问题描述
使用itextsharp从PDF文件中提取文本时出现错误无法找到图像数据或EI
我已经放置了我的代码和示例文件。请帮忙。
Dim simg,tmp,sImgPDFLst As String
simg = AOTD_SU_20131118_006.pdf
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
tmp = pdf.parser.PdfTextExtractor.GetTextFromPage(reader,1,New pdf.parser.SimpleTextExtractionStrategy() )
如果tmp.Length = 0那么
sImgPDFLst =以下文件是图像PDF
结束如果
reader.Dispose()
reader.Close()
reader = Nothing
链接:https: //drive.google.com/file/d/0B_nzYHWVJJ7KbnFSRWx5ZVNpSkk/edit?usp=sharing
While extracting text from PDF file using itextsharp iam getting an error "Could not find image data or EI"
I have placed my code and sample file. Kindly do the needful.
Dim simg,tmp,sImgPDFLst As String
simg = AOTD_SU_20131118_006.pdf
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
tmp = pdf.parser.PdfTextExtractor.GetTextFromPage(reader, 1, New pdf.parser.SimpleTextExtractionStrategy())
If tmp.Length = 0 Then
sImgPDFLst = "Following files are IMAGE PDF"
End If
reader.Dispose()
reader.Close()
reader = Nothing
Link: https://drive.google.com/file/d/0B_nzYHWVJJ7KbnFSRWx5ZVNpSkk/edit?usp=sharing
推荐答案
Imports iTextSharp.text
Imports iTextSharp.text.pdf
Imports System.IO
Partial Public Class WebForm2
Inherits System.Web.UI.Page
Dim path As String = Server.MapPath("PDFs")
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
If GetTextFromPDF(path + "/AOTD_SU_20131118_006.pdf") <> "" Then
Me.Label1.Text = GetTextFromPDF(path + "/AOTD_SU_20131118_006.pdf")
Else
Me.Label1.Text = "The PDF has Images"
End If
End Sub
Public Shared Function GetTextFromPDF(ByVal PdfFileName As String) As String
Dim oReader As New iTextSharp.text.pdf.PdfReader(PdfFileName)
Dim i As Integer
Dim sOut = ""
For i = 1 To oReader.NumberOfPages
Dim its As New iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy
sOut &= iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(oReader, i, its)
Next
Return sOut
End Function
End Class
这篇关于无法使用itextsharp找到图像数据或EI的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!