本文介绍了如何使用xpdf从PDF提取文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的文件夹中有很多PDF.我想使用xpdf从这些PDF中提取文本.例如:
I have many PDFs in a folder. I want to extract the text from these PDFs using xpdf. For example :
- example1.pdf提取到example1.txt
- example2.pdf提取到example2.txt
- 等.
这是我的代码:
<?php
$path = 'C:/AppServ/www/pdfs/';
$dir = opendir($path);
$f = readdir($dir);
while ($f = readdir($dir)) {
if (eregi("\.pdf",$f)){
$content = shell_exec('C:/AppServ/www/pdfs/pdftotext '.$f.' ');
$read = strtok ($f,".");
$testfile = "$read.txt";
$file = fopen($testfile,"r");
if (filesize($testfile)==0){}
else{
$text = fread($file,filesize($testfile));
fclose($file);
echo "</br>"; echo "</br>";
}
}
}
我得到空白结果.我的代码有什么问题?
I get blank result. What's wrong with my code?
推荐答案
尝试使用此方法:
$dir = opendir($path);
$filename = array();
while ($filename = readdir($dir)) {
if (eregi("\.pdf",$filename)){
$content = shell_exec('C:/AppServ/www/pdfs/pdftotext '.$filename.' ');
$read = strtok ($filename,".");
$testfile = "$read.txt";
$file = fopen($testfile,"r");
if (filesize($testfile)==0){}
else{
$text = fread($file,filesize($testfile));
fclose($file);
echo "</br>"; echo "</br>";
}
}
这篇关于如何使用xpdf从PDF提取文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!