问题描述
我有一个 OFX 文件,该文件是从下载的花旗银行,此文件的DTD在 http://www.ofx.net/DownloadPage/Files中定义/ofx102spec.zip (文件OFXBANK.DTD),则OFX文件似乎是 SGML 有效的.我正在尝试使用PHP 5.4.13的 DomDocument ,但收到了一些警告,文件是没有解析.我的代码是:
I have a OFX file downloaded from Citibank, this file has a DTD defined at http://www.ofx.net/DownloadPage/Files/ofx102spec.zip (file OFXBANK.DTD), the OFX file appear to be SGML valid.I'm trying with DomDocument of PHP 5.4.13, but I get several warning and file is not parsed. My Code is:
$file = "source/ACCT_013.OFX";
$dtd = "source/ofx102spec/OFXBANK.DTD";
$doc = new DomDocument();
$doc->loadHTMLFile($file);
$doc->schemaValidate($dtd);
$dom->validateOnParse = true;
OFX文件开始于:
OFXHEADER:100
DATA:OFXSGML
VERSION:102
SECURITY:NONE
ENCODING:USASCII
CHARSET:1252
COMPRESSION:NONE
OLDFILEUID:NONE
NEWFILEUID:NONE
<OFX>
<SIGNONMSGSRSV1>
<SONRS>
<STATUS>
<CODE>0
<SEVERITY>INFO
</STATUS>
<DTSERVER>20130331073401
<LANGUAGE>SPA
</SONRS>
</SIGNONMSGSRSV1>
<BANKMSGSRSV1>
<STMTTRNRS>
<TRNUID>0
<STATUS>
<CODE>0
<SEVERITY>INFO
</STATUS>
<STMTRS>
<CURDEF>COP
<BANKACCTFROM> ...
我愿意安装和使用Server(Centos)中的任何程序来通过PHP进行调用.
I'm open to install and use any program in Server (Centos) for call from PHP.
PD:此类 http://www.phpclasses.org/package/5778-PHP-Parse-and-extract-financial-records-from-OFX-files.html 对我不起作用.
PD: This class http://www.phpclasses.org/package/5778-PHP-Parse-and-extract-financial-records-from-OFX-files.html don't work for me.
推荐答案
首先,即使XML是SGML的子集,有效的SGML文件也不能是格式正确的XML文件.XML更加严格,并且没有使用SGML提供的所有功能.
Well first of all even XML is a subset of SGML a valid SGML file must not be a well-formed XML file. XML is more strict and does not use all features that SGML offers.
由于 DOMDocument
是基于XML(而不是SGML)的,因此这并不是真正的兼容.
As DOMDocument
is XML (and not SGML) based, this is not really compatible.
在该问题旁边,请参阅Ofexfin1.doc中的 2.2打开Financial Exchange标头,其中为您说明了
Next to that problem, please see 2.2 Open Financial Exchange Headers in Ofexfin1.doc it explains you that
并进一步:
因此找到第一个空白行并剥离所有内容,直到那里.然后,先将SGML转换为XML,然后将SGML部分加载到DOMDocument中:
So locate the first blank line and strip everyhing until there. Then load the SGML part into DOMDocument by converting the SGML into XML first:
$source = fopen('file.ofx', 'r');
if (!$source) {
throw new Exception('Unable to open OFX file.');
}
// skip headers of OFX file
$headers = array();
$charsets = array(
1252 => 'WINDOWS-1251',
);
while(!feof($source)) {
$line = trim(fgets($source));
if ($line === '') {
break;
}
list($header, $value) = explode(':', $line, 2);
$headers[$header] = $value;
}
$buffer = '';
// dead-cheap SGML to XML conversion
// see as well http://www.hanselman.com/blog/PostprocessingAutoClosedSGMLTagsWithTheSGMLReader.aspx
while(!feof($source)) {
$line = trim(fgets($source));
if ($line === '') continue;
$line = iconv($charsets[$headers['CHARSET']], 'UTF-8', $line);
if (substr($line, -1, 1) !== '>') {
list($tag) = explode('>', $line, 2);
$line .= '</' . substr($tag, 1) . '>';
}
$buffer .= $line ."\n";
}
// use DOMDocument with non-standard recover mode
$doc = new DOMDocument();
$doc->recover = true;
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
$save = libxml_use_internal_errors(true);
$doc->loadXML($buffer);
libxml_use_internal_errors($save);
echo $doc->saveXML();
然后,此代码示例输出以下(重新格式化的)XML,该XML还显示DOMDocument正确加载了数据:
This code-example then outputs the following (re-formatted) XML which also shows that DOMDocument loaded the data properly:
<?xml version="1.0"?>
<OFX>
<SIGNONMSGSRSV1>
<SONRS>
<STATUS>
<CODE>0</CODE>
<SEVERITY>INFO</SEVERITY>
</STATUS>
<DTSERVER>20130331073401</DTSERVER>
<LANGUAGE>SPA</LANGUAGE>
</SONRS>
</SIGNONMSGSRSV1>
<BANKMSGSRSV1>
<STMTTRNRS>
<TRNUID>0</TRNUID>
<STATUS>
<CODE>0</CODE>
<SEVERITY>INFO</SEVERITY>
</STATUS>
<STMTRS><CURDEF>COP</CURDEF><BANKACCTFROM> ...</BANKACCTFROM>
</STMTRS>
</STMTTRNRS>
</BANKMSGSRSV1>
</OFX>
我不知道这是否可以针对DTD进行验证.也许这可行.另外,如果SGML没有在同一行上写有标记的值(并且每行只需要一个元素),那么这种脆弱的转换将中断.
I do not know whether or not this can be validated against the DTD then. Maybe this works. Additionally if the SGML is not written with the values that are of a tag on the same line (and only a single element on each line is required), then this fragile conversion will break.
这篇关于如何在PHP中解析OFX(版本1.0.2)文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!