问题描述
我有一个CSV档案含有一些引文问题:
I have a CSV file that has some quoting issues:
"Albanese Confectionery","157137","ALBANESE BULK ASST. MINI WILD FRUIT WORMS 2" 4/5LB",9,90,0,0,0,.53,"21",50137,"3441851137","5 lb",1,4,4,$6.7,$6.7,$26.8
SuperCSV对这些水果蠕虫$ c> 2应该是 2
,但不是。 LibreOffice实际上解析这正确(这让我惊讶)。我想的只是写我自己的小解析器,但其他行在字符串中有逗号:
SuperCSV is choking on these fruit worms (pun intended). I know that the 2"
should probably be 2""
, but it's not. LibreOffice actually parses this correctly (which surprises me). I was thinking of just writing my own little parser but other rows have commas inside the string:
"Albanese Confectionery","157230","ALBANESE BULK JET FIGHTERS,ASSORTED 4/5 B",9,90,0,0,0,.53,"21",50230,"3441851230","5 lb",1,4,4,$6.7,$6.7,$26.8
有没有人知道Java库会处理这样的疯狂的东西?或者我应该尝试所有可用的?
Does anyone know of a Java library that will handle crazy stuff like this? Or should I try all the available ones? Or am I better off hacking this out myself?
推荐答案
正确的解决方案是找到生成数据并打败他们的人
The right solution is to find the person who generated the data and beat them over the head with a keyboard until they fix the problem on their end.
一旦你耗尽了这条路线,你可以尝试一些市场上的其他CSV解析器,我已经在过去成功使用。
Once you've exhausted that route, you could try some of the other CSV parsers on the market, I've used OpenCSV with success in the past.
即使OpenCSV将不能解决开箱即用的问题,代码是相当容易阅读和可用的Apache许可证,所以可能可以修改算法与您的基础数据,可能比从头开始更容易。
Even if OpenCSV won't solve the problem out of the box, the code is fairly easy to read and available under an Apache license, so it might be possible to modify the algorithm to work with your wonky data, and probably easier than starting from scratch.
这篇关于带有未转义引号的Java CSV解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!