问题描述
我想在apache Beam应用程序中将一个csv文件读入一个列表中,其中列表中的每个元素都是一个元组或列表(没关系),这样我就可以得到csv
I want to read a csv file into a list in an apache beam application, where each element in the list is a tuple or list (don't really matter), so that I would have the csv
1,2,3
4,5,6
成为
[(1,2,3) , (4,5,6)]
或
[ [1,2,3], [4,5,6] ]
我尝试按照中的说明进行操作将CSV转换成Apache Beam数据流中的字典但是当我尝试使用
I tried following the instructions in How to convert csv into a dictionary in apache beam dataflowbut when I try to use
from beam_utils.sources import CsvFileSource
我知道
from beam_utils.sources import CsvFileSource
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/beam_utils/sources.py", line 9, in <module>
from apache_beam.io import fileio
ImportError: cannot import name fileio
如果我尝试直接导入
from apache_beam.io import fileio
我遇到同样的问题,但是我可以同时使用
I get the same issue, however I can use both of
import apache_beam.io
import beam_utils
没有任何问题.任何人都很好地知道可能是什么问题,或者对我如何可以以不同的方式来做到这一点有很好的理解?
without any issues. Anyone got a good idea of what the issue might be or got a good idea of how I could do this in a different way?
我目前有
with beam.Pipeline(options = pipeline_options) as p:
csvfile = p | ReadFromText(known_args.input)
所以如果我可以通过另一种也可以正常工作的方式将 csvfile
转换为所需格式
so if I can turn csvfile
to the desired format in another way that works well too
推荐答案
几分钟前就遇到了同样的问题.问题是 fileio
显然不再在 apache_beam
中(至少对我而言不是).它似乎已被文件系统
取代.
Just ran into this same problem a few minutes ago. The issue is that fileio
is apparently no longer in apache_beam
(at least it wasn't for me). It appears to have been replaced by filesystem
.
不是一个很好的解决方案,但是在beam_utils的sources.py中,我用文件系统"替换了"fileio"的所有实例
Not a great solution, but in sources.py from beam_utils I replaced all instances of "fileio" with "filesystem"
所以
from apache_beam.io import fileio
成为
from apache_beam.io import filesystem
这篇关于从apache_beam.io导入fileio给出错误:无法导入名称fileio的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!