本文介绍了使用Google Datalab读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Google Datalab在ipython笔记本中读取文件,即基本的pd.read_csv(),因为我找不到文件的路径.我在本地拥有它,还可以将其上传到存储桶中的Google云存储中.

I am trying to use Google Datalab to read in a file in ipython notebook, the basic pd.read_csv() as I can't find the path of the file. I have it locally and also uploaded it to google cloud storage in a bucket.

我运行了以下命令以了解我的位置

I ran the following commands to understand where I am

os.getcwd()

提供'/content/[email protected]'

gives '/content/[email protected]'

os.listdir('/content/[email protected]')

提供了['.git','.gitignore','datalab','Hello World.ipynb','.ipynb_checkpoints']

gives ['.git', '.gitignore', 'datalab', 'Hello World.ipynb', '.ipynb_checkpoints']

推荐答案

以下内容将对象的内容读取到名为text的字符串变量中:

The following reads the contents of the object into a string variable called text:

%%storage read --object "gs://path/to/data.csv" --variable text

然后

from cStringIO import StringIO
mydata = pd.read_csv(StringIO(text))
mydata.head()

希望熊猫将支持"gs://" URL(目前对s3://如此,以允许直接从Google Cloud存储中读取

Hopefully Pandas will support "gs://" URLs (as it does for s3:// currently to allow reading directly from Google Cloud storage.

我发现以下文档非常有用:

I have found the following docs really helpful:

https://github.com/GoogleCloudPlatform/datalab/tree/master/content/datalab/tutorials

希望有帮助(也只是开始使用Datalab,所以也许有人很快就会有一种更简洁的方法).

Hope that helps (just getting started with Datalab too, so maybe someone will have a cleaner method soon).

这篇关于使用Google Datalab读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-28 08:36