问题描述
有没有一种方法可以使用Google BigQuery客户端API从本地文件系统加载JQuery文件到BigQuery?我找到的所有选项都是:
1-逐一记录数据。
2-从GCS载入JSON数据。
3-使用原始POST请求加载JSON(即不通过Google Client API)。 我从python标签假设你想从python做到这一点。 有一个加载示例,从本地文件加载数据(它使用CSV,但它很容易适应JSON ...在同一目录中有另一个json示例)。
基本流程是:
#加载指定目的地的配置。
load_config = {
'destinationTable':{
'projectId':PROJECT_ID,$ b $'datasetId':DATASET_ID,
'tableId':TABLE_ID
}
$ b $ load_config ['schema'] = {
'fields':[
{'name':'string_f','type':'STRING' },
{'name':'boolean_f','type':'BOOLEAN'},
{'name':'integer_f','type':'INTEGER'},
{'name':'float_f','type':'FLOAT'},
{'name':'timestamp_f','type':'TIMESTAMP'}
]
}
load_config ['sourceFormat'] ='NEWLINE_DELIMITED_JSON'
#这告诉它执行一个本地文件的可恢复上传
#叫做'foo.json'
upload = MediaFileUpload('foo.json',
mimetype ='application / octet-stream',
#这将启用可恢复的上传功能
resumable = True)
start = ti me.time()
job_id ='job_%d'%start
#创建作业。
result = jobs.insert(
projectId = project_id,
body = {
'jobReference':{
'jobId':job_id
},
'configuration':{
'load':load
}
},
media_body = upload).execute()
#然后你还想等待结果并检查状态。 (查看
#链接中的示例以获取更多信息)。
Is there a way to load a JSON file from local file system to BigQuery using Google BigQuery Client API?
All the options I found are:
1- Streaming the records one by one.
2- Loading JSON data from GCS.
3- Using raw POST requests to load the JSON (i.e. not through Google Client API).
I'm assuming from the python tag that you want to do this from python. There is a load example here that loads data from a local file (it uses CSV, but it is easy to adapt it to JSON... there is another json example in the same directory).
The basic flow is:
# Load configuration with the destination specified.
load_config = {
'destinationTable': {
'projectId': PROJECT_ID,
'datasetId': DATASET_ID,
'tableId': TABLE_ID
}
}
load_config['schema'] = {
'fields': [
{'name':'string_f', 'type':'STRING'},
{'name':'boolean_f', 'type':'BOOLEAN'},
{'name':'integer_f', 'type':'INTEGER'},
{'name':'float_f', 'type':'FLOAT'},
{'name':'timestamp_f', 'type':'TIMESTAMP'}
]
}
load_config['sourceFormat'] = 'NEWLINE_DELIMITED_JSON'
# This tells it to perform a resumable upload of a local file
# called 'foo.json'
upload = MediaFileUpload('foo.json',
mimetype='application/octet-stream',
# This enables resumable uploads.
resumable=True)
start = time.time()
job_id = 'job_%d' % start
# Create the job.
result = jobs.insert(
projectId=project_id,
body={
'jobReference': {
'jobId': job_id
},
'configuration': {
'load': load
}
},
media_body=upload).execute()
# Then you'd also want to wait for the result and check the status. (check out
# the example at the link for more info).
这篇关于使用Google BigQuery客户端API在BigQuery中加载JSON文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!