问题描述
我创建了一个包含两个工作表的Google表格电子表格,分别称为Sheet1
和Sheet2
:
I have created a Google Sheets spreadsheet with two worksheets, called Sheet1
and Sheet2
:
Sheet1
在第1行中具有col1
和col2
列,而Sheet2
在第1行中具有col3
和col4
列.
Sheet1
has columns col1
and col2
in row 1, and Sheet2
has columns col3
and col4
in row 1.
我希望能够使用命令行bigQuery界面查询两个工作表.直到2018年5月添加了一项最新功能后,看来这是不可能的:
I would like to be able to query both sheets using the command-line bigQuery interface.It seems like this was not possible until a recent feature added in May 2018:
https://issuetracker.google.com/issues/35905674#comment12
我试图复制此评论所描述的内容,但无法使其正常工作.这是我到目前为止所做的:
I tried to replicate what this comment describes, but I can't get it to work. Here is what I've done so far:
新功能的描述显示了一个示例google_sheets_tabeledef.json
,如下所示:
The description of the new feature shows an example google_sheets_tabeledef.json
that looks like this:
{
"autodetect": false,
"sourceFormat": "GOOGLE_SHEETS",
"sourceUris": [
"https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxx"
],
"maxBadRecords": 1,
"googleSheetsOptions":
{
"range": "test_sheet!A1:B20",
"skipLeadingRows": 0
},
"schema" : {
"fields": [
{
"name": "col1",
"type": "string"
},
{
"name": "col2",
"type": "int64"
},
]
}
}
鉴于我的电子表格有两个工作表,分别是Sheet1
和Sheet2
,我在bigQuery帐户中创建了各自的表,然后我尝试通过命令行bq query
来查询它们中的每一个.其工作方式如Sheet1
所示:
Given my spreadsheet with two sheets, Sheet1
and Sheet2
, I created their respective tables in my bigQuery account,then I tried to query each of them via the command-line bq query
. It works as shown for Sheet1
:
bq query --format=csv --use_legacy_sql=false 'SELECT * FROM `cegx-test-project1.multi_sheet_test.Sheet1`' 2>/dev/null
col1,col2
col1,col2
foo,bar
abc,def
如果我对Sheet2
尝试相同的操作,则在没有external_table_definition的情况下,它会显示:
And if I try the same for Sheet2
, in absence of the external_table_definition, it shows:
bq query --format=csv --use_legacy_sql=false 'SELECT * FROM `cegx-test-project1.multi_sheet_test.Sheet2`' 2>/dev/null
col3,col4
col1,col2
foo,bar
abc,def
因此,它基本上恢复为第一个工作表,而不是查询Sheet2
.到目前为止,符合预期.
So it basically reverts to the first worksheet instead of querying Sheet2
. So far, as expected.
然后我创建了一个等效的google_sheets_sheet2_tabledef.json
文件,如下所示:
I have then created an equivalent google_sheets_sheet2_tabledef.json
file like this:
{
"autodetect": true,
"sourceFormat": "GOOGLE_SHEETS",
"sourceUris": [
"https://docs.google.com/spreadsheets/d/1P1WH7cwVDaG6k-OQxKVXtnjBXI1NGFYvHD6IxCRFsZc"
],
"maxBadRecords": 1,
"googleSheetsOptions":
{
"range": "Sheet2!A1:B10",
"skipLeadingRows": 0
},
"schema" : {
"fields": [
{"name":"col3","type":"string"},
{"name":"col4","type":"string"}
]
}
}
请注意,在range
参数中我是如何指向Sheet2!A1:B10
的.
Notice how in the range
parameter, I am pointing to Sheet2!A1:B10
.
如果我尝试再次查询该表,现在带有external_table_definition标志,我得到:
If I try to query the table again, now with the external_table_definition flag, I get:
bq query --external_table_definition="Sheet2::/tmp/google_sheets_sheet2_tabledef.json" --format=csv --use_legacy_sql=false 'SELECT * FROM `cegx-test-project1.multi_sheet_test.Sheet2`' 2>/dev/null
col3,col4
col1,col2
foo,bar
abc,def
因此,即使外部定义文件指向Sheet2
中的范围,我仍然可以从Sheet1
中获得结果.
So I still get the results from Sheet1
even with the external definition file pointing to the range in Sheet2
.
任何想法出在哪里?
推荐答案
就像您用来从临时外部表定义中查询的命令未引用外部表.
Looks like the command you used to query from an temporary external table definition was not referring to the external table.
如果将命令更改为以下命令,它应该可以按预期运行:
If you change the command to the following, it should work as expected:
bq查询--external_table_definition ="Sheet2 ::/tmp/google_sheets_sheet2_tabledef.json" --format = csv --use_legacy_sql = false'SELECT * FROM Sheet2
'2>/dev/null
bq query --external_table_definition="Sheet2::/tmp/google_sheets_sheet2_tabledef.json" --format=csv --use_legacy_sql=false 'SELECT * FROM Sheet2
' 2>/dev/null
这篇关于bigQuery Google云端硬盘使用googleSheetsOptions范围查询多个工作表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!