I am trying to import a json file using the function:
sku = pandas.read_json('https://cws01.worldstores.co.uk/api/product.php?product_sku=125T:FT0111')
但是,我一直得到以下错误
ValueError:数组的长度必须相同
如何将其正确导入数据帧?
这是json的结构:
{
"id": "5",
"sku": "JOSH:BECO-BRN",
"last_updated": "2013-06-10 15:46:22",
...
"propertyType1": [
"manufacturer_colour"
],
"category": [
{
"category_id": "10",
"category_name": "All Products"
},
...
{
"category_id": "238",
"category_name": "All Sofas"
}
],
"root_categories": [
"516"
],
"url": "/p/Beco Suede Sofa Bed?product_id=5",
"item": [
"2"
],
"image_names": "[\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/L\\/19\\/Beco_Suede_Sofa_Bed-1.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/P\\/19\\/Beco_Suede_Sofa_Bed-1.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/SP\\/19\\/Beco_Suede_Sofa_Bed-1.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/SS\\/19\\/Beco_Suede_Sofa_Bed-1.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/ST\\/19\\/Beco_Suede_Sofa_Bed-1.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/WP\\/19\\/Beco_Suede_Sofa_Bed-1.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/L\\/19\\/Beco_Suede_Sofa_Bed-2.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/P\\/19\\/Beco_Suede_Sofa_Bed-2.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/SP\\/19\\/Beco_Suede_Sofa_Bed-2.jpg\",\"https:\\/\\/cdn.worldstores.co.uk \\/images\\/products\\/SS\\/19\\/Beco_Suede_Sofa_Bed-2.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/ST\\/19\\/Beco_Suede_Sofa_Bed-2.jpg\",\"https:\\/\\/cdn.worldstores.co.uk\\/images\\/products\\/WP\\/19\\/Beco_Suede_Sofa_Bed-2.jpg\"]"
}
最佳答案
。
由于您没有指定json文件的格式(pandas.read_json
属性),pandas将默认认为您的数据是columnar。。
The data that you are trying to parse from The orient=
function
。。
。
编辑:
Simple way of getting multiple rows and parsing it with the read_json
function:
import urllib2
import pandas as pd
url_base = "https://cws01.worldstores.co.uk/api/product.php?product_sku={}"
products = ["125T:FT0111", "125T:FT0111", "125T:FT0111"]
raw_data_list = []
for sku in products:
url = url_base.format(sku)
raw_data_list.append(urllib2.urlopen(url).read())
data = "[" + (",".join(raw_data_list)) + "]"
data = pd.read_json(data, orient='records')
data
/
。
https://cws01.worldstores.co.uk/api/product.php?product_sku=125T:FT0111是熊猫试图将尽可能多的功能塞进单个功能的又一个光辉例子。。
Series
如果您的数据是
pandas.read_json
,pandas.read_json
默认为Series
The values allowed for orient while parsing a
pandas.read_json(orient=)
are: 'index'
。
数据帧
If your data is a
Series
, {'split','records','index'}
defaults to orient='index'
The values allowed for orient while parsing a
DataFrame
are:pandas.read_json(orient=)
请注意,对于
'columns'
和DataFrame
,序列索引必须是唯一的;对于{'split','records','index','columns','values'}
、orient='index'
和orient='columns'
,DataFrame列必须是唯一的。格式
No matter if your data is a
orient='index'
or a orient='columns'
, the orient='records'
will expect data in the same format:分裂
需要dict的字符串表示形式,如
DataFrame
构造函数所采用的:{"index":[1,2,3,4], "columns":["col1","col2"], "data":[[8,7,6,5], [5,6,7,8]]}
记录
需要一个dict列表的字符串表示形式,如:
[{"col1":8,"col2":5},{"col1":7,"col2":6},{"col1":6,"col2":7},{"col1":5,"col2":8}]
。
Index
Expects a string representation of a nested dict dict like:
{"1":{"col1":8,"col2":5},"2":{"col1":7,"col2":6},"3":{"col1":6,"col2":7},"4":{"col1":5,"col2":8}}
。可能在以后的版本中修复。
柱
Expects a string representation of a nested dict like:
{"col1":{"1":8,"2":7,"3":6,"4":5},"col2":{"1":5,"2":6,"3":7,"4":8}}
Values
Expects a string representation of a list like:
[[8, 5],[7, 6],[6, 7],[5, 8]]
Resulting dataframe
In most cases, the dataframe you get will look like this, with the json strings above:
col1 col2
1 8 5
2 7 6
3 6 7
4 5 8
关于python - 在 Pandas 中使用read_json导入单个记录,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/31656040/