问题描述
我的外部json数据无效,名称周围没有双引号.
示例:
{
data: [
{
idx: 0,
id: "0",
url: "http://247wallst.com/",
a: [
{
t: "Title",
u: "http://247wallst.com/2012/07/30/",
sp: "About"
}
],
doc_id: "9386093612452939480"
},
{
idx: 1,
id: "-1"
}
],
results_per_page: 10,
total_number_of_news: 76,
news_per_month: [20, 0, 8, 1, 1, 2, 0, 2, 1, 0, 0, 1, 1, 0, 5, 1, 1, 1, 0, 2, 5, 16, 7, 1],
result_start_num: 2,
result_end_num: 2,
result_total_articles: 76
}
如您所见,很多名称(例如data,idx,id,url和其他名称)都没有用双引号引起来,因此这使此json无效.如何使此外部json有效?我已经尝试过str_replace,将'{'替换为{{'',将':'替换为':',在未加引号的名称周围添加双引号,但这会弄乱一些已经被双引号引起来的变量.
如何使此json有效,以便可以使用PHP json_decode读取此数据?我对preg_replace不太熟悉.
有效的json看起来像:
{
"data": [
{
"idx": 0,
"id": "0",
"url": "http://247wallst.com/",
"a": [
{
"t": "Title",
"u": "http://247wallst.com/2012/07/30/",
"sp": "About"
}
],
"doc_id": "9386093612452939480"
},
{
"idx": 1,
"id": "-1"
}
],
"results_per_page": 10,
"total_number_of_news": 76,
"news_per_month": [20, 0, 8, 1, 1, 2, 0, 2, 1, 0, 0, 1, 1, 0, 5, 1, 1, 1, 0, 2, 5, 16, 7, 1],
"result_start_num": 2,
"result_end_num": 2,
"result_total_articles": 76
}
请建议我一些php preg_replace函数.
数据源: http://www.google.com /finance/company_news?q = aapl& output = json& start = 1& num = 1
使用preg_replace
,您可以执行以下操作:
json_decode(preg_replace('#(?<pre>\{|\[|,)\s*(?<key>(?:\w|_)+)\s*:#im', '$1"$2":', $in));
由于上面的示例不适用于真实数据(战斗计划很少在与敌人第一次接触后才能幸存),这是我的第二个看法:
$infile = 'http://www.google.com/finance/company_news?q=aapl&output=json&start=1&num=1';
// first, get rid of the \x26 and other encoded bytes.
$in = preg_replace_callback('/\\\x([0-9A-F]{2})/i',
function($match){
return chr(intval($match[1], 16));
}, file_get_contents($infile));
$out = $in;
// find key candidates
preg_match_all('#(?<=\{|\[|,)\s*(?<key>(?:\w|_)+?)\s*:#im', $in, $m, PREG_OFFSET_CAPTURE);
$replaces_so_far = 0;
// check each candidate if its in a quoted string or not
foreach ($m['key'] as $match) {
$position = $match[1] + ($replaces_so_far * 2); // every time you expand one key, offsets need to be shifted with 2 (for the two " chars)
$key = $match[0];
$quotes_before = preg_match_all('/(?<!\\\)"/', substr($out, 0, $position), $m2);
if ($quotes_before % 2) { // not even number of not-escaped quotes, we are in quotes, ignore candidate
continue;
}
$out = substr_replace($out, '"'.$key.'"', $position, strlen($key));
++$replaces_so_far;
}
var_export(json_decode($out, true));
但是由于google会在RSS feed中提供此数据,因此我建议您使用该数据(如果它适用于您的用例),这只是出于娱乐目的(-:
I have invalid external json data, without double quotes around names.
Example:
{
data: [
{
idx: 0,
id: "0",
url: "http://247wallst.com/",
a: [
{
t: "Title",
u: "http://247wallst.com/2012/07/30/",
sp: "About"
}
],
doc_id: "9386093612452939480"
},
{
idx: 1,
id: "-1"
}
],
results_per_page: 10,
total_number_of_news: 76,
news_per_month: [20, 0, 8, 1, 1, 2, 0, 2, 1, 0, 0, 1, 1, 0, 5, 1, 1, 1, 0, 2, 5, 16, 7, 1],
result_start_num: 2,
result_end_num: 2,
result_total_articles: 76
}
As you see a lot of names like data,idx,id,url and others are not double quoted, so this makes this json invalid.How can I make this external json valid? I already tried str_replace, replacing '{' to '{"' and ':' to '":' adding double quotes around unquoted names, but this messes up some already double quoted variables.
How can I make this json valid so I can read this data with PHP json_decode? I'm not very familiar with preg_replace..
Valid json will look like:
{
"data": [
{
"idx": 0,
"id": "0",
"url": "http://247wallst.com/",
"a": [
{
"t": "Title",
"u": "http://247wallst.com/2012/07/30/",
"sp": "About"
}
],
"doc_id": "9386093612452939480"
},
{
"idx": 1,
"id": "-1"
}
],
"results_per_page": 10,
"total_number_of_news": 76,
"news_per_month": [20, 0, 8, 1, 1, 2, 0, 2, 1, 0, 0, 1, 1, 0, 5, 1, 1, 1, 0, 2, 5, 16, 7, 1],
"result_start_num": 2,
"result_end_num": 2,
"result_total_articles": 76
}
Please suggest me some php preg_replace function.
Data source:http://www.google.com/finance/company_news?q=aapl&output=json&start=1&num=1
With preg_replace
you can do:
json_decode(preg_replace('#(?<pre>\{|\[|,)\s*(?<key>(?:\w|_)+)\s*:#im', '$1"$2":', $in));
Since the above example won't work with real data (the battle plans seldom survive first contact with the enemy) heres my second take:
$infile = 'http://www.google.com/finance/company_news?q=aapl&output=json&start=1&num=1';
// first, get rid of the \x26 and other encoded bytes.
$in = preg_replace_callback('/\\\x([0-9A-F]{2})/i',
function($match){
return chr(intval($match[1], 16));
}, file_get_contents($infile));
$out = $in;
// find key candidates
preg_match_all('#(?<=\{|\[|,)\s*(?<key>(?:\w|_)+?)\s*:#im', $in, $m, PREG_OFFSET_CAPTURE);
$replaces_so_far = 0;
// check each candidate if its in a quoted string or not
foreach ($m['key'] as $match) {
$position = $match[1] + ($replaces_so_far * 2); // every time you expand one key, offsets need to be shifted with 2 (for the two " chars)
$key = $match[0];
$quotes_before = preg_match_all('/(?<!\\\)"/', substr($out, 0, $position), $m2);
if ($quotes_before % 2) { // not even number of not-escaped quotes, we are in quotes, ignore candidate
continue;
}
$out = substr_replace($out, '"'.$key.'"', $position, strlen($key));
++$replaces_so_far;
}
var_export(json_decode($out, true));
But since google offers this data in RSS feed, i would recommend you to use that one if it works for your usecase, this is just for fun (-:
这篇关于PHP使用json_decode()读取无效的json;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!