本文介绍了仅解析json的第一级的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这种json文件:

I have this kind of json file:

{
"params": {
    "apiKey": "key",
    "sessionId": "123433890",
    "lang": "en",
    "timezone": "America/New_York",
    "query": "hi all",
    "latitude": "37.459157",
    "longitude": "-122.17926",
    "context": "[{"
     name ": "
     weather ","
     lifespan ": 4}]"
}

}

由于

"context": "[{"
     name ": "
     weather ","
     lifespan ": 4}]"

而且我无法使用json_decode对其进行解码.

and I can not decode it with json_decode.

所以我想知道是否可以仅解码第一个密钥.因此结果可能看起来像

So I wonder if it possible to decode only first keys. So the result would possibly look like

    array(1) {
  'parameters' =>
  array(8) {
    'apiKey' =>
    string(32) "key"
    'sessionId' =>
    string(10) "123433890"
    'lang' =>
    string(2) "en"
    'timezone' =>
    string(16) "America/New_York"
    'query' =>
    string(16) "hi all"
    'latitude' =>
    string(9) "37.459157"
    'longitude' =>
    string(10) "-122.17926"
    'context' =>
    string(16) "[{"name ": "weather ","lifespan ": 4}]"
  }
}

谢谢!

这也是有效的json,但无法使用json_decode对其进行解码.

Also this is valid json, but it can not be decoded with json_decode.

    {
    "query": [
        "and for tomorrow"
    ],
    "contexts": "[{'name':'weather', 'lifespan' : 4}]",
    "location": {
        "latitude": 37.459157,
        "longitude": -122.17926
    },
    "timezone": "America/New_York",
    "lang": "en",
    "sessionId": "1234567890"
}

推荐答案

您的JSON确实无效.它应该看起来像这样:

Your JSON is indeed not valid. It should look like this:

{
  "params": {
    "apiKey": "key",
    "sessionId": "123433890",
    "lang": "en",
    "timezone": "America/New_York",
    "query": "hi all",
    "latitude": "37.459157",
    "longitude": "-122.17926",
    "context": [{"name":"weather","lifespan": 4}]
  }
}

错误是 context 键值放在了引号中,而它本来不应该放在引号中,因为它不是字符串,而是嵌套的对象.

The error is that the context key value was put in quotes, while it should not have been, since it is not a string, but a nested object.

如果您无法控制该文件且无法修复该文件,则可以使用以下代码,该代码将在您阅读后尝试为您修复:

If you have no control over the file, and cannot fix it, then you could use this code, which will try to fix it for you after you have read it:

// Invalid JSON as read from your file:
$json = '{
  "params": {
    "apiKey": "key",
    "sessionId": "123433890",
    "lang": "en",
    "timezone": "America/New_York",
    "query": "hi all",
    "latitude": "37.459157",
    "longitude": "-122.17926",
    "context": "[{"
     name ": "
     weather ","
     lifespan ": 4}]"
  }
}';
// Fix malformed JSON
$json = preg_replace_callback('~"([\[{].*?[}\]])"~s', function ($match) {
    return preg_replace('~\s*"\s*~', "\"", $match[1]);
}, $json);
// Now you can do:
$arr = json_decode($json, true);

以上代码的结果是 $ arr 将包含以下内容:

The result of the above code is that $arr will contain this:

array (
  'params' => array (
    'apiKey' => 'key',
    'sessionId' => '123433890',
    'lang' => 'en',
    'timezone' => 'America/New_York',
    'query' => 'hi all',
    'latitude' => '37.459157',
    'longitude' => '-122.17926',
    'context' => array (
      array (
        'name' => 'weather',
        'lifespan' => 4,
      ),
    ),
  ),
)

看到它在 eval.in 上运行.

请注意 context 属性还具有结构化信息(数组).

Note how also the context property has structured information (an array).

首先搜索以下模式:

~"([\[{].*?[}\]])"~s

只是正则表达式的分隔符.然后:

The ~ are just delimiters for the regular expression. Then:

  • ":匹配双引号
  • (...):定义了我们实际要获取的部分:我们要删除最外面的双引号,因此它们不在这些括号内.
  • [\ [{] :匹配以下文字字符之一: [{
  • .*?:匹配任何字符,但不超过继续操作所必需的字符(?使其不贪心,即是懒惰的).
  • [} \]] :匹配以下文字字符之一:}]
  • s :这是一个修饰符,将使.也与换行符
  • 匹配
  • ": matches a double quote
  • ( ... ): defines the part that we want to actually get: we want to remove the outer most double quotes, so they are not within these parentheses.
  • [\[{]: matches either one of these literal characters: [{
  • .*?: matches any character, but not more than necessary to continue (the ? makes it non-greedy, i.e. lazy).
  • [}\]]: matches either one of these literal characters: }]
  • s: this is a modifier that will make the . also match with newline characters

对于每次匹配, preg_replace_callback 将调用我们作为第二个参数传递的函数,并将其传递给数组.数组的第一个元素将是完全匹配项,而第二个元素将表示捕获的部分,即括号之间的部分(即我们感兴趣的部分):

For every match, preg_replace_callback will call the function we pass as second argument, passing it an array. The first element of the array will be the complete match, while the second will represent the captured part, i.e. the part between parentheses (that one has our interest):

$match[1]

我们对此应用了一个新的正则表达式,该表达式删除了双引号(包括换行符)周围的所有空格.这样,键名(例如 name )将被紧紧地用双引号引起来,

We apply a new regular expression on that, which removes all white-space around double quotes, including newlines. This way, the key names, like name will be tightly wrapped in double quotes, as it should be:

~\s*"\s*~s

同样,只是正则表达式的分隔符.

Again, the ~ are just delimiters for the regular expression.

  • \ s * :匹配任意数量的空格,包括换行符
  • \s*: matches any number of white-space, including newlines

这样修改过的字符串必须返回到外部 preg_replace_callback 函数,它将使用该函数将其插入最终结果字符串中.

The string that is so modified must be returned to the outer preg_replace_callback function, which will use it to insert it in the final result string.

当然,如果您可以控制文件或文件的生成方式,请解决此问题的原因.

Of course, if you do have control over the file, or how it is generated, then fix the cause of this issue.

请注意,有效的JSON不会使用单引号来分隔字符串.它们必须是双引号.

Note that valid JSON does not use single quotes to delimit strings. They must be double quotes.

这篇关于仅解析json的第一级的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-03 07:02