


What is the best way to convert JSON to XML and back. For example, the below JSON

    "user": "gerry",
    "likes": [1, 2, 4],
    "followers": [
            "name": "megan"
            "name": "pupkin"


could be converted into XML like this (#1):

<?xml version="1.0" encoding="UTF-8" ?>


<?xml version="1.0" encoding="UTF-8"?>


In particular, the difference arises converting arrays. Object property conversion is quite trivial. I am also sure that there are other ways to convert JSON to XML.


So the question is: What is the best way? Are there any standards?


Another question: is there a way to express the conversion mapping itself in some mathematical form. Eg, is it possible to describe a mapping such that a conversion function when given the JSON object and the mapping object would know exactly which XML to produce. And reverse it, too.

XML_1 = convert(JSON, mapping_1)
XML_2 = convert(JSON, mapping_2)
JSON  = convert(XML_1, mapping_1)
JSON  = convert(XML_2, mapping_2)
JSON  = convert(XML_1, mapping_2) # Error!



You're obviously interested in the theory behind data serialization. I'll try to explain using the following headings.

  • XML作为数据序列化格式的问题
  • 为什么喜欢其他格式
  • 这真的与信息和关系有关


What I'm leading to is an introduction to the Semantic web and how it formats data in various different formats.

您已经发现了几种以XML构造数据的方法.这是因为XML从一开始就是文档标记. XML没有内置的方式来描述简单的数据结构,例如列表或哈希.

As you've discovered there a several ways to structure data in XML. This is because XML started life as a documentation markup. XML has no built in way to describe simple data structures like lists or hashes.


  <user name="gerry"/>


This can be deserialized as a simple hash:

data.user.name = "gerry"


or less obviously as a list of hashes:

data.user[0].name = "gerry"


Fact is a different XML document could be specifying multiple user tags:

  <user name="gerry"/>
  <user name="tom"/>



XML schema to the rescue

The solution to this problem was to design a separate schema specification that describes how the document is formatted:

<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="data">
        <xs:element name="user" maxOccurs="unbounded" minOccurs="0">
              <xs:extension base="xs:string">
                <xs:attribute type="xs:string" name="name" use="optional"/>


The person tag is described as being a sequence of elements... So this enables XML parsers to store this information in a list construct.


This is the approach taken by many web service frameworks which process XML data. The message format is described in the WSDL/XML schema and the programming code that processes the message is generated automatically.

JSON YAML 是专门用于序列化数据的.他们不需要架构文档即可明确解析数据.

Formats like JSON and YAML are specifically designed to serialize data.They don't require schema documents in order to parse data unambiguously.


but... Even so.... JSON and YAML don't solve all problems. While the data is more obvious at first glance there are no standards for describing data structures....

我之前曾批评过XML模式,但是这些模式对于确定一段数据是否在程序上可用(有效)非常有用.即使这样,XML Schema也不能告诉我一个数据与另一个数据之间的关系.

Earlier I vilified XML schemas, but these can be really useful to determining whether a piece of data is programmatically usable (valid) or not. Even so an XML Schema does not tell me the relationship between one piece of data and another.


The Semantic web movement is an attempt to create a self describing and collaborative internet. Problem is (IMHO) the associated standards are complex and difficult to understand and apply. The place to start is RDF:


It's designed as a generic information interchange format and cleverly works in manner that is independent of how data is actually serialized.

您的简单示例,表示为RDF XML:

Your simple example and expressed as RDF XML:

<?xml version="1.0"?>
<rdf:RDF xmlns:user="http://myspotontheweb.com/user/1.0/" xmlns:ex="http://myspotontheweb.com/example/user/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description rdf:about="http://myspotontheweb.com/example/user/1">
    <rdf:Description rdf:about="http://myspotontheweb.com/example/user/2">
        <user:follows rdf:resource="http://myspotontheweb.com/example/user/1" />
    <rdf:Description rdf:about="http://myspotontheweb.com/example/user/3">
        <user:follows rdf:resource="http://myspotontheweb.com/example/user/1" />


Each item of data has a unique identifier and a custom set of attributes:

  • 名称
  • 喜欢
  • 跟随:用于将一个RDF实体链接到另一个.

XML只是表达RDF的一种方式,我更喜欢更紧凑的 N3 RDF格式 :

XML is just one way to express RDF, I prefer the more compact N3 RDF format:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix user: <http://myspotontheweb.com/user/1.0/> .
@prefix ex: <http://myspotontheweb.com/example/user/> .

ex:1 user:name "gerry" .
ex:1 user:likes "1" .
ex:1 user:likes "2" .
ex:1 user:likes "4" .

ex:2 user:name "tom" .
ex:2 user:likes "2" .
ex:2 user:likes "4" .
ex:2 user:likes "6" .
ex:2 user:follows ex:1 .

ex:3 user:name "felix" .
ex:3 user:likes "3" .
ex:3 user:likes "5" .
ex:3 user:follows ex:1 .


Again note the custom prefix declaration at the top and the clear statement of what each piece of data ("tuple" in RDF parlance) represents. I think this demonstrates it's about information not data format!

为了完整起见,以 JSON-LD 格式显示的RDF信息:

And for completeness the RDF information presented in JSON-LD format:

  "@graph": [
      "@id": "http://myspotontheweb.com/example/user/3",
      "http://myspotontheweb.com/user/1.0/follows": {
        "@id": "http://myspotontheweb.com/example/user/1"
      "http://myspotontheweb.com/user/1.0/likes": [
      "http://myspotontheweb.com/user/1.0/name": "felix"
      "@id": "http://myspotontheweb.com/example/user/2",
      "http://myspotontheweb.com/user/1.0/follows": {
        "@id": "http://myspotontheweb.com/example/user/1"
      "http://myspotontheweb.com/user/1.0/likes": [
      "http://myspotontheweb.com/user/1.0/name": "tom"
      "@id": "http://myspotontheweb.com/example/user/1",
      "http://myspotontheweb.com/user/1.0/likes": [
      "http://myspotontheweb.com/user/1.0/name": "gerry"


  • 有多种方法可以将RDF表示为JSON.请参见 JSON + RDF


Once the information is expressed as RDF its relationships to other data entities can be graphed visually:


The Semantic web goes a lot further, it only starts with RDF. There are XML schema-like standards for publishing well understood relationships between tuplies. Using these one can start to manipulate RDF data in very interesting ways.


I don't claim to be an expert in data processing. What I do acknowledge is that some very clever people have been looking at this problem for some time. The concepts are tough to learn, but worthwhile in order to better understand information theory.


09-05 13:02