问题描述
如何将非结构化数据转换成结构化数据?例如,电子邮件联系人,从非结构化文本到结构化格式。
How can I convert unstructured data into structured data? For example email contacts, from an unstructured text, to a structured format.
有没有任何算法?
推荐答案
没有一种通用算法来取非结构化数据并将其转换为结构化数据,否则。它非常依赖于可能的输入范围,以及期望的结构以及需要应用的转换等。
There's no generic algorithm to "take unstructured data and convert it to structured data", no. It's highly dependent on what the possible range of input is, and what the desired structure is, and what conversions need to be applied, etc.
问题称为解析:您需要为您期望的特定输入构造一个解析器,并使用该解析器从您发现的输入中生成结构。
The class of problem is called "parsing": you need to construct a parser for the specific inputs you expect, and use that parser to generate structure from what it discovers about the input you get.
您的编程语言可能会有解析库可用于协助构建特定解析器。
Your programming language will likely have parsing libraries available to assist with constructing a specific parser.
这篇关于将非结构化数据转换为结构化数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!