Ruby:解析简单的markdown文件(结构相似，但结构不相等)，然后将内容填充到对象的属性中

任何欢迎解决此问题的想法.谢谢. PS:我还考虑过使用markdown解析器解析markdown，然后使用Nokogiri或其他可以解析结果的东西.但这对于这样一个基本简单的需求来说似乎太昂贵了.解决方案给出示例:examples = []例子<<<< -EOS#这是标题这是一些描述.甚至更多的描述.##这是一个h2bla bla.##这是另一个h2更多bla bla.###这甚至是h3再次，更多bla bla.##再次是h2等等等EOS例子<<<< -EOS##这是一个h2bla bla.##这是另一个h2更多bla bla.EOS例子<<<< -EOS#这是标题##这是一个h2bla bla.EOS例子<<<< -EOS这是一个描述.一些更多的描述.##这是一个h2bla bla.EOS 您可以执行以下操作: examples.each | text |文字=〜/\ A(?:(?:^#(?！#)([^ \ n] *))?(.*?)(?= ^#| \ z))?(.*)\z/米title，description，content = [$ 1，$ 2，$ 3] .map {| s |s.strip！如果除非(s& s.empty?)}放置<< -EOS文件:标题:#{title.inspect}说明:#{description.inspect}内容:#{content.inspect}EOS结尾注意:正则表达式不在乎连续换行符的数量. 哪个给您: 文件:h1:这是标题"description:这是一些说明.\ n甚至更多说明."内容:"##这是一个h2 \ nBla bla.\ n ##这是另一个h2 \ n更多的bla bla.\ n ###这甚至是一个h3 \ n，更多的bla bla.\ n ##同样，h2 \ netc.等等."文件:H1:无说明:无内容:"##这是一个h2 \ nBla bla.\ n ##这是另一个h2 \ n更多bla bla."文件:h1:这是标题"说明:无内容:"##这是h2 \ nBla bla."文件:h1:这是标题"描述:这是一些描述."内容:无文件:h1:无description:这是说明.\ n更多说明."内容:"##这是h2 \ nBla bla." I have a folder full of markdown files. Each of them I want to read into the following Ruby object:class File attr_accessor :title, :description, :contentendThe markdown files usually look like this:# This is the titleThis is some description.And even more description.## This is an h2Bla bla.## This is another h2More bla bla.### This is even an h3Again, more bla bla.## Again, an h2etc. etc.This should result in this Ruby object:File: h1: "This is the title" description: "This is some description.\n\nAnd even more description." content: "## This is an h2...etc. etc."To assign the content of the file to the Ruby object's definition, I could simply use a regular expression which would extract title (the first H1), description (the text right between H1 and the following H2), and content (all the rest).But the files do not always look exactly like this:Sometimes, there is no H1(If so, the file name will be used for title)Sometimes, there is no descriptionSometimes, there is no contentThese exceptions can occur in combinations, ie. a file without H1 and description:## This is an h2Bla bla.## This is another h2More bla bla.This should result in this Ruby object:File: h1: nil description: nil content: "## This is an h2...More bla bla."Or a file with H1 but no description:# This is the title## This is an h2Bla bla.This should result in this Ruby object:File: h1: "This is the title" description: nil content: "## This is an h2...Bla bla.Or a file with no H1, but a description:This is a description.Some more description.## This is an h2Bla bla.This should result in this Ruby object:File: h1: nil description: This is a description...Some more description. content: "## This is an h2...Bla bla.I wonder whether I can do this using a single fancy regular expression (I'm no expert in that), or whether I should try to somehow split it into several process steps. I asked a similar question here: Markdown: Regex to find all content following an heading #2 (but stop at another heading #2), but I couldn't get the regex to run properly using Ruby with the exceptions described above.Any idea how to solve this problem is highly welcome. Thank you.PS: I also thought about parsing the markdown using a markdown parser and then use Nokogiri or something which would allow me to parse the results. But this feels like way too much overhead for such a basically simple requirement. 解决方案 Given your examples:examples = []examples << <<-EOS# This is the titleThis is some description.And even more description.## This is an h2Bla bla.## This is another h2More bla bla.### This is even an h3Again, more bla bla.## Again, an h2etc. etc.EOSexamples << <<-EOS## This is an h2Bla bla.## This is another h2More bla bla.EOSexamples << <<-EOS# This is the title## This is an h2Bla bla.EOSexamples << <<-EOSThis is a description.Some more description.## This is an h2Bla bla.EOSYou can do this:examples.each do |text| text =~ /\A(?:(?:^#(?!#)([^\n]*))?(.*?)(?=^#|\z))?(.*)\z/m title,description,content = [$1,$2,$3].map { |s| s.strip! if s s unless (s && s.empty?) }puts <<-EOSFile: title: #{title.inspect} description: #{description.inspect} content: #{content.inspect}EOSendNote: The regexp doesn't care about number of consecutive newlines.Which gives you:File: h1: "This is the title" description: "This is some description.\nAnd even more description." content: "## This is an h2\nBla bla.\n## This is another h2\nMore bla bla.\n### This is even an h3\nAgain, more bla bla.\n## Again, an h2\netc. etc."File: h1: nil description: nil content: "## This is an h2\nBla bla.\n## This is another h2\nMore bla bla."File: h1: "This is the title" description: nil content: "## This is an h2\nBla bla."File: h1: "This is the title" description: "This is some description." content: nilFile: h1: nil description: "This is a description.\nSome more description." content: "## This is an h2\nBla bla." 这篇关于Ruby:解析简单的markdown文件(结构相似，但结构不相等)，然后将内容填充到对象的属性中的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！上岸，阿里云！

With

Ruby:解析简单的markdown文件(结构相似，但结构不相等)，然后将内容填充到对象的属性中