问题描述
我有一个 HTML 字符串,我试图在其中生成一组出现在两组字符之间的所有子字符串实例.
I have an HTML string where I'm trying to generate an array of all substring instances that occur between two sets of characters.
我的字符串看起来像这样:
My string looks something like this:
<h2>The Phantom Menace</h2>
<p>Two Jedi escape a hostile blockade to find allies and come across a young boy who may bring balance to the Force, but the long dormant Sith resurface to claim their original glory.</p>
<h2>Attack of the Clones</h2>
<p>Ten years after initially meeting, Anakin Skywalker shares a forbidden romance with Padmé Amidala, while Obi-Wan Kenobi investigates an assassination attempt on the senator and discovers a secret clone army crafted for the Jedi.</p>
<h2>Revenge of the Sith</h2>
<p>Three years into the Clone Wars, the Jedi rescue Palpatine from Count Dooku. As Obi-Wan pursues a new threat, Anakin acts as a double agent between the Jedi Council and Palpatine and is lured into a sinister plan to rule the galaxy.</p>
<h2>A New Hope</h2>
<p>Luke Skywalker joins forces with a Jedi Knight, a cocky pilot, a Wookiee and two droids to save the galaxy from the Empire's world-destroying battle station, while also attempting to rescue Princess Leia from the mysterious Darth Vader.</p>
<h2>The Empire Strikes Back</h2>
<p>After the Rebels are brutally overpowered by the Empire on the ice planet Hoth, Luke Skywalker begins Jedi training with Yoda, while his friends are pursued by Darth Vader and a bounty hunter named Boba Fett all over the galaxy.</p>
<h2>Return of the Jedi</h2>
<p>After a daring mission to rescue Han Solo from Jabba the Hutt, the Rebels dispatch to Endor to destroy the second Death Star. Meanwhile, Luke struggles to help Darth Vader back from the dark side without falling into the Emperor's trap.</p>
<h2>The Force Awakens</h2>
<p>As a new threat to the galaxy rises, Rey, a desert scavenger, and Finn, an ex-stormtrooper, must join Han Solo and Chewbacca to search for the one hope of restoring peace.</p>
<h2>The Last Jedi</h2>
<p>Rey develops her newly discovered abilities with the guidance of Luke Skywalker, who is unsettled by the strength of her powers. Meanwhile, the Resistance prepares for battle with the First Order.</p>
<h2>The Rise of Skywalker</h2>
<p>The surviving members of the resistance face the First Order once again, and the legendary conflict between the Jedi and the Sith reaches its peak bringing the Skywalker saga to its end.</p>
我想创建一个包含 {h2} 和 {/h2} 子字符串的数组以获得以下结果:
I want to create an array of {h2} and {/h2} substrings to get the following result:
[《幽灵的威胁》、《克隆人的进攻》、《西斯的复仇》、《新希望》、《帝国反击战》、《绝地归来》、《原力觉醒》、《最后的绝地》、《天行者的崛起》]
["The Phantom Menace", "Attack of the Clones", "Revenge of the Sith", "A New Hope", "The Empire Strikes Back", "Return of the Jedi", "The Force Awakens", "The Last Jedi", "The Rise of Skywalker"]
此代码是否有变体,我可以在其中输入标签之间的范围?
Is there a variation of this code where I can input the range between the tags?
let titles = htmlInput.components(separatedBy:"<h2>")
这将返回一个包含如下元素的数组:
This returns an array with elements like this:
"幽灵的威胁
两名绝地逃离敌对封锁寻找盟友,并遇到了一个可能为原力带来平衡的小男孩,但长期沉睡的西斯重新浮出水面,夺回了他们最初的荣耀.
"The Phantom Menace
Two Jedi escape a hostile blockade to find allies and come across a young boy who may bring balance to the Force, but the long dormant Sith resurface to claim their original glory.
欢迎任何帮助.
谢谢
推荐答案
您可以使用正则表达式 (?)(.*?)(?=)
You can use Regular Expression (?<=<h2>)(.*?)(?=</h2>)
示例:
let input: String = ...
let expr = "(?<=<h2>)(.*?)(?=</h2>)"
do {
let regex = try NSRegularExpression(pattern: expr)
let nsString = input as NSString
let results = regex.matches(in: input, range: NSRange(location: 0, length: nsString.length))
print(results.map { nsString.substring(with: $0.range)})
} catch let error {
print("invalid regex: \(error.localizedDescription)")
}
这篇关于在 Swift 中,如何从较大的字符串生成子字符串数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!