问题描述
我有以下 HTML 输入列表.该列表具有嵌套结构 -
I have the following list of HTML inputs. The list has a nested structure -
- 级别 1 包含输入的名称(例如
input1
). - Level 2 包含有关每个输入的一些信息 -
name
、attribs
、children
- Level 3 从
children
分支出来,它是一个长度为 2 的列表 - 第一个元素包含有关输入标签的信息,第二个元素包含有关输入类型的信息.由于我需要输入标签,因此我需要为每个输入提取此列表的第一个元素.
- Level 1 contains the names of the inputs (e.g.
input1
). - Level 2 contains some information about each input -
name
,attribs
,children
- Level 3 branches off
children
, which is a list of length 2 - the first element contains information about the input's label and the second contains information about the type of input. Since I need the input labels, I need to extract the first element of this list for each input.
名单:
library(purrr)
inputs = list(
input1 = list(
name = 'div',
attribs = list(class = 'form-group'),
children = list(list(name = 'label',
attribs = list(`for` = 'email'),
children = list('Email')),
list(
list(name = 'input',
attribs = list(id = 'email', type = 'text'),
children = list()))
)))
str(inputs)
List of 1
$ input1:List of 3
..$ name : chr "div"
..$ attribs :List of 1
.. ..$ class: chr "form-group"
..$ children:List of 2
.. ..$ :List of 3
.. .. ..$ name : chr "label"
.. .. ..$ attribs :List of 1
.. .. .. ..$ for: chr "email"
.. .. ..$ children:List of 1
.. .. .. ..$ : chr "Email"
.. ..$ :List of 1
.. .. ..$ :List of 3
.. .. .. ..$ name : chr "input"
.. .. .. ..$ attribs :List of 2
.. .. .. .. ..$ id : chr "email"
.. .. .. .. ..$ type: chr "text"
.. .. .. ..$ children: list()
我可以使用 keep()
和 has_element
来做到这一点:
I am able to do this using keep()
and has_element
:
label = input %>%
map_depth(2, ~keep(., ~has_element(., 'label'))) %>%
map('children') %>%
flatten %>%
map('children') %>%
flatten
输出:
str(label)
List of 1
$ input1: chr "Email"
当我浏览 purrr
帮助页面时,keep
似乎是我所追求的功能,但我仍然不得不使用 map
和 flatten
两次以到达标签,这看起来很笨拙.所以我想知道是否有更直接的方法来实现相同的输出?我对解决方案不太感兴趣,因为我对使用此类嵌套列表背后的思考过程感兴趣.
When I was looking through the purrr
help pages, keep
seemed to be the function I was after but I still had to use map
and flatten
twice to get to the label, which seems clumsy. So I was wondering if there is a more direct way to achieve the same output? I am not so much interested in the solution as I am in the thought process behind working with nested lists like these.
推荐答案
如果每个输入都具有相同的结构,那么你就不需要 keep
,它用于删除不存在的列表元素t 满足一些条件.相反,您可以像这样使用 pluck
进行映射.当然,此方法会删除与每个输入相关的所有其他数据.如果最终目标是矩形化",即在平面结构中获取每个输入的所有信息,您可能想要做一些不同的事情.
If every input has the same structure, then you don't need keep
, which is used to remove list elements that don't meet some condition. Instead, you can just map through with pluck
like this. Of course, this method removes all the other data relevant to each input. You may want to do something different if the end goal is "rectangling", i.e. getting all the information for each input in a flat structure.
library(purrr)
inputs = list(
input1 = list(
name = 'div',
attribs = list(class = 'form-group'),
children = list(
list(
name = 'label',
attribs = list(`for` = 'email'),
children = list('Email')
),
list(
list(
name = 'input',
attribs = list(id = 'email', type = 'text'),
children = list()
)
)
)
)
)
inputs %>%
map(~ pluck(., "children", 1, "name"))
#> $input1
#> [1] "label"
这篇关于R - 根据 `purrr' 中的条件从列表中提取元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!