本文介绍了在 IN 属性中具有多项条目的模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用规则扩展 spaCy 模型.在查看文档时,我注意到 IN 属性,用于将模式映射到属性字典.这很好,但它仅适用于单个令牌.

I am extending a spaCy model using rules. While looking through the documentation, I noticed the IN attribute, which is used to map patterns to a dictionary of properties. This is great however it only works on single tokens.

例如这个模式: {"label":"EXAMPLE","pattern":[{"LOWER": {"IN": ["such as", "like", "for example"]}}]} 仅适用于术语 like 而不是其他.

For example, this pattern: {"label":"EXAMPLE","pattern":[{"LOWER": {"IN": ["such as", "like", "for example"]}}]} will only work with the term like but not the others.

对于多术语属性实现相同结果的最佳方法是什么?

What is the best way to achieve the same result for multi-terms attributes?

推荐答案

这取决于预期模式的复杂程度,但是 PhraseMatcher 可以使用属性 LOWER 处理与上述类似的情况:

It depends on how complicated the intended patterns are, but the PhraseMatcher can handle similar cases as above using the attribute LOWER:

import spacy
from spacy.matcher import PhraseMatcher

nlp = spacy.blank("en")
pmatcher = PhraseMatcher(nlp.vocab, attr="LOWER")
phrases = ["such as", "like", "for example"]
pmatcher.add("EXAMPLE", [nlp(x) for x in phrases])
assert pmatcher(nlp("Things Such As Books")) == [(15373972490796046842, 1, 3)]

这篇关于在 IN 属性中具有多项条目的模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-13 13:40
查看更多