问题描述
有谁知道如何解决 TreeTagger
中的这个文件读取错误,TreeTagger
是一种常用的自然语言处理工具,用于 POS
标记、词形还原和块句子?
Do anyone know how to resolve this file reading error in TreeTagger
that is a common Natural Language Processing tool used to POS
tag, lemmatize and chunk sentences?
alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english
reading parameters ...
ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par
aborted.
我没有遇到任何可能的安装问题,如 http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt.我已按照网页上的说明进行操作,并且已正确安装(http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/#Linux):
I didn't encounter any possible installation problems as hinted on http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt.I've followed the instructions on the webpage and it's installed properly (http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/#Linux):
alvas@ikoma:~$ mkdir treetagger
alvas@ikoma:~$ cd treetagger
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tree-tagger-linux-3.2.tar.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tagger-scripts.tar.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/install-tagger.sh
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/dutch-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/german-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/italian-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/spanish-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/french-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ sh install-tagger.sh
Linux version of TreeTagger installed.
Tagging scripts installed.
German parameter file (Linux, UTF8) installed.
German chunker parameter file (Linux) installed.
French parameter file (Linux, UTF8) installed.
French chunker parameter file (Linux, UTF8) installed.
Italian parameter file (Linux, UTF8) installed.
Spanish parameter file (Linux, UTF8) installed.
Dutch parameter file (Linux, UTF8) installed.
Path variables modified in tagging scripts.
You might want to add /home/alvas/treetagger/cmd and /home/alvas/treetagger/bin to the PATH variable so that you do not need to specify the full path to run the tagging scripts.
但是当我尝试测试该软件时,我收到了以下错误:
alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english
reading parameters ...
ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par
aborted.
alvas@ikoma:~/treetagger$ echo 'Das ist ein Test.' | cmd/tagger-chunker-german
ERROR: Can't open for reading: /home/alvas/treetagger/lib/german-chunker.par
aborted.
ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par
aborted.
reading parameters ...
ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par
aborted.
推荐答案
我认为有两个问题:首先,脚本的名称中应该有-utf8",例如cmd/tagger-chunker-german-utf8
,因为你下载的是UTF-8数据.其次,标记和分块都需要一个数据文件.请参阅主页,其中有PC 的参数文件"和PC 的 Chunker 参数文件"部分 - 从这两个部分下载文件,然后重新执行 install-tagger.sh
.
I think there are two problems: first, the scripts should have "-utf8" in their name, e.g. cmd/tagger-chunker-german-utf8
, because you downloaded the UTF-8 data. Second, tagging and chunking requires a data file each. See the homepage which has a section "Parameter files for PC" and "Chunker parameter files for PC" - download the files from both sections, then re-execute install-tagger.sh
.
这篇关于TreeTagger 安装成功但无法打开 .par 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!