Stanford Parser
- Constituency and dependency
- Java, with Python and Ruby interfaces
- GPL license
- By Chris Manning et al
| English, Chinese, German, Arabic, Italian, Bulgarian, and Portuguese | - Part of Stanford Core NLP Toolkit
- It is a package of three kinds of parsers: a PCFG (probabilistic context-free grammar) parser, a lexicalized dependency parser, and a lexicalized PCFG parser
- Parsing accuracy ranks consistently high in surveys
- Good documentation
- The PCFG parser is based CKY algorithm
- However, the dependency parser is anexhaustive dependency parser with O(n^4) complexity. It is much worse than other linear time O(n) dependency parsers
| | Yes (frequent releases) |
Collins and Bikel Parser
- Constituency parser
- Java
- Free for research
- By Dan Bikel (UPenn) andMike Collins(Columbia)
| English, Chinese, Arabic | - It is an improvement of Collins parser
- Based on CYK algorithm (source code)
- Lexicalized PCFG
- state-of-the-art performance for English
| | No (since 2008) |
Berkeley parser
- Constituency parser
- Java
- GPL
- Slav Petrov and Dan Klein
| English, Bulgarian, Arabic, Chinese, French, German | - based on a hierarchical coarse-to-fine parsing, where a sequence ofgrammars is considered
- no need for language-specific adaptations, Automatically induced PCFG
- state-of-the-art performance for English on the Penn Treebank
| | Yes(infrequent changes) |
Charniak-Johnson Parser
- Constituency parser
- C
- Eugene Charniak (Brown Univ) and Mark Johnson
| English | - Based on discriminative reranking, dynamic programming
- Lexicalized N-Best PCFG : for each sentence, constructing sets of 50-best parses based on a heuristic coarse-to-fine generative parser
- estimate the reranker feature weights using MaxEnt, Averaged Perceptron, etc
- State of the art performance on English
| | Yes (infrequent changes) |
Link Grammar Parser
- Dependency parser
- C, Bindings from Ruby, Python, perl, Java and Ocaml
- BSD license
- Davy Temperley, John Lafferty and Daniel Sleator (CMU)
- Dom Lachowicz, Linas Vepstas (AbiWord)
| Persian, Arabic, Chinese, German, Russian | - Based on lexicons of link grammar (similar to IBM Watson’s English slot grammar parser). ItsEnglish dictionary has 70k+ words
- Produce both dependencies (labelled links connecting pairs of words) and constituents (Penn tree-bank style phrase tree)
- Performance is comparable to the Stanford PCFG parsing model, and is 3+ times faster than the Stanford lexicalized model.
- 10+ extensions, including FrameNet-style framing, reference (anaphora) resolution and natural language generation
- However, it is grammar-rigid, may fail when the sentence is grammatically incomplete or incompliant
- Very good documentation
| | Yes (frequent releases) |
NLTK Parser
- Constituency and dependency
- Python
- Apache License
- Steven Bird
| English, German, Chinese, Japanese | - Very good documentation, various books available. Widely adopted in education and web application development
- Very easy to use, clean API interface
- Part of whole set of NLP tools covering major NLP needs
- Constituency parser with PCFG
- Dependency parser using shift-reduce algorithm, based CFG
- However, its parser implementation is less optimized
| | Yes (very active) |
MiniPar
- Dependency parser
- C and Lisp, with Java binding in GATE
- free of charge for non-commercial use
- Dekang Lin
| English | - One of the early dependency parser
- After 15+ years, is slightly worse than state-of-the-art parsers
- Code is small and easy to extend
- Its dependency maybe useful in designing a new parser
| | No (since 1994) |
RASP
- C and Common Lisp
- Constituency and dependency
- LGPL
- John Carroll et al (Sussex and Cambridge)
| English | - RASP = Robust Accurate Statistical Parsing
- fully domain-independent automated training
- integration of statistical techniques and incremental grammar rule induction
- state-of-the-art performance
| | Yes (infrequent releases) |
MaltParser
| English, French, Swedish | - Shift-reduce algorithm (automaton-based)
- Inductive dependency parsing that learns from a treebank
- Very fast: linear time parsing
- State-of-the-art performance on accuracy
| | Yes (frequent releases) |
DeSR
- Dependency parser
- C++ wth Python binding
- GPL
- Giuseppe Attardi
| Italian, English, French, and 10+ others | - Part of the Tanl project
- shift-reduce dependency parser, can handle non-projective dependencies
- deterministically parsing, very fast (linear time)
- fully labeled dependency trees
- training with Multi Layer Perceptron, Averaged Perceptron, Maximum Entropy, SVM, memory-based learning using TiMBL
- Among the best on English labeled dependency parsing
| | Yes (frequent releases) |
MSTParser
- Dependency parser
- Java
- Jason Baldrige and Ryan McDonald (UPenn)
| English, Chinese and 10+ other languages | - MST = Maximum-Spanning Tree, based on graph algorithm
- Support online learning
- State-of-the-art performance, comparable to MaltParser
- outperform MaltParser on longer dependencies, but typically slower
| | No (since 2007) |
DepParse
- Dependency parser
- Python
- MIT Lincense
- Leif Johnson (UT Austin)
| English | - maximum spanning tree (MST) parser and a stack-based, shift-reduce parser
- support data parallelism on multicore machines
- performance has not been evaluated
- Self-contained, easy to extend
| | No (since 2010) |
pfp
- Constituency parser
- C++ and Python
- GPL
- Erik Frey, Norman Casagrande et al (Wavii Inc)
| English | - pfp — pretty fast statistical parser
- Using PCFG grammar and CYK algorithm
- 3-4x faster than the Stanford parser, and uses 5-8x less resident memory
- Thread-safe/multi-core support
| | Yes [1] |
MBSP
- Shallow (dependency) parsing
- Python
- GPL and Commercial
| English | - Memory-Based Shallow Parser, based on the TiMBL and MBT memory-based learning applications
- No need for manual pattern or grammar definition
- Client-server architecture
- Do shallow parsing,
- Share an API with Pattern
- Can be used together with DeSR and NLTK
| | Yes |
OpenNLP Parser
- Constituency parser
- Java
- Apache License (An Apache project)
| English | - A chunking parser (relatively simple)
- Can be used with UIMA
| | Yes |
Senna
- Constituency parser
- C
- a non-commercial license
| English | - Using deep-learning
- Very small code (3500 lines)
- syntactic parsing
- State-of-the-art performance
| |