Pattern-based unsupervised parsing method.
Jesus Santamaria, Lourdes Araujo
Natural Language Engineering, 22: 397-422 (2016)

We have developed a heuristic method for unsupervised parsing of unrestricted text. Our method relies on
detecting certain patterns of part-of-speech tag sequences of words in sentences. This detection is based on
statistical data obtained from the corpus and allows us to classify part-of-speech tags into classes that play
specific roles in the parse trees. These classes are then used to construct the parse tree of new sentences
via a set of deterministic rules. Aiming to asses the viability of the method on different languages, we have
tested it on English, Spanish, Italian, Hebrew, German, and Chinese. We have obtained a significant improvement
over other unsupervised approaches for some languages, including English, and provided, as far as we know,
the first results of this kind for others.