TalkBank MOR Grammars

The MOR program provides a method for automatic tagging of corpora in the CHAT format. To make this work, it is necessary to construct a separate MOR grammar for each language. After analysis with MOR, users can then use the POST program to disambiguate the %mor line. We provide a POST disambiguation database for English, but for other languages, users will need to do the work of training a POST database for themselves. This whole system is fully described in the MOR Manual as well as in a book chapter on morphosyntactic analysis in CLAN.

We have working MOR grammars for these languages:

  • Cantonese (yue): This grammar was built by Brian MacWhinney, Sam Po Law, and Anthony Kong with additional help from a Cantonese-English lexicon provided by K. K. Luke.
  • Chinese (zho): This grammar was built by Brian MacWhinney and Twila Tardif. Thanks to K. J. Chen and the CKIP Group of the Academica Sinica for the input lexicon
  • Chinese segmenter and Trad<->Simp converter
  • Dutch (nld): This grammar was contributed by Steven Gillis and Jan Odijk.
  • English (eng): This grammar was built by Brian MacWhinney and Mitzi Morris.
  • French (fra): This is revised version of the grammar contributed by Christophe Parisse.
  • German (deu): This grammar was developed by Nikolas Koch.
  • Hebrew (heb): This grammar was developed by Aviad Albert, Bracha Nir, Shuly Wintner, Brian MacWhinney, and Ruth Berman.
  • Japanese (jpn): This grammar was constructed by Norio Naka and Susanne Miyata. The distribution includes the Wakachi system from Susanne Miyata for grammatical reference.
  • Italian (ita): This grammar was built by Livia Tonelli and Brian MacWhinney.
  • Spanish (spa): This grammar was built by Brian MacWhinney.

These grammars also include POST databases created by Christophe Parisse's POSTTRAIN program. After MOR finishes, POST runs automatically to disambiguate the output of MOR. After this, the grammars for English, Hebrew, Japanese, Mandarin, and Spanish will also run the MEGRASP programs to automatically create a dependency grammar analysis on the %gra line. However, the accuracy of these analyses varies across these languages because some need more training data.

To help those interested in building their own MOR grammars, we provide these two examples of minMOR grammars. One is the basic example and the other indicates how to build a grammar that targets only a few word forms, such as the German article.

We also have a few grammars in preparation: