1. Mikhail Korobov
  2. russian-tagsets

Commits

Mikhail Korobov  committed 0b0328d

some pymorphy2-specific conversion rules for RNC

  • Participants
  • Parent commits b2faae4
  • Branches default

Comments (0)

Files changed (2)

File russian_tagsets/rule_engine.py

View file
     for from_, to_ in parsed_rules:
         from_set = set(from_)
 
-        if not from_set.issubset(grammemes):
+        if not from_set <= grammemes:
             # rule doesn't apply
             continue
 

File russian_tagsets/ruscorpora.py

View file
 # extra grammemes
 LATN => NONLEX
 
+# non-standard & pymorphy2-specific
+# (RNC doesn't tag punctuation marks)
+PNCT => PNCT
+NUMB,Romn => ANUM,ciph
+NUMB => NUM,ciph
+
 # hack to preserve whitespace info:
 | => =
 """)