HTTPS SSH

Cross-lingual parsing at VarDial 2017

This repository includes the data sets and baseline models for the shared task in cross-lingual dependency parsing at VarDial 2017 in Valencia, Spain in connection with EACL. The task is to develop models for parsing selected target languages without annotated training data in that language but annotated data in one or two closely related languages. At VarDial 2017 we include the following language pairs:

  • target language = Croatian; source language = Slovenian
  • target language = Slovak; source language = Czech
  • target language = Norwegian; source languages = Danish and Swedish

The data sets use universal dependencies and come from the UD distribution version 1.4. Submissions may not use any additional syntactically annotated data sets in the target language to train or tune the system. Please, use the provided development only to test your systems during development and not to train system parameters! We further divide submissions into constraint settings and open submissions:

Constraint settings

Submissions in this category use no additional data sets or linguistic resources besides the resources we provide from the official website of our shared task. The development set for the target language is not to be used for directly training model parameters.

Open settings

Submissions include additional resources or data sets that are not included in the official distibution of the shared task. This may include annotated data sets from other languages but no syntactically annotated data for the target language. In particular, it is not permitted to use UD data sets for the target languages for any training or tuning of the system.

Evaluation

The primary metric for evaluation will be Labeled Attachment Score (LAS) as it is common in dependency parsing. We will also look at unlabeled scores (UAS) but we expect labeled output from the participating systems. The evaluation script is provided in the tools/ directory and is based on the eval07.pl script from CoNLL-07. Note, that we use development and test sets with PREDICTED PoS labels and PREDICTED morphological information. We use UDPipe to label the data sets and the tagging models are provided in the models/ directory.

Data sets

Download data sets and baseline models from bitbucket:

git clone https://tiedemann@bitbucket.org/tiedemann/vardial2017.git

Training data (identical to UD data sets v1.4):

  • cs-ud-train.conllu
  • da-ud-train.conllu
  • sl-ud-train.conllu
  • sv-ud-train.conllu

Development data (based on devsets from UD but with predicted PoS and morphology):

  • hr-ud-predPoS-dev.conllu
  • no-ud-predPoS-dev.conllu
  • sk-ud-predPoS-dev.conllu

Parallel data sets (from OPUS)

If you want to run word alignment for any kind of annotation projection or transfer, we recommend

Test data (with predicted PoS and morphology)

  • hr-ud-predPoS-test.conllu
  • no-ud-predPoS-test.conllu
  • sk-ud-predPoS-test.conllu

Baselines

We provide two baselines: Simple delexicalized models and lexicalized models without any target language adaptation. The latter refers to models that are trained on the source language with all features and applied without modification to the target language test and development sets. The models are trained using UDPipe and are available in the models/ directory.

Lexicalized models without adaptation (devset)

  • Czech model applied to Slovak: LAS = 54.61%, UAS = 66.62%
  • Slovenian model applied to Croatian: LAS = 56.85%, UAS = 66.00%
  • Danish model applied to Norwegian: LAS = 54.11%, UAS = 63.87%
  • Swedish model applied to Norwegian: LAS = 55.85%, UAS = 65.25%
  • Danish + Swedish model applied to Norwegian: LAS = 59.10%, UAS = 68.48%

Delexicalized models (devset)

Using universal PoS labels only:

  • Czech model applied to Slovak: LAS = 53.66%, UAS = 63.59%
  • Slovenian model applied to Croatian: LAS = 53.93%, UAS = 64.69%
  • Danish model applied to Norwegian: LAS = 54.54%, UAS = 64.48%
  • Swedish model applied to Norwegian: LAS = 56.71%, UAS = 66.47%
  • Danish + Swedish model applied to Norwegian: LAS = 57.84%, UAS = 67.60%

Using also morphological features of the source language (devset)

  • Czech model applied to Slovak: LAS = 51.03%, UAS = 63.49%
  • Slovenian model applied to Croatian: LAS = 53.14%, UAS = 62.31%
  • Danish model applied to Norwegian: LAS = 50.54%, UAS = 63.43%
  • Swedish model applied to Norwegian: LAS = 53.98%, UAS = 65.36%
  • Danish + Swedish model applied to Norwegian: LAS = 56.22%, UAS = 66.71%

Supervised models (devset)

As an upper bound we can also look at fully supervised models trained on the annotated target language data. Default settings with UDPipe yield:

  • Croatian: LAS = 74.27%, UAS = 80.16%
  • Slovak: LAS = 70.27%, UAS = 78.18%
  • Norwegian: LAS = 78.10%, UAS = 82.11%

Baseline results on the testset

Cross-lingual without adaptation:

results/hr.sl.test.eval:  Labeled   attachment score: 3364 / 6306 * 100 = 53.35 %
results/no.da.test.eval:  Labeled   attachment score: 16453 / 29966 * 100 = 54.91 %
results/no.dasv.test.eval:  Labeled   attachment score: 17965 / 29966 * 100 = 59.95 %
results/no.sv.test.eval:  Labeled   attachment score: 16971 / 29966 * 100 = 56.63 %
results/sk.cs.test.eval:  Labeled   attachment score: 6999 / 13028 * 100 = 53.72 %

Delexicalized:

results/hr.sl-delex.test.eval:  Labeled   attachment score: 3166 / 6306 * 100 = 50.21 %
results/no.da-delex.test.eval:  Labeled   attachment score: 15172 / 29966 * 100 = 50.63 %
results/no.sv-delex.test.eval:  Labeled   attachment score: 16515 / 29966 * 100 = 55.11 %
results/no.dasv-delex.test.eval:  Labeled   attachment score: 17096 / 29966 * 100 = 57.05 %
results/sk.cs-delex.test.eval:  Labeled   attachment score: 6372 / 13028 * 100 = 48.91 %

Fully supervised:

results/hr.hr.test.eval:  Labeled   attachment score: 4320 / 6306 * 100 = 68.51 %
results/no.no.test.eval:  Labeled   attachment score: 23441 / 29966 * 100 = 78.23 %
results/sk.sk.test.eval:  Labeled   attachment score: 9008 / 13028 * 100 = 69.14 %

Results of the final test set submissions

The submission of all teams are in the sub-directory 'submissions'

Croatian

Team LAS UAS
CUNI 60.70 69.73
Helsinki-CLP 57.98 69.57
tubasfs 55.20 66.75
baseline 53.35 63.94
delex (uPoS) 50.81 62.64
Croatian 68.51 75.61 trained on target data

Norwegian

Team LAS UAS Notes
CUNI 70.21 77.13
Helsinki-CLP 68.60 76.77
tubasfs 65.62 74.61 from Swedish
tubasfs 64.91 73.50 Danish + Swedish
tubasfs 58.55 67.48 from Danish
baseline 59.95 69.02 Danish + Swedish
baseline 56.63 66.24 Swedish
baseline 54.91 64.53 Danish
delex (uPoS) 58.80 68.58 Danish + Swedish
delex (uPoS) 57.54 66.96 Swedish
delex (uPoS) 55.17 65.23 Danish + Swedish
Norwegian 78.23 82.28 trained on target data

Slovak

Team LAS UAS
CUNI 78.12 84.92
Helsinki-CLP 73.14 82.87
tubasfs 64.05 73.16
baseline 53.72 65.70
delex 48.91 60.68
Slovak 69.14 76.57 trained on target data