Skip to content

Commit 7f105f3

Browse files
authored
Update README.md
1 parent 19415a8 commit 7f105f3

File tree

1 file changed

+11
-11
lines changed

1 file changed

+11
-11
lines changed

ud-data/mt/bm-to-nn/README.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
1-
1+
# Contents
22
The files;
33

4-
mt-nn-to-bm-train.conllu
5-
mt-nn-to-bm-dev.conllu
6-
mt-nn-to-bm-test.conllu
4+
* mt-nn-to-bm-train.conllu
5+
* mt-nn-to-bm-dev.conllu
6+
* mt-nn-to-bm-test.conllu
77

88
contains the result of automatically translating the corresponding sections (train, dev, test) of the Nynorsk UD NDT to Bokmål using Apertium. Note that only the full-forms are translated, not the lemma column. The only sanity checking performed was ensuring that the number of tokens in the target translation matched that of the source. Sentences where the token counts diverged are not included in files avbove. For these cases, the original Nynorsk version is preserved in the following files;
99

10-
skipped-mt-nn-to-bm-train.conllu
11-
skipped-mt-nn-to-bm-dev.conllu
12-
skipped-mt-nn-to-bm-test.conllu
10+
* skipped-mt-nn-to-bm-train.conllu
11+
* skipped-mt-nn-to-bm-dev.conllu
12+
* skipped-mt-nn-to-bm-test.conllu
1313

1414
Overall, just below 4% of the sentences failed to yield a valid translation (in terms of identical token counts). Detailed numbers are provided below for the different sections;
1515

16-
Train: 13617 sentences translated + 557 sentenes skipped (3.9%).
17-
Dev: 1826 translated + 64 skipped (3.4%).
18-
Test: 1445 translated + 66 skipped (4.4%).
19-
Total: 17575 translated + 687 skipped (3.9%).
16+
* Train: 13617 sentences translated + 557 sentenes skipped (3.9%).
17+
* Dev: 1826 translated + 64 skipped (3.4%).
18+
* Test: 1445 translated + 66 skipped (4.4%).
19+
* Total: 17575 translated + 687 skipped (3.9%).

0 commit comments

Comments
 (0)