Corpus Annotation Tools Explorer

Explore and compare tools for manual linguistic annotation

Suggest a Tool About

Distribution of features among tools

Data support

Input format

Raw text

16

XML

12

TSV

6

PDF

2

CONLL-U

13

CONLL-X

1

Corpus Workbench vertical format

1

SFM

1

ToolBox files

1

Praat files

1

CoNLL-U plus

1

.eaf

1

TextGrid

1

CHAT

1

Shoebox/Toolbox

1

CSV

1

FoLiA XML

1

.txt+.ann

1

CoNLL-U

1

VISL CG3

1

SDParse

1

Bracket notation

1

Integration of other resources

Audio

5

Video

5

Image

5

PDF

3

External databases (e.g. dictionaries)

5

Data export

The data can be exported.

21

The data can be stored with a third party storage platform (e.g. Git repository)

4

Diversity features

Diversity features

Language-independence

20

Writing script support

20

LTR

20

unicode support

19

RTL

16

Accessibility

1

Language of the interface

English

21

German

1

French

2

Chinese

1

Annotation

Segmentation

word-level

12

sentence-level

8

word segmentation

1

Sentence segmentation

1

Annotation types

Tokens (e.g. part-of-speech tag morphological features)

19

Spans (e.g. names entities)

15

Multi-tokens (e.g. multi-word expressions)

12

Relations (e.g. dependencies)

14

Chains (e.g. coreferences)

7

Interlinear glossed text annotations

1

Multi-tokens (e.g. multi-word expressions

2

Spans (e.g. names entities),

1

Multi-tokens (e.g. multi-word expressions) - Relations (e.g. dependencies)

1

Chainsc(e.g. coreferences)

1

Tokens (e.g. part-of-speech tag, morphological features)

1

Annotation scope

Free span annotation

1

independent of sentences

1

Annotation inside sentences

19

Annotation across sentences

9

Annotation of full sentences/paragraphs/documents

8

Annotation of full sentences/paragraphs/ documents

1

Tagset customization

Yes

19

No

1

Import custom

1

UD/PARSEME specifics

UD specific annotations

Lemmas

15

UPOS

17

FEATS

14

XPOS

15

Multi-word tokens

11

Basic dependencies

15

Enhanced dependencies

7

Metadata

13

Empty nodes

8

Multiword tokens

1

UD validation

No

16

Yes

5

PARSEME specific annotations

Nested expressions

7

Overlapping expressions

7

Discontinuous expressions

5

xs

1

Consistency check

No

17

partial

2

Yes

2