Genia Corpus

warning: Creating default object from empty value in /home/medlingmap/ on line 33.


GENIA corpus is a collection of biomedical literature. It has been compiled and annotated within the scope of the GENIA project. The goal of the project is to develop text mining (TM) systems for the domain of molecular biology. The GENIA corpus has been developed to provide a reference material for the development of bio-TM systems. The corpus currently contains 1,999 Medline abstracts which were collected using the three MeSH terms, "human", "blood cells", and "transcription factors". The corpus has been annotated with various levels of linguistic and semantic information.

Syndicate content