The Semedico search engine provides a deeper semantic access to the contents of MEDLINE abstracts. All the semantic meta data accessible through Semedico are automatically generated by the JULIE Lab text mining engine based on the UIMA Middleware.
This engine automatically processes MEDLINE abstracts by recognizing and indexing (via the Lucene search engine libraray) terms from crucial biomedical terminologies, in particular MeSH, (the hematological part of) the Cell Ontology and UniProt.
MeSH is a comprehensive and well-curated terminology with approximately 25,000 biomedical terms and 140,000 (non-curated) chemical substances covering various subdomains which range from molecular biology and chemistry over translational medicine to applied health care.
UniProt is the most comprehensive terminology for proteins across several species (350.000 entries). These terminologies also contain hierarchies to varying degrees (ranging from depth 2 to depth 10), thus making them apt to browsing. Technically, this is accomplished by adding all parent terms to the search index. Bibliographic meta data already provided in MEDLINE such as author and journal names, publication dates, etc. is also added to the pool of meta data.

The Semedico search interface complements a classical ranked document list interface with the faceted search approach. It currently contains about 20 categories (facets) with over 900,000 hierarchically organized concepts based on the text-mining-generated semantic meta data and the bibliographic meta data.
From a design perspective, Semedico inherited the facet metaphor from the Flamenco project, while it uses a standard layout inspired by major web search engines like Google or Yahoo!. Still, the interface has been considerably augmented by new features to handle the vast size of domain terminologies and category systems. Among these features are:



JULIE Lab Team

Tel.: +49 3641 944 323

The MHCO: An Ontology for Major Histocompatibility Complex Alleles and Molecules

The MHC ontology (MHCO) provides a formal, coherent and consistent representation of multi-species MHC alleles and molecules. In particular it includes an HLA ontology representing all human leukocyte antigen (HLA) alleles and serological groups. In contrast to already existing ontologies in the field of immunogenetics that were exclusively built in support of immunogenetics data bases, the MHCO achieves conceptual abstraction and higher expressiveness in terms of explicit taxonomic, partonomic and additional conceptual relations and the incorporation of class restrictions.
The formal encoding of the MHCO in OWL DL, the description logics based sublanguage of the Web Ontology Language (OWL), allows to run a classifier to compute (i.e. make explicit) formally implicit knowledge as needed for advanced user interaction with query processors and information extraction systems. This is a major advantage compared to database-oriented conceptual schemata.

The MHCO is part of the StemNet knowledge management system for hematopoietic stem cell transplantation that comes with the integrated semantic search engine Semedico. The ontology contributes to the system in various ways. On the user side it provides new browsing facilities and supports query formulation for fact and document retrieval. On the back-end side it serves as coherent semantic backbone for text mining components and software applications involved in HLA typing.

The current version of the MHCO consists of 6755 classes, of which 6649 belong to the HLA ontology. The classes are interlinked by taxonomic relations and 7 different semantic relation types.

The ontology can be downloaded from this website. It is also accessible via the BioPortal website hosted by the United States National Center for Biomedical Ontology.



David DeLuca

Elena Beisswanger

Tel.: +49 3641 944 303

Valid XHTML 1.1  Valid CSS 2.1