Species name recognition and normalization software.
LINNAEUS is a general-purpose dictionary matching software, capable of processing multiple types of document formats in the biomedical domain (MEDLINE, PMC, BMC, OTMI, text, etc.). It can produce multiple types of output (XML, HTML, tab-separated-value file, or save to a database). It also contains methods for acting as a server (including load balancing across several servers), allowing clients to request matching over a network. A package with files for recognizing and identifying species names is available for LINNAEUS, showing 94% recall and 97% precision compared to LINNAEUS-species-corpus.

LINNAEUS can be run in two different ways: using an internal dictionary, or using an external dictionary. The external dictionaries are available for download below. The internal dictionaries (subsets of the external dictionaries, containing the 10,000 most frequently mentioned species in MEDLINE, representing ~99% of mentions) are contained in the Java .jar archive, and do not need any configuration. Due to the small size of the internal dictionaries, they require very little memory.

LINNAEUS is the subject of the following paper: Gerner M., Nenadic, G. and Bergman, C. M. (2010) LINNAEUS: a species name identification system for biomedical literature. BMC Bioinformatics 11:85.

For questions, suggestions or bug reports, please contact Martin Gerner or Casey Bergman.
For more information about the developers of this project see: Martin Gerner's personal page, the Nenadic group or the Bergman lab.
Remote web service availability

LINNAEUS can be accessed remotely through its web service. It can be accessed either as a SOAP endpoint (WSDL) or as a RESTFUL service by posting data as a 'text' argument to this location (example). The XML output should be self-explanatory, but for any questions, don't hesitate to contact me.


