SciLite allows biological terms or relations, such as diseases, chemicals or protein interactions, to be highlighted for readers on abstracts and full text articles. These terms are identified by text mining algorithms, developed by a variety of text mining groups.
For readers SciLite makes it easier to scan an article and get a quick overview. It helps in finding key concepts, and discovering evidence, such as gene–disease associations or molecular interactions. SciLite enables users to locate the primary data in the text by linking text-mined entities to public life sciences and chemistry databases. The goal of SciLite is to support scientists and database curators in their literature research by harnessing the power of text mining, and to promote the contribution of text miners to the advancement of science.
SciLite provides annotations for core named entities (e.g. gene/protein names, organisms, diseases, chemicals, Gene Ontology terms, etc.), biological events (e.g. phosphorylation), functional relations (e.g. gene–disease associations, protein–protein interactions), as well as biological functions (e.g. gene function).
As a reader views an article in a web browser, any annotations associated with it are made available in a menu alongside the article, as shown below.
Users can control the selection of concepts they see by checking the corresponding boxes, which highlights colour coded annotations in the text. To see a list of individual terms users may click on the right arrow next to the annotation type. A selection of terms found most frequently in the text appears, together with up/down navigation buttons, which allow the user to jump to selected terms in the text.
Clicking on the highlighted terms in the text opens a pop-up menu with information about the given annotation (below).
The pop-up menu displays a link to related database record, the source of the annotation, and the feedback link. In the case of overlapping annotations in a sentence, we highlight the longest annotation, and the individual annotations within the phrase can be seen in the pop-up window.
It is of critical importance that readers find the annotations useful. Readers can provide feedback on each annotation, e.g. mark incorrect annotations or endorse useful ones. This information is fed back to the Europe PMC team and will be acted upon, helping to improve the annotations overall. If you find an incorrect annotation, or the annotation is too generic and is highlighted too often, you can report it by clicking or tapping on the highlighted term and using the Feedback link in the pop-up window. You can also endorse annotations using the Feedback link, if they are useful to you.
Europe PMC is a community platform, open for contributions that enhance our interaction with the scientific literature. SciLite enables text miners to showcase their work to the wider public. We welcome contributions from text-mining and other associated communities and encourage them to share annotations on the SciLite platform. Any text-mining group can participate by providing their annotations in a specific format described below.
If you are a text-mining group and can supply annotations in the format we require (see below), then please send us an email to firstname.lastname@example.org to set up an account. Annotations may be generated on your own local set up, or a virtual machine on the EBI Embassy Cloud could also be used.
We chose the W3C Web Annotation Data Model as an emerging generic standard for web annotations, meaning that any annotations displayed within SciLite can be shared, reused and integrated with other types of annotation such as comments. Once concepts of interest have been identified within the text, they are formatted accordingly, and stored in a triple store via the EMBL-EBI RDF Platform.
We strongly encourage the providers to pay close attention to the URI schemes used.
Data provenance vocabulary: https://www.w3.org/TR/void/.
GeneRIF annotation in RDF
@prefix annotations: <http://rdf.ebi.ac.uk/resource/europepmc/annotations/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix epmc: <http://europepmc.org/articles/> . @prefix oa: <http://www.w3.org/ns/oa#> . @prefix orb: <http://purl.org/orb/> . @prefix provenance: <http://rdf.ebi.ac.uk/dataset/generif_> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix uniprot: <http://purl.uniprot.org/uniprot/> . @prefix void: <http://rdfs.org/ns/void#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
annotations:PMC2761928#8 a oa:Annotation ; void:inDataset provenance:2016-04-29 ; oa:hasBody <http://europepmc.org/articles/PMC2761928#8-b> ; oa:hasTarget uniprot:B3CJ46 .
<http://europepmc.org/articles/PMC2761928#8-b> a oa:SpecificResource ; dcterms:isPartOf [ a orb:Header ; dcterms:hasPart dcterms:title ] ; dc:description "The killing activity of the McbC protein raises the possibility that it might serve to lyse other M. catarrhalis strains that lack the mcbABCI locus" ; oa:hasRole oa:highlighting ; oa:hasSelector <http://europepmc.org/articles/PMC2761928#line=0,148> ; oa:hasSource epmc:PMC2761928 .
<http://europepmc.org/articles/PMC2761928#line=0,148> a oa:FragmentSelector ; rdf:value "line=0,148" ; oa:confirmsTo <http://tools.ietf.org/rfc/rfc5147> .
@prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix purl: <http://purl.org/pav/2.0/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix void: <http://rdfs.org/ns/void#> . @prefix provenance: <http://rdf.ebi.ac.uk/dataset/generif_> .
provenance:2016-04-29 a void:Dataset ; dc:description "GeneRIF produced by Bibliomics and Text Mining group at the HES-SO, Geneva and Europe PMC, EMBL-EBI, Hinxton" ; dc:publisher <http://www.europepmc.org/> ; dc:title "Gene Reference into Function (GeneRIF)" ; purl:importedBy <http://www.europepmc.org/> ; purl:importedOn "2016-05-03" ; purl:version "2016-04-29" ; void:triples "251602" .