Consensus Annotation
This page was posted with the release of
version 1.6 and pertains only to this and subsequent versions.
This page steps through an example of how to use the 'consensus mode' in Knowtator. The example is distributed with Knowtator and is located at:
<protege-home>/plugins/edu.uchsc.ccp.knowtator/examples/consensus-set-example
We will refer to this directory as <example-dir>.
The example consists of a simplified version of an annotation task in which molecular transport events were annotated. Molecular transport is a process in which
proteins and other macro-molecules are moved (transported) from one part of a cell to another. The annotation schema in this example provides a very simple
model of molecular transport in which a class called 'transport' has slots 'origin' and 'destination' which are constrained to be of type 'cellular component'.
The transport class also has slots 'transported entities' and 'transporters' which are constrained to be of type 'molecule'. The annotated examples are not
necessarily biologically accurate (or even close.) They are for example purposes only.
This example picks up in the middle of an example annotation effort. The annotation schema has already been defined, the text sources selected, and assignements
given to Jane and John to annotate the text sources individually. They have annotated the same 10 sentences individually and now we wish to create a consensus
set of annotations in which Jane and John's annotations are merged and their differences resolved.
The <example-dir> folder has the following directories:
- consensus-annotations - this is a nearly empty directory where we will create an annotation project that consists of Jane's and John's annotations and a
consensus set of annotations in which all differences between their annotations will be adjudicated. The file 'textsources' has been copied in as
a convenience.
- janes-annotations - contains an annotation project that has all of Jane's annotations.
- johns-annotations - contains an annotation project that has all of John's annotations.
- reference-project - contains the reference annotation project and contains no annotations. This project was copied into Jane's and John's folders
before they began annotating individually (using Protege->Menu->Save Project As...)
Before we start creating the consensus set, please open the three annotation projects in turn and observe the following:
- <example-dir>/reference-project/molecular-transport.pprj
- There are no annotations in this project. This project was created first before any annotation was performed by John or Jane. This project
can be thought of as a reference because all of the other annotation projects were copied from this one.
- Open the configuration dialog with Menu->Knowtator->Configure. Note that the 'default annotator' is not set and neither is 'default annotation
set'.
- <example-dir>/janes-annotations/janes-annotations.pprj
- Each text source has annotations created by Jane.
- Open the configuration dialog with Menu->Knowtator->Configure. Note that the 'default annotator' is set to 'Jane' and the 'default annotation
set' is set to 'Individually Annotated'. This configuration was done before Jane began annotating so that all annotations that she created would
have these two values filled in automatically.
Merge Jane and John's Annotations
The first step is to create a new annotation project from the reference project and import the annotations from Jane and John's annotation projects. Do the
following:
- open <example-dir>/reference-project/molecular-transport.pprj
- Menu->File->Save Project As... (see note on using this menu option)
- Save as <example-dir>/consensus-annotations/consensus-annotations.pprj
- Menu->Knowtator->Merge annotations
- select <example-dir>/janes-annotations/janes-annotations.pprj
- select the filter labeled "Jane's annotations"
- select all 10 instances of "file line text source"
- Menu->Knowtator->Merge annotations
- select <example-dir>/johns-annotations/johns-annotations.pprj
- select the filter labeled "John's annotations"
- select all 10 instances of "file line text source"
- save, close, reopen <example-dir>/consensus-annotations/consensus-annotations.pprj.
(see bug report)
Now Jane and John's annotations are in the same project sitting side by side.
Create a filter that defines the consensus set
It is possible that you could merge other annotations into this project, but
we want to create a consensus set on just these annotations. To do this we need to create a filter that defines the annotations to be added to the consensus set.
Do the following:
- Create an instance of "knowtator filter" by following the detailed instructions
here.
- Name the filter "Jane and John's individually annotated annotations"
- The 'annotators' of the filter should be set to 'Jane' and 'John'
- The 'sets' of the filter should be set to 'Individually Annotated'
If you are following the instructions carefully, then this filter defines all of the annotations that are currently in the consensus-annotations project.
However, the consensus set creation step we are about to perform will create annotations that do not satisfy this filter (by design).
NOTE: Creating and using an annotation set is important for consensus mode.
For consensus in your own annotation project, you should add
all of your individually annotated annotations to a set that is different than the consensus set (that is about to be created below). This can be done in batch mode
as explained here.
Create a consensus set
To create the consensus set do the following:
- Select Menu->Knowtator->consensus mode
- A wizard labeled 'consensus set creation wizard' should appear if no consensus set has yet been created for this project. Click 'next'.
- Enter 'consensus set' and click 'next'.
- Click 'select filter' and select the filter labeled
Jane and John's individually annotated annotations
. Click 'next'.
- Click 'add' and select all 10 text sources. Click 'create'.
The following is a listing of all the annotations that are now in the project:
- Jane's original annotations (those annotations whose annotator is 'Jane' and annotation set membership is 'Individually annotated').
- John's original annotations (those annotations whose annotator is 'John' and annotation set membership is 'Individually annotated').
- Consensus annotations (those annotations whose annotation set membership is 'consensus set'). This set was created by copying all of Jane's
and John's annotations, assigning them to the 'consensus set' and then consolidating pairs of annotations (one from Jane and John, respectively) where
there is 100% agreement. Consolidated annotations have the annotator 'consensus set annotator team'.
Consensus Mode UI
If you have successfully created a consensus set, then Knowtator should look like the following (partial) screenshot.
Here are a few things to observe about this image:
- There is a label 'consensus mode' which tells you that you are in consensus mode. If you do not see this label, then you are not in consensus mode.
- The button labeled 'create' allows you to create a new consensus set in the currently opened annotation project. This doesn't make too much sense in this
example because there are only annotations from two individuals.
- The button labeled 'restart' allows you to completely restart consensus work for the current text source. This may be useful if you end up making a
bunch of changes that you do not like and want to discard them all.
- The button labeled 'quit' will take you out of consensus mode by selecting the next active filter that is not a consensus set filter.
- The button labeled 'consolidate' allows you to delete an unwanted annotation and specify the preferred annotation in a single step. To delete the currently
selected annotation and bless another annotation, simply click the 'consolidate' button and choose the "blessed" annotation.
- The progress of consensus on the current text source is provided in a message that looks like this:
.
The numerator is the number of annotations in the consensus set that have the annotator 'consensus set annotator team' - this number will increase as more
annotations are reviewed. The denominator is the
number of annotations in the consensus set - this number will typically decrease because the number of annotations in the consensus set is not fixed.
- The progress of consensus for a given class is provided in the annotation schema and looks like this:
.
The numerator is the number of annotations of that type in the consensus set that have the annotator 'consensus set annotator team'. The denominator is the
number of annotations of that type in the consensus set.
Working in Consensus Mode
Now Jane and John can sit together and resolve their differences (or an adjudicator can do the same). A text source is considered 'done' when all of the annotations
in the consensus set have been assigned the annotator 'consensus set annotator team'. An annotation can be assigned this annotator in one of four ways:
- direct assignment - the annotator of the currently selected annotation can be manually reassigned to the 'consensus set annotator team'.
- automatic consolidation - if an annotation is editted such that it becomes exactly the same as another annotation, then they will automatically be
consolidated. Note that "exactly the same" means that two annotations must have the same span, the same type, and all of their slot values must be "exactly the same".
- manual consolidation - this is accomplished with the button labeled 'consolidate' which is described above.
- a new annotation is created - if a new annotation is created and the defaul annotator is set to 'consensus set annotator team', then the new annotation
will be considered 'done.'
An unwanted annotation may also be deleted in which case there is one less annotation that has not yet been assigned the 'consensus set annotator team' annotator.
Because the original individually annotations are preserved, it is easy to restart consensus work for a text source by clicking the 'restart' button. This will
simply recreate the original consensus set annotations for that text source.
Default annotator and annotation set
When the consensus set is first created, Knowtator's configuration is updated such that the default annotator is the consensus team annotator and the default annotation
set is the consensus set. However, after that Knowtator makes no attempt to keep the configuration synchronized with the filter that is currently selected.
For example, if you have the 'consensus set filter' selected and you change the filter to be "Jane's annotations", the default annotator and default annotation
set will not be changed. This may not be desirable if Jane is going to spend time editing her annotations. Make sure that when you change filters that the
configuration settings for the default annotator and default annotation set are what you expect them to be. To learn more about configuring Knowtator, go
to
here.
Examples
The following lists things to observe for each text source in our working example:
- 10364 - Jane and Joe's annotations are identical. Six annotations should collapse to 3 annotations in the consensus set.
- 10380 - The entrez_gene_id's of the two protein mentions differ. This propagates to the two transport mentions because the transported entity slots
are different because the protein mentions are different. Therefore, the resulting consensus set will have four annotations because none of the
annotations are consolidated. Changing one of the entrez_gene_id values on one of the NLS annotations so that both NLS annotations have the
same entrez_gene_id will result in the two NLS annotations to be consolidated into a single annotation. The annotator of this annotation
will be the corresponding team annotator. This change will propogate to the "processing" annotations which are now identical.
These two will also be consolidated.
- 10470 - The protein annotations came in matching pairs and were consolidated when the consensus set was made. The transported entities of the
two transport events are different so the transport annotations will not be consolidated. However, note that the values for both annotations
point to the consolidated proteins. Changing one of the annotations so that 'transported entities' slots have the same values results in a
consolidation. For example, remove 'Grb2' from John Deer's transport annotation.
- 1060 - Entrez ids for DNup88 proteins differ so they will not be consolidated, nor will the transport events. Changing the values of the
entrez ids so that they are the same results in consolidation of the transport events. The annotations for 'DNup214' will collapse to a single
annotation during the creation of the initial set of consensus annotations.
- 10837 - Jane has nearly duplicate protein annotations one of which is the transported entity of the transport annotation -
the one that matches the transported entity of John's transport annotation. Therefore, there will be three annotations
because four of the original annotation will collapse to two. Ideally, Janes redundant protein annotation will be removed manually.
- 11030 - The D6 annotations differ because the mentioned classes differ.
The transport annotations differ because they have different spans, they have different mentioned classes and because their slot values differ.
The 'beta-arrestin' annotations differ because the mentioned classes differ.
The initial consensus set should have 7 annotations - one for each of the individually annotated annotations.
Changing the 'annotated class' for annotations in the consensus set will cause them to be consolidated if done so that they annotations match. However,
it is much quicker to use the 'consolidate' button. If the single 'ligand' annotation is deemed as being good, then the annotator will need to be manually
changed to the 'consensus set annotator team'.
- 11135 - This one demonstrates that slot values of slot values will consolidate correctly. If the annotation corresponding to '1' and 'F1' are consolidated,
then the two complex annotations will be consolidated which in turn will cause the two tranpsort annotations to be consolidated.
- 11143 - There was a bug that this set of annotations exposed. It is now fixed and consolidation works as expected.
- 1116 - I wanted to test the behavior of the 'consolidate' button when an annotation with slot values
(whose values don't match) is consolidated. Do the following:
- select Jane's 'import' annotation and click the button labeled 'consolidate'
- choose the other 'import' annotation to consolidate with
- observe the 'transported entities' slot of consolidated 'import' annotation
- select either 'bax' annotation and consolidate with the other using the 'consolidate' button.
- observe the 'transported entities' slot of the 'import' annotation.
Maintained by Philip V. Ogren.
This file last modified Wednesday, 17-Dec-2008 21:15:36 UTC