Search engine

Top   Previous   Next

BioUML workbench provides 3 types of search engine for working with databases:

· data search (filter) - this search engine maps database content into Java objects and filters these Java objects according filtering condition for each property, for example name="e;TP53"e;.
· full text search - the search engine uses Lucene full text search engine. For this purpose the database content is also mapped into Java objects and then these Java objects are indexed by Lucene. Due to index usage this search engine is much faster then data search using filters.
· graph search - this search engine finds interacting pathway components and displays result as an editable graph.

Data search (filter)

This is default search engine that is available for any database.The search engine maps database into Java objects and filters these Java objects according filtering condition for each property, for example: field Complete name contains 'acid' (Figure 1.6). Regular expressions can be used for text values. This search engine works quite slowly because it scans all database objects of corresponding type (for example, gene, protein or substance).

 

data-search-kegg

Figure 1.6 . Data search dialog for KEGG/Compound, left pane - search results, top right pane - filtering conditions, bottom right pane - detailed description of substance selected in the table.

Full text search

This search engine uses Lucene full text search engine - http://lucene.apache.org/

 

Algorithm:

· the database content is mapped into Java objects;
· each Java object corresponds to Lucene document, object properties correspond to document fields. Administrator can specify what data types and properties will be indexed by Lucene;
· user query is parsed and executed by Lucene. Search results is set of identifiers for database objects and values of indexed fields.
· search results can be shown as a table.

 

BioUML workbench provides following interface for working with full text search engine:

1. Full text search dialog (Figure 1.7)
2. Full text search tab (Figure 1.8)

 

full-text-search-kegg
Figure 1.7 . Full text search dialog for KEGG/Compound, left pane - search results, top right pane - search conditions, bottom right pane - detailed description of substance selected in the table.

 

full-text-search-kegg-tab

Figure 1.8 . Full text search tab for KEGG/Ligand database, Compound table(bottom right pane).

Graph search

Graph search engine finds interacting pathway components and displays result as an editable graph.
 

To start the search user should specify start node (for example, protein AP-1 on Figure 1.9) and search conditions: what type of interactions should be found and depth of search.
 
Graph search supports interactive search and incremental graph layout - a user can select any node on the graph (for example, gene IL-6 on Figure 1.9) and find other biological objects in the database that interact with it. These objects will be shown in left bottom pane in tabular form. The user can add these nodes and corresponding edges to the diagram. Graph layout will locate new nodes and edges automatically preserving location of previous diagram elements (Figure 1.10).

 

graph-search

Figure 1.9 . Graph search dialog. Top left pane - search results that are displayed as an editable graph, top right pane - search conditions, bottom left pane - results of interactive search for the selected node, bottom right pane - detailed description of the selected node.

 

graph-search-2

Figure 1.10 . Interactive graph search, here biological objects that interact with gene IL-6 were added.