About GExplore

The function of the majority of the genes in the C. elegans (and many other) genomes is currently unknown. GExplore is a tool for genome-scale mining of data related to gene/protein function in C. elegans. The database contains selected data sets, which are important indicators for gene and protein function, such as the protein domain organization, gene expression and phenotype data, as well as homology information and gene ontology terms. Data sets were obtained from WormBase and selected publications. GExplore should be useful for quick survey-type queries and experimental planning of genome-scale experiments.

C.elegans Image | Hutter Lab

The Beginning

The first seed for the development of GExplore was planted when the C. elegans genome was sequenced. The number of genes (~20,000) in this small organism was unexpectedly high, and for most genes, functional data were not available. The genome sequence provided the protein sequences. The presence of certain protein domains, such as a kinase domain, allowed a first functional grouping of genes and an overview of the sizes of various gene families. The initial version of GExplore (2009) contained data for the protein domain organization of proteins in combination with early genome-wide expression data (SAGE datasets) and phenotype information from genome-wide RNAi experiments.

Updates and New Databases

Over time, additional datasets were incorporated into the main "Gene" database, and additional databases were established as part of GExplore. Currently, the "Gene" database contains data about protein domain organization, gene description, phenotype, expression, interacting genes, gene ontology terms, and human disease associations.

  • Mutation Database: Large-scale genome sequencing efforts led to an explosion of 'variants' in WormBase, including many single nucleotide polymorphisms (SNPs) from natural isolates. This made it difficult to sort through the increasing number of variants to identify putative function-changing variants. The GExplore mutation database (as of WB292) contains a curated list of 235,873 alleles that specifically affect the coding parts of proteins, allowing for efficient identification of function-altering mutations.
  • Protein Database: A large number of nematode genomes have been sequenced by now. GExplore contains proteome data with domain annotations and homology data for 20 nematode species and five 'model' organisms (S. cerevisiae, D. melanogaster, D. rerio, M. musculus, H. sapiens), enabling quick survey-type searches for protein families that are evolutionary conserved or unique to certain species.
  • Expression Databases: GExplore hosts raw data for three expression datasets, providing expression profiles of different developmental stages (embryo to adult), tissue-specific expression at the L2 stage, and major tissues in the embryo at different time points.

GExplore and WormBase

WormBase (now also available on the Alliance of Genome Resources website) is the central database for information about C. elegans genes. Data in GExplore mostly come from WormBase and GExplore mainly offers an alternative user interface for selected data sets that are also available in WormBase. WormBase's main focus is on DNA-related data and WormBase is optimized to display all information about one gene at a time. GExplore aims to complement WormBase by providing a user-friendly multi-gene display for quick access to data related to gene and protein function.

Contact

For any inquiries, feedback, or support, please contact Dr. Harald Hutter at:
hutter@sfu.ca


Simon Fraser University
Department of Biological Sciences
8888 University Drive
Burnaby, BC, V5A 1S6, Canada

SFU Burnaby Campus

SFU Burnaby Campus

How to Cite GExplore

If you use GExplore in your research, please cite the following publication:

Hutter H, Suh, J
GExplore 1.4: An expanded web interface for queries on Caenorhabditis elegans protein and gene function.
Worm. 2016 Sep 19;5(4):e1234659.