Genes, Genomes and Variation

Genome databases and browsers

Technologies
mysql, javascript, perl
Topics
web, genomics
Genome databases and browsers

Ensembl was created alongside the publication of the first draft of the human genome, in 2001, to distribute this goldmine of information to scientists across the world. It quickly became and remains one of the most important reference databases in genomics, following the rapid development of the field. Its initial mission included finding all of the genes in the human genome. A year later, the mouse genome was published and we developed tools to directly compare genomes across species. Over the following decade, sequencing capacity increased exponentially (faster that Moore's Law in fact) and large surveys started examining more species and more individuals within each species. Our mission therefore expanded to store these datasets and statistics efficiently. Finally, in recent years, sequencing has been used to study the biochemical activity of the DNA molecule within the different tissues of an individual, prompting us to extend yet again our remit.

At the same time, Ensembl is an evolving software development project. Over 15 years, we moved from a central relational MySQL database with a Perl API and static web pages, to an array of storage technologies with a RESTful interface and an interactive front-end. We have dedicated portals for the large clades on the tree of life (known as Ensembl Genomes). Our annotations are produced through centuries of CPU time, coordinated by our powerful eHive analysis workflow manager.

Today, we are a team of nearly 60 full time staff, housed at the European Bioinformatics Institute, and we collaborate with many external contributors around the world, in particular via our Github repositories where you can see us work day-to-day. We are at the intersection of two exciting and rapidly expanding fields, and there is no lack of interesting directions to push the project.

2017 Program

Successful Projects

Contributor
Chanaka De Silva
Mentor
Andy Yates
Organization
Genes, Genomes and Variation
Project Proposal Ensembl Track Database Mentors : Andy Yates, Stephen Trevanion
Ensembl is a data warehousing and data sets project which helps to analyse advanced biological data. Simple this application displays a 1 dimensional...
Contributor
Stefan Dvoretskii
Mentor
Dan Staines
Organization
Genes, Genomes and Variation
Data file search API
The Ensembl and Ensembl Genomes projects provide genomic data for over 40 thousands genomes. Considering even the increasing number of this data, it...
Contributor
Arpit Agarwal
Mentor
Paul Kersey
Organization
Genes, Genomes and Variation
Strain Differential
The objective of the project is to get a scrollable web view for gene variations after changing the reference strain of genome data in real time. The...
Contributor
Sourabhreddy Medapati
Mentor
Steve Trevanion
Organization
Genes, Genomes and Variation
Enabling large file format support in Genoverse
The motive is enable support for large binary file formats in Genoverse, a genome browser that is written in javascript. This project would involve...
Contributor
Rachel Slater
Mentor
Carlos Garcia Giron, Fergal Martin
Organization
Genes, Genomes and Variation
Gene Visualisation in Mixed Reality with Microsoft HoloLens.
This project would involve building an application allowing genes to be visualised in ‘mixed reality’ with the Microsoft HoloLens. Mixed reality...