PROVIDENCE, R.I. [Brown University] — Brown University computer scientists have developed a new interactive tool to help researchers and clinicians explore the genetic underpinnings of cancer.
The tool — dubbed MAGI, for Mutation Annotation and Genome Interpretation — is an open-source web application that enables users to search, visualize, and annotate large public cancer genetics datasets, including data from The Cancer Genome Atlas (TCGA) project.
“The main motivation for MAGI has been to reduce the computational burden required for researchers or doctors to explore and annotate cancer genomics data,” said Max Leiserson, a Ph.D. student at Brown who led the development of the tool. “MAGI lets users explore these data in a regular web browser and with no computational expertise required.”
In addition to viewing TCGA data, the portal also allows researchers to upload data they may have collected on their own and compare the findings to those in the larger databases.
“Over the last decade, researchers working with TCGA have sequenced genes from thousands of tumors and dozens of cancer types in an effort to understand which mutations contribute to the development of cancer,” said Ben Raphael, director of Brown’s Center for Computational and Molecular Biology, who helped oversee the project. “At the same time, as sequencing has gotten faster and cheaper, individual researchers have begun sequencing samples from their own studies, sometimes from just a few tumors.”
By uploading their data to MAGI, researchers can leverage the large public datasets to help interpret their own data.
“In cancer genomics, there’s real value in large sample sizes because mutations are diverse and spread all over the genome,” Raphael said. “If I had just sequenced a few cancer genomes from my local tumor bank, one of the first things I’d want to do is compare my data to these big public datasets and look for similarities.”
MAGI has data from TCGA already loaded. Users can search by cancer type, by individual genes, or by groups of genes. The output offers several ways of visualizing the search results, showing how often a given gene is mutated across samples, what types of mutations they were, and other information.
Those same search and visualization capabilities are available for user-uploaded data, which enables researchers to look at their own data side-by-side with TCGA data. Users can also annotate TCGA data, appending new findings, academic papers and other relevant information.
“When someone uploads data to MAGI, they can use the public data to help them interpret their own dataset,” Raphael said. “But in the process, they might also be able say something about the public data. We thought: wouldn’t it be great if users could record that information and share it?”
The MAGI project started as a means of looking at the output from algorithms that Raphael’s lab develops. Those algorithms comb through large genome datasets, helping to pick out the mutations that are important to cancer development and distinguishing them from benign mutations that are just along for the ride.
“As we were developing tools to visualize our own results, we realized that other researchers might also find these tools useful,” Raphael said. “We decided to develop a public portal for the cancer genomics research community.”
The lab is making MAGI available for free, with the hope that many in the cancer genomics community will take advantage of it.
“We think this could be a really useful piece of software,” Raphael said. “There’s great value in just being able to look at these data. We hope MAGI will lead to some new discoveries.”
Other contributors included Ph.D. students Hsin-Ta Wu and Connor Gramazio, undergraduate Jason Hu, and David Laidlaw, professor of computer science.
Raphael and his colleagues describe MAGI in a correspondence published in the June issue of Nature Methods. The work was supported by the National Institutes of Health (grants R01HG005690 and R01HG007069).