Genomics & Bioinformatics
The CBT works directly with The Institute for Genome Sciences (IGS) for its bioinformatics and genomics work. Such work involves identifying mutations and biological pathways that are “drivers” for disease progression. Such analyses are used to generate experimental protocols/methods for identifying/targeting pathogens and/or testing the targeting of potential targets from the human genome for drug development. The relevance of these targets in disease are then tested directly in the TVS group. The IGS was established within the University of Maryland, School of Medicine on May 1, 2007, houses an inter-disciplinary, multi-departmental team of collaborative investigators with a broad spectrum of research programs related to the genomics of infectious disease agents, human microbial metagenomics, and bioinformatics. The Institute is led by Claire M. Fraser-Liggett, Ph.D., one of the world’s preeminent genome scientists and previously the Director and President of The Institute for Genomic Research (TIGR).
IGS is currently located at the UMB BioPark, a biomedical research park located adjacent to the University of Maryland Medical Research Complex. Within the BioPark, the Institute occupies part of the fifth floor and the entire sixth floor encompassing ~38,000 sq ft of total laboratory, office, conference, and interactive gathering space. The newly constructed facility represents a 40 million dollar commitment to genomic research for the School of Medicine. IGS faculty have access to all other Core facilities at UMSOM including Biosafety Level-3 (BSL-3) laboratories, animal facilities, proteomics core, a large bio-imaging facility, clinical specimens and a General Clinical Research Center (GCRC).
GS Genomics Resource Center (GRC)
The Genomics Resource Center is a high-throughput core laboratory and data analysis group using state-of-the-art technology to generate high quality genomic data in a cost effective manner. The GRC staff consists of scientists, bioinformatics software engineers, bioinformatics analysts and research specialists to support these activities. The GRC occupies 6,500 sq ft of space within the newly completed BioPark II facility on the UMSOM campus. This BSL-2 research space consists of wet laboratory space, a sequencer facility, a cryo-storage facility, a cold room, a dark room, a reagent storage facility, conference rooms, and office space.
IGS Informatics Resource Center (IRC)
The Informatics Resource Center (IRC), under the direction of Owen White, Ph.D., provides genome annotation and analysis services as well as IT infrastructure, web, and database services to IGS Investigators. The IRC is staffed by scientists, computational biologists, software engineers, and IT specialists to support these activities. The IRC has built a state-of-the-art computational infrastructure that includes a computational grid, an internal 10-gigabit network, clustered database servers, and a hierarchical storage management system. IGS is connected by a high-performance switched gigabit network, powered by Cisco equipment, to the rest of the campus. All UMB buildings are connected to the LAN backbone and core switches via fiber cabling. IGS, as part of UMB maintains a 10 Gbps link to the National Lambda Rail (NLR), a 1Gbps connection to Internet2, the high-speed network designed to facilitate collaboration and communication among research institutions, as well as the aggregated bandwidth of 20 Mbps to the regular Internet network. The computational grid is built primarily around six high-performance high-memory multi-processor machines (64-256 GB RAM, 4 CPU multi-core processors) for memory and compute intensive applications and ninety high throughput computational nodes (16 GB RAM 2 CPU multi-core Intel Xeon processor machines) for running distributed applications such as BLAST, HMMsearch, MCMC samplers, etc. The grid scheduling is managed by Sun N1 Grid Engine (SGE) distributed computing system.
To address the ever expanding data sets generated by next generation genome sequencing technologies at a reasonable cost we have deployed a hierarchical storage infrastructure consisting of 3 tiers of random access storage and a fourth tier of serial access tape media storage for archival and data backup. Tier 1 storage is grid-attached storage where most of the computational activities will occur. Tier 1 storage includes a traditional NAS based storage with a capacity of 30 TB that currently support over 400MB/s throughput. In addition we have deployed scalable high performance parallel file system from Isilon and Panasas with a capacity of 350 TB that can currently support an aggregate throughput of over 3GB/s. Tier 2 storage is a high performance storage that hosts mission critical data for active projects and has a current capacity of 75 TB with 125 MB/s IO throughput. Tier 3 storage is near-line storage used for hosting less frequently used data that still needs to have random access. This tier currently has a capacity of 118 TB with an aggregate throughput of 35 MB/s. Tier 4 storage is off-line storage that is used for data backups as well as archival data such as the raw data generated by sequencers. This tier is built around tape library and is integrated with Tier 2/3 storage to provide daily, weekly, monthly, and annual backups.
Institute for Genome Sciences
801 W. Baltimore Street
BioPark II, 6th floor
Baltimore, MD, 21201