Issues in Management and Analysis of Genomic Data at Large Scale
Mr. Matthew Trunnell
Acting Director, Advanced IT, Broad Institute of MIT and Harvard
One of the Broad Institute's core missions is to discover, develop and optimize the critical technologies needed to obtain and analyze the massive amounts of genomic data being generated by scientists at the Broad and around the world. Our adoption of second-generation sequencing technologies over the past three years has driven a 25-fold growth in the size of our data repositories, placing new pressures on our IT infrastructure and motivating different approaches to data analysis. This talk will discuss our ongoing adjustment to this multi-petabyte scale with emphasis on data management and incorporation of data-intensive computing techniques such as map/reduce into our data production and analysis processes.