PALO ALTO, Calif., April 06, 2016 -- Cloudera, the global provider of the fastest, easiest, and most secure data management and analytics platform built on Apache Hadoop and the latest open source technologies, today announced a collaboration with the Broad Institute of MIT and Harvard, the world’s leading biomedical and genomic research center. The two organizations are working together this year to advance the development of Broad’s next generation Genome Analysis Toolkit, GATK4.
Cloudera Enterprise accelerates life sciences research and drug discovery by putting real-time data into the hands of the clinicians, researchers, and providers focused on personalizing the patient experience. Building the fourth generation of GATK (GATK4) on Cloudera Enterprise and utilizing the Apache Spark distributed computing framework to speed research, the Broad Institute is facilitating better understanding of genomic sequencing, resulting in faster data exploration and ultimately empowering better clinical decisions.
Since the Human Genome Project produced the first draft sequence of the human genome in 2000, the cost of sequencing has dropped exponentially, from around $100 million USD per genome to around $1,000 USD today. Over the same period, we have seen massive growth in the storage and processing capabilities of big data technologies like Hadoop.
“This lower cost of genome sequencing and advancement in big data technologies means that we can afford to sequence the genome of patients very broadly and produce datasets that have never been available before,” said Shawn Dolley, industry leader of life sciences at Cloudera. “Building the next generation toolkit on Spark greatly accelerates in-memory computations and facilitates parallelism. Cloudera Enterprise expedites round-trips to access and compute data for data discovery, translating into significant reductions in R&D time. This will have a very meaningful scientific upside.”
Presently there are more than 31,000 registered users of the GATK. Broad Institute is working with collaborators to develop cloud-hosted options to expand access and facilitate usage of genome analysis tools for even more powerful insights and decision-making. Users could also more easily create best-practice pipelines and avoid duplicating infrastructures.
“Utilizing the Spark computing framework on Cloudera Enterprise gives us the ability to implement tools that were not possible in GATK3 due to their computational complexity,” said Dr. Eric Banks, senior director of Data Sciences and Data Engineering at Broad and a creator of the GATK software package. “On Cloudera Enterprise, we can now run analysis of genomic data two orders of magnitude faster than in previous versions of GATK, enabling faster iterative analysis for propelling genomic innovation.“
About Cloudera
Cloudera delivers the modern data management and analytics platform built on Apache Hadoop and the latest open source technologies. The world’s leading organizations trust Cloudera to help solve their most challenging business problems with Cloudera Enterprise, the fastest, easiest and most secure data platform available for the modern world. Our customers efficiently capture, store, process and analyze vast amounts of data, empowering them to use advanced analytics to drive business decisions quickly, flexibly and at lower cost than has been possible before. To ensure our customers are successful, we offer comprehensive support, training and professional services. Learn more at http://cloudera.com.
Connect with Cloudera
About Cloudera: cloudera.com/content/cloudera/en/about/company-profile.html
Read our blogs: blog.cloudera.com/ and vision.cloudera.com/
Follow us on Twitter: twitter.com/cloudera
Visit us on Facebook: facebook.com/cloudera
Join the Cloudera Community: community.cloudera.com
Cloudera, Cloudera's Platform for Big Data, Cloudera Enterprise Data Hub Edition, Cloudera Enterprise Flex Edition, Cloudera Enterprise Basic Edition, Cloudera Navigator Optimizer and CDH are trademarks or registered trademarks of Cloudera Inc. in the United States, and in jurisdictions throughout the world. All other company and product names may be trademarks of their respective owners.
###
Karina Babcock Cloudera [email protected] +1 (650) 644-3900


AEVEX Raises $320 Million in IPO Amid Surging Defense Sector Demand
Texas AG Investigates Lululemon Over "Forever Chemicals" in Activewear
Federal Judge Dismisses DOJ Lawsuit Attempting to Block Hawaii's Climate Case Against Oil Giants
CATL Stock Hits Record High After Q1 2025 Earnings Surge
OpenAI's $20 Billion Cerebras Deal Signals Massive AI Infrastructure Push
Samsung Races to Deliver Next-Gen HBM4E Memory Samples to Nvidia
Goldman Sachs FICC Revenue Falls 10% Amid Iran War Market Volatility
Elliott Investment Takes ~3% Stake in Daikin, Pushes for Buybacks and Strategic Overhaul
TSMC Posts Record Q1 Profit Fueled by AI Chip Demand
Japan Opens Arms Export Floodgates: New Policy Draws Global Defense Interest
Uber Bets Big on Autonomous Vehicles with $10 Billion Commitment
Federal Agencies Secretly Test Anthropic's AI Despite Trump Administration Ban
Japan to Subsidize Sony's Image Sensor Plant in Kumamoto with $380 Million
Daikin Industries Stock Surges 14% After Elliott Investment Management Discloses Major Stake
NiSource Signs Long-Term Energy Deals with Alphabet and Amazon to Power Indiana Data Centers
Elon Musk's Terafab Foundry Courts Top Chipmaking Giants for AI Self-Sufficiency Push
DEEPX Partners with Hyundai to Power Next-Gen AI Robots Ahead of IPO 



