SAP AG (NYSE: SAP) and Technical University Munich (TUM) today announced ProteomicsDB, a new offering based on the SAP HANA platform that stores protein and peptide identifications from mass spectrometry-based experiments. The proteomic data assembled in the new offering resulted in the identification of proteins mapping to over 18,000 human genes. This represents 90 percent coverage of the human proteome. Data stored and analyzed within ProteomicsDB can be used in basic and biomedical research for discovering therapeutic targets and developing new drugs as well as enhanced diagnosis methods.
As personalized medicine is on the rise, the healthcare field is discovering the opportunities of big data analysis. The result of a joint project between the TUM Chair of Proteomics and Bioanalytics, SAP and the SAP Innovation Center, ProteomicsDB is a major step forward in human proteomics. It currently contains more than 11,000 datasets from human cancer cell lines, tissues and body fluids and enables real-time analysis of this highly dimensional data and creates instant value by allowing to test analytical hypothesis.
ProteomicsDB is based on the SAP HANA for rapid data mining and visualization. It has been built to enable public sharing of mass spectrometry-based proteomic datasets as well as to allow users to access and review data prior to publication. The database is backed with 50 TB of storage, 2 TB RAM and 160 processing units. A direct interface to the programming languages L, C++ and R allows more flexible calculations than are possible with standard SQL. The Web interface is built on a JavaScript framework for HTML5 and optimized for Google Chrome but also available under Internet Explorer and Mozilla Firefox. An easy-to-use and fast Web interface allows users to browse and upload data to the repository as well as browsing the human proteome, including protein level information such as protein function and expression.
ProteomicsDB will be available free of charge.