|
Translating the massive increases of data related to biomedical
research into improved health care will require improved methods
of data management and software tool development, according
to two National Institutes of Health reports the
The Biomedical Information Science and Technology Initiative,
prepared by the Working Group on Biomedical Computing Advisory
Committee to the NIH Director, and the NIH
Roadmap for Accelerating Medical Discovery to Improve Health:
Bioinformatics and Computational Biology. Both documents
recommend employing advanced data management technologies
for developing interoperable biomedical databases and software
engineering principles for delivering robust and reliable
tools for biomedical research.
Data management and bioinformatics software development challenges
are also discussed in the U.S. Department of Energy's (DOE)
Genomes to Life (GTL) program report User
Facilities for 21st Century Systems Biology: Providing Critical
Technologies for the Research Community. GTL envisions
different types of facilities generating data that would be
organized in a variety of databases, including expression,
proteomic, protein-function, chemistry, and pathway databases.
Data will be collected, archived, and passed through a number
of processing stages, including data annotation and integration,
whereby a "seamless and effectively centralized capability
to deal with data" in the form of data centers collecting
and integrating effectively large scale biological data is
seen as key to GTL's success.
To help meet these needs, the Biological Data Management
and Technology Center (BDMTC) was established in January 2004
in the Computational Research Division (CRD) at DOE's Lawrence
Berkeley National Laboratory (LBNL). BDMTC will serve as a
source of expertise in and provide support for data management
and bioinformatics tool development projects at the Joint
Genome Institute (JGI), Life
Sciences and Physical
Biosciences Divisions at LBNL, Biomedical
Centers at UCSF, and other similar organizations in the
Bay Area. BDMTC will enable collaborating organizations to
share experience, expertise, technology, and results across
projects. BDMTC will employ industry practices in developing
data management systems and bioinformatics tools, while maintaining
academic high standards for the underlying data generation,
interpretation, and analysis methods and algorithms.
BDMTC will provide support in addressing key data management
challenges, including:
- the massive increase in the amount and range of biological
data,
- the difficulty of quantifying the quality of data generated
using inherently imprecise tools and techniques, and
- the high complexity of integrating data residing in diverse
and sometimes poorly correlated repositories.
BDMTC's strategy involves using existing technology and methods,
adapted as needed to a specific application, in order to address
immediate data management and bioinformatics requirements.
Cost effectiveness and the ability to take advantage of rapid
technological advances without loss of quality, time or cost
will be built into solutions that are inherently evolving.
Critical data management problems that cannot be resolved
using existing technology will be pursued as part of longer
term R&D activities.
BDMTC is led by Victor M. Markowitz,
who until recently was CIO and Senior VP, Data Management
Systems at Gene Logic,
where he was responsible for the development and deployment
of the data management and analysis platform for the company's
gene expression data. Prior to joining Gene Logic in 1997,
he was a staff scientist at LBNL, where he led the development
of data management tools applied to biological databases.
BDMTC is currently involved in data management projects at
JGI and in the UC Berkeley proposal for an NIH National Center
for Biomedical Computing, where it provides the infrastructure,
data management, and software development cores. BDMTC aims
to become a QB3 affiliated
center.
|