IBM unlocks secrets of life
Computing visits Big Blue's life sciences research centre in New York to see the latest developments
Over the next 10 years, IT is expected to herald a revolution in the drugs and medical industries.
IBM's life sciences team is at the forefront of developments that will play a major role in this new era of information-based medicine. Computing last week visited its New York research centre to see the latest work.
'More has happened, faster than expected, in turning data into practical solutions,' said IBM life sciences solutions general manager Carol Kovac.
'While this is mostly a long-term project, the last three years has seen an increase in activity, and many more solutions come to market.'
Over the last decade, scientists have developed various tools for pattern matching and discovery and creating new techniques for data mining. The potential rewards are enormous.
'If you can add molecular data to existing clinical trials, you can make research more specific and then target drugs far more specifically to individual user's needs,' said IBM life sciences clinical genomics team leader Kareem Saad.
IBM is working with a range of universities and other partners to study a variety of diseases in depth.
'What was fascinating for me to discover was that in groups of patients with pulmonary conditions [lung disease], 50 per cent of them have a specific risk, such as smoking. But the other half had nothing, no specific risk that could lead to the same disease,' said Saad.
'The same, although in slightly less than half of the patients, holds true for cardiovascular disease and breast cancer. We hope to make a move to specific subsets of major diseases, rather than stay with our current "one size fits all" solution for drugs.'
Saad says that IT is a critical component in achieving more personalised medicine, but it also requires a change in the whole industry.
'IT is just an enabler to the new way of working with drugs,' he said.
Dealing with the genome Ajay Royyuru, manager for IBM's computational biology centre, says the sequencing of the human genome has shifted the field of biology considerably.
'The old way for biologists to learn was top down, like high school students dissecting a frog to see what's inside. The new way is to look at an object's genome sequence and see what the precise makeup of it is,' he said.
But the genome is just the beginning of a long journey.
'Getting the genome sequence is like getting a book filled with all the part numbers for a Boeing. You have no idea what those parts are, what they do, or how they interact with other parts, but it does give you a starting point to work up from. It enables biologists to build models based on the basic information.'
The challenge for IBM is figuring out how IT can be used to help transform the massive amounts of data being discovered into meaningful knowledge.
Royyuru says IT is being used to help sequence assembly algorithms, perform molecular modelling and structure prediction and generate complex cell and organ simulations.
IBM researchers are drawing on a diverse range of disciplines, such as knowledge management, visual data analysis, grid computing and data mining and management.
Pattern discovery Bioinformatics and pattern discovery group manager Isidore Rigoutsos says biotech opens up a new set of challenges for IT.
'At every level of the biological hierarchy there's tons of data to be analysed. And as we aren't able to query for "like" text, because we often don't know what we're searching for, this kind of data requires a change in the way that queries are performed.'
Along with the numerous algorithms for pattern and association discovery that IBM has developed, it has also created an online library, called Bio-dictionary.
'These tools allow researchers to input an unknown genetic sequence, analyse it to find patterns and then compare those results to known sequences in the dictionary that have been discovered by others, which provides a tremendous help to researchers,' said Rigoutsos.
IBM is making many of the tools freely available for use on its website.
IT infrastructure All this begs the question of how IBM plans to make any revenues from its research.
'There is a major need for IT infrastructure, including high-performance hardware and more advanced software tools, which make up about half of IBM's life sciences revenues,' said Kovac.
The other half is mostly made up of specialist services that the company sells to firms in the biotechnology sector.
Biotech research is also particularly compute-intensive, which led IBM to start research dedicated to building Blue Gene, the world's most powerful supercomputer.
In time, the unit's research into advanced servers will pay off with new technologies that can be integrated with the other server products.
And if, as predicted, life sciences become the hot technology for the 21st century, IBM should see major demand for its on-demand IT infrastructure, bespoke tools, and the substantial expertise it has developed.
Blue Gene In 1999, IBM Research announced its five-year, $100m Blue Gene project to build a petaflop (1,000,000,000,000,000 operations per second) supercomputer to handle complex simulations such as protein folding.
The first release, Blue Gene/L, is planned for the end of 2004 and will feature 65,536 computing nodes, each containing two processors, four maths engines and 4MB of memory.
IBM says the Linux-based Blue Gene/L will be equal to the power of all the top 500 existing supercomputers together - but only about a third as powerful as the Blue Gene/P release planned for late 2006.