Nature (440:7083) has a Microsoft sponsored freely available special feature on the future of scientific computing – the articles focus on sensor networks, database management, and automated learning.
Declan Butler 2020 computing: Everything, everywhere
Data networks will have gone from being the repositories of science to its starting point. When researchers look back on the days when computers were thought of only as desktops and portables, our world may look as strange to them as their envisaged one does to us. Although we might imagine a science based so much on computing as being distanced from life’s nitty gritty, future researchers may look back on today’s world as the one that is more abstracted. To them the science practised now may, ironically, look like a sort of virtual reality, constrained by the artificialities of data selection and lab analysis: a science not yet ready to capture the essence of the real world.
Computer scientist and science fiction writer Vernon Vinge writes on the internet as The Creativity Machine
All this points to ways that science might exploit the Internet in the near future. Beyond that, we know that hardware will continue to improve. In 15 years, we are likely to have processing power that is 1,000 times greater than today, and an even larger increase in the number of network-connected devices (such as tiny sensors and effectors). Among other things, these improvements will add a layer of networking beneath what we have today, to create a world come alive with trillions of tiny devices that know what they are, where they are and how to communicate with their near neighbours, and thus, with anything in the world. Much of the planetary sensing that is part of the scientific enterprise will be implicit in this new digital Gaia. The Internet will have leaked out, to become coincident with Earth.
How can we prepare for such a future? Perhaps that is the most important research project for our creativity machine. We need to exploit the growing sensor/effector layer to make the world itself a real-time database. In the social, human layers of the Internet, we need to devise and experiment with large-scale architectures for collaboration. We need linguists and artificial-intelligence researchers to extend the capabilities of search engines and social networks to produce services that can bridge barriers created by technical jargon and forge links between unrelated specialties, bringing research groups with complementary problems and solutions together — even when those groups have not noticed the possibility of collaboration. In the end, computers plus networks plus people add up to something significantly greater than the parts. The ensemble eventually grows beyond human creativity. To become what? We can’t know until we get there.
Ian Foster, director of the computational institute at the University of Chicago, writes of the interaction between science and computer science in Two way street to science’s future, and argues that scientists need to develop their computer science skills:
… the scientist of 2020 will be adept in computing: not only will they know how to program, but they will have a solid grounding in, for example, the principles and techniques by which information is managed; the possibilities and limitations of numerical simulation; and the concepts and tools by which large software systems are constructed, tested and evolved. This knowledge has been picked up on the job by many pioneering scientists and will hopefully be instilled in the next generation by more formal training. The idea that you can be a competent scientist without such training will soon seem as odd as the notion that you need not have a solid grounding in seventeenth-century mathematics (such as algebra).
In the most novel, to me, article Stephen H. Muggleton, a computer scientist and systems biologist at Imperial College, writes of automated experiments in Exceeding human limits
Statistical and machine-learning approaches to building and updating scientific models typically use ‘open loop’ systems with no direct link or feedback to the collection of data. A robot-scientist project in which I was involved offers an important exception7. Here, laboratory robots conducted experiments on yeast (Saccharomyces cerevisiae) using a process known as ‘active learning’. The aim was to determine the function of several gene knockouts by varying the quantities of nutrient provided to the yeast. The robot used a form of inductive logic programming to select experiments that would discriminate between contending hypotheses. Feedback on each experiment was provided by data reporting yeast survival or death. The robot strategy that worked best (lowest cost for a given accuracy of prediction) not only outperformed two other automated strategies, based on cost and random-experiment selection, but also outperformed humans given the same task.