Discovery science
Discovery science (also known as discovery-based science) is a scientific methodology which aims to find new patterns, correlations, and form hypotheses through the analysis of large-scale experimental data. The term “discovery science” encompasses various fields of study, including basic, translational, and computational science and research.[1] Discovery-based methodologies are commonly contrasted with traditional scientific practice, the latter involving hypothesis formation before experimental data is closely examined. Discovery science involves the process of inductive reasoning or using observations to make generalisations, and can be applied to a range of science-related fields, e.g., medicine, proteomics, hydrology, psychology, and psychiatry.[2][3][4][5][6]
For other uses, see Discovery science (disambiguation).Overview[edit]
Purpose[edit]
Discovery science places an emphasis on 'basic' discovery, which can fundamentally change the status quo. For example, in the early years of water resources research, the use of discovery science was demonstrated by seeking to elucidate phenomena that was, until that point, unexplained. It did not matter how unusual these ideas may have been perceived to be. In this sense, discovery science is based on the attitude that ‘‘we must not allow our concepts of the earth, in so far as they transcend the reach of observation, to root themselves so deeply and so firmly in our minds that the process of uprooting them causes mental discomfort" (as stated by Davis in 1926).[7] For discovery science to be utilised, there is a need to revert to creating and testing genuine hypotheses, rather than focusing on praising concepts that are already familiar.[2] While researchers commonly feel that new hypotheses will naturally emerge inductively from curiosity in the relevant field, it should be acknowledged that hypotheses can be generated by models.[2] Additionally, deductive testing must involve field observation, so that imperfect answers can be substituted with questions that are more clearly defined.[2]
Tools[edit]
Hypothesis-driven studies can be transformed into discovery-driven studies with the help of newly available tools and technology-driven life science research.[5] These tools have allowed for new questions to be asked, and new paradigms to be considered, particularly in the field of biology. However, some of these required tools are limited in the sense that they are inaccessible or too costly because the related technology is still being developed.[5]
Data mining is the most common tool used in discovery science, and is applied to data from diverse fields of study such as DNA analysis, climate modelling, nuclear reaction modelling, and others. The use of data mining in discovery science follows a general trend of increasing use of computers and computational theory in all fields of science, and newer methods of data mining employ specialised machine learning algorithms for automated hypothesis forming and automated theorem proving.
Methodology[edit]
Discovery-based methodologies are often viewed in contrast to traditional scientific practice, where hypotheses are formed before close examination of experimental data. However, from a philosophical perspective where all or most of the observable "low-hanging fruit" has already been plucked, examining the phenomenological world more closely than the senses alone (even augmented senses, e.g. via microscopes, telescopes, bifocals etc.) opens a new source of knowledge for hypothesis formation. This process is also known as inductive reasoning or the use of specific observations to make generalisations.
Discovery science is usually a complex process, and consequently does not follow a simple linear cause and effect pattern.[1] This means that outcomes are uncertain, and it is expected to have disappointing results as a fundamental part of discovery science.[1] In particular, this may apply to medicine for the critically ill, where disease syndromes may be complex and multi-factorial.[1] In psychiatry, studying complex relationships between brain and behaviour requires a large-scale science. This calls for a need to conceptually switch from hypothesis-driven studies to hypothesis-generating research which is discovery-based.[4] Normally, discovery-based approaches for research are initially hypothesis-free, however, hypothesis testing can be elevated to a new level that effectively supports traditional hypothesis-driven studies.[11] Researchers hope that combining integrative analyses of data from a range of different levels can result in new classification approaches to enable personalised interventions.[3] Some biologists, such as Leroy Hood, have suggested that the model of ‘discovery science’ is a model which certain research fields are heading towards. For example, it is believed that more information about gene function can be discovered, through the evolution of data-mining tools.[4]
Discovery-based approaches are often referred to as “big data” approaches, because of the large-scale datasets that they involve analyses of.[9] Big data includes large-scale homogenous study designs and highly variant datasets, and can be further divided into different kinds of datasets.[9] For example, in neuropsychiatric studies, big data can be categorised as ‘broad’ or ‘deep’ data.[9] Broad data is complex and heterogenous, as it is collected from multiple sources (e.g., labs and institutions) and uses different kinds of standards.[9] On the other hand, deep data is collected at multiple levels, e.g., from genes to molecules, cells, circuits, behaviours, and symptoms.[9] Broad data allows for population level inferences to be made; deep data is required for personalised medicine.[9] However, combining broad and deep data and storing them in large-scale databases makes it practically impossible to rely on traditional statistical approaches. Instead, the use of discovery-based big data approaches can allow for the generation of hypotheses and offer an analytical tool with high-throughput for pattern recognition and data mining. It is in this way that discovery-based approaches can provide insight into causes and mechanisms of the area of study.[9]
Although discovery-based and data-driven big data approaches can inform understanding of mechanisms behind the topic of concern, the success of these approaches depends on integrated analyses of the various types of relevant data, and the resultant insight provided.[9] For example, when researching psychiatric dysfunction, it is important to integrate vast and complex data such as brain imaging, genomic data and behavioural data, to uncover any brain-behaviour connections that are relevant to psychiatric dysfunction.[12] Therefore, there are challenges to integrating data and developing mining tools. Furthermore, validation of results is a big challenge for discovery-based science. Although it is possible for results to be statistically validated by independent datasets, tests of functionality affect ultimate validation. Collaborative efforts are therefore critical for success.[9]