The Logic of Applied Statistics
History: A partial list of Who's Who
of statisticians who worked in cross-discipline areas including statistics.
Statistics provides the general principle for all scientific investigations. But statistics cannot develop (bootstrap) itself alone without stimulations from challenging scientific and industrial problems. (Neither can scientific experiments be conducted to achieve maximum efficiency without proper statistical considerations and knowledge of what analysis and software tools are at his or her disposal). Large amounts of data from scientic experimentations (such as new biological experiments) have provided impetus and demand for new statistical theory, methodology, and computation which may supersede existing ones. Exactly same thing happened in the early 20th century, when waves of new discoveries in biology, C. Darwin's Origin of Species and G. Mendel's discovery of heritage laws, have led to quantitative developments resulting in biometrics by the F. Galton (1822-1911, biometrician), K. Pearson (1857-1936, biometrician), and the birth of modern statisticis in the hands of R.A. Fisher (1890-1962, geneticist and statistician) and others. Interestingly, this may be the only time that statistics has developed hand-in-hand with the frontier areas of sciences. Further theoretical developments of statistics led by J. Neyman (1894-1981, theoretical and applied statistician), E. Pearson (1895-1980, statistician and historian), A.N. Kolmogorov (1903-1987, mathematician and probabilist), Harold Hotelling (1895-1973, economist and mathematical statistician), Abraham Wald (1902-1950, econometrician, game theory, and mathematical statistician) et al have establsihed a school of theoretical statistics (mathematical statistics and modern probability theory) which has formaly separated from statistics in practice. Applied Statistics has been relegated into less prominent places in mainstream statistics, and has been practiced within disparate disciplines such as biostatistics, econometrics, chemometrics, psychometrics, and environmetrics. It is high time that Applied Statistics needs a cohrent and unifying principle and logic that praticing statisticians can use and apply everyday, and that teachers can refer his or her would-be Applied Statistician students to consult.
Opportunities and Challenges
Now we are on the brink of new breakthroughs in the 21st century with new opportunities offered in molecular biology, biotechnology, nanotechnology, material sciences, and computer sciences.
This time, we are faced with much richer statistical problems of analyzing massive and high resolution data sets, supported with rapidly improving computational tools. Though the biological and scientific questions may remain similar in that we still try to understand and to predict based on data. But unfortunately we have to compete or to work closely with computer scientists, electrial engineers, and other scientists in order to meet these challenges which may require new thinkings and breakthoughs in statistical research. Indeed, we have entered a new era of interdisciplinary sciences, in which many new growth areas in sciences demand genuine collaborations from several related disciplines and statistician can prove to be an valuable member on many important interdisciplinary projects.
What it takes to be a good Applied Statistician
Doing these new kinds of statistics in scientific investigation is not always to be easy. Traditionally, statistics has not been developed to cope with large and high-dimensional objects. In order to cope with the challenges raised by modern science and industrial problems, statisticians need to abandone their traditional comfort zones of development from existing models and need to embrace new problems and research areas required by modern sciences.
Understanding existing methods by providing new mathematical insights may be interesting, but statistics is not about proofs and convergence, it is about inference from the observed and to relate to the unknown phenonomen, through some inductive reasoning, which is the most exciting step in a scientific discovery.
The challenge and difficulty of statistics lies in translating a scientific problem into a proper statistical problem, which can then be solved using existing or new techniques to be developed. I think it is exactly this process of inductive and iterative learning that is the hardest part of statistics. Sometimes it may take many years training and experience in order to develop the statistical intuition and statistical sense as a successful statistical scientist. The breath of statistical knowledge and science will be crucial. Statistician probably should be the most hard-working student of sciences as well as mathematics, for he or she needs to keep learning new areas of applications as well as keeping up with developments in statistical computing and mathematics.
Because of the interdisciplinary nature of statistics, I think the most valuable qualities of a good applied statistician should include an open and exploratory mind, an appreciation and good sense of many different areas of science, a long cultivated instinct of knowing where good statistical problems may arise, and patience and persistence of learning and understanding a new area of application. Ideally, the science problem to be solved should be important, and the statistics that is used should be relevant and most approriate, and hopefully novel and interesting. What a cutting-edge interdisciplinary research means is that you can work on the frontiers of the interface of both disciplines, contributing to top scientific problems while developing new and novel statistical methodology that have clear applications to other problems.
Copyright: Z.Q. John Lu, 2004, 2008.
back to main page.