# Notable Advances in Statistics: 1971-1976

By the early 1970s, the U.S. Department of Defense was supporting a small internet system (ARPANET) that connected computers at sites across the U.S. The first e-mail message was sent in 1971 and as ARPANET grew, e-mails constituted the bulk of network traffic.

This period saw the creation of more tools for statistical calculations. The statistical package of computer programs BMDP (abbreviation for Bio-Medical Data Package) was developed at UCLA during 1965-1969; the full version became available at MSU in 1973. BMDP was widely used in the 1970s but, by the late 1990s, it was replaced by modern alternatives.

At my previous job, I (MAH) frequently used BMDP because it was the package of choice
for data analysis at the National Cancer Institute. Because of that experience, I
frequently used the BMDP for data analysis. I (REL) remember bringing use of BMDP
into my Math 527 *Regression Methods in Curve Fitting* class the summer of 1973 for at least one lab problem. I used the preliminary version
BMDO a few times my first year at MSU, 1969-70, but not for classes.

The statistical software suite, SAS (previously "Statistical Analysis System"), was developed at North Carolina State University from 1966 until 1976, when SAS Institute was incorporated. Since then, it has been continuously extended, revised, and updated. SAS first became available at MSU on the DEC VAX 11/780 in 1983.

The Minitab statistics package was developed in 1972 at the Pennsylvania State University. It began as a light version of OMNITAB 80, a statistical analysis program by NIST for the IBM 7090 computer. Minitab has been continuously extended and updated. Now, it is distributed by Minitab, Inc., a privately owned company.

Statistical theory and method development included Nei’s 1972 measure of genetic distance
between populations based on the analysis of genes, a topic of emerging importance
in biology. D. R. Cox published his 1972 paper on the regression analysis of censored
failure time data in which he utilized the concept of partial likelihood and introduced
the proportional hazards model. In 1974, cross-validation was popularized in the statistics
world by M. Stone & discussants in a JRSS-B paper, although the concept had been sporadically
used since the 1930s. Akaike published his criterion for estimating the appropriate
explanatory variables in a linear model, now know as Akaike’s Information Criterion
(AIC). Generalized linear models received attention due to the 1972 Nelder and Wedderburn
article in JRSS-A and the 1975 textbook by Bishop, Fienberg, and Holland *Discrete Multivariate Analysis*.

In 1976, G. V. Glass introduced the term ‘meta-analysis’ for an analysis of analyses. The problem and suggested solutions had been the subject of study for many decades prior to the 1970s. But the desire for combining information across studies, especially the call for ‘evidence-based medicine,’ stimulated statisticians to work on meta-analysis topics. The 1977 paper by Dempster, Laird, and Rubin utilized the Expectation Maximization (EM) algorithm for maximum likelihood estimation from incomplete data. In 1977, John Tukey introduced the box-plot and box-and-whiskers diagram for a concise display of the distribution. The potential for computationally-intensive statistical analysis was realized by Bradley Efron who introduced the bootstrap method for assessing uncertainty in 1977.

In 1973, the Institute of Mathematical Statistics replaced the *Annals of Mathematical Statistics* with two journals, the *Annals of Statistics* and the *Annals of Probability*. By 1976, two new sections of ASA were created, the Statistical Computing Section
(COMP) and the Survey Research Methods Section (SRMS).

##### Advances in Stat during Era 7

Next Topic (Annals of MSU) during Era 6

Table of Contents

Last revised: 2020-06-20