Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

NSF recognizes math and statistics are not the same thing...kind of

There’s controversy brewing over at the National Science Foundation over names. Back in October 2011, Sastry Pantula, the Director of the Division of Mathematical Sciences at NSF (formerly the Chair of NC State Statistics Department and President of the ASA), proposed that the name of the Division be changed to the “Division of Mathematical and Statistical Sciences”. Excerpting from his original proposal, Pantula says

Extracting useful knowledge from the deluge of data is critical to the scientific successes of the future. Data-intensive research will drive many of the major scientific breakthroughs in the coming decades. There is a long-term need for research and workforce development in computational and data-enabled sciences. Statistics is broadly recognized as a data-centric discipline, thus having it in the Division’s name as proposed would be advantageous whenever “Big Data” and data-sciences investments are discussed internally and externally.

This bureaucratic move by Pantula created quite a reaction. A sub-committee of the Math and Physical Sciences Advisory Committee (MPSAC) was formed to investigate the name change and to solicit feedback from the relevant communities. The sub-committee was chaired by Fred Roberts (Rutgers) and also included James Berger (Duke), Emery Brown (MIT), Kevin Corlette (U. of Chicago), Irene Fonseca (CMU), and Juan Meza (UC Merced). A number of organizations provided feedback to the sub-committee, including the American Statistical Association and the American Mathematical Society.

There was intense feedback both for and against the name change. Somewhat predictably, mathematicians were adamantly opposed to the name change and statisticians were for it. The final report of the sub-committee is both interesting and enlightening for those not familiar with the arguments involved.

First a little background for people (like me) who are not familiar with NSF’s organizational structure. NSF has a number of Directorates, of which Mathematical and Physical Sciences (MPS) is one, and within MPS is the Division of Mathematical Sciences (DMS). DMS includes 11 program areas ranging from algebra and number theory to topology. Statistics is one of those program areas. 

This should already give one pause. How exactly do statistics and topology end up in the same basket? I’m not exactly sure but I’m guessing it’s the result of bureaucratic inertia. Statistics came later and it had to be stuck somewhere. DMS is not the only place at NSF to get funding for statistics, but a quick search through the currently active grants shows that the vast majority of statistics-related grants go through DMS, with a smattering coming through other Divisions.

The primary issue here, and the only reason it’s an issue at all, is money. Statistics is one of 11 program areas in DMS, which means that it roughly gets 9% of the funding allocated to DMS. This is worth noting—the entire field of statistics gets roughly as much funding as, say, topology. For example, one of the arguments against the name change in the sub-committee’s report is

3). Statistics constitutes a small (although significant) proportion of the DMS portfolio in terms of number of programs, number of grant applications, number of grants funded.

Well, yes, but I would argue that the reason for this is the historically (low) prioritization of statistics in the Division. This is a choice, not a fact. I believe statistics could play a much bigger role in the Division and perhaps within NSF more generally if there were an agreement on its importance. A key argument comes next, which is

If the name change attracts more proposals to the Division from the statistics community, this could draw funding away from other subfields and it could also increase the workload of the Division’s program officers.

Okay, so money’s important too, but let’s get to the main attraction, which comes in comment number 5:

5). Statistics is funded throughout the federal government. The traditional funding of statistics by DMS is appropriate: fund fundamental research in statistics. Broadening the mission of DMS to include more applied statistics would not benefit the overall funding of the mathematical sciences.

The first sentence is a fact: Many government agencies fund statistics research. For example, the National Institutes of Health funds many statisticians who develop and apply methods to problems in the health sciences. The EPA will occasionally fund statisticians to develop methods for environmental health applications.

But who is charged with funding the development and application of statistical methods to every other scientific field? The problem now is that you essentially have a group of NIH-funded (bio)statisticians doing biomedical research and a group of NSF-funded statisticians doing “fundamental” research in statistics (note that “fundamental” equals “mathematical” here). But that hardly represents all of the statisticians out there. So for the rest of the statisticians who are not doing biomedical research and are not doing “fundamental” research, where do they go for funding?

These days, statistics is “applied” to everything. NSF itself has acknowledged that we are in an era of big data—it’s clear that statistics will play a big role whether we call it “statistics” or not. If NSF decided to fund research into the application of statistics to all areas, it would likely overwhelm the funding of every other program area in DMS. This is why the “solution” is to resort to what is informally understood as the mission of NSF, which is to fund “fundamental” research.

But it’s not clear to me that NSF should limit itself in this manner. In particular, if NSF got serious about funding the application of statistics to all scientific areas (either through DMS or some other Division), it would incentivize statisticians to build stronger and closer collaborations with scientists all over. I see this as a win-win for everyone involved. 

As a statistician, I’m willing to admit I’m biased, but I think NSF should play a much bigger role in advancing statistics as one of the critical tools of the future. Perhaps the solution is not to rename the Division, but to create a separate division for statistical sciences independent of mathematics, one of the suggestions in the sub-committee report. This separation would mirror what has occurred in many universities over the past 50 years or so with the creation of independent departments of statistics and biostatistics.  

Ultimately, the name of the Division was not changed. Here’s the release from last week:

NSF is committed to supporting the research necessary to maximize the benefits to be derived from the age of data, and to promoting and funding research related to data-centric scientific discovery and innovation, and in particular, the growing role of the statistical sciences in all research areas. Recognizing both the complex composition of the various communities and the support of statistical sciences throughout NSF, and taking into account the various community views described in the very thoughtful report of the MPSAC, I have decided to maintain the name “Division of Mathematical Sciences (DMS)” within MPS, but to affirm strong commitment to the statistical sciences.

To demonstrate this commitment, (a) whenever appropriate, we will specifically mention “statistics” alongside “mathematics” in budget requests and in solicitations in order to recognize the unique and pervasive role of statistical sciences, and to ensure that relevant solicitations reach the statistical sciences community….

Well, I feel better already. I suppose this is progress of some sort.