Posted in Current PhilStat Articles

Current (or recent) PhilStat Articles of Relevance: Use comments to add your exs

1  COMPare: Qualitative analysis of researchers’ responses to critical correspondence on a cohort of 58 misreported trials

This is an important and illuminating study on misreporting results of medical trials, along with inaccurate explanations (by authors) of what was done (as a result of letters by Goldacre’s group).

articles/10.1186/s13063-019-3172-3

enhanced pdf:
https://rdcu.be/bmSZd

 

 

 

Posted in Uncategorized

American Phil Assoc Blog: The Stat Crisis of Science: Where are the Philosophers?

Ship StatInfasST

The Statistical Crisis of Science: Where are the Philosophers?

This was published today on the American Philosophical Association blog. 

“[C]onfusion about the foundations of the subject is responsible, in my opinion, for much of the misuse of the statistics that one meets in fields of application such as medicine, psychology, sociology, economics, and so forth.” (George Barnard 1985, p. 2)

“Relevant clarifications of the nature and roles of statistical evidence in scientific research may well be achieved by bringing to bear in systematic concert the scholarly methods of statisticians, philosophers and historians of science, and substantive scientists…” (Allan Birnbaum 1972, p. 861).

“In the training program for PhD students, the relevant basic principles of philosophy of science, methodology, ethics and statistics that enable the responsible practice of science must be covered.” (p. 57, Committee Investigating fraudulent research practices of social psychologist Diederik Stapel)

I was the lone philosophical observer at a special meeting convened by the American Statistical Association (ASA) in 2015 to construct a non-technical document to guide users of statistical significance tests–one of the most common methods used to distinguish genuine effects from chance variability across a landscape of social, physical and biological sciences.

It was, by the ASA Director’s own description, “historical”, but it was also highly philosophical, and its ramifications are only now being discussed and debated. Today, introspection on statistical methods is rather common due to the “statistical crisis in science”. What is it? In a nutshell: high powered computer methods make it easy to arrive at impressive-looking ‘findings’ that too often disappear when others try to replicate them when hypotheses and data analysis protocols are required to be fixed in advance.

How should scientific integrity be restored? Experts do not agree and the disagreement is intertwined with fundamental disagreements regarding the nature, interpretation, and justification of methods and models used to learn from incomplete and uncertain data. Today’s reformers, fraudbusters, and replication researchers increasingly call for more self-critical scrutiny on philosophical foundations. Philosophers should take this seriously. While philosophers of science are interested in helping to clarify, if not also to resolve, matters of evidence and inference, they are rarely consulted in practice for this end. The assumptions behind today’s competing evidence reforms–issues of what I will call evidence-policy–are largely hidden to those outside the loop of the philosophical foundations of statistics and data analysis, or Phil Stat. This is a crucial obstacle to scrutinizing the consequences to science policy, clinical trials, personalized medicine, and across a wide landscape of Big Data modeling.

Statistics has a fascinating and colorful history of philosophical debate, marked by unusual heights of passion, personality, and controversy for at least a century. Wars between frequentists and Bayesians have been so contentious that everyone wants to believe we are long past them: we now have unifications and reconciliations, and practitioners only care about what works. The truth is that both brand new and long-standing battles simmer below the surface in questions about scientific trustworthiness. They show up unannounced in the current problems of scientific integrity, questionable research practices, and in the swirl of methodological reforms and guidelines that spin their way down from journals and reports, the ASA Statement being just one. There isn’t even agreement as to what is to be meant by the method “works”. These are key themes in my Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (2018, CUP).

Many of the key problems in today’s evidence-policy disputes inherit the conceptual confusions of the underlying methods for evidence and inference. They are intertwined with philosophical terms that often remain vague, such as inference, reliability, testing, rationality, explanation, induction, confirmation, and falsification. This hampers communication among various stakeholders, making it difficult to even recognize and articulate where they agree. The philosopher’s penchant for laying bare presuppositions of claims and arguments would let us cut through the unclarity that blocked the experts at the ASA meeting from clearly pinpointing where and why they agree or disagree. (As a mere “observer”, I rarely intervened.) We should put philosophy to work on the popular memes: “All models are false”, “Everything is equally subjective and objective”, “P -values exaggerate evidence”, and “ most published research findings are false”.

So am I calling on my fellow philosophers (at least some of them) to learn formal statistics? That would be both too much and too little. Too much because it would be impractical; too little because despite technical sophistication, basic concepts of statistical testing and inference are more unsettled than ever. Debates about P-values–whether to redefine them, lower them, or ban them altogether–are all the subject of heated discussion and journalistic debates. Megateams of seventy or more authors array themselves on either side of the debate (e.g., Benjamin 2017, Lakens 2018), including some philosophers (I was a co-author in Lakens, arguing that redefining significance would not help with the problem of replication). The deepest problems underlying the replication crisis go beyond formal statistics–into measurement, experimental design, communication of uncertainty. Yet these rarely occupy center stage in all the brouhaha. By focusing just on the formal statistical issues, the debates give short shrift to the need to tie formal methods to substantive inferences, to a general account of collecting and learning from data, and to entirely non-statistical types of inference. The goal becomes: who can claim to offer the highest proportion of “true” effects among those outputted by a formal method?

You might say my project is only relevant for philosophers of science, logic, formal epistemology and the like. While they are the obvious suspects, it goes further. Despite the burgeoning of discussions of ethics in research and in data science, the work is generally done by practitioners apart from philosophy, or by philosophers apart from the nitty-gritty details of the data sciences themselves. Without grasping the basic statistics, informed by understanding contrasting views of the nature and goals of using probability in learning, it’s impossible to see where the formal issues leave off and informal, value-laden issues arise or intersect. Philosophers in research ethics can wind up building arguments that forfeit a stronger stance that a critical assessment of the methods would afford (e.g., arguing for a precautionary stance, when there is evidence of genuine risk increase in the data, despite non-significant results.) Interest in experimental philosophy is another area that underscores the importance of a critical assessment of the statistical methods on which it is based. Formal methods, logic and probability are staples of philosophy, why not methods of inference based on probabilistic methods? That’s what statistics is.

Not only is PhilStat relevant to addressing some long-standing philosophical problems of evidence, inference and knowledge, it offers a superb avenue for philosophers to genuinely impact scientific practice and policy. Even a sufficient understanding of the inference methods together with a platform for raising questions about fallacies and pitfalls could be extremely effective. What is at stake is a critical standpoint that we may be in danger of losing. Without it, we forfeit the ability to communicate with, and hold accountable, the “experts,” the agencies, the quants, and all those data handlers increasingly exerting power over our lives. It goes beyond philosophical outreach–as important as that is–to becoming citizen scholars and citizen scientists.

I have been pondering how to overcome these obstacles, and am keen to engage fellow philosophers in the project. I am going to take one step toward exploring and meeting this goal, together with a colleague, Aris Spanos, in economics. We are running a two-week immersive seminar on PhilStat for philosophy faculty and post-docs who wish to acquire or strengthen their background in PhilStat as it relates to philosophical problems of evidence and inference, to today’s statistical crisis of replication, and to associated evidence-policy debates. The logistics are modeled on the NEH Summer Seminars for college faculty that I directed in 1999 (on Philosophy of Experiment: Induction, Reliability, and Error). The content reflects Mayo (2018), which is written as a series of Excursions and Tours in a “Philosophical Voyage” to illuminate statistical inference. Consider joining me. In the meantime, I would like to hear from philosophers interested or already involved in this arena. Do you have references to existing efforts in this direction? Please share them.

Barnard, G. (1985). A Coherent View of Statistical Inference, Statistics Technical Report Series. Department of Statistics & Actuarial Science, University of Waterloo, Canada.

Benjamin, D. et al (2017). Redefine Statistical Significance, Nature Human Behaviour 2, 6-10.

Birnbaum, A. (1972). More on concepts of statistical evidence. J. Amer. Statist. Assoc. 67 858–861. MR0365793

Lakens et al (2018). Justify Your Alpha Nature Human Behaviour 2, 168-71.

Levelt Committee, Noort Committee, Drenth Committee (2012). Flawed Science: The Fraudulent Research Practices of Social Psychologist Diederik Stapel (www.commissielevelt.nl/).

Mayo, D. (2018). Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (CUP). (The first chapter [Excursion 1 Tour I ] is here.)

Wasserstein & Lazar (2016). The ASA’s Statement on P-values: Context, Process and Purpose, (and supplemental materials), The American Statistician 70(2), 129–33.

 

Credit for the ‘statistical cruise ship’ artwork goes to Mickey Mayo of Mayo Studios, Inc.

Deborah Mayo is Professor Emerita in the Department of Philosophy at 

Virginia Tech. She’s the author of Error and the Growth of Experimental Knowledge (1996, Chicago), which won the 1998 Lakatos Prize awarded to the most outstanding contribution to the philosophy of science during the previous six years. She co-edited, with Aris Spanos, Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability, and the Objectivity and Rationality of Science(2010, CUP), and co-edited, with Rochelle Hollander, Acceptable Evidence: Science and Values in Risk Management (1991, Oxford). Other publications are available here.

many thanks to Nathan Oserloff for inviting me to submit this blogpost to the APA blog.
Posted in Instructions for Applying

FAQS

FAQS :(2/8)

(1) What are the Instructions for Applying? 

These are on the first post of this blog. The Cover sheet contents may be found in the menu of pages at the top of this blog. Please type and pdf your responses.

(2.) Will I have to have a background in statistics before the seminar?

The main aim of the seminar is to provide such a background. So, the answer is no. You’ll have a chance to describe your background in your application. Some will have stat and no philo, and we might break out into groups with one group working on philo, the other on stat. You can follow our current graduate Seminar here.

It’s useful to know some probability, and I find it helpful to watch some of the seminars at the free Khan Academy: AP Statistics. Participants are expected to have read at least 3/4 of SIST (Mayo 2018) in advance so that they can participate and keep up with the discussion in our condensed schedule. Continue reading “FAQS”

Marriott Residence Inn, Blacksburg

Participants will each have their own suite at the Marriott Residence Inn, Blacksburg (We will pay the housing directly.)
It’s new and really nice! Details on reserving your room will be given upon your acceptance to the seminar, and your agreeing to participate. For participants needing several bedrooms, we will help you seek the many rental opportunities in Blacksburg on  Airbnb.

suite
Exterior
Posted in Instructions for Applying

Welcome to Summer Seminar in Phil Stat

SUMMER SEMINAR IN PHIL STAT (2/5/19)

OVERVIEW

SELECTION CRITERIA

APPLICATION INSTRUCTIONS

STIPEND, EXPECTATIONS, AND CONDITIONS OF AWARD

CHECKLIST OF APPLICATION MATERIALS

 

OVERVIEW

We will offer a 2-week immersive seminar on Philosophy of Statistics (PhilStat) at Virginia Tech for faculty and post-docs in philosophy who wish to acquire or strengthen their background in PhilStat as it relates to philosophical problems of evidence and inference, to today’s statistical crisis of replication, and to associated evidence-policy debates. We also invite social scientists and methodology researchers interested in strengthening their philosophical scholarship in this arena. A total of 12-15 applicants will be selected. Given our goals, we anticipate approximately 2/3 will be philosophers, but we are not applying any rigid rules. (We will consider up to 2 advanced Ph.D. students working on a dissertation that is directly in this area.)

All accepted participants will receive private housing with kitchen facilities (Marriott Residence, Blacksburg) and a stipend of $1,000 (in 2 installments). See STIPENDS and [1\.

Our primary goal is to strengthen Phil Stat research and teaching in Philosophy programs (by incorporating PhilStat). However, we wish also to enable statistical practitioners and researchers on methods to gain a greater understanding of the philosophical dimensions of statistical debates, as well as a facility in the conceptual and critical skills included under the umbrella of Phil Stat. Today’s debates are intertwined with philosophical terms that often remain vague, such as evidence, validity, inference, realism, reliability, rationality, explanation, induction, confirmation, and falsification. This hampers communication among various stakeholders, making it difficult to even see where they agree. Thus we also encourage interested social scientists and methodology researchers to apply.

Philosophy of Statistics  (Phil Stat), broadly understood
Phil Stat includes Continue reading “Welcome to Summer Seminar in Phil Stat”

“The well-known definition of a statistician as someone whose aim in life is to be wrong in exactly 5 per cent of everything they do misses its target.” (Sir David Cox 2006a, p. 197)

Upon being accepted, participants will receive a copy of Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars (Mayo 2018, CUP). It is a good idea to read parts of it in advance. Applicants will be notified in early April.

quote (all participants will get a copy of SIST)