Wednesday, July 31, 7-8:30
Reader, School of Psychology
Presentation Abstract: Most of scientific work happens behind the scenes: every published scientific result is an intermediate step in a process that may have taken years to realize. Opportunistic analyses, publication bias, and outright fraud may be important to consider when trying to assess the trustworthiness of a result, but they are, of course, not mentioned in the report. However, they might leave statistical traces: for instance, Fisher (1936) pointed out that Mendel’s (1866) results with pea plants appeared to be too close to their theoretical values to be accounted for by chance variation, possibly intentionally falsified either for didactic reasons or by an assistant trying to please Mendel. Similar methods have been at the center of assessing the credibility of more recent research, but the essential character of modern methods is a straighforward extension of Fisher’s logic (which itself is significance testing). We can call these “statistical forensics” methods whose goal is to shed light on whether a body of research is trustworthy and perhaps to try to correct for issues that might cause doubt. I will outline some of these methods, describe where they have been used in practice, and discuss potential objections.
The Legal Sequelae of the 2016 American Statistical Association P-Value Statement
Nathan A. Schachtman
Monday August 5 7-8:30
In 2016, the American Statistical Association issued an unusual guidance document in which it attempted to redress its perception that p-values and statistical significance were widely misunderstood and misinterpreted. In addition to providing guidance on the meaning and use of attained significance probabilities, the ASA also encouraged the use of “other methods that emphasize estimation over testing,” including Bayesian methods. Although the ASA guidance document warned against misuses of p-values, it did not warn of the potential for misapplication of these “other methods.” The reaction of some segments of the legal community was prompt, both in interpreting the 2016 guidance as a rejection of p-values and significance testing, as well as an encouragement to use “other methods,” for which the judiciary would have far less experience and acumen to detect invalid inferences. In this presentation, I will discuss how the ASA Statement was used rhetorically to justify causal claims that had been rejected by the FDA and scientific organizations, and to advance a “Bayesian hypothesis” test to support a claim that a meta-analysis showed that there was an 85 percent probability that testosterone replacement therapy caused either heart attack or stroke. (Fuller Discussion (pdf))
Click on the title of the post or page to open it to see all the information.There is also a search box at the bottom of the front page. The latest (5th) schedule is here. Check back for updates (this is (i)).
- From the Airport: Remember, you need to reserve a shuttle IN ADVANCE if you plan to take one to the hotel. They are located right at the baggage area of the airport. (You can email them to schedule as well: firstname.lastname@example.org). Try to group with others. If you are taking the SmartWay bus (which goes to Squires, but not the hotel), you don’t need to schedule in advance. [UPDATE: SmartWay bus does not run on Sunday to airport, nor do they have any transfer arrangements. Arrangements for the shuttle from the Marriott or individual rides will have to be made later, but we have plenty of time.] Information (webpages and phone numbers) for both can be found here: https://summerseminarphilstat.com/2019/02/02/local-maps-transportation/
- At Squires Student Center: If you’re arriving at Squires Student Center (on Virginia Tech Campus) in the afternoon, and need someone to take you to the hotel (1 mile) please let us know fairly soon and we’ll arrange it. If it’s evening, and we’re at the dinner, we should still be able to find someone to fetch you. (walking map)
- Hotel: Information about the hotel can be found here: https://summerseminarphilstat.com/housing/
Stephen Senn, Consultant Statistician, Edinburgh
100 years ago, in 1919, Fisher arrived at Rothamsted Agricultural Research Station and began his programme of revolutionising statistics. He realised that it was not enough for the subject of statistics to concern itself with the analysis of data but that it also had to deal with the process of collecting and planning to collect data. Thus, statistics became, under his leadership, a subject not just about analysis of experiments but also about their design.
One of the innovations in design he introduced was randomisation. However, although this has proved to be a practical success in many fields it has become a critical failure amongst many methodologists, in particular, philosophers of science. In my opinion much of the mistrust can be traced to a misunderstanding as to how statistical analysis of randomised experiments proceeds. In this talk I attempt to clear up the misunderstanding and show that many of the criticisms of randomisation turn out to be irrelevant.
Related blogs and articles
August 8 SUMMER SEMINAR PHILSTAT SCHEDULE 9th. Material from Spanos (2019) will be provided during the Seminar itself.
Here is the current August 5 Updated Participant Presentation Schedule SSPS
Use the comments to share ideas and questions on SIST and related Phil Stat articles.
Barnett (1999). Comparative Statistical Inference (Chapter 6: Bayesian Inference), John Wiley & Sons.
Benjamin, Berger, Johannesson et al (2017) Redefine Statistical Significance, Nature Human Behaviour 2, 6-10.
Berger, J. (2003). Could Fisher, Jeffreys and Neyman have Agreed on Testing? Stat Sci 18: 1-12.
Berger, J. & Sellke (1987). Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence (with Discussion and Rejoinder), Journal of the American Statistical Association 82(397), 112–22; 135–9. Continue reading “Key Articles/Chapters for the Summer Seminar in Phil Stat”
This was published today on the American Philosophical Association blog.
“[C]onfusion about the foundations of the subject is responsible, in my opinion, for much of the misuse of the statistics that one meets in fields of application such as medicine, psychology, sociology, economics, and so forth.” (George Barnard 1985, p. 2)
“Relevant clarifications of the nature and roles of statistical evidence in scientific research may well be achieved by bringing to bear in systematic concert the scholarly methods of statisticians, philosophers and historians of science, and substantive scientists…” (Allan Birnbaum 1972, p. 861).
“In the training program for PhD students, the relevant basic principles of philosophy of science, methodology, ethics and statistics that enable the responsible practice of science must be covered.” (p. 57, Committee Investigating fraudulent research practices of social psychologist Diederik Stapel)
I was the lone philosophical observer at a special meeting convened by the American Statistical Association (ASA) in 2015 to construct a non-technical document to guide users of statistical significance tests–one of the most common methods used to distinguish genuine effects from chance variability across a landscape of social, physical and biological sciences.
It was, by the ASA Director’s own description, “historical”, but it was also highly philosophical, and its ramifications are only now being discussed and debated. Today, introspection on statistical methods is rather common due to the “statistical crisis in science”. What is it? In a nutshell: high powered computer methods make it easy to arrive at impressive-looking ‘findings’ that too often disappear when others try to replicate them when hypotheses and data analysis protocols are required to be fixed in advance.