The dialectic of science journalism, especially as refracted through social media, is to begin with the overenthusiastic claim “Study proves P.” Within a couple of days, the antithesis: “Study shows nothing of the kind, and anyway not-P.” Then there’s a storm, a cute puppy, or racism, and the matter is never resolved in a progressive synthesis.
Take the recent specimen of a study on the effectiveness of various kinds of face mask. Coverage popped up in my feeds from numerous friends.1 Most people took away the lesson that gaiters were worse than no face mask at all.
Yesterday, Slate ran the antithesis under the headline For All We Know, Gaiter Masks Are Fine.
The story makes the sensible point that the research doesn’t give detailed descriptions of the masks that were tested. The results that people interpret as being about gaiters were given under the label “neck fleece.”2 It is unclear whether the researchers were using whatever it is that readers have in mind when they think of gaiters.3
Although the researchers tested some of the masks with a few subjects, the neck fleece was only tested with one person. N was small.
Moreover, the study wasn’t aimed to yield verdicts about kinds of face masks. It was primarily about a new, low-cost method for making these kinds of evaluations. Susan Matthews, writing in Slate, puts the point this way:
The purpose of the research was to establish that the testing method worked in principle—not to come up with meaningful or accurate verdicts about particular masks. … It was by no means an exhaustive study that lets us make conclusions about gaiters—it’s a study that incidentally happened to use a gaiter in the course of putting forth a methodology that other people might be able to use to in doing real experiments that would make conclusions about gaiters.
This shifts matters a bit too far, though. Although the study can’t be taken to provide final verdicts on types of masks, it must provide at least somewhat meaningful and somewhat accurate data about how the masks performed under the circumstances observed. More decisive evidence will test masks on multiple people under an array of conditions, but this study can’t establish the method in principle without also providing information about the actual subjects observed. The method is shown to be legitimate only insofar as the numbers it gets are plausible.
This is an instance of what Harry Collins called the experimenter’s regress. In developing a new instrument or method, experimenters rely on an array of theory and tacit knowledge. Without this background knowledge, it would be impossible to validate or calibrate the new method as a legitimate way of discovering things. Yet the results of a new method also serve as some evidence.
The way to keep the experimenter’s regress from becoming vicious is to do further experiments and develop further methods. That itself is a complicated business that relies on further theory and tacit knowledge, so there is no way out of it entirely. Regardless, it’s a slow synthesis that won’t show up in your social media feed.
This is an important point. Scientists and science popularizers don’t even share the same understanding of what scientific literature /is/. To the scholar, a publication is a contribution to an ongoing conversation, which has a backstory and certain assumptions, and which contribution is based on the quality and validity of the methods used. Anyone can turn on a machine and make the buttons go “bloop beep!”, what separates the scholar from the novice is /judgment/, an understanding of what valid findings and valid /questions/ look like. Popular media look at science as though its primary purpose is to issue dispositive pronouncements, but that’s mostly not what the apparatus is equipped to do. A very nice post.