July 23, 2013

WARNING: Data and Narratives may lead to Bias

This post contains three features cross-posted to my micro-blog, Tumbld Thoughts. I am carving something at its joints here (see note #5 for reference), but am not sure exactly what. I guess it is the role of belief and human nature in things we usually (from a common-sense perspective, at least) consider to be objective and/or conceptually attractive (e.g. data analysis, scientific theory, and intuitions about complexity).

Featured are three loosely-related topics: the uncritical interpretation of big data (I), shortcomings of the narrative explanation (II), and subtle but important biases in scientific thinking (III). Nothing is sacred (or at least reverent) here, but then again it shouldn't be.

I. Uncritical Interpretations of "Big" Data

This is what a religion based on big data would look like. Or rather, this is what the uncritical interpretation of big data [1] currently looks like. A few readings on the topic:

1) a news article [2] on how the mis-interpretation of big data (and over-reliance on its correlative relationships) threaten to undermine effective decision-making. Of greatest importance is distinguishing between correlation and causation, which is a data analysis (and logical reasoning) issue that predates big data.

2) an op-ed [3] featuring the work of Seth Stephens-Davidowitz, an economist from Google who is debunking the idea that child abuse and neglect decreased in the aftermath of the 2008 economic crisis. He is accomplishing this using a novel methodology, an example of how using different methods to address the same problem yields different results. In this case, aggregate Google searches (based on unobserved, online activity) were used. An example of how we might better extract causal relationships from "big" datasets.

II. Going beyond "the narrative" explanation

Is "the narrative" the best way to convey complex ideas, especially when it comes to social or scientific explanation? Barry Ritholtz and Cullen Roche [4] remind us that in contemporary economics, the prevailing narratives often fail to capture the complexity or even the outcomes of real-world situations. 

This failure is one of explanations that have no capacity to take into account conflicting evidence. This ultimately results in cognitive dissonance, which becomes more pronounced as the narrative explanations continue to fail.

That being said, using a narrative structure to describe the function and complexity of scientific and social concepts is not always to be avoided. Here is my list of reasons why narratives "work" in this context:

1) narratives contain common structures that are shared across cultures.

2) narrative structures are consistent with naive models [5], which represent an intuitive view of the world [6].

3) narratives are compact ways to encode information, such as oral traditions or feature films. Compare the amount of potential information contained in these with a Wiki or a blog post.

Why narratives do not work in this context:

1) narratives can perpetuate naive models of the world in the face of contradictory evidence. Examples of this are given in [4].

2) naive models (and thus narratives) are often conservative, and do not encode nonlinear effects or the parallel progression of events. Narrative thinking favors simple cause-and-effect mechanisms over mechanisms that favor multiple causes or long-term, delayed outcomes [7].

3) like most specialized information, they often require intersubjectivity. Unlike most specialized information, they require a moral logic.This may or may not obfuscate the interpretation of events.

4) narratives often utilize allegorical arguments, in which a single string of text can be interpreted in many ways. For example, if a message is passed around a circle, if often changes due to intrasubjectivity (e.g. individual interpretations). While this is good for cultural diversity, it is not so good for scientific replication.

Whereas the narrative is valuable to a communicator, it may be less valuable from a technical standpoint. Overall, using "plain language" and "narrative structure" can actually undercut the scientific content. 

III. Bias in Scientific Thinking

Is it bias, or is it good science? A series of recent papers/talks may give us some insight [8]. The first is a Skepticon IV talk [9] and Measure of Doubt blog post by Juila Galef on the "Straw Vulcan" phenomenon. A Straw Vulcan is shorthand for the popular misconceptions surrounding logical decisionmaking and how what may seem logical may be co-opted by emotionally-driven biases.

Interesting enough, but what does this have to do with science? Well, recently published papers in PLoS Biology [10] suggest that bias is a natural feature of scientific thinking, which results from a tension between the need to shift paradigms (Kuhnian science) and the need to falsify hypotheses (Popperian science)


[1] Whitehorn, M.   The Parable of Beer and Diapers. The Register, August 15 (2006).

[2] Asay, M.   Big Data's Dehumanizing Impact on Public Policy. ReadWrite content aggregator, July 12 (2013).

[3] Stephens-Davidowitz, S.   How Googling Unmasks Child Abuse. NYT Opinion, July 13 (2013).

[4] Examples from so-called "common sense knowledge" in economic issues includes:

a) Ritholtz, B.   The Narrative Fails. The Big Picture blog, July 19 (2013).

b) Roche, C.   The Fear Trade has been Demolished. Pragmatic Capitalism blog, July 19 (2013).

[5] For a structure learning perspective, please see: Gershman, S.J. and Niv, Y.   Learning latent structure: carving nature at its joints. Current Opinion in Neurobiology, 20, 251-256 (2010).

[6] for more information in the role of narratives vs. data in the wealth inequality debates, see the following:

a) Norton, M.I. and Ariely, D.   Building a better America -- one wealth quintile at a time. Perspectives on Psychological Science, 6(1), 9-12 (2011). Bottom image is taken from Figure 2.

b) Noah, T.   Theoretical Egalitarians. Slate.com, September 27 (2010).

[7] Wexler, M.   Invisible Hands: intelligent design and free markets. Journal of Ideology, 33 (2011).

[9] here is video of Julia Galef's lecture at Skepticon IV, and here is her post on the topic at Measure of Doubt blog.

[10] relevant papers from this issue include the following:

a) Chase, J.M.   The Shadow of Bias. PLoS Biology, 11(7), e1001608 (2013).

b) Tsilidis, K.K., Panagiotou, O.A., Sena, E.S., Aretouli, E., Evangelou, E., Howells, D.W., Al-Shahi Salman, R., Macleod, M.R., and Ioannidis, J.P.A.   Evaluation of Excess Significance Bias in Animal Studies of Neurological Diseases. PLoS Biology, 11(7), e1001609 (2013).

No comments:

Post a Comment