December 7, 2013

Thought (Memetic) Soup, December edition

Here is the latest installment of assorted features from my micro-blog, Tumbld Thoughts. These include heuristic-based prediction of the future, system shock and recovery (human edition), and exposure vs. prestige. More from the intersection of human culture, technology, and complexity theory.

Heuristic-based Prediction of the Future

How do you predict the future? How does anyone predict the future? Perhaps they use heuristics such as the extrapolation of current trends, gradualistic change, or stasis in human value systems (see the "future prediction heuristics GUI", top picture). Here are two attempts at future approximation from the academia and the technology industry, respectively.

"We have tended to see the professor as a single figure, but he is now a multiple being of many types, tasks, and positions". Circa 2013.

The article in [1] is a counter to the common argument that academia has undergone a period of "deskilling". Here, the author thinks sociological differentiation rather than deskilling is at the root of institutional change, and that trend will continue into the future. 

"Do our computer pundits lack all common sense? The truth in no online database will replace your daily newspaper, no CD-ROM can take the place of a competent teacher and no computer network will change the way government works". Circa 1995.

The article in [2] is a retro look at critiques of the internet, circa 1995. The context for this critique was set against the unbridled optimism of what the internet would change in society. And even though many of the changes deemed too unrealistic actually came to pass, not all of them unfolded in the same way people expected them to in 1995 [3].

Systemic Shock and Recovery: human edition

For the end of the 2013 Hurricane season, I provide some storm-related free association. The first picture is about what tends to happen socio-economically in the aftermath of a hurricane. This was inspired not only by the aftermath of Typhoon Haiyan [4], but also by the response of fault-tolerant computer systems.

The picture above shows a summary [5] of all Hurricane tracks in the Atlantic during the 2013 season. This season was fairly quiet, with no major storms and a relatively small number of landfalls.

Exposure vs. Prestige

Randomly sampling Wikipedia entries and then using it to predict h-index scores may mean nothing. This is a play on the title of [6], but serves as a good, one-sentence critique of [7].

The authors of [7] suggest that personal profiles of scientists on Wikipedia should correspond with scientific impact (measured using the h-index). If they do not, then it suggests that Wikipedia is the source of distortion, artificially giving attention to lesser mortals (as it were). 

However, this assumes two things: that the properties of Wikipedia entries should reflect the scoring of citation indices, and that random samples of Wikipedia entries will correspond to the distributions of h-index values.

The first assumption is only valid if h-indices capture all possible information about scientific impact. Clearly, this is not the case, as many different indices have been developed [8] to characterize the various nuances inherent in scientific output and influence. 

The authors of [3] present a systematic review of various citation indices. Importantly, none of which produce a normal distribution centered around a mean. So when the mean h-index value of the Wikipedia sample is compared to the h-indices of different scientific fields, it does not mean as much as one would assume at first glance.

The brings us to the second assumption, which regards the underlying distribution of scientific impact. While this is not clearly discussed in [7], we know from other studies [9] that scientific impact can be explained using Lotka's Law (which can be characterized using a Pareto distribution).

While this long-tail can be mitigated using specialized metrics such as the x-index [10], this was not considered in [7]. In fact, one could argue that Wikipedia profiles and citation indices are statistically independent of one another.


[1] Williams, J.J.   The Great Stratification. Chronicle of Higher Education, December 2 (2013).

[2] This was a piece by Clifford Stoll in Time magazine. For more, please see: Yglesias, M.   Predictions About the Web From 1995. Moneybox blog, December 2 (2013).

[3] For an interesting look at the technological way forward from around that time, check out: Gates, B.   The Road Ahead. Penguin (1995).

[4] Langfitt, F.   After the Storm: commerce returns to damaged Phillipines city. Parallels NPR, November 25 (2013) AND Quismorio, E.A.   Looters' goods sold on Tacloban streets. Tempo: news in a flash, November 21 (2013).

[5] Masters, J.   The Unusually Quiet Atlantic Hurricane Season of 2013 Ends. Dr. Jeff Masters' WunderBlog, November 29 (2013). 

[7] Samoilenko, A. and Yasseri, T.   The distorted mirror of Wikipedia: a quantitative analysis of Wikipedia coverage of academics. arXiv:1310. 8508 (2013).

[8] Alonso, S., Cabrerizo, F.J., Herrera-Viedma, E., and Herrera, F.   h-index: a review focused in its variants, computation and standardization for different scientific fields. Journal of Informetrics, 3(4), 273-289 (2009). doi:10.1016/j.joi.2009.04.001.

[9] MacRoberts, M.H. and MacRoberts, B.R.   A Re-Evaluation of Lotka's Law of Scientific Productivity. Social Studies of Science, 12(3), 443-450 (1982).

No comments:

Post a Comment