Showing posts with label behavior-observation. Show all posts
Showing posts with label behavior-observation. Show all posts

November 27, 2020

Open-source Community-building Discussions

As part of my role as Community Manager at the Rokwire Initiative, I have been maintaining content on a personal blog. I am highlighting some of the systems-oriented posts here on Synthetic Daises. The first re-post summarizes the functional aspects of open-source communities, while in the second re-post I propose that open-source communities are actually form of collective intelligence. The third re-post is of particular interest to this blog's audience, which is a revisitation of wicked problems.


Striving for Issue 0 (why do we have a community)

Why do we want to build a community in the first place? This question may seem self-obvious on its surface, even in the open-source context. But it is good to state the reasons why we might want to create a community. More importantly, it is good to think about communities as having explicitly stated benefits. Communities allow us to tap into a heterogeneous pool of expertise, build upon a network of contributors, recruit collaborators for future projects, and serve to facilitate education.

What direction does the action flow? A countdown to the essential issue, or enumeration of the issues?

The first benefit of an open-source community is to allow for a single contributor to interact quickly with many other contributors with a wide range of skills and expertise. For example, if you are a technical writer and have a question about a new programming language, you can post a message to the Slack channel with your inquiry and receive a helpful answer much more quickly than making a blanket inquiry or searching for the answer yourself. This actually sounds like the guiding principle of a University, but with a much more decentralized and informal structure.

Unlike a University, open-source communities are inherently interconnected [1], and as such serve to form a social network of contributors. Our hypothetical Slack channel is part of a Slack team (with many channels), and the Slack team is in turn a part of the entire community. As with a traditional workplace or school, this involves the interactions of many people, often with different roles and interests. A community allows for people who might not otherwise interact to cross paths, and mutually benefit from this serendipity [2].

Example of network structure in an open source community. Map of openrpg bug report interactions. From [3].

By their nature, open-source community networks are broad and loosely integrated. This is necessary by design; many open-source contributors have other commitments and contribute erratically to the project. But this is a good thing! One benefit of this structure is to enable people who would otherwise not make the time commitment. Related to this is the existence of a talent pool that is poised to build out an existing initiative, or engage in a new initiative. This alleviates the need to recruit and incentivize people using conventional hiring mechanisms.

But these relationships should not designed to be exploitative. In fact, one contributor to a healthy and stable open-source community is the feeling that contributors are able to benefit from their participation. Short of financial compensation, there are many other ways that contributors can benefit. One of these is becoming educated about the software platform itself. Educational incentives such as badging systems (microcredentials) serve as small-scale incentives. Being a contributor also allows one to learn about the latest features, and even have a hand in developing them. Another way open-source communities can reinforce education is through outreach, particularly in the form of an Ambassadors program [4]. A focus on learning opportunities centered on a specific software platform, in addition to learning and promoting associated skills, builds participation incentives into the community.

In these ways, we can take a step back from issues and builds and think about how we can leverage the community as a benefit of the platform. Then we can work towards addressing Issue 0: building and maintaining a community.

NOTES:

[1] Teixeira, J., Robles, G., and Gonzalez-Barahona, J.M. (2015). Lessons learned from applying social network analysis on an industrial Free/Libre/Open Source Software ecosystemJournal of Internet Services and Applications, 6, 14.

[2] Achakulvisut, T., Ruangrong, T., Acuna, D.E., Wyble, B., Goodman, D., and Kording, K. (2020). neuromatch: Algorithms to match scientistseLife Labs, May 18.

[3] Crowston, K. And Howison, J. (2005). The social structure of free and open source software developmentFirst Monday, 10(2).

[4] Starke, L. (2016). Building an Online Community Ambassador ProgramHigher Logic blog, July 28.

The Role of Collective Intelligence in Open-source

What do ants have to do with open-source?

Ants (and social insect more generally) are capable of building structures that feature great complexity and labor specialization. These complicated structures result from the small contribution of individual ants, each of which have a specialized job in their community [1]. In this sense, ant colonies exhibit parallels with open-source collaboration. Both types of organization (ant societies and open-source communities) rely upon collective intelligence, or the wisdom and power of crowds to create an artifact of great complexity shaped by design principles that emerge from these interactions.

Collective intelligence can be defined as intentional behaviors based on a coordinated set of goals among multiple individuals, and emerges from various forms of collaboration, competition, and group effort. For systems ranging from insect colonies to human societies [2], collective intelligence is a prime enabler of coordinated social behaviors and movements [3]. Being aware of how this process works is important for making the most of an open-source effort.

Traditional crowdsourcing can be understood as cooperation between two groups of people: requesters and contributors [4]. Requesters can be thought of as people who want a functional artifact, but may not be able to implement the solution. Contributors are people with technical know-how who realize the initial specification. I have discussed the open-source ethos and related contributor motivations in other posts on this blog. In this post, the focus will be on how requesters and contributors work together to produce a coherent outcome.

The find-fix-verify (FFV) workflow [5] has been identified as a facilitator or collective intelligence in open-source projects. FFV involved three steps: 1) finding room for improvement, such as a new feature or correcting an error, 2) proposing specific solutions to said improvements, and 3) verify that such changes are acceptable solutions. FFV allows for a division of labor in open-source communities based on expertise and technical skill. Yet each step is not executed by a dedicated employee reporting to a supervisor. Rather, each step is performed by groups of contributors who make their own contributions based on personal experiences and context.

In terms of organization, there are three ways in which open-source communities can be organized [4]. These are demonstrated graphically using Paul Baran’s networking diagram. The first is through direct leadership (centralized, A), which typically involves a single coordinator. This is most typically found in traditional corporate or academic organizations, and are often the least effective at harnessing the power of open-source.

Open-source communities can also be organized through collaboratives (decentralized, B), which involves the coming-together of people with a common interest. This form of organization is usually maintained over the long-term through expedient subgroups that form and dissolve given the immediate imperative. Finally, the passive mode of organization (distributed, C) is perhaps the most effective mode for facilitating open-source. In this type of organization, members of the crowd work independently, and in fact may never collaborate directly. This mode resembles so-called leaderless movements [6]. While collaborative and passive organizational modes have their advantages for facilitating open-source, there is no one-size-fits-all solution. The optimal organizational structure is often project- and goal-dependent.

Paul Baran’s Networking Diagram

A great majority of open-source contributors are transient [7]. Generally, these people contribute to a single project, and within that project only contribute to a small portion of the codebase. Most open-source projects have a few key leaders (occasionally not dissimilar to queen ants) who coordinate and facilitate project management [8]. But this is only a coordination tactic; across hundreds or event thousands of contributors, great things can happen.

NOTES:

[1] Bonabeau, E., Dorigo, M., and Theraulaz, G. (1999). Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, New York.

[2] O’Bryan, L., Beier, M., and Salas, E. (2020). How Approaches to Animal Swarm Intelligence Can Improve the Study of Collective Intelligence in Human TeamsJournal of Intelligence, 8(1), 9.

[3] Sasaki, T. And Biro, D. (2017). Cumulative culture can emerge from collective intelligence in animal groupsNature Communications, 15049.

[4] Bigham, J.P., Bernstein, M.S., and Adar, E. (2014). Human-Computer Interaction and Collective Intelligence. Chapter 2, Collective Intelligence Handbook.

[5] Bernstein, M.S., Little, G., Miller, R.C., Hartmann, B., Ackerman, M.S., Karger, D.R., Crowell, D., and Panovich, K. (2015). Soylent: A Word Processor with a Crowd InsideCommunications of the ACM, 58(8), 85-94. doi:10.1145/ 2791285.

[6] Alicea, B. (2012). Leaderless control: understanding unguided orderSynthetic Daisies, April 9.

[7] Cui, X. And Stiles, E. (2010). Workings of Collective Intelligence within Open Source CommunitiesThird International Conference on Social Computing, Behavioral Modeling, and Prediction, Bethesda, MD.

[8] Alicea, B. (2020). Building a Distributed Virtual Laboratory Adjacent to AcademiaMetaArXiv, doi:10.31222/osf.io/4k3z6. 


Infinite Issues (issue infinity): how to break down a wicked problem

In this blog post, I want to discuss how to break down a big problem into a set of smaller issues that are addressable in a short-term timescale. Previously, I discussed how to break a complex problem down into addressable action items. Now I want to do this in the context of wicked problems, or problems that are highly complex, hard to predict, and have multiple unintended or unforeseen outcomes.

Solving a basic problem requires you to take its most salient and/or interesting features, and then establish the outlines of a solution. This might involve breaking the problem up into  more easily solved parts, or asking additional questions about the nature and context of the problem. Next is a consideration of what resources you might need. In an open-source project, many of these projects revolve around the time constraints of contributors. Therefore, one key to creating issues is to break down the problem into small pieces that are both relatively easy to solve and require a low time commitment. Individually, these might seem too small to matter. Taken together, however, they allow you to build large-scale applications.

Of course, this procedure works well for normal complex problems, such as “let’s build a smart watch”! But suppose we want to work on a problem that interacts strongly with social systems, such as how to mitigate a pandemic. Then we have to think about the problem in terms of so-called wicked problems.

Wicked problems have a very high degree of computational complexity, which translates into a system with many more moving parts than cannot be analyzed in a way that provides an exact answer.

1) ill-defined problems reign supreme: it is hard to define the problem or even a set of issues at least initially.

2) all solutions to a given problem are at best a guess. This includes the ability to break down a problem into salient issues. Most issues in wicked problems will require the approximation of the salient issues, as well as forms of rapid prototyping to refine issues as the problem domain becomes more familiar.


Wicked Problems from a design perspective. Image is from Figure 1 in [1].

3) no natural end-point, where a system does not have a clearly defined stopping points.

4) so-called messes interactions between subdomains, problems of which cannot be easily broken up into discrete parts [for more, see 2]. This might be because your problem domain has porous boundaries/categories or that the problem itself is highly variable over time.

One example of a wicked problem is an institutional response to COVID. The University of Illinois has done this fairly effectively, but has involved innovations on multiple fronts. Part of this has involved Safer Illinois, which is a normal complex problem. But Safer is necessary but not sufficient to solving this problem. The real success has been seen through multiple institutional components such as testing regimens and building ambassadors. While an app designed to manage one part of the pandemic response is a basic problem, the response as a while is a wicked problem. This is something to consider when we think about how our contributions fit into larger issues.

NOTES:

[1] Jobst, B. And Meinel, C. (2013). How Prototyping Helps to Solve Wicked Problems. Design Thinking Research, 105-113.

[2] Yeh, R.T. (1991). System Development as a Wicked Problem. International Journal of Software Engineering and Knowledge Engineering, 1(2), 117-130.

September 6, 2016

Now Announcing the OpenWorm Open House

OpenWorm Browser. Courtesy Christian Grove, WormBase and Caltech.

About two years ago, I announced the start of the DevoWorm project to the OpenWorm community. Now both OpenWorm and DevoWorm have grown up a bit, with the former (OpenWorm) now being a Foundation and the latter (DevoWorm) resulting in multiple publications. Now we will be celebrating all of the projects that make up the OpenWorm Foundation in an Open House format, taking place in cyberspace and tentatively scheduled for October.

Image courtesy Matteo Farinella: http://matteofarinella.com/Open-Worm. These posters are the outcome of an OpenWorm Kickstarter campaign several years ago.

The details of the schedule are still being worked out, but the format is to include both short, 5-minute talks (Ignite-style) and longer tutorials (45-60 minutes, plus questions). The short talks will highlight the various ongoing projects within OpenWorm, while the tutorials will focus on specific methods or procedures employed by the projects. If you happen to be a project leader or major contributor, I have probably already asked you for content. Interested in either contributing content or attending? Please let me know

Dr. Stephen Larson (pre-PhD), discussing the connection between Lt. Data and C. elegans at Ignite San Diego.

I have also been involved in committee work for the OpenWorm foundation. One of the initiatives we are in the process of establishing is the OpenWorm badge system, which is being spearheaded by Dr. Chee-Wai Lee. Currently trendy in the online learning world, this is an experiment in open learning that provides micro-credentials to a global community. Badges are a great way to learn new skills, as well as a means to motivate people's contributions to different projects within OpenWorm. Currently, OpenWorm is offering tutorials on the Hodgkin-Huxley model, the Muscle Model builder, and the Muscle Model explorer. If there are any tutorials you would like to see us offer, or if you think there is a need for a particular skill to be highlighted, please let me know.

August 3, 2016

Slate and the Solitary Ethnographic Diagram

While his style and message does not resonate with me at all, I've always thought that Donald Trump's speeches were highly-structured rhetoric. He seems to be using a form of intersubjective signaling [1] understood by a number of constituencies as communicating their values in an authentic manner. Specifically, the speeches have a sentence structure and cadence that can be differentiated from the literalism of contemporary mainstream society or more traditional forms of doublespeak ubiquitous in American politics.

This is why the most recent challenge from Slate Magazine was too good to pass up. The challenge (which has the feel of a Will Shortz challenge): diagram a passage from a Donald Trump speech given on July 21 in Sun City, South Carolina. The passage is as follows:
"Look, having nuclear—my uncle was a great professor and scientist and engineer, Dr. John Trump at MIT; good genes, very good genes, OK, very smart, the Wharton School of Finance, very good, very smart—you know, if you’re a conservative Republican, if I were a liberal, if, like, OK, if I ran as a liberal Democrat, they would say I’m one of the smartest people anywhere in the world—it’s true!—but when you’re a conservative Republican they try—oh, do they do a number—that’s why I always start off: Went to Wharton, was a good student, went there, went there, did this, built a fortune—you know I have to give my like credentials all the time, because we’re a little disadvantaged—but you look at the nuclear deal, the thing that really bothers me—it would have been so easy, and it’s not as important as these lives are (nuclear is powerful; my uncle explained that to me many, many years ago, the power and that was 35 years ago; he would explain the power of what’s going to happen and he was right—who would have thought?), but when you look at what’s going on with the four prisoners—now it used to be three, now it’s four—but when it was three and even now, I would have said it’s all in the messenger; fellas, and it is fellas because, you know, they don’t, they haven’t figured that the women are smarter right now than the men, so, you know, it’s gonna take them about another 150 years—but the Persians are great negotiators, the Iranians are great negotiators, so, and they, they just killed, they just killed us"
Okay, here you go -- an ethnographic-style diagram [2] based on one man, but perhaps instructive of an entire American subculture (click to enlarge). The diagram focuses on the relationship between John and Donald Trump (context-specific braintrust) and a specific worldview of power wielded through nuclear weapons, financial ability, and persuasion.


NOTES:
[1] In this case, intersubjective signaling could be used as a mechanism to reinforce group cohesion, particularly when the group's belief structure is defined by epistemic closure.

[2] Perceived lack of agency shown as red arcs terminated with a dot.

October 22, 2015

Arriving at October 21, 2015...... and beyond

Last year I marked the date, and this year it became a "thing" (at east on the internet). So here are a few links to celebrate the famous date from the "Back to the Future" trilogy.

Billings, L.   Time Travel Simulation Resolves "Grandfather Paradox". Scientific American, September 2 (2015).

"What 'Back to the Future, Part II' Got Wrong (and right)", from the University of Illinois, Urbana-Champaign.


A welcome to the future, from Doc Brown himself:

COURTESY: Universal Studios.

And now..... a bit farther into the future...... The Economics of Star Trek, which is a really active area of internet scholarship:

Transcript of the recent New York ComicCon panel on Trekonomics.

Podcast on the "Economics of Star Trek", courtesy of FW: Thinking.

A few other takes on the Star Trek economy from Noahpinion, Joseph Dickerson, and Slate.

In the future, Spock is on the money. COURTESY: Rick Webb, The Economics of Star Trek: the proto-post scarcity economy.

September 29, 2015

Reconsidering the Model as a Unit of Regulation: cybernetics and the adaptive outcome

Here is a preview of an essay Robert Stone and I have been working on as part of the Orthogonal Research initiative during the course of the last year. The formal title is: "The Foundations of Control and Cognition: The Every Good Regulator Theorem". This essay takes a classical tool from the cybernetics literature and applies it to game theoretic and other problems of our interest.

Robert Stone, cybernetics enthusiast

Robert Stone and myself, bringing cybernetics back to the "soft" but immensely-complex (social, brain, and biological) sciences. The full version (with notes, definitions, and additional references) can be found here.

A seemingly simple discrete system with feedback (which makes it not so simple during future iterations). COURTESY: intgr, Wikimedia commons.

I. Introduction
            In the history of scientific discovery, there have been examples of certain persons or facets of their work being considered ‘out of step’ with the dominant scientific or philosophical trends of the time. As such, they risk falling down a deep well in our cultural landscape, with their work’s efficacy lost to subsequent generations. If their work has merit, it may be considered ahead of its’ time by future generations. The timing of a given theory or great idea is largely determined by cultural and cognitive biases that favor the dominant paradigm [1]. In other cases, ideas at the paradigmatic vanguard end up resurrected in a more pragmatic way. The acceptance of such ideas occurs either gradually or in one fell swoop at a later point in time. Let us keep this in mind as we discuss Ronald C. Conant and W. Ross Ashby’s seminal work “Every Good Regulator Theorem” [2] (EGRT):

“[The EGRT is]….a theorem is presented which shows, under very broad conditions, that any regulator that is maximally both successful and simple must be isomorphic with the system being regulated…….Making a model is thus necessary.” [2]

The EGRT characterizes regulation with respect to cybernated control systems. In the case of Ashby and Conant [3], the EGRT developed within the context of several intersecting traditional fields. These include algorithmics, information theory, systems theory, and behavioral science. In such a context, models are exceedingly important. Given the reliance of the EGRT concept on inference and propositional thinking, there is an essential reliance on models. In fact, the EGRT exists at such a high level of abstraction that even with a high degree of specification may not be directly applicable in the real world [4]. However, there are certain advantages of cybernetic modeling that make their cross-contextual application useful.

Ashby's graphical formulation of the EGRT Theorem with original notation. COURTESY: [2].


II. Background
Let us return to the notion of modeling as phenomenology. Systems engage in modeling not simply to purposely regulate their environments, but rather to reactively respond to input stimuli in a way that maintains higher-level states [5]. This ability to model becomes part of their structure at the most basic of levels, though it would be fair to say most modeling (in the way we will use the word) is the result of cognitive processes. The constructivist might argue that such metacognitive dynamics [6] would influence one’s proposed scientific model. Like Shakespeare’s Hamlet, however, the question of whether or not to model (or be) is one of survival, whether that survival be genetic or memetic. Rather than reviewing the proof step-by-step, let’s discuss its potential significance in a variety of use-cases. In the process, we will be transcending the traditional boundaries of autonomic, ‘choice’, or even cognitive.

          Simply put, the Every Good Regulator Theorem says that regulators operate on approximations (e.g. models) of the thing they are regulating. This requires a mapping of the natural world to the model. While one might consider the activities of encoding and translation to be inherently cognitive, genomic systems also perform biological control functions in the absence of cognition [7, 8]. In the biological control example, what matters is not intent, but accuracy. Rather than an actively goal-oriented criterion, what we observe here is passively goal-oriented system output. Accuracy of the approximated model influences the quality of regulation. Thus, there need not be agency on the part of any single system component. Indeed, to survive as a unit in an interrelated system, a regulating machine must construct an interactive model that includes inputs, outputs, and feedback.

Let us consider a couple cases of regulatory dynamics, which may be valuable in understanding the importance of this theorem. We can then move on to what could this mean for both further theoretical development and practical application. A good place to begin in cognitive science is game theory [9]. One of the most simple, effective, and most explanatory strategies in the Prisoners’ Dilemma game is the tit-for-tat strategy [10]. In this 2-player, 2x2 game, the tit-for-tat strategy is simple: ‘Do unto others as they have done unto you’ after an initial good faith move of cooperation. The strategy is simply to copy your opponent's behavior. If the opposing agent cooperates, so does the tit-for-tat strategizing agent; if they defect, the tit-for-tat strategist follows suit. The intended outcome of the strategy is to move the exchange towards an equilibrium (though this is not the only possible outcome, nor is the strategy perfect).

Of specific interest here is that the mechanics of the strategy requires a model to be held in memory by the agent employing tit-for tat (a 1-bit cooperate/defect model), regardless of the strategy employed by the other agent (whether that be a more sophisticated maximizing strategy, or random selections). While an economist might view this as free-riding behavior by one of the two agents, the selection of tit-for-tat by both players can produce a cooperative equilibrium, such as in the evolution of reciprocal altruism in biological systems. The EGRT suggests that the greater the memory for an agent, and the longer it has the opportunity to observe and integrate the moves of its opponent, the greater its’ potential for effective regulation.

Over time, this can lead to greater accuracy for the agent’s cognitive model and a more stable equilibrium game outcome. Further, this equilibrium state can be long-lasting, given extended memory capacity for more detailed models, and may evolve towards ‘a conspiracy of doves’, within a game of homo lupus homini. An agent with a greater memory capacity can also employ more elaborate (or deeper) strategies over time. This development of deeper strategies may also feedback into modifying its model of the external world [11]. Overall, the capability to regulate behavior of other players depends on the inferential and predictive capacities of each player’s model: in a highly complex competitive game environment, a good regulator has a superior model, or it will find itself regulated by a competing agent in the game, especially as the behaviors get more complex.

“The theorem has the interesting corollary that the living brain, so far as it is to be successful and efficient as a regulator for survival, must proceed, in learning, by the formation of a model (or models) of its environment.” [2]

An example of a basic 2x2 payoff matrix characterizing the Prisoner's Dilemma. COURTESY: "Extortion in Prisoner's Dilemma", Blank on the Map blog, September 19 (2012).

III. Further Considerations
Let us now consider a more complicated scenario where we might be able to uncover the universal components of the EGRT phenomenology. The context will be two people on a blind date (this can actually be a complicated scenario). If one has been in one of these (terrifying) contexts, then one can already see where we are going. The cognitive agents are continually competing to increase the efficacy of their models of the other agent, while also attempting to constrain the modeling of the other agent towards a compact image they prefer. Although rarely implemented successfully, winning strategies include accurately modeling the other actor and influencing the state of their mental model. This can include both elaborate, multi-step strategies, and simpler strategies, the complexity of which is does not indicate their effectiveness. If the goal is a continuation of relations, the acquisition and intentional obfuscation of information occurs at appropriate times and in appropriate ways. Furthermore, this information has contextual value. As in most scenarios involving imperfect or asymmetrical information [12], your model must be superior to become the leader of the interaction [13], and thus control of regulation.

Does regulation even require what we would call cognition? This of course depends on our definition of cognition and regulation. However, let us consider that a bacterium does not have a “cognitive” or mental model of its environment, yet appears to have little trouble getting around and controlling some aspects of its landscape. The similarities between chemotactic sensation and mental models built upon multisensory stimuli serve as evidence for the universal character of the EGRT. In fact, Heylighen [14] has proposed that cybernetic regulation is a highly-generalized form of cognition. Yet do thermostats or other mechanical systems possess anything approaching what we consider cognition? While none of these has the cognitive capacity of a brain, they do have information processing capabilities from their physical or electronic structure, memory states, and crude models of how things ‘should’ be, towards which they regulate conditions. Non-cognitive systems possessing these characteristics are obviously still capable of rudimentary communication, control, decision making, and regulation, at least abstractly. We should also expect some degree of continuity that crosses the boundary of the cognitive and non-cognitive, since cognitive systems evolved from less intentional ones with more rudimentary forms of behavioral control.

“...success in regulation implies that a sufficiently similar model must have been built, whether it was done explicitly, or simply developed as the regulator was improved.” [2]


IV. Conclusion
Earlier, we had touched upon the history of scientific discovery, and contextual model building. A scientific theory is simply a model, and its value lies in its efficacy and repeatability (thus its’ trustworthiness and ability to aid in regulation). Theoretical models have tended, historically, to shift from informal, conceptual models towards formal mathematical ones (consider Comte’s Philosophy of Science). As a given model acquires more data, and as those data create ever-more accurate model revisions with higher fidelity. The overall capacity to aid regulation increases via feedback. Thus, the model’s value to humans increases. However, as noted by the example of ahead of their time thinking, scientific thought does not exist in a vacuum, and the landscape conditions need to be aligned so that the model can prove fruitful. Consider how we are witnessing an explosion in robust formal mathematical and/or computer models either aiding or besting human cognitive efforts [15, 16]. Informational revisions of the model often occur faster than the landscape conditions change, so adaptive cross-contextual models may prove more successful in dynamic situations, such as ones which are developed by human thought and human cultural systems.

            This ability to cross the boundary between cognitive and non-cognitive with models may challenge either our informal, colloquial conception of cognition or the universality criterion of the formal EGRT. As both features of cognition and more universal mechanisms, information processing, memory, communication, and selection can occur without any kind of cognitive superstructure. Perhaps the context of what we call “cognition” is too limiting. What about human cognition then is truly universal, and what is unique to a certain set mechanisms and representational models? For example, are models of so-called cellular decision-making [17] an unduly anthropomorphic representation of cellular differentiation and metabolism, or is it drawing upon a common set of universal properties that can only be abstracted from the system by an appropriate model?

Rather than trying to solve this philosophical puzzle now, let us take leave to consider that a deep truth like the one perhaps contained within the formalism of the EGRT should make us question scientific knowledge in a manner akin to reconsidering our firmly-held beliefs. It should make us reconsider how well we understand the relationship between nature and our own conceptual models. In that, it kindles the same spark from which all great scientific theories alight: It leads us to more questions, new ways of thinking about things, and guides us towards more accurate, repeatable, and otherwise ‘good’ models.

“Now that we know that any regulator (if it conforms to the qualifications given) must model what it regulates, we can proceed to measure how efficiently the brain carries out this process. There can no longer be question about whether the brain models its environment: it must.” [2]

References:
[1] Kuhn, T.   Structure of Scientific Revolutions. University of Chicago Press (1962). 

[2] Conant, R.C. and Ashby, W.R.   Every good regulator of a system must be a model of that system. International Journal of Systems Science, 1(2), 89–97 (1970).

[3] Ashby, W.R.   Introduction to Cybernetics. Chapman and Hall (1962).

[4] Fishwick, P.   The Role of Process Abstraction in Simulation. IEEE Transactions on Systems, Man, and Cybernetics, 18(1), 18-39 (1988).

[5] Brooks, R.   Intelligence Without Representation. Artificial Intelligence, 47, 139-159 (1991).

[6] Kornell, N. Metacognition in Humans and Animals. Current Directions in Psychological Science, 18(1), 11-15 (2009).

[7] Ertel, A. and Tozeren, A.   Human and mouse switch-like genes share common transcriptional regulatory mechanisms for bimodality. BMC Genomics, 23(9), 628 (2008).

[8] Gormley, M. and Tozeren, A.   Expression profiles of switch-like genes accurately classify tissue and infectious disease phenotypes in model-based classification. BMC Bioinformatics, 9, 486 (2008).

[9] Gintis, H.   Game Theory Evolving. Princeton University Press (2000).

[10] Imhof, L.A., Fudenberg, D., and Nowak, M.A.   Tit-for-tat or Win-stay, Lose-shift? Journal of Theoretical Biology, 247(3), 574–580 (2007).

[11] Liberatore, P. and Schaerf, M.   Belief Revision and Update: Complexity of Model Checking. Journal of Computer and System Sciences, 62(1), 43–72 (2001).

[12] Rasmussen, E.   Games and Information: an introduction to game theory. Blackwell Publishing (2006).

[13] Simaan, M. and Cruz, J.B.   On the Stackleberg Strategy in Nonzero-Sum Games. Journal of Optimization Theory and Applications, 11(5), 533-555 (1973).

[14] Heylighen, F.   Principles of Systems and Cybernetics: an evolutionary perspective. CiteSeerX, doi:10.1.1.32.7220 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.7220 (1992).

[15] LeCun, Y., Bengio, Y., and Hinton, G. Deep Learning. Nature, 521, 436-444 (2015).

[16] Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A.A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J., Schlaefer, N., and Welty, C.   Building Watson: an overview of the DeepQA project. AI Magazine, Fall (2010).

[17] Kobayashi, T.J., Kamimura, A.   Theoretical aspects of cellular decision-making and information-processing. Advances in Experimental Medicine and Biology, 736, 275-291 (2012).


UPDATE (9/30): During the editorial process, Rob and I had a discussion about using the word "alight" (in the final paragraph). I was not sure about the correct word usage, but Rob assured me that it was being used correctly in this context. But to back this up even further (and to gratuitously insert an informatics Easter Egg), here is the Google Ngram history of "alight" usage since 1800. 




June 11, 2015

Slipping Down the Fluid Slope of Ethical Integrity

This post will focus in on the slippery slope of research ethics, particularly the consequences of strange things happening in the course of pursuing one's best intentions. Cheeky images and puns will help to accentuate the story.

I have been leery of the start-up Uber ever since I heard stories about their varied ethical breaches [1], but now I'm even more deeply skeptical. Uber is showing exactly what can be accomplished when the "technically not illegal" ethos runs amok. Apparently, the ridesharing service entered into a research partnership with Carnegie Mellon, only to poach massive amounts of staff at-will [2]. As I understand it, the reason for this is largely superfluous. Uber wanted to possess expertise in Artificial Intelligence and automation, but did not want to go through an intermediary. Generally speaking, academic-private sector partnerships are not supposed to work like patent troll litigation. But Uber is a wildly-successful startup, so some non-zero percentage of the population are sure to overlook the ethical lapses.


Not illegal, not illegal....the Uber business model? REFERENCE: Family Guy.

Another example of the ethical slippery slope comes from the open-access journal troll and Alan Sokal wannabe John Bohannon. As a feature writer for the journal Science, he once did an updated version of the Sokal hoax where nonsensical papers were sent to a large number of open-access journals. The catch is that some of the journals published the articles for little more than a publication fee (and minimum editorial oversight) [3]. More recently, a new hoax involved an intentionally shady study that touted the health benefits of chocolate [4]. The thing is, the popular press picked up on the paper before it could be retracted. Awesomely delicious stuff (pun intended). The chocolate study has ignited a debate in the world of internet opinion, both in support and as criticism. Aside from the rhetorical point that provocative results can often result from p-value hacking [6], the most obvious problem is that you are intentionally drawing questionable conclusions and having them published. Even though an important point is made, the purposeful dissemination of false findings can lead to serious unintended consequence [7]. In any case, I suppose if the pages of Science ever see a high-profile paper retraction [8], Bohannon's team will get right on the case.

The iffy ethics of sting operations against questionable peer-review processes and journalistic hype machines. REFERENCE: Willy Wonka and the Chocolate Factory.

NOTES:
[1] Newton, C.   This is Uber's playbook for sabotaging Lyft. The Verge, August 26 (2014).

[2] Lowensohn, J.   Uber gutted Carnegie Mellon's top robotics lab to build self-driving cars. The Verge May 19 (2015).

[3] Alicea, B.   Fireside Science: the Consensus-Novelty Dampening. Synthetic Daisies blog, October 22 (2013).

[4] Bastian, H.   Tricked: the ethical slipperiness of hoaxes. Absolutely Maybe blog, May 31 (2015).

[5] Gelman, A.   John Bohannon’s chocolate-and-weight-loss hoax study actually understates the problems with standard p-value scientific practice. Statistical Modeling, Causal Inference, and Social Science blog, May 29 (2015).

[6] Kassel, M.   John Bohannon's Chocolate Hoax and the Spread of Misinformation. Observer.com, June 6 (2015).

[7] Data “were destroyed due to privacy/confidentiality requirements,” says co-author of retracted gay canvassing study. Retraction Watch blog (2015).

May 31, 2015

Kuhnian Practice as a Logical Reformulation

Are 01110000 01100001 01110010 01100001 [1] shifts a loss, a gain, a mismatch, or an opportunity for intellectual integration and the birth of a new field?


In the Kuhnian [2] approach to empiricism, a well-known outcome observed across the history of science is the "paradigm shift". This occurs when a landmark finding shifts our pre-existing models of a given natural phenomenon. One example of this: Darwin's finches and their evolutionary history in the Galapagos. In this case, a model system confirmed previous intuitions and overturned old facts in a short period of time (hence the idea of a scientific revolution). 

During a recent lecture by W. Ford Doolittle at the Insititute for Genomic Biology, I was introduced to a term called "Kuhn loss" [3]. Kuhn loss refers to the loss of accumulated knowledge due to a conceptual shift in a certain field. One might consider this to be a matter of housecleaning, or a matter of throwing out the baby with the bathwater. The context of this introduction was the debate between evolutionary genomicists [4] and the ENCODE consortium over the extent and nature of junk DNA. During the talk, Ford Doolittle presented the definitions of genome function proposed by the ENCODE consortium as a paradigm shift. The deeper intellectual history of biological function would suggest that indeed junk DNA not only exists, but requires a multidisciplinary and substantial set of results to overturn. Thus, rather than viewing the ENCODE results [5] as a paradigm shift, it can be viewed as a form of intellectual loss. The loss, paradigmatic or otherwise, provides us with a less satisfying and robust explanation than was previously the case.

A poster of the talk. COURTESY: IGB, University of Illinois, Urbana-Champaign

Whether or not you agree with Ford Doolittle's views of function, and I am of the opinion that you should, this introduces an interesting PoS issue. In the case of biological function, the caution is against a 'negative' Kuhn loss. But Kuhn loss (in a linear view of historical progress) usually refers to the loss of knowledge associated with folk theories or theories based on limited observational power. In some cases, these limited observations are augmented with deeper intuitive motivations. This type of intuition-guided theory usually becomes untenable given new observations and/or information about the world. Phlogiston theory [6] can be used to illustrate this type of 'positive' Kuhn loss. Quite popular in Ancient Greece and Medivel Europe, phlogiston theory predicts that the physical act of combustion released fire-like elements called phlogistons. Phlogistons operated in a manner opposite of the role we now know oxygen serves in combustion and other chemical reactions. Another less clear-cut example of 'positive' Kuhn loss involves a pre-relativity idea called aether theory predicts that the aether (an all-enveloping medium) is responsible for the propogation of light in space.

In each of these cases, what was lost? Surely the conclusions that arose from a faulty premise needed to be re-examined. A new framework also swept away inadequate concepts (such as "the aether" and "phlogistons"). But there was also a deeper set of logical structures that needed to be reformulated. In phlogiston theory, the direction of causality was essentially reversed. In aether theory, we essentially have a precursor to a more sophisticated concept (spacetime). Scientific revolutions are not all equal, and so neither is the loss that results. In some cases, Kuhn losses can be recovered and contribute to the advancement of a specific theoretical framework. Midwinter and Janssen [7] introduce us to the physicist/chemist Van Vleck, who improved upon the Kuhn loss introduced when quantum theory was introduced and replaced its antecedent theory. Van Vleck did this by borrowing mathematical formalisms from the theory of susceptibilities, and bringing them over to physics. While neither a restoration nor a paradigm shift, Van Vleck was able to improve upon the ability of quantum theory to make experimental predictions.

Tongue-in-cheek description of an empirically verified of phlogiston theory. COURTESY: [8]

Now let us revisit the Kuhnian content of the ENCODE kerfuffle vis a vis this framework of positive/negative Kuhn loss and Kuhn recovery. Is this conceptual clash ultimately a chance for a gain in theoretical richness and conceptual improvement? Does the tension between computational and traditional views of biological function neccessitate Kuhn loss (positive or negative)? According to the standard dialectical view [9], the answer to the former would be yes. In such case, we might expect a paradigm shift that results in an improved version of the old framework (e.g. 'positive' Kuhn loss). But perhaps there is also a cultural mismatch at play here [10] that could be informative for all studies of Kuhn loss. Since these differing perspectives come from very different intellectual and methodological traditions, we could say that any Kuhn loss would be negative due to a mismatch. This is a bit different from the phlogiston example in that while both approaches come from a scientific view of the world, they use different sets of assumptions to arrive at a coherent framework. However, what is more likely is that computational approaches (as new as they are to the biological domain) will influse themselves with older theoretical frameworks, resembling more of Kuhnian recovery (the quantum/antecedent theory example) than a loss or gain.

It is this intellectual (and logical) reformulation that will mark the way forward in computational biology, using an integrative approach (as one might currently take for granted in biology) rather than reasoning through the biology and computation as parallel entities. While part of the current state of affairs involves a technology-heavy computation being used to solve theoretically-challenging biological problems, better logical integration of the theory behind computational analysis and the theory behind biological investigation might greatly improve both enterprises. This might lead to new subfields such a the computation of biology, in which computation would be more than a technical appendage. Similarly, such a synthetic subfield would view of biological phenomena much more richly, albeit with the same cultural biases as previous views of life. Most importantly, this does not take a revolution. It merely takes a logical reformulation, one that could be put into motion with the right model system.


NOTES:
[1] the word "paradigmatic", translated into binary. COURTESY: Ashbox Binary Translator.

[2] Kuhn, T.S.   The Structure of Scientific Revolutions. University of Chicago Press (1962).

[3] Hoyningen-Huene, P.   Reconstructing Scientific Revolutions. University of Chicago Press (1983).

[4] Doolittle, W.F.   Is junk DNA bunk? A critique of ENCODE. PNAS, 110(14), 5294-5300 (2013).

[5] The ENCODE Project Consortium   An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57-74 (2012).

[6] Vihalemm, R.   The Kuhn-loss Thesis and the Case of Phlogiston Theory. Science Studies, 13(1), 68 (2000).

[7] Midwinter, C. and Janssen, M.   Kuhn Losses Regained: Van Vleck from Spectra to Susceptibilities. arXiv, 1205.0179 [physics.hist-ph] (2012).

[8] DrKuha   The Phlogiston: Not Quite Vindicated. Spin One Half blog, May 19 (2009).

[9] what we should expect according to dialectical materialism: adherents of two ideologies struggle for dominance, with an eventual winner that is improved upon the both original ideologies. Not to be confused with the "argument to moderation".

[10] for more context (the difference between a scientific revolution and a scientific integration) please see: Alicea, B.   Does the concept of paradigm shift need a rethink? Synthetic Daisies blog, December 25 (2014).

Printfriendly