January 28, 2012

Representing rare events


In this post I will be discussing the occurrence of rare events [1] and how they might be represented in computational systems. Rare events are intriguing precisely because they occur infrequently. We are all familiar with rare events in nature: avalanches, rogue waves, freak storms, and developmental mutations are but a few examples. Furthermore, while these events are rare, they are not improbable. I posted a few months back on the context of such processes (overproductive systems) and their potential relevance to biology. While the occurrence of rare events themselves is sparsely distributed in time, they occur against a background process filled with many events we like to call "normal". This is related to a normal distribution: background events occur within a certain range of expected outcomes.


One way in which rare events have been addressed from a statistical standpoint is to model them as a distribution of noise. In processes where we expect no rare events, we can assume that events will unfold according to a white noise distribution [2]. For processes where these rare events occur at increasingly larger magnitudes, a model of colored (or 1/f) noise can be used to model the expectation of rare events embedded in a process. So-called pink noise provides a series of rare events with a near-uniform size many orders of magnitude above the size of background events, while black noise provides a series of rare events occurring against a background of near-inactivity.


The comparison of extreme 1/f noise to white noise can be made using a thought experiment. Suppose you enjoy sleeping with a white noise machine running in the background. Why do you enjoy this? One reason might be that a white noise machine provides a stochastic but uniform auditory stimulus that allows your relax enough to fall asleep. Yet suppose that you wanted to awake from your slumber. Most people use an alarm clock, which presents bursts of auditory stimuli at different magnitudes, depending on your preference.
This transition from a uniform stream of sound to a series of bursts embedded in a more uniform background is the transition from common events to rare events. Not only do rare events capture your attention, but they also provide a basis for large-scale transitions in coupled processes. In this case, your alarm clock is coupled to your sensory systems, which force you to wake up abruptly.


One also can think of a noisy distribution as the null model for the expectation and occurrence of rare events. Yet during a given natural process, not only are rare events expected to occur at a certain rate, but also recur at a certain rate. This is because rare events tend to be stochastic, and as a result are not evenly distributed in time. This connection to recursion suggests that such events are computable, if we only had the proper data structure for these events.


From the time of the difference engine to today, representing something for a computational system has involved using a discrete binary model that conforms to a logical data structure (e.g. logic gate, tree, matrix). Yet "natural" computation (or computable behavior) has been observed in systems such as chemical (B-Z and Turing) reactions [3], collective behavior, and gene expression, which may be neither discrete and binary nor explicitly logical. However, one could argue that rare and extreme events occur in such systems, and that it is a phenomenon very poorly captured by contemporary hardware and software.


LEFT: an example of a B-Z reaction. RIGHT: examples of reaction-diffusion seen in so-called Turing reactions (a key mechanism in biological development).

Why aren't contemporary data structures suitable for what I am describing? It is true that much of the work on fractal geometry was made possible using standard, existing computing logic [4]. Yet most of modern artificial intelligence and machine learning is based on the notion of normalization. For example, in machine learning and pattern recognition, a series of exemplars are used to identify discrete groups and isolated objects. A basis function used to delimit these groups is often based on the normal distribution, and the margin of these functions are generally associated with a deterministic amount of variance. This leads to a high rate of true positives for classifying objects, but can lead to catastrophic failures from time to time.


An alternate approach is to use a genetic algorithmic or artificial life representation. While these techniques are capable of generating rare and/or extreme events (through mutation and recombination), we must still understand the context of rare events. By observing systems such as rogue waves and avalanches, we can see that fluctuations are an important ingredient for generating rare events. Thus, systems that exhibit flux (where there are undulations in the acceleration and movement of individual particles) should be abstracted to a formal data structure in order for commercially-available computers to completely model rare events. Incorporating these features into a genetic algorithm or artificial life application would greatly help us understand natural instances of non-uniformity.

But can this even be done using conventional computational abstractions? It is hard to say. I have thought about this problem, and remain convinced that we need fundamentally new forms of computation to truly address these features of nature. One option might be "natural" computing, which is the use of natural phenomena (such as chemical reactions or bubble formation) to execute computations [5]. Another option might be in the area of biomimetics, where the goal is to abstract functional features from biological and other natural systems for purposes of computing and developing engineering applications.

An interesting area indeed. More news as it develops.

References
[1] Rare events have been characterized by the Black Swan phenomenon and the Poisson distribution.

[2] A white noise distribution is a normally-distributed Gaussian process.

[3] Adamatzky, A. (2001). Computing in Nonlinear Media and Automata Collectives. Institute of Physics, Bristol, UK.

[4] Mandelbrot, B. (1982). The Fractal Geometry of Nature. W.H. Freeman, New York.


Printfriendly