April 13, 2018

Video Game Complexity: a short overview

What inspired this post was both an interesting question from Twitter user @ID_AA_Carmack, CTO at Oculus VR and a long-suffering manuscript I have been trying to get into publication with some collaborators [1]. With reference to the former:

In the accompanying thread, someone mentioned that the classic game Tempest used a paddle controller, and therefore was an example of analogue control (and outside the scope of John's taxonomy). But in general, joystick and button-press control provide a measure of complexity for a wide variety of games.

Let's begin by highlighting a 2-bit game. Two examples of a 2-bit game (in terms of control) are Pong and Breakout. In both games, the player uses a single controller to move a virtual paddle back and forth across the screen [2].

A screenshot of Pong (left) and Breakout (right) and the degrees of freedom per player. Each degree of freedom (movement along the right-left axis) represents 2 bits of information.

Now let's discuss the mention of Joust as an example of  3-bit control and Pac-Man as an example of 4-bit control. For uninitiated readers, here are screen shots from the game (click to enlarge)

And here is a picture of the Joust arcade console and its 3 bit control mechanism.

The Joust joystick provides 2 bits of control (left on/off and right on/off), and the button is 1-bit of control (press or no press). This is sufficient complexity to control the action, but not enough complexity to account for all observed action in the game.

Pac-Man utilizes a maze rather than linear action, but has only one additional degree of control complexity. Here are some screenshots from the game (click to enlarge)

In terms of control complexity, Pac-Man is a 4-bit game (2 degrees of freedom), while the Atari 2600 controller can provide up to 5 bits of information (2 degrees of freedom and a button). These are arbitrary assessments of information content, and do not correspond with every possible degree of freedom in the game environment.


Examples of 4- and 5-bit control via joystick for a version of Pac-Man (left) and the Atari 2600 (right). 

There are two additional sources of complexity we should be concerned with: the combinatorial complexity of objects within the game, and perceptual complexity of human information processing during gameplay. The former can be characterized by computational complexity classes, while the latter can be characterized in conjunction with control complexity in the form of information bandwidth (measured in bits or bits per second).

A taxonomy of computational classes for Pac-Man and other games is described by [3-5]. While Pac-Man is considered an NP-hard game, not all video games are. Furthermore, games that are in the same computational class do not have the same types of objectives or in-game properties. The games Tetris [6] and Minesweeper [7] have been found to be NP-complete. This is likely based on the combinatorial properties of each game.

According to Aloupis et.al [8,9], Super Mario Brothers can be classified as NP-hard. Other Nintendo games, such as Donkey Kong Country, are PSPACE complete. The Nintendo examples are interesting from an approximation perspective, as several Nintendo games have been reclassified by this research group between 2012 and 2016 [8,10]. 

The criterion are based on things like how the avatar moves within the game's activities. A game like Pac-Man involves so-called "location traversal" [4], which in computational terms is similar to the travelling salesman problem (NP-hard) [3].  By contrast, a game like Breakout involves a balancing maneuver [see 2] that is solvable in LSPACE.

Video game screenshots and computational complexity for several example games. Clockwise from top: Super Mario Brothers (NP), Donkey Kong Country (PSPACE), Breakout (LSPACE), Minesweeper (NP), Tetris (NP), Pac-Man (NP).  

Human information processing (or performance) can be characterized in the form of informational complexity. One classic way to assess complexity in performance is the use of psychometric laws [11]. Invariants relevant to video game performance are Fitts Law (hand-eye coordination), the Weber-Fechner Law (just noticable differences in stimulus intensity), and Stevens Law (the perception of physical stimuli relative to intensity). In the case of Fitts Law, the index of difficulty is often expressed as a function of bandwidth (bits/second). This provides a real-time expression of complexity that can be compared across games. Similarly, the presence of signal relative to noise can be expressed in terms of information content (Shannon-Hartley theorem). The stimulus information available to the game player may or may not be congruent with control complexity and in-game combinatorial complexity [12]. Accounting for all three dimensions of this complexity is an avenue for further study.

[1] Weber, R., Alicea, B., Huskey, R., and Mathiak, K. (2018). Network Dynamics of Attention During a Naturalistic Behavioral Paradigm. Frontiers in Human Neuroscience, doi:10.3389/fnhum. 2018.00182.

[2] Many early video games were designed around simple programming benchmarks. Games such as Pong and Breakout take inspiration from the pole-balancing problem, while the game Qix takes inspiration from k-D partitioning. In this case, the human serves to solve the computational problem, however inefficiently.

[3] Pac-Man Proved NP-Hard By Computational Complexity Theory. Emerging technology from the arXiv blog, January 26 (2012). 

[4] Viglietta, G. (2013). Gaming Is A Hard Job, But Someone Has To Do It! arXiv, 1201.4995.

[5] Demaine, E.D., Lockhart, J., and Lynch, J. (2016). The Computational Complexity of Portal and Other 3D Video Games. arXiv, 1611.10319.

[6] Demaine, E.D., Hohenberger, S., and Liben-Nowell, D. (2003). Tetris is hard, even to approximate. Proceedings of the International Computing and Combinatorics Conference, 351–363.

[7] Kaye, R. (2000). Minesweeper is NP-complete. The Mathematical Intelligencer, 22(2), 9–15.

[8] Demaine, E.D., Viglietta, G., and Williams, A. (2016). Super Mario Brothers is harder/easier than we thought. Proceedings of the International Conference on Fun with Algorithms.

[9] Aloupis, G., Demaine, E.D., Guo, A., and Viglietta, G. (2014). Classic Nintendo games are (NP-) hard. arXiv, 1203.1895.

[10] Aloupis, G., Demaine, E.D., Guo, A., Viglietta, G. (2012). Classic Nintendo Games are (Computationally) Hard. arXiv, 1203.1895.

[11] One challenge in linking psychophysical complexity to video game perception is in capturing the entirely of what is cognitively extracted from the game environment.

For an example of a purely psychophysical aproach to game design, please see: Hussain, Z., Astle, A.T., Webb, B.S., and McGraw, P.V. (2014). The challenges of developing a contrast-based video game for treatment of amblyopia. Frontiers in Psychology, 5, 1210. doi:10.3389/fpsyg.2014.01210

[12] Forisek, M. (2010). Computational complexity of two-dimensional platform games. Proceedings of the International Conference on Fun with Algorithms, 214–227.

April 1, 2018

What is MS Fool for $1000, Alex?

Want more Jeopardy! answers in question form? Check out the J! archive.

There is a recurring insider reference among Very Serious Computer Users regarding using Microsoft products to perform sophisticated computational tasks [1]. While most people tend to think of these programs as not computationally sophisticated, programs such as Excel [2], PowerPoint [3], and even MS Paint can do some pretty powerful computing. One such example are 3-D Models enabled by object/shape manipulation in PowerPoint and Paint.

Around every April 1, Carnegie Mellon University students participate in an annual tongue-in-cheek conference (sponsored by the satirical Association for Computational Heresy) called SIGBOVIK (Special Interest Group on Harry Questionable Bovik). Not sure what where Harry Q. Bovik got his credentials, but if you enjoy the Ig Nobel Awards, this should be right up your alley.

SIGBOVIK often features highly interesting quasi-research [4] projects based on the aforementioned Microsoft program suite. But there are other groups creating programs and computational artifacts because they can. Here are a few examples of this collected from around the web:

1) Using MS Paint to create high-quality works of art, courtesy of Hal Lasko and "The Pixel Painter".

2) Tom Wildenhain's SIGBOVIK 2017 lecture on Turing-complete Power Point.

The (non-) Turing Completeness of PowerPoint. One of many computational artifacts that are Turing incomplete. COURTESY: Tom Wildenhain YouTube channel and HackerNoon.

3) Try out this version of Doom programmed in Excel, courtesy of Gamasutra. The game program runs on a series of equations that requires VBA to run. There are several files you need to download, and the blogpost goes through the full set of challenges.

Rasterized Graphics with Underlying Linear Equations [5]. Examples from the Excel Art Series. ARTIST: Oleksiy Say

4) The final example returns us to SIGBOVIK (2016), where David Fouhey and Daniel Maturana bring us ExcelNet. This is an architecture for Deep Learning in Excel, and has gone through the paces of the Kaggle MNIST competition. Another example features Blake West and his implementation of a deep learning architecture in Google Sheets.

[1] This is similar to advertising membership in the cult of LaTeX, touched upon in this discussion of LaTeX fetishes.

[2] the first spreadsheet (Autoplan) was developed in the form of a scripting language, which is a more general version of more formal programming languages (e.g. Java versus Javascript).

[3] presentation software can be pretty diverse. But is any of it Turing Complete (TC)? If you work out your design process using Wang Dominoes, then perhaps TC-ness can be realized.

Example of a Wang tile.

[4] definition of quasi-research: research projects that do not produce useful results directly, but often as a side-effects of the research process itself.

[5] by combining the world of image rasterization with interlinked linear equations, there are some exciting (but conceptually and technlogically undeveloped) opportunities in this space.

March 19, 2018

Google Summer of Code Application Deadline is Approaching!

A quick reminder as to what is near. Nothing to fear -- except that applications for Google Summer of Code 2018 are due on March 27th (in a little more than a week). Thanks to all the applicants to our projects so far. I am the mentor for the following projects (all sponsored by INCF):

Contextual Geometric Structures (Project 8)

Towards a k-D embryo (Project 10.2)

Physics-based XML (Project 10.1)

Apply to work work either the Orthogonal Research Lab community (Project 8) or the OpenWorm Foundation community (Projects 10.1, 10.2). Please contact me (Bradly Alicea) for more information.

There are several other projects hosted by the OpenWorm Foundation that combine work with Neuroscientific data and Computational Modeling. These include:

Advanced Neuron Dynamics in WormSim (Project 10.3)

Mobile application to explore C. elegans nervous system dynamics (Project 10.4)

Add support for Neurodata Without Borders 2.0 to Geppetto (Project 10.5)

All three of these projects are based in the OpenWorm Foundation community, and are lead by mentors Matteo Cantarelli, Giovanni Idili, and Stephen Larson.

March 3, 2018

Open Data Day 2018: Orthogonal Research Version

Time once again for International Open Data Day, an annual event hosted by organizations all around the world. For the Orthogonal Research contribution, I am sharing a presentation on the role of theory in data science (and the analysis of open data).

 Full set of slides are available on Figshare, doi:10.6084/m9.figshare.5483746

A theory of data goes back to before there were concepts such as "big data" or "open data". In fact, we can learn a lot from attempts to characterize regularities in scientific phenomena, particularly in the behavioral sciences (e.g. Psychophysics).

There are a number of ways to build a mini-theory, but one advantage of the approach we are working on is that (assuming partial information about the data being analyzed) a theoretical model can be built with very limited amounts of data. I did not mention the role of non-empirical reasoning [1] in the theory-building, but might be an important issue for future consideration.

 The act of theory-building is also creating generalized models of pattern interpretation. In this case, our mini-theory detects sheep-shaped arrays. But there are bottom-up and top-down assumptions that go into this recognition, and theory-building is a way to make those explicit.
 Naive theories are a particular mode of error in theory-building from sparse or incomplete data. In the case of human reasoning, naive theories result from generalization based on limited empirical observation and blind inference of mechanism. They are characterized in the Cognitive Science literature as being based on implicit and non-domain-specific knowledge [2].

Taken together, mini-theories and naive theories can help us not only better characterize unlabeled and sparsely labelled data, but also gain an appreciation for local features in the dataset. In some cases, naive theory-building might be beneficial for enabling feature engineering, ontologies/metadata [3] and other characteristics of the data.

In terms of usefulness, theory-building in data science lies somewhere in between mathematical discovery programs and epistemological models. 

[1] Dawid, R. (2013). Novel Confirmation and the Underdetermination of Scientific Theory Building. PhilSci Archive.

[2] Gelman, S.A., Noles, N.S. (2011). Domains and naive theories. WIREs Cognitive Science, 2, 490–502. doi:10.1002/wcs.124

[3] Rzhetsky, A., Evans, J.A. (2011). War of Ontology Worlds: Mathematics, Computer Code, or Esperanto? PLoS Computational Biology, 7(9), e1002191. doi:10.1371/journal.pcbi.1002191

February 12, 2018

Darwin as a Universal Principle

Background Diagram: Mountian-Sky-Astronomy-Big-Bang blog.

For this year's Darwin Day post, I would like to introduce the concept of Universal Darwinism. To understand what is meant by universal Darwinism, we need to explore the meaning of the term as well as the many instances Darwinian ideas have been applied to. The most straightforward definition of Universal Darwinism is a Darwinian processes that can be extended to any adaptive system, regardless of their suitability. Darwinian processes can be boiled down to three essential features:
1) production of random diversity/variation (or stochastic process).  
2) replication and heredity (reproduction, historical contingency). 
3) natural selection (selective mechanism based on some criterion). 
A fourth feature, one that underlies all three of these points, is the production and maintenance of populations (e.g. population dynamics). These features are a starting point for many applications of universal Darwinism. Depending on the context of the application,these four features may be emphasized in different ways or additional features may be added.

Taken collectively, these three features constitute many different types of process, encompassing evolutionary epistemology [1] to cultural systems [2], neural systems [3, 4], physical systems [5, 6], and informational/cybernetic systems [7, 8]. Many of these universal applications are explicitly selectionist, and do not have uniform fitness criteria. In fact, fitness is assumed in the adaptive mechanism. This provides a very loose analogy to organismal evolution indeed.

Universal computational model shaped by Darwinian processes. COURTESY: Dana Edwards, Universal Darwinism and Cyberspace.

Of these, the application to cybernetic systems is the most general. Taking inspiration from both cybernetics theory and the selectionist aspects of Darwinian models, Universal Selection Theory [7, 8] has four basic claims that can be paraphrased in the following three statements:
1) "operate on blindly-generated variation with selective retention". 
2) "process itself reveals information about the environment". 
3) "processes built atop selection also operate on variation with selective retention".
The key notions are that evolution acts to randomly generate variation, retains only the most fit solutions, then builds upon this in a modular and hierarchical manner. In this way, universal Darwinian processes act to build complexity. As with the initial list of features, the formation and maintenance of populations is an important bootstrapping and feedback mechanism. Populations and heredity underlie all Darwinian processes, even if they are not defined in the same manner as biological populations. Therefore, all applications of Darwinian principles must at least provide an analogue to dynamic populations, even at a superficial level.

There is an additional advantage of using universal Darwinian models: capturing the essence of Darwinian processes in a statistical model. Commonalities between Darwinian processes and Bayesian inference [3, 5] can be proposed as a mechanism for change in models of cosmic evolution. In the Darwinian-Bayesian comparison, heredity and selection are approximated using the relationship between statistical priors and empirical observation. The theoretical and conceptual connections between phylogeny, populations, and Bayesian priors is a post-worthy topic in and of itself.

At this point, we can step out a bit and discuss the origins of universal Darwinian systems. The origin of a Darwinian (or evolutionary) system can take a number of forms [9]. There are two forms of "being from nothingness" in [9] that could be proposed as origin points for Darwinian systems. The first is an origin in the lowest possible energetic (or in our case also fitness) state, and the other is what exists when you remove the governance of natural laws. While the former is easily modeled using variations of the NK model (which can be generalized across different types of systems), the latter is more interesting and is potentially even more universal.

An iconic diagram of Cosmic Evolution. COURTESY: Inflation Theory by Dr. Alan Guth.

An iconic diagram of Biological Evolution. COURTESY: Palaeontological Scientific Trust (PAST).

So did Darwin essentially construct a "theory of everything" over 200 years ago? Did he find "42" in the Galapagos while observing finches and tortoises? There are a number of features from complexity theory that might also fit into the schema of Darwinian models. These include concepts from self-organization not explicitly part of the Darwinian formulation: scaling and complexity, dependence on initial condition, tradeoffs between exploitation and exploration, and  order arising from local interactions in a disordered system. More explicitly, contributions from chaos theory might provide a bridge between nonlinear adaptive mechanisms and natural selection.

The final relationship I would like to touch on here is a comparison between Darwinian processes and Universality in complex systems. The simplest definition of Universality states that the properties of a system are independent of the dynamical details and behavior of the system. Universal properties such as scale-free behavior [10] and conformation to a power law [11] occur in a wide range of systems, from biological to physical and from behavioral to social systems. Much like applications of Universal Darwinism, Universality allows us to observe commonalities among entities as diverse as human cultures, organismal orders/genera, and galaxies/universes. The link to Universality also provides a basis for the abstraction of a system's Darwinian properties. This is the key to developing more representationally-complete computational models.

8-bit Darwin. COURTESY: Diego Sanches.

Darwin viewed his theory development of evolution by natural selection as an exercise in inductive empiricism [12]. Ironically, people are now using his purely observational exercise as inspiration for theoretical mechanisms for systems from the natural world and beyond.

[1] Radnitzky, G.,‎ Bartley, W.W., and Popper, K. (1993). Evolutionary Epistemology, Rationality, and the Sociology of Knowledge. Open Court Publishing, Chicago. AND Dennett, D. (1995). Darwin's Dangerous Idea. Simon and Schuster, New York.

[2] Claidiere, N., Scott-Phillips, T.C., and Sperber, D. (2014). How Darwinian is cultural evolution? Philosophical Transactions of the Royal Society B, 36(9), 20130368.

[3] Friston, K. (2007). Free Energy and the Brain. Synthese, 159, 417-458.

[4] Edelman, G.M. (1987). Neural Darwinism: the theory of neuronal group selection. Oxford University Press, Oxford, UK.

[5] Campbell, J. (2011). Universal Darwinism: the path to knowledge. CreateSpace Independent Publishing.

[6] Smolin, L. (1992). Did the universe evolve? Classical and Quantum Gravity, 9, 173-191.

[7] Campbell, D.T. (1974). Unjustified Variation and Selective Retention in Scientific Discovery. In "Studies in the Philosophy of Biology", F.J. Ayala and T. Dobzhansky eds., pgs. 139-161. Palgrave, London.

[8] Cziko, G.A. (2001). Universal Selection Theory and the complementarity of different types of blind variation and selective retention. In "Selection Theory and Social Construction", C. Hayes and D. Hull eds. Chapter 2. SUNY Press, Albany, NY.

[9] Siegal, E. (2018). The Four Scientific Meanings Of ‘Nothing’. Starts with a Bang! blog, February 7.

[10] Barab├ísi, A-L. (2009). Scale-Free Networks: a decade and beyond. Science, 325, 412-413.

[11] Lorimer, T., Gomez, F., and Stoop, R. (2015). Two universal physical principles shape the power-law statistics of real-world networks. Scientific Reports, 5, 12353.

[12] Ayala, F.J. (2009). Darwin and the Scientific Method. PNAS, 106(1), 10033–10039.