Category Archives: Maths

He Knew He Was Right

The crucial point in my conception of non-empirical theory confirmation is that all three arguments of non-empirical theory confirmation that I’ve described before rely on assessments of limitations to underdetermination. In effect, scientists infer the strength of limitations to underdetermination from observing a lack of known alternatives, the surprising explanatory extra value of their theory or a tendency of predictive success in the research field. Eventually, these observations amount to theory confirmation because strong limitations to underdetermination increase the probability that the known theory is viable. The connection between the number of possible alternatives and the chances of predictive success is intuitively most plausible when looking at the extreme cases: if there are infinitely many alternatives to choose from and just one of them is empirically viable, the chances to pick the correct one are zero. If there is just one possible consistent theory – and if I assume that there is a viable scientific theory at all -, the chance that the consistent theory I found will be predictively successful is 100 percent.

Richard Dawid, on String Theory and Post-Empiricism

This type of reasoning – from bounded probability, given an infinite search space – throws up all kinds of surprises. But there’s another theme here, slightly submerged: it turns out that theory choice, or axiom selection, isn’t really arbitrary (even if it is necessarily “ungrounded”). There are background reasons why a particular theory proposes itself, or a particular collection of axioms seems initially plausible.

The argument I’m familiar with is that such a choice “proves” itself through its own performativity: it’s retroactively validated (in the weak sense of “shown to be useful”, rather than a strong sense of “proven to be true”) by the results it makes available. But this may be a kind of rationalisation – see, we were right to start here after all! – of a choice that was already guided by criteria that aren’t formally specifiable (i.e. you couldn’t generate the “good” starting-points by following a computational procedure). We start out with a sense of the affordances and constructive capacities of particular forms and combinatorial operations, and pick out likely candidates based on practical intuitions.

This is certainly how it goes in programming – to the extent that I’m a “good” programmer, it’s because experience enables me to be consistently “lucky” in picking out a pragmatically viable approach to a problem. There’s usually an “experimental” stage where one sets up a toy version of a problem just to see how different approaches play out – but what one is experimenting with there is theoretical viability, not empirical confirmation.

Often the initial intuition is something like “this is likely either to turn out to be right, or to fall down quickly so we can discard it early and move on to something else”: what we dread, and become practised in avoiding, is floundering about with something which is wrong in subtle ways that only reveal themselves much later on.

Lather, Rinse, Repeat

The (possibly alarmist) claim recently surfaced on social media that it was only a matter of time before some enterprising hacker managed to connect the records held by porn sites of their users’ browsing histories to the individual identities of those users, creating considerable opportunities for individual blackmail or general mischief. My personal reaction to this scenario –oh god please no – was balanced by a tranquil sense that a great many people would be in the same boat, and that the likely social impact of mass disclosure was difficult to anticipate. It might be horrific and hilarious in about equal measure. However, sites such as Pornhub already occasionally release their own statistical analyses, showing which US states evince the greatest interest in teenagers, spanking, interracial couples and so on. Public access to their – suitably anonymised – access logs might yield much of sociological interest.

My review of Tim Jordan’s Information Politics: Liberation and Exploitation in the Digital Society is now up at Review 31.

Psychedelic Investigations (conversation with Trent Knebel)

Leonora Carrington: El Mundo Mágico de los Mayas
Leonora Carrington: El Mundo Mágico de los Mayas

DF: The psychedelic (or phenoumenodelic) is a mode of investigation into perception, periodically renewed by new technical means – drugs, synthesizers, fractals, neural nets. We are now entering into a new phase of psychedelic investigation; that is, investigation into how we perceive what we perceive, what perception is “made out of” or “drawn from”, and what extensions or modifications it is susceptible to.

Psychedelic investigation is sometimes taken to be investigation into the ultimate nature of reality, which it is but not directly. In psychedelia, perception is relieved of its sufficiency and submitted once again to the real. That doesn’t mean that we see what’s “really always there”, but that what we see is other than what our standard frame of perception acknowledges as capable of “being there”. Givens appear outside of the established regime of givenness. The stranger enters into manifestation.

Slug-squirrel
Slug-squirrel

(Give the neural net a picture of some sky, and ask it to extrapolate images of Lucy with diamonds…)

Computer-generated psychedelia
What androids actually dream of

TK: Re: neural networks: I’ll be impressed when a computer can uncover a new correspondence between apparently unconnected domains of reality, progressively deforming things is fairly trivial and I don’t think stretches much past ideas of what computers are capable of (even if it does generate some interesting visuals).

LSD Cat
LSD Cat

DF: I agree, what we’ve seen so far with this is pattern recognition over-egged into hallucination, rather than pattern recognition uncovering previously undiscovered real structures. But I think that has always been true of psychedelia: it doesn’t bring insight into the real directly, but insight into the construction of illusions.

Edge detection
Edge detection

TK: I’ve never tried psychedelic drugs so can’t comment in that area, however, I do think Catren style psychedelia uncovers real structures that are only glimpsed distortedly when seen from any particular perspective, and Zalamea’s oeuvre is filled with example realizations of synaesthetic glimpses of structural kernels.

fractal

DF: I think I have to modify my previous statement: psychedelia is essentially undecided between reality and illusion, it’s an investigation of areas for which there is as yet no decision procedure. The question of whether or not there’s any “there” there is temporarily suspended. Later on it may be possible to discern some “structural kernel”, but the psychedelic moment is about developing the intuition that there is a “there” where something might conceivably be.

60s Psychedelia
60s Psychedelia

TK: Perhaps psychedelia is perceiving a correspondence in a synaesthetic manner without conscious grasp of the higher order principles governing the correspondence- eg synaestheticaly perceiving homotopy and type theory together without formally understanding homotopy type theory.

Bridget Riley
Bridget Riley

DF: Yes, it’s a kind of unchained synaesthesia, a synaesthesia that might always come to nothing.

Diagram from Zalamea
Diagram from Zalamea

TK: Might, but I’d also say that ascension to higher order structures (coupled with rich fleshing out of those structures, which is obviously there in psychedelia. this is opposed to hollow knowledge of higher order structures without understanding the lower level things they control) is one of the most fundamental types of progress, if not the most fundamental.

Protocol Duffers

A graph, yesterday
A graph, yesterday

What can we tell by both the order and size of a graph? One of the basic theorems of graph theory states that for any graph G, the sum of the degrees of the nodes equals twice the number of edges of G. That is, if the degree of any node is the number of edges connected to it (for node n1 with two edges connected to it, its degree = 2), the sum of all the degrees of the graph will be double the size of the graph (the number of edges). In other words, a network is not simply made up of a certain number of elements connected to one another, but is constituted by, qualified by, the connectivity of the nodes. How connected are you? What type of connection do you have? For a square, the sum of the degrees is 8 (the nodes [the square’s corners] each have two edges [the square’s lines] connected to them), while the sum of the edges is 4. In the IT industries connectivity is purely a quantitative measure (bandwidth, number of simultaneous connections, download capacity). Yet, in a different vein, Deleuze and Guattari describe network forms such as the rhizome as, in effect, edges that contain nodes (rather than vice versa), or even, paradoxically, as edges without nodes. In graph theory we see that the connectivity of a graph or network is a value different from a mere count of the number of edges. A graph not only has edges between nodes but edges connecting nodes.

This paragraph (from Galloway and Thacker on protocols) is typical of the faults of this kind of writing. Nothing that it says is entirely incorrect; and yet it confuses and misleads where it ought to clarify.

It is certainly true that there is a relationship between the ratio between the order and size of a graph, and the degree of its nodes. This can be stated precisely: given that the sum of all the degrees of the graph will be double the size of the graph, and the average degree of nodes in the graph will be that sum divided by the number of nodes, then the average degree of nodes in the graph will be twice the number of edges divided by the number of nodes. OK, so what? “A network is not simply made up of a certain number of elements connected to one another” – except that it still is. No extra information has been introduced by observing these ratios. There isn’t an additional property of “connectivity” (in the sense meant here, but see below) that is not inferrable from what we already know about size, order and the degree of each node. Saying “a graph not only has edges between nodes but edges connecting nodes” is a little like saying “the sun not only warms sunbathers, but also increases their temperature”.

The reference to “connectivity” as the term is used informally “in the IT industries” is largely a red herring here. The size, order and degrees of a graph are also “purely…quantitative” – what else would they be? As for Deleuze and Guattari, who can say? “Edges that contain nodes (rather than vice versa)” – who says that nodes “contain” edges? What could it possibly mean for either to contain the other? “Edges without nodes” do not exist in standard graph theory – there are no edges-to-nowhere or edges-from-nowhere. A rhizome’s structure is graph-like, in that nodes (in the botanical sense) put out multiple roots and shoots which connect to other nodes, but to map a rhizome as a graph we must introduce abstract “nodes” to represent the ends of shoots; only then can segments of the rhizome be considered “edges” between nodes (in the graph theoretical sense). None of this is particularly helpful in this context.

When we talk about “connectivity” in graph theory, we are typically talking about paths (traceable along one or more edges, e.g. from A to B and then from B to C) between nodes; the question that interests us is whether there are any nodes that are unreachable along any path from any other nodes, whether there are any disconnected subgraphs, how redundant the connections between nodes are, and so on. “Connectivity” in this sense is indeed not a function of the counts of nodes and edges (although if the number of edges is fewer than the number of nodes minus one, your graph cannot be fully connected…). But it is also not a matter of the degrees of nodes. A graph may be separable into multiple disconnected subgraphs, and yet every node may have a high degree, having multiple edges going out to other nodes within the subgraph to which it belongs. In this sense, it is indeed true that “the connectivity of a graph is…different from a mere count of the number of edges” (it is in fact the k-vertex-connectedness of the graph, a precise notion quite separate from that of degree). But the way in which it is really true is quite different from – and much more meaningful than – the way in which the above paragraph tries to suggest it is true.

What has happened here? The authors have clearly done their reading, but they have not synthesized their knowledge at the technical level: they move from learned fact to learned fact without understanding the logical infrastructure that connects them, being content instead to associate at the level of figurative resemblance. If pressed, writers in this style will often claim that they are identifying “homologies” (abusing that word also in the process) between things, and that one thing’s having a similar sort of conceptual shape to another is sufficient reason to associate them. But the available connectives in that case are weak (“it is surely no coincidence that…”, and other rhetorical substitutes for being able to demonstrate a reliably traversable connection), and it is often impossible to move from the resulting abstract quasi-structure back to the level of the explanandum without falling into total incoherence. The required “aboutness” just isn’t there: there is no negotiable passage back from the talk-about-talk to the talk-about-the-things-the-original-talk-was-about.

In the analysis of literary texts (and other cultural artifacts) we often are looking for structures of similar-patterning: for things which “look like” one another, which share a field of associations or a way of relating elements within that field. It is usually quite legitimate to compare two poems and to say that both have a common “logic” in the way they relate temporality and subjective identity-formation, or something like that. But it is foolish to apply the tools of literary analysis to objects whose primary mode of organisation is not figurative. Skimming along the surface of the language used by technicians in the description of their tasks, one may well discover patterns of association that are “telling”, that reveal something at the level of ideology. I am not proposing that cultural studies give up the jouissance of unmasking – without it, the discipline would lose its entire raison d’etre. But I would like to put in a plea for technical focus, of a kind appropriate to the domain, when dealing with technical subjects. You don’t have to ignore the things you’ve been trained to recognise, but you do need to be able to be undistracted by them. Get it right, then be clever. The payoffs may take longer in coming, but they’re so much realer.

Flat Ontology = One God Universe

Reza on flat ontology as a One God Universe:

In procedurality, we should understand that faraway global behaviors are not simply the similar or homothetic variations of local behaviors. Procedurality or the shift of the perspective according to the shift of landscape of rules is a response to this asymmetry between the global and the local. For example, contingency differs at different levels. We cannot overextend the concept of contingency at the level of the individual gambler to the contingency at the level of a collection of games to the contingency at the level of casino. These have different levels of probability which cannot be over-stretched to one another. By calling this hierarchy of gambles within gambles ‘contingency’ without any regard to the specifications of each distinct level, we are making a flat universe.

A flat universe is a trivial environment in which the content of a local domain is uniformly distributed across the entire horizon. It’s another variation of what Mark Wilson calls “the classical picture of concepts”.5 According to the classical picture, a concept fully and in one-to- one relationship covers the object. The speculative implications of such a universe are indeed appealing because everything can be applied all the way down, concepts can be overextended from one domain to another at will. But as Mark Wilson points out, this conceptual universe is precariously overloaded. It is akin to a house where the basement is leaking, in trying to fix the basement, the kitchen floor sinks in, in repairing the floor, some of the pipes burst. Everything always needs to be patched up because ultimately in this universe nothing works, the entire edifice is a house of cards.

It wouldn’t be too hard to detect this pattern in certain speculative philosophies [lol] where either the object or contingency is the crazy glue – the big idea – that holds everything at the levels of local and global together. Flatness is another name for the condition of triviality where the global structure has the same properties and/or behaviors of its local fields. But when there is an asymmetry between the global and the local – a non-triviality – we cannot solely resort to analysis (locally oriented) to produce or examine a global structure. Conceptual mapping for a non-trivial universe requires various conceptual maps or navigational atlases distributed at different elevations according to their different a priori statuses.

(That’s “One God Universe” as in Burroughs: “Consider the impasse of a one God universe. He is all-knowing and all-powerful. He can’t go anywhere since He is already everywhere. He can’t do anything since the act of doing presupposes opposition. His universe is irrevocably thermodynamic having no friction by definition. So, He has to create friction: War, Fear, Sickness, Death….to keep his dying show on the road…”)

The guiding (mathematical) metaphor here is that of the manifold, which patches together an “atlas” of local spaces, or the sheaf, which ensures the availability of “gluings” for consistent local data. Both entities have the property that local qualities are not globally preserved: a manifold is “locally Euclidean”, but globally may be very weirdly-shaped indeed; sheaves construct a sort of protocol of descent/ascent which determines how local consistency is globally represented and how global data is locally enriched or deformed. To put it another way: they schematise situations in which, to adapt a phrase of Geoffrey Bennington’s, you need more than one idea to think with.

(edit: I have as usual misremembered the phrase, which is from Bennington’s review of books by Gillian Rose and Peter Dews: “the ‘anarchy’ whose spectre is reported to be looming whenever Left or Right finds it needs more than three ideas to think with”)

Notes on “the digital”

It is mathematically demonstrable that the ontology of set theory and the ontology of the digital are not equivalent.

The realm of the digital is that of the denumerable: to every finite stream of digits corresponds a single natural number, a finite ordinal. If we set an upper bound on the length of a stream of digits – let’s say, it has to fit into the available physical universe, using the most physically compact encoding available – then we can imagine a “library of Boole”, finite but Vast, that would encompass the entirety of what can be digitally inscribed and processed. Even larger than the library of Boole is the “digital universe” of sequences of digits, D, which is what we get if we don’t impose this upper bound. Although infinite, D is a single set, and is isomorphic to the set of natural numbers, N. It contains all possible digital encodings of data and all possible digital encodings of programs which can operate on this data (although a digital sequence is not intrinsically either program or data).

The von Neumann universe of sets, V, is generated out of a fundamental operation – taking the powerset – applied recursively to the empty set. It has its genesis in the operation which takes 0, or the empty set {}, to 1, or the singleton set containing the empty set, {{}}, but what flourishes out of this genesis cannot in general be reduced to the universe of sequences of 0s and 1s. The von Neumann universe of sets is not coextensive with D but immeasurably exceeds it, containing sets that cannot be named or generated by any digital procedure whatsoever. V is in fact too large to be a set, being rather a proper class of sets.

Suppose we restrict ourselves to the “constructible universe” of sets, L, in which each level of the hierarchy is restricted so that it contains only those sets which are specifiable using the resources of the hierarchy below it. The axiom of constructibility proposes that V=L – that no set exists which is not nameable. This makes for a less extravagantly huge universe; but L is still a proper class. D appears within L as a single set among an immeasurable (if comparatively well-behaved) proliferation of sets.

A set-theoretic ontology such as Badiou’s, which effectively takes the von Neumann universe as its playground, is thus not a digital ontology. Badiou is a “maximalist” when it comes to mathematical ontology: he’s comfortable with the existence of non-constructible sets (hence, he does not accept the “axiom of constructibility”, which proposes that V=L), and the limitations of physical or theoretical computability are without interest for him. Indeed, it has been Badiou’s argument (in Number and Numbers) that the digital or numerical enframing of society and culture can only be thought from the perspective of a mathematical ontology capacious enough to think “Number” over and above the domain of “numbers”. This is precisely the opposite approach to that which seeks refuge from the swarming immensity of mathematical figures in the impenetrable, indivisible density of the analog.

We Shall Come Rejoicing

Trying to get my head around the interplay between the locality and gluing axioms in a sheaf. In brief, and given a metaphorical association of the “more global” with the “above” and of the “more local” with the “below”:

The locality axiom means that “the below determines the (identity of the) above”: whenever two sections over an open set U are indistinguishable based on their restrictions to the sections over any open cover of U, they are the same. There is no way for data that are more-locally the same to correspond to data that are more-globally different. Our view can be enriched as we move from the global to the local, but not the other way around.

The gluing axiom means that “the above determines the (coherence of the) below”: each compatible-in-pairs way of gluing together the sections over an open cover of U has a representative among the sections over U, of which the sections in the glued assemblage are the restrictions. There is no coherent more-local assemblage that does not have such a more-global representation. The global provides the local with its law, indexing its coherence.

A theme of postmodernism, and particularly of Lyotard’s treatment of the postmodern, was “incommensurability”. Between distinct local practices – language games – there is no common measure, no universal metalanguage into which, and by means of which, every local language can be translated. The image of thought given by sheaves does not contradict this, but it complicates it. The passage from the local to the global draws out transcendental structure; the passage from the global to the local is one of torsion, enrichment, discrimination. The logics of ascent and descent are linked: we cannot “go down” into the local without spinning a web of coherence along the way, and we cannot “come up” into the global without obeying a strict rule of material entailment.

An emerging orientation

Why am I so excited about the HKW Summer School? Because it represents an attempt to take some cultural initiative: this is “us” showing what we’ve got and what we can do with it, and showing-by-doing that what can be done in this way is actually worth doing.

I don’t expect everyone to be convinced by such a demonstration – in fact, I expect quite a few people to be dismayed about it, to feel that this is an upstart, renegade movement with distinctly not-for-People-Like-Us values and practices (maths! logics! don’t we know Lawvere* was a worse fascist than Heidegger?). It’s likely that not a few leftish PLUs will be rocking up any moment now to tell us all to curb our enthusiasm. But a glance over the history of Marxist thought will show that there have been plenty of times and places in which the initiative has indeed been held by rationalists – albeit often by warring rationalists, who disagreed ferociously with each other about how a rational politics was to be construed and practised. It’s not at all clear that the present moment, which places such overriding importance on affective tone, is not in fact the anomaly. That’s not to say that we should ditch everything that has declared itself over the past decade – on the contrary, it represents a vast, complex, necessary and unfinished project to which we should aim to contribute meaningfully. But we can only do so by approaching that project from a perspective which it does not encompass, and is hugely unwilling – and perhaps unable – to recognise as valid. To do so requires confidence, of a kind that those who are already confident in their moral standing will find unwarranted and overweening. We are going to be talked down to a lot; we are going to be called names; we are going to have to develop strong memetic defenses against the leftish words-of-power that grant the wielder an instant power of veto over unwelcome ideas. We have a lot to prove. Calculemus!

  • a fairly hardcore Maoist, as it happens.

Sheaves for n00bs

What itinerary would a gentle introduction to sheaves have to take? I would suggest the following:

  • A basic tour of the category Set, introducing objects, arrows, arrow composition, unique arrows and limits. (OK, that’s actually quite a lot to start with).
  • Introduction to bundles and sections, with a nicely-motivated example.
  • Enough topology to know what open sets are and what a continuous map is, topologically speaking.
  • Now we can talk about product spaces and fiber bundles.
  • Now we can talk about the sheaf of sections on a fiber bundle.
  • Now we back up and talk about order structures – posets, lattices, Heyting algebras, and their relationship to the lattice of open sets in a topological space. We note in passing that a poset can be seen as a kind of category.
  • Functors, covariant and contravariant.
  • That’s probably enough to get us to a categorial description of first presheaves (contravariant functor from a poset category to e.g. Set) and sheaves (presheaves plus gluing axiom, etc). Show how this captures the fundamental characteristics of the sheaf of sections we met earlier.
  • Then to applications outside of the sheaf of sections; sheaf homomorphisms and sheaf categories; applications in logic and so on. This is actually where my understanding of the topic falls of the edge of the cliff, but I think that rehearsing all of the material up to this point might help to make some of it more accessible.

Anything really essential that I’ve missed? Anything I’ve included that’s actually not that important?

What is the ontology of code?

If, as is sometimes said, software is eating the world, absorbing all of the contents of our lives in a new digital enframing, then it is important to know what the logic of the software-digested world might be – particularly if we wish to contest that enframing, to try to wriggle our way out of the belly of the whale. Is it perhaps object-oriented? The short answer is “no”, and the longer answer is that the ontology of software, while it certainly contains and produces units and “unit operations” (to borrow a phrase of Ian Bogost’s), has a far more complex topology than the “object” metaphor suggests. One important thing that practised software developers mostly understand in a way that non-developers mostly don’t is the importance of scope; and a scope is not an object so much as a focalisation.

Cover of "Object-Oriented Modeling and Design with UML"
No.

The logic of scope is succinctly captured by the untyped lambda calculus, which is one of the ways in which people who really think about computation think about computation. Here’s a simple example. Suppose, to begin with, we have a function that takes a value x, and returns x. We write this as a term in the lambda calculus as follows:

[latex]\lambda x.x[/latex]

The [latex]\lambda[/latex] symbol means: “bind your input to the variable named on the left-hand side of the dot, and return the value of the term on the right-hand side of the dot”. So the above expression binds its input to the variable named x, and returns the value of the term “[latex]x[/latex]”. As it happens, the value of the term “[latex]x[/latex]” is simply the value bound to the variable named x in the context in which the term is being evaluated. So, the above “lambda expression” creates a context in which the variable named x is bound to its input, and evaluates “x” in that context.

We can “apply” this function – that is, give it an input it can consume – just by placing that input to the right of it, like so:

[latex](\lambda x.x)\;5[/latex]

This, unsurprisingly, evaluates to 5.

Now let’s try a more complex function, one which adds two numbers together:

[latex]\lambda x.\lambda y.x+y[/latex]

There are two lambda expressions here, which we’ll call the “outer” and “inner” expressions. The outer expression means: bind your input to the variable named x, and return the value of the term “[latex]\lambda y.x+y[/latex] ”, which is the inner expression. The inner expression then means: bind your input to the variable named y, and return the value of the term “[latex]x+y[/latex]”.

The important thing to understand here is that the inner expression is evaluated in the context created by the outer expression, a context in which x is bound, and that the right-hand side of the inner expression is evaluated in a context created within this first context – a new context-in-a-context, in which x was already bound, and now y is also bound. Variable bindings that occur in “outer” contexts, are said to be visible in “inner” contexts. See what happens if we apply the whole expression to an input:

[latex](\lambda x.\lambda y.x+y)\;5 = \lambda y.5+y[/latex]

We get back a new lambda expression, with 5 substituted for x. This expression will add 5 to any number supplied to it. So what if we want to supply both inputs, and get [latex]x+y[/latex]?

[latex]\begin{array} {lcl}((\lambda x.\lambda y.x+y)\;5)\;4 & = & (\lambda y.5+y)\;4 \\ & = & 5 + 4 \\ & = & 9\end{array}[/latex]

Some simplification rules in the lambda calculus notation allow us to do away with both the nested parentheses and the nested lambda expressions, so that the above can be more simply written as:

[latex]\lambda xy.x+y\;5\;4 = 9[/latex]

There is not much more to the (untyped) lambda calculus than this. It is Turing-complete, which means that any computable function can be written as a term in it. It contains no objects, no structured data-types, no operations that change the state of anything, and hence no implicit model of the world as made up of discrete pieces that respond as encapsulated blobs of state and behaviour. But it captures something significant about the character of computation, which is that binding is a fundamental operation. A context is a focus of computation in which names and values are bound together; and contexts beget contexts, closer and richer focalisations.

So far we have considered only the hierarchical nesting of contexts, which doesn’t really make for a very exciting or interesting topology. Another fundamental operation, however, is the treatment of an expression bound in one context as a value to be used in another. Contexts migrate. Consider this lambda expression:

[latex]\lambda f.f\;4[/latex]

The term on the right-hand side is an application, which means that the value bound to f must itself be a lambda expression. Let’s apply it to a suitable expression:

[latex]\begin{array} {lcl}(\lambda f.f\;4) (\lambda x.x*x) & = & (\lambda x.x*x)\;4 \\ & = & 4*4 \\ & = & 16\end{array}[/latex]

We “pass” a function that multiplies a number by itself, to a function that applies the function given to it to the number 4, and get 16. Now let’s make the input to our first function be a function constructed by another function, that binds one of its variables and leaves the other “free” – a “closure” that “closes over” its context, whilst remaining partially open to new input:

[latex]\begin{array} {lcl}(\lambda f.f\;4) ((\lambda x.\lambda y.x*y)\;5) & = & (\lambda f.f\;4) (\lambda y.5*y) \\ & = & (\lambda y.5*y) 4 \\ & = & 5* 4 \\ & = & 20\end{array}[/latex]

If you can follow that, you already understand lexical scoping and closures better than some Java programmers.

My point here is not that the untyped lambda calculus expresses the One True Ontology of computation – it is equivalent to Turing’s machine-model, but not in any sense more fundamental than it. “Functional” programming, a style which favours closures and pure functions over objects and mutable state, is currently enjoying a resurgence, and even Java programmers have “lambdas” in their language nowadays; but that’s not entirely the point either. The point I want to make is that even the most object-y Object-Oriented Programming involves a lot of binding (of constructor arguments to private fields, for example), and a lot of shunting of values in and out of different scopes. Often the major (and most tedious) effort involved in making a change to a complex system is in “plumbing” values that are known to one scope through to another scope, passing them up and down the call stack until they reach the place where they’re needed. Complex pieces of software infrastructure exist whose entire purpose is to enable things operating in different contexts to share information with each other without having to become tangled up together into the same context. One of the most important questions a programmer has to know how to find the answer to when looking at any part of a program is, “what can I see from here?” (and: “what can see me?”).

Any purported ontology of computation that doesn’t treat as fundamental the fact that objects (or data of any kind) don’t just float around in a big flat undifferentiated space, but are always placed in a complex landscape of interleaving, interpenetrating scopes, is missing an entire dimension of structure that is, I would argue, at least as important as the structure expressed through classes or APIs. There is a perspective from which an object is just a big heavy bundle of closures, a monad (in the Leibnitzian rather than category-theoretical sense) formed out of folds; and from within that perspective you can see that there exist other things which are not objects at all, or not at all in the same sense. (I know there are languages which model closures as “function objects”, and shame on you).

It doesn’t suit the narrative of a certain attempted politicisation of software, which crudely maps “objects” onto the abstract units specified by the commodity form, to consider how the pattern-thinking of software developers actually works, because that thinking departs very quickly from the “type-of-thing X in the business domain maps to class X in my class hierarchy” model as soon as a system becomes anything other than a glorified inventory. Perhaps my real point is that capitalism isn’t that simple either. If you want a sense of where both capitalism and software are going, you would perhaps do better to start by studying the LMAX Disruptor, or the OCaml of Jane Street Capital.