hopl-s2017

Conversational Context & Concurrency

Tony Garnock-Jones tonyg@ccs.neu.edu, 31 Jan 2017.

Paper list

[1] P. Brinch Hansen, “Monitors and Concurrent Pascal: A Personal History,” ACM SIGPLAN Not., vol. 28, no. 3, pp. 1–35, 1993. (Copy 1; Copy 2)

[2] C. Hewitt, P. Bishop, and R. Steiger, “A universal modular ACTOR formalism for artificial intelligence,” in Proc. International Joint Conference on Artificial Intelligence, 1973, pp. 235–245. (Copy 1 is a scan I made from an original; Copy 2 is the official online version, which unfortunately has bad OCR and is hard to read)

[3] G. A. Agha, “Actors: A Model of Concurrent Computation in Distributed Systems,” Technical Report 844. MIT Artificial Intelligence Laboratory, Jun-1985. (Copy 1; Copy 2)

[4] C. Fournet and G. Gonthier, “The Join Calculus: a Language for Distributed Mobile Programming,” Applied Semantics Summer School, September 2000, Caminha, Portugal. (Copy 1)

[5] N. J. Carriero, D. Gelernter, T. G. Mattson, and A. H. Sherman, “The Linda alternative to message-passing systems,” Parallel Comput., vol. 20, no 4, pp. 633–655, 1994. (Very hard to find online. This is the copy that I scanned myself. Heather Miller has an annotated version here)

Introduction: Conversational Context

When programs are written with concurrency in mind, the programmer reasons about the interactions between concurrent components or agents in the program. This includes exchange of information, as well as management of resources, handling of partial failure, collective decision-making and so on.

These components might be objects, or threads, or processes, or actors, or some more nebulous and loosely-defined concept; a group of callbacks, perhaps. The programmer has the notion of an agent in their mind, which translates into some representation of that agent in the program.

We think about the contexts (because there can be more than one) in which agents exist in two different ways.

From each agent’s perspective, the important thing to think about is the boundary between the agent and everything else in the system.

But from the system perspective, we often think about conversations between agents, whether it’s just two having an exchange, or a whole group collaborating on some task. Agents in a conversation play different roles, join and leave the group, and build shared conversational state.

In this lecture I’m going to use the idea of these conversational contexts as a lens through which to view the development of various metaphors and mechanisms of communication and coordination.

Basically, I’m going to present four computational models for concurrent interaction. These aren’t full programming languages, but there are many programming models that build upon them. In some cases, development of these ideas has progressed all the way up to system models including user interaction and so forth.

((Four whiteboards: (left) 1 2 3 4 (right).))

((WB4: Historical timeline))

The computational models I’m going to examine are

((The (**)s mark the sources most similar to the little models I’ll be presenting.))

This area is absolutely huge. Another major strand I don’t have time for is transactions in all their various forms.

The ones I’ve chosen have an interesting relationship, though: in some ways, actors and channels separately are reactions to shared-memory with synchronization primitives; and tuplespaces are a reaction to message-passing designs like actors and channels.

But let’s come back to the idea of conversational context.

Let’s look at an example program to gain an intuition for the kinds of things we want to be thinking about as we look at these computational models.

((WB2: Chatroom example to build intuition))

We’ll think about a simplified chat service a bit like IRC.

Diagram:

First of all, we can see some really interesting communication patterns here.

How well do our computational models support these patterns?

Second, we see that agents frequently engage in multiple conversations at once.

How do our computational models allow us to compose multiple conversations?

Third, we see that there are interesting pieces of conversational state held in common among participants in each conversation.

As an example of conversational state, consider the idea of presence, and how it is used.

In a multi-party conversation, such as this chat room, you get information about other participants as a set of names, and are kept informed as that set changes over time.

In a two-party conversation, such as a TCP connection, you get information about whether the other participant is still there or not based on whether the socket is still open or has closed.

In both cases, you use that knowledge to decide whether or not it’s worth saying any particular utterance. If I need to contact either Alice or Bob but neither are in the chat room, I know I needn’t bother saying anything until they show up.

If what I have to communicate involves some actual work, I may be able to get away with delaying doing the work until Alice or Bob appears. In this way, I’ve used the conversational context to efficiently allocate resources: both not wasting my words, and not wasting my compute time.

So presence information is a piece of conversational state that’s shared among the group and kept up to date as things change.

How do our computational models support signalling of relevant changes to common state?

Notice that the user list is managed primarily here, in the chat room, but has to be replicated all the way out here, and kept in sync. On a smaller scale, the view of the user list has to be kept in sync with the model, within the client program.

How do our computational models support maintenance of consistency of replicas of information?

How do our computational models support integration of newly-received signals of changes in state with local views on common state?

How do our computational models handle maintaining the integrity of both common state and any application-level invariants that might exist in the face of partial failure?

Finally, there are aspects of partial failure to consider, since we have concurrent activities that can fail independently. What are the consequences if some agent fails?

((WB1: Build up list of aspects of conversational context))

As I present examples, also ask yourself:

Foundation: ISWIM

((WB4: ISWIM. Core syntax, evaluation contexts, beta. Note re: syntactic sugar, data types, PURITY.))

Monitors

In the early 1960s, people had started to grapple with the difficulties of multiprogramming. The main approach, being quite machine-oriented, was to think about many processes sharing a mutable memory.

Monitors, as described by Per Brinch Hansen’s 1993 HOPL-II retrospective,

“evolved from the ideas of Ole-Johan Dahl, Edsger Dijkstra, Tony Hoare, and [himself]”

between 1971 and 1973.

Dijkstra came up with the concept of semaphores around 1963, and in 1965 published on the idea of a critical region – a block of code accessing shared variables within which no more than one process could find itself at once – and showed how to implement critical regions using semaphores.

Hoare proposed notation for identifying some variable as a shared resource in 1971. The idea was to have a mutable variable that, in some portions of code, was treated as guarded by critical regions, but in other portions, was left alone. At the same time, he came up with the idea of a conditional critical region, where entry to the region is delayed until some predicate holds.

data T = ...
var X: T = ...
with X when P(X) do { ... }

Monitors were developed to fix weaknesses of this idea:

Monitors “enable concurrent processes to share data and resources in an orderly manner” (Brinch Hansen 1993, p3).

The idea is that

There are a number of roughly-equivalent (according to Brinch Hansen) formulations of the idea, all involving different ways of managing the conditional suspension and resumption of processes accessing the monitor in various ways.

Brinch Hansen works through lots of formulations, but I’m going to cut to the chase and show something that’s a bit like the monitorish primitives that Java provides, although Java allows access to resources outside of synchronized blocks, and allows access to the lock external to the object itself. It’s interesting that in Java’s use of monitor-like constructs, we see Hoare’s original idea: it’s possible to write code that accesses a monitored bundle of state outside the monitor!

Simple model of Monitors.

Evaluation in terms of conversational context:

Actors

The original 1973 actor paper by Hewitt, Bishop and Steiger in the International Joint Conference on Artificial Intelligence, is incredibly far out! It’s a position paper that lays out a broad and colourful research vision. It’s packed with amazing ideas.

The heart of it is that Actors are proposed as a universal programming language formalism ideally suited to building artificial intelligence.

The goal really was A.I., and actors and programming languages were a means to that end. Later researchers developed the model into a programming model in its own right, separating it from its A.I. roots.

In the mid-to-late 70s, Hewitt and his students Irene Greif, Henry Baker, and Will Clinger developed a lot of the basic theory of the actor model, inspired originally by SIMULA and Smalltalk-71. Irene Greif developed the first operational semantics for it as her dissertation work and Will Clinger developed a denotational semantics for actors.

In the late 70s through the 80s and beyond, Gul Agha made huge contributions to actor theory. His dissertation was published as a book on actors in 1986 and has been very influential.

Actor-based languages come in wide variety; De Koster et al.’s 2016 taxonomy classifies actor languages into four groups. I won’t go into detail, but this model is a process-style variation, more like Erlang than the more classic Actor languages. So if you go look at Agha’s work, it’ll look quite different. There was a HOPL-III paper from Joe Armstrong on Erlang.

Simple model of Actors.

Channels

Channels are conduits through which messages travel between processes.

They have a buffer associated with them: if the buffer has zero places, then the channel is synchronous; if it has an unbounded number of places, it is asynchronous; otherwise, it’s somewhere in between.

As part of the road to monitors, in 1971 Hoare proposed an attempt at a monitor-like idea that involved a “description of parameter transfers as unbuffered input/output”, which “later became the basis for the concept of communicating sequential processes”. CSP involves a construct very similar to channels, alongside many other features and mechanisms. The first publication on CSP is from 1978, but the best introduction to CSP is Hoare’s book, first published in 1985 and frequently updated since.

Separately, Milner developed the calculus of communicating systems, first published in 1980. I don’t know how much influence from CSP there was on his thinking. The two are clearly comparable.

CCS led subsequently to the π calculus, which has been hugely influential. The π calculus is a very minimal computational model based entirely around channels and processes, with very little else involved. Milner, Parrow and Walker first published on π calculus in 1992.

Lots and lots of π-like derivatives sprung up during the ’90s and subsequently, including the paper on the Join calculus that I chose for this talk. It’s a lovely model and a really good paper, but the model I’ll present here is simpler than Join. It’s roughly equivalent to a simple synchronous-π-calculus extension of ISWIM.

Simple model of Channels.

Tuplespaces

In 1985, David Gelernter published his first paper on Linda, a programming language based around a new idea, tuplespaces.

A tuple space is a data store shared by a group of processes that interact with it by placing data items called tuples into it, and retrieving them by pattern matching. Once a tuple is placed into the store, it takes on independent existence. Tuplespaces are a little like relational databases in this way: an inserted row doesn’t just vanish by itself.

Gelernter and his colleagues continued to work on this throughout the 80s and 90s, and lots and lots of Linda-inspired systems have appeared over the years. In 1994, Carriero, Gelernter, Mattson and Sherman published the paper I listed, where Linda is explicitly positioned against message-passing systems. I think they’re aiming for a contrast between message-passing approaches to parallelism, for scientific computing, rather than for general concurrency.

The basic Linda model has a couple of severe shortcomings that I’ll talk about in a moment, and lots of the variations over the years have been attempts to fix the model. Here, I’ll be presenting the original basic Linda tuplespace model.

Simple model of Tuplespaces.