Daniel S. Fava

You get what you measure

2024-04-25T08:50:00+00:00

This is a short post on how measuring performance can backfire. So you are trying to foster a culture and you decide to measure how close (or far) your organization is from that culture. Although your intentions are good, your efforts can backfire. You can end up with a culture that is the opposite from what you intended in the first place.

We will walk through two ideas which, put together, lead to the concept of “you get what you measure.” We then close with what this could mean for your organization.

Measurement influences behavior

Think about the difference between measuring the length of a house versus measuring people’s kindness. Unlike measuring inanimate objects, measuring adaptive systems is difficult. An adaptive system is a system that reacts to being measured. So, the act of measuring influences the outcome.

Different from measuring length of inanimate objects, if you measure people’s kindness, people will tend to act kinder. You can never get to the actual truth–there will always a bias in the measurement. That can be good. By measuring kindness, kinder people. Win!

A measurement is a proxy, not the actual attribute

The second idea is the idea that a measurement is a proxy, and a proxy is not the actual attribute.

Going back to kindness, how would you measure it? Kindness is an abstract concept. You can’t measure it directly, but you can try to quantify it by measuring a proxy for kindness.
For example, maybe you measure whether a person makes charitable contributions, whether they sent you a message on your birthday, or whether they tend to hold the door for the next person coming in line.

This Zen buddhist saying “don’t mistake the moon by the finger pointing at the moon” can help us remember the distinction between an attribute and it’s proxy.

You get what you measure

So, the combination of (1) adaptive system and (2) measurement-by-proxy has a catchphrase: you get what you measure. Here is an exaggerated example:

You have a customer support center where people look into tickets created when a customer calls with a complaint. You want to measure and improve the quality of this organization.

If you measure quality by how fast tickets are closed, analysts will find the easiest answers to the complains and close the ticket with “xyz is likely the culprit” without doing due diligence. So customers will have to call back with the same complaint, and new tickets will be opened for the same issue. On paper, you are closing tickets faster than ever! In practice, your customers are getting pretty frustrated because their problems are not being addressed.

Say you decide to measure quality by how fast tickets are responded to (as opposed to how fast they are closed). Then someone will create a bot that fills the new tickets with messages like “Looking into it..” All tickets get an initial response, never mind the fact that the response is useless.

Once you have put a system in place to measure (1) an adaptive system that (2) can only the quantified via a proxy, then you get what you measure. And what you measure is not necessarily what you want to foster. No matter how complex you make the metrics, there will come a point where optimizing for the metric will clash with simply “doing the right thing.”

Over time

You are probably thinking: “That’s a pretty cynical view on people!” or “The people I work with are not like you describe”. You are right! Your employees or co-workers are not like that.. at least not now. But they may become more selfish once you put a performance measurement system into practice. The reason is that performance measurement systems put people on the spot. They have to solve a dilema: do I improve the metric or do I “do the right thing”?

May people will be altruistic. They want to do the right thing. Plus, if the company isn’t doing well, no one will do well. So it’s logical to look after the health of the company. But one bad apple is all it takes to shift behavior.

Although you may not notice when one person in your organization puts benefiting themselves over “the right thing”, other closer by will. People have beliefs and values (call it a compass), and they will not change their behavior because of one or two more self-centered individuals. Many of your employees will keep doing “the right thing”, but a few of them may decide to “do the right thing” somewhere else. Over time, the frustration of seeing a bad apple benefitting themselves will crumble your employees’ motivation . They will get tired of being put on the spot and forced to choose between “the right thing” and the metric. You can’t blame them. It’s not a fair situation to be in in the first place.

So, over time, the system will select for the less altruistic of us, and eventually the company will look the opposite of what the performance-measurement system intended to foster.

What then

Okay.. but if we shouldn’t measure performance, what can we do instead?

If the goal is to foster behavior, then create a culture that allows for this behavior to flourish. The shift is from measuring people to enabling them. Embrace the fact that none of this is a hard science, and stop trying to put a number on something that can’t be properly quantified. Encourage discussions on culture instead.

Go, Scala and case classes

2023-03-14T12:00:00+00:00

In this blog post, we will take a look at Go and Scala, and specifically, at their approach to case classes.

One of Scala’s key features is its support for case classes. Case classes are meant for holding immutable data. They are similar to regular classes, but they come with a number of useful features out-of-the-box, such as the ability to generate a toString method, a copy method; they also come with matching support.

Here’s an example of two case classes, Dollar and Euro, that extend a common Currency class in Scala:

abstract class Currency(val name: String, val alpha: String, val symbol: String)

case class Dollar() extends Currency("US Dollar", "USD, "$")
case class Euro() extends Currency("Euro", "EUR", "€")

val dollar = Dollar()
val euro = Euro()

println(dollar.name)   // Output: US Dollar
println(dollar.alpha)  // Output: USD
println(dollar.symbol) // Output: $
println(euro.name)     // Output: Euro
println(euro.alpha)    // Output: EUR
println(euro.symbol)   // Output: €

Go, on the other hand, does not have built-in support for case classes. We will look at three different approaches that can be used instead:

structs
enums
interface and “empty” types definitions

Structs

We use structs to hold data. Going back to the currency example, we can define a Currency as such:

package main

import (
	"fmt"
)

type Currency struct {
	Name   string
	Alpha  string
	Symbol string
}

type dollar struct {
	Currency
}

type euro struct {
	Currency
}

var (
	Dollar = dollar{Currency{"US Dollar", "USD", "$"}}
	Euro   = euro{Currency{"Euro", "EUR", "€"}}
)

func main() {
	fmt.Println(Dollar.Name)   // Output: US Dollar
	fmt.Println(Dollar.Alpha)  // Output: USD
	fmt.Println(Dollar.Symbol) // Output: $
	fmt.Println(Euro.Name)     // Output: Euro
	fmt.Println(Euro.Alpha)    // Output: EUR
	fmt.Println(Euro.Symbol)   // Output: €
}

Enums

Enums, short for enumerations, are a type in programming that allows you to define a set of named values. They are used to represent a fixed set of possible values for a variable, parameter, or property.

Here is an alternative implementation to our “currency” example. In this implementation, we are removing information from the structs and putting it in methods. If you squint, you can see the interplay between data and code: Functions can be implemented as table look-ups, where the function’s behavior is precomputed and stored in a data structure, rather than being computed on the fly. Conversely, data can sometimes be computed or generated on-the-fly by code, rather than being stored.

The struct here would only have to hold one integer for each currency… so there is no point of a struct at all! Instead of a struct holding an integer, we just have the integer. And we have one integer value for each currency. In other words, the struct collapses into an enum.

package main

import (
    "fmt"
)

type Currency int

const (
    Dollar Currency = iota
    Euro
)

func (c Currency) String() string {
    switch c {
    case Dollar:
        return "US Dollar"
    case Euro:
        return "Euro"
    default:
        return "Unknown"
    }
}

func (c Currency) Alpha() string {
    switch c {
    case Dollar:
        return "USD"
    case Euro:
        return "EUR"
    default:
        return "Unknown"
    }
}

func (c Currency) Symbol() string {
    switch c {
    case Dollar:
        return "$"
    case Euro:
        return "€"
    default:
        return "?"
    }
}

func main() {
    dollar := Dollar
    euro := Euro

    fmt.Println(dollar.String()) // Output: US Dollar
    fmt.Println(Symbol(dollar))  // Output: $
    fmt.Println(euro.String())   // Output: Euro
    fmt.Println(Symbol(euro))    // Output: €
}

Empty struct

We can push this interplay between data and code further. We don’t need an integer for each currency, we can have that information in the type itself. With this approach, the struct is empty.

We define the Currency type as an interface, and each currency then implements this interface accordingly:

package currency

type Currency interface {
	String() string
	Alpha() string
	Symbol() string
	IsValid() bool
}

type dollar struct{}
type euro struct{}
type empty struct{}

var (
	Dollar = dollar{}
	Euro   = euro{}
	Empty  = empty{}
)

// Dollars
func (dollar) String() string {
	return "US Dollar"
}

func (dollar) Alpha() string {
	return "USD"
}

func (dollar) Symbol() string {
	return "$"
}

func (dollar) Valid() bool {
	return true
}

// Euros
func (euro) String() string {
	return "Euro"
}

func (euro) Alpha() string {
	return "EUR"
}

func (euro) Symbol() string {
	return "€"
}

func (euro) Valid() bool {
	return true
}

// A zero value for the class
func (empty) String() string {
	panic("attempting to access an invalid currency")
}

func (empty) Alpha() string {
	panic("attempting to access an invalid currency")
}

func (empty) Symbol() string {
	panic("attempting to access an invalid currency")
}

func (empty) Valid() bool {
	return false
}

Comparison

The three approaches above have their pros and cons.

struct
- pros: the variable structs.currency.Dollar cannot be redefined; the variable is of type dollar and these types are unexported.
- cons: we expose the internal structure to library clients
enums
- pros: currencies like enum.currency.Dollar are defined as const, so they cannot be re-assigned
- cons: we need to implement accessor methods (those methods don’t come for free)
empty struct; information in the type
- pros: the variable emptystruct.currency.Dollar cannot be redefined; the variable is of type dollar and these types are unexported. We have well defined interfaces.
- cons: we need to implement accessor methods (those methods don’t come for free in Go, like they do in Scala)

To me, option (3) is what comes the closest to case classes. Unfortunately, options (3) is not very idiomatic Go. That option also relies more heavily on the capabilities of the type system. And that’s where things start to go wrong for Go. To illustrate, let’s look at marshaling.

Marshaling and Unmarshaling

Marshaling is the process of converting an object or data structure from its in-memory representation to a format that can be stored or transmitted. In other words, marshaling takes an object in a program’s memory and converts it to a format that can be written to disk, sent over a network, or otherwise persisted.

It is trivial to marshal and unmarshal structs as in example (1) and enums, like in example (2). When marshaling a struct, you do structural decomposition of the struct’s elements until you get to elementary data types. Thankfully, the json package does that for us, so we don’t even think about this decomposition.

Marshaling for (3) is also easy:

jsonData, err := json.Marshal(currency.Dollar)
if err != nil {
    println(err.Error())
}

How about unmarshal?

var d currency.Currency
err = json.Unmarshal(jsonData, &d)
if err != nil {
    fmt.Println("Error unmarshaling JSON:", err)
}

The code above gives a runtime error: Error unmarshaling JSON: json: cannot unmarshal object into Go value of type currency.Currency. If we trace through the json package, we see the following stack trace:

json.Unmarshal
  json.(*decodeState).unmarshal
    json.(*decodeState).value
      json.(*decodeState).object

The object() function is trying to figure out what we are unmarshaling into. It can unmarshal to the empty interface:

if v.Kind() == reflect.Interface && v.NumMethod() == 0 {
    oi := d.objectInterface()
    v.Set(reflect.ValueOf(oi))
    return nil
}

Otherwise, the object() function checks if the kind of the target is a map or a struct. If it is neither, it returns the error we’ve seen:

default:
    d.saveError(&UnmarshalTypeError{Value: "object", Type: t, Offset: int64(d.off)})
    d.skip()
    return nil

Maybe you are thinking… “wait a minute, how can the unmarshaler be able to tell what object it’s dealing with?” Let’s make the example simpler. Say we write a marshaler that marshals our currency into strings like “Dollar” and “Euro”. Then what we want the unmarshaler to do is simple:

Parse the JSON, unmarshal it as a string, then check:
If the string is “Euro”, return euro{}. If the string is “Dollar” return dollar{}. Otherwise, return empty{}.

Unfortunately, if we try to force the default unmarshaler down this path, we again get a json: cannot unmarshal string into Go value of type currency.Currency error. The call stack is a bit different:

json.Unmarshal
  json.(*decodeState).unmarshal
    json.(*decodeState).value
      json.(*decodeState).object
        json.(*decodeState).literalStore

We to get a bit deeper into the decoder. The error message is now coming from literalStore(). At this point, the unmarshaler has determined that it’s unmarshaling a string and it is trying to put it into an interface. Inside literalStore() we see this:

case reflect.Interface:
    if v.NumMethod() == 0 {
        v.Set(reflect.ValueOf(string(s)))
    } else {
        d.saveError(&UnmarshalTypeError{Value: "string", Type: v.Type(), Offset: int64(d.readIndex())})
    }

Again, when trying to unmarshal into an interface, the unmarshaler checks the number of methods declared by the interface. If there is one or more methods declared, the unmarshaler errors out. So it’s possible to unmarshal to an empty interface, but we can’t unmarshal to Currency because Currency declares several methods.

We want to somehow tell the unmarshaler to use custom logic when dealing with the Currency interface. We could then implement this logic in the currency package alongside the interface. The algorithm would work for all known currencies (USD, EUR, etc). By “known” I mean currencies known at compile time. The method would check if the string is Dollar and create the dollar type, similar for Euro and etc. We want something like this:

func UnmarshalJSON(data []byte) (Currency, error) {
    var s string
    if err := json.Unmarshal(data, &s); err != nil {
        return Empty, err
    }
    switch (s) {
      case "USD":
        return Dollar, nil
      case "EUR":
        return Euro, nil
      default:
        return Empty, "Invalid currency"
    }
}

Even if this were possible in Go and the json package, what happens if someone creates a new currency by implementing the interface methods?

Sealed interfaces

Say we managed to ship the currency package out. Then someone comes along and implements another currency; the Brazilian Real, for example. What should our custom unmarshaler do when it encounters this new currency? Since the unmarshaler was implemented before this new currency, it will not recognize that currency and will return the empty one instead along with an error. What can the currency package managers do about this?

Scala faces a similar issue, and it offers a couple of solutions:

we can prevent classes from being extended by declaring them as final, or
we can define a sealed trait. A sealed trait can only be extended in the same source file it is declared.

In Go, we can get a similar effect as sealed traits by adding an unexported function signature to the interface. For example:

type Currency interface {
	Name() string
	Alpha() string
	Symbol() string
    IsValid() bool
    sealed()
}

Because of sealed(), only the currency package will be able to implement currencies. (Note that there is nothing special about the name “sealed”. We could have used any lowercase function name.)

I learned about sealed interfaces in Go from Chewxy. And by using sealed interfaces, BurntSushi has built a tool for checking exhaustive patter matching in Go. Neat :-)

Sealed interfaces solves one of our problems: that of safely extending currencies inside the currency package. But we are still left with one problem: how to plumb the implementation of our unmarshaling function into the encoding/json package. Concretely, we want to use func UnmarshalJSON(data []byte) (Currency, error) inside literalStore() in the encoding/json package.

The problem boils down to associating a “static” function to an interface. Looking at the method signature for our unmarshaler; it doesn’t have a receiving object:

func UnmarshalJSON(data []byte) (Currency, error) {

The custom unmarshal logic doesn’t operate on an instance; it “operates on the class.” For better or worse, there is no way in Go to express a “static” function of an interface. At least not as far as I know.

Conclusion

There are may ways to implement something like case classes in Go. None of them seem to do justice, in my opinion. As a Go programmer, I miss algebraic data types. If we had them, error handling in Go would look nicer. But I am getting ahead of myself! Before you think that I’m advocating for changes to Go or the json package, you should watch this talk by Rob Pike. Languages are different. I too get annoyed with the lack of this or that. But I’m also glad that Go isn’t like Scala! Scala is great in many ways, I am glad it exists, and I love functional programming in general. But Go’s goals are different from Scala’s, and I welcome their difference.

Yesterday I defended the PhD

2021-06-15T12:00:00+00:00

As cliché as it sounds, the story of my PhD started in my childhood. Growing up, I wanted to be a scientist. In my teenage years, I had my eyes on a joint math and physics program at the Universidade Estadual de Campinas (UNICAMP), Brazil. I started the program in 2001 and still remember Professor Alcibíades Rigas class on linear algebra and analytical geometry. Rigas used a graduate level book written in English as the main resource for this freshman level course—in a country where most don’t speak English. It’s an understatement to say that the class was hard. But rather than an exercise in gratuitous punishment, Rigas helped us build a solid foundation. I fell in love with the campus and the program, but I left midway through my freshman year. While taking entrance exams in Brazil, I had also submitted applications for college in the US. When notified of my acceptance at the Rochester Institute of Technology (RIT), I chose the unknown over a life I had been falling in love with.

Moving to the US was a momentous decision for me. Leaving a liberal, public (free) university that is strong in theory and the sciences and going to a paid, conservative school with a focus on industrial application… had I made a big mistake? The feeling of isolation, the cold, and the political climate post 9/11 weighed hard. But I also made life-long friends during this time, and learned to embraced the engineer in me. In the end, RIT did prepare us well for industry. After college, I worked at Intel on one of the first many-core CPU architectures. At Apple, I worked on the first 64-bit cellphone processor to hit the market. But my childhood dream of being a scientist looked far away in the rear-view mirror. So on the verge of becoming a father, with the encouragement and support of my wife, I took an order of magnitude pay-cut and made a u-turn into graduate school.

I enrolled in the PhD program in Computer Science at the University of California in Santa Cruz (UCSC) in California. Wanting to find my way back to math and science, I took classes in machine learning and in the theory of programming languages. I became interested in logic and was exposed to formal methods. But I struggled to find my footing, and life in the US was not easy for two graduate students with a kid. With the help of the Good Country Index, I made a list of potential places to live. A serendipitous e-mail from Olaf (now my PhD co-advisor) and the support from amazing friends put us in motion towards Norway.

At the University of Oslo (UiO), I continued studying programming languages and formal methods. In this thesis you may sense the pushes and pull of a person with mixed interests. The operational semantics and the proof by simulation that appear early in the document come from wanting to deepen my mathematical background. The work of manipulating symbols in a formal system, however, is more fitting to a theoretician than to the engineer who I had become. So I am grateful to Martin, my advisor, for taking my interest and curiosity seriously, for encouraging me to develop my own research style, and for helping me bridge my knowledge gap.

I also wanted to build a modest trail, starting with real world source code and veering towards math. A trail that someone like my past self—a programmer who aspires to learn more but who does not yet have graduate-level training—might find useful. With the goal of bringing the thesis’ work to practice, I began looking at source code again. My exposure to industrial code bases and my experience dealing its complexities helped me a lot. I studied the thread sanitizer library (TSan), the Go data race detector, and the implementation of channels in the Go runtime. What started as tinkering developed into the latter part of the thesis. In the process I was exposed to open-source development, which I have been interested in since my undergraduate studies.

I am tremendously grateful for the journey. Risking opening and finishing with a cliché: I hope you will find the work interesting. Thank you.

That was the preface to my thesis. Below is a technical blurb. If you fond them to be interesting, you can check out the PDF.

Thesis blurb

Go is an open-source programming language developed at Google that has become the underpinning of large amounts of virtual infrastructure, especially in the area of cloud computing. Inspired by Go, this thesis analyzes a programming environment where threads synchronize with each other via the exchange of messages over channels. Go specifies this interaction on a document called the memory model, written in English. We present a mathematical interpretation of the Go memory model document. Our mathematization brings benefits. For example, it allows us to more easily relate the language’s design to the language’s implementation. As evidence, we were able to find and fix a non-trivial bug in the Go data-race detector. Rooted in theory, our improvements to the Go data-race detector were incorporated into release 1.16 of the language. In this thesis (PDF), we share our experience applying formal methods to a large, real-world software project.

Concurrency, Distribution, and the Go memory model

2020-07-07T12:00:00+00:00

The Go memory model specifies two main ways in channels are used for synchronization:

A send onto a channel happens-before the corresponding receive from that channel completes.
The $k^{th}$ receive from a channel with capacity $C$ happens-before the $(k+C)^{th}$ send onto that channel completes.

Recall that happens-before is a mathematical relation, as discussed here.

Rule (1) above has been around for a while. The rule is very similar to what was originally proposed by Lamport in 1978. It establishes a happens-before relation between a sender and its corresponding receiver. Rule (2) is a bit more esoteric. It was not present in Lamport’s study of distributed systems. There is a good reason for that absence.

Go is a language in between concurrency and distribution.

Both concurrency and distribution speak of independent agents cooperating on a common task. For that to happen, agents need to coordinate, to synchronize. Although similar in many ways, concurrency and distribution are fundamentally different. Because of this difference, synchronization in the setting of distribution differs from synchronization for concurrency.

In a concurrent systems, we assume that the agents are under/within a single environment. In Go, for example, all agents (goroutines) are under a single umbrella, in this case the Go runtime. This overarching environment allows us to assume that no messages are lost during transmission.

In distributed system, however, there is no such point of authority—at least not without making lots of extra assumptions about the system. For example, it may be impossible to tell whether a message was received. A network delay may be indistinguishable from a crashed/failed node. This impossibility exist even if we label some node as the “authoritative source of information about the state of the system.” After all, what if we are unable to reach this special node? In a distributed system, communication is no longer perfect, and we are forced to deal with this fact at some point.

Locks are often used to program concurrent systems, where the agents are located under a central resource manager. This manager can be the operating system, or a language runtime with the help of the OS. Different from locks, channels are a step towards synchronization in the setting of distribution.

Go borrowed rule (1) from Lamport’s research on distribution. On the other hand, rule (2) comes from the realization that Go is not all the way there. Rule (2) allows for the use of channels as locks, with send acting as acquire and receive as release (see previous post for details):

  T0            T1
c <- 0     |  c <- 0
z := 42    |  z := 43
<- c       |  <- c

Rule (1) gives us an order, while rule (2) is related to mutual exclusion (an order exists, but we don’t know which). In a sense, rule (1) is constructive or intuitionistic, while rule (2) is classical. If you are interested, you can find more on Section 3.5 of our paper Ready, set, Go! Data-race detection and the Go language.

Conclusion

While channels are typically used to program distributed systems, Go has a slightly different angle on message passing. Go introduces rule (2), which takes into account the channels’ capacity:

The $k^{th}$ receive from a channel with capacity $C$ happens-before the $(k+C)^{th}$ send onto that channel completes.

With rule (2), we can program channels as locks. This puts the language in the spectrum between concurrency and distribution.

Channels vs Locks

2020-06-24T12:00:00+00:00

With channels, we typically establish that two things are synchronized because A happens-before B. We often know the order in which they happen. In this post, we’ll see a use (or “misuse”) of channels. We will be able to establish that two things are synchronized, but we won’t know the order between them. We won’t know which happened first. We will then relate and contrast channels with locks: how are they different, how are they similar.

The typical example of channel communication is:

  T0            T1
z := 42    |  
c <- 0     |  <- c
           |  z := 43

Thread T0 does something (write to the shared variable z) and informs a partnet, T1 by sending on a channel. T1 learns about the write to z by receiving from the channel. T1 can then use z without causing a data race.

The Go memory model specifies two main ways to synchronize with channels:

A send onto a channel happens-before the corresponding receive from that channel completes.
The $k^{th}$ receive from a channel with capacity $C$ happens-before the $(k+C)^{th}$ send onto that channel completes.

Rule (1) is the rule used to reason about the example above: T0’s send happens-before T1’s receive. Rule (2) is a bit more esoteric. It allows us to use (or “misuse”, depending on your point of view) channels as locks.

Channels as locks

Here is an example of channels being used as locks.

  T0            T1
c <- 0     |  c <- 0
z := 42    |  z := 43
<- c       |  <- c

In this example, the channel c has capacity 1. Threads T0 and T1 are both trying to access some shared resource, say z. Before accessing z, a thread sends a message on the channel c, and receives from the channel afterwards.

Note that the send and its corresponding receive do not contribute to synchronization in this example. The send is matched by a receive from the same thread; nothing new is learned from this exchange. Rule (1) is mute here: the receive is in happens-before the send not just because of rule (1) but, more obviously, because of program order. Yet, this program is properly synchronized.

This program is properly synchronized because the channel is acting like a lock: send is acting like the acquire operation, and the receive as the release. What allows for this interaction is Rule (2). To see that, let us assume that T1 is the first thread to perform the send operation. (We could easily apply the same logic assuming T0 was first.) Since the channel has capacity 1, T0 will not be able to send onto the channel until T1 has received from it. Rule (2) then links the reception by T1 to the send by T0: the 0^th receive happens-before the 1^st send completes. By this reasoning, we know that the write z := 43 by T1 in the past of T0. Therefore, T0 can access z without causing a data race.

For a rigorous discussion, see Section 3.5 of our paper Ready, set, Go! Data-race detection and the Go language.

Conclusion

Go channels are a bit more than channels in their pure sense. In particular, rule (2), which allows us to use channels as locks, gives us extra power. In the next post I argue that this power is not necessarily a good thing. Spoiler alert. The possibility of using channels as locks is a good thing when it comes to concurrency. However, this power makes Go less of a language for distribution.

Also, although we can make channels behave as locks, in this post I discuss how synchronization through locks is fundamentally different from synchronization via channels. The neat thing about the post is that we’ll get to explore the Go runtime. We’ll also be bring together many of the concepts we’ve talked about in this blog so far.

What makes Go special

2020-06-23T12:00:00+00:00

What stands out the most in Go, to me, are goroutines and channels. The language was built for the many-core. It sprung from the observation that processors are not getting much faster at doing one thing, but are becoming faster by doing many things at once. To benefit from this trend, we ought to write multithreaded applications. Go makes it super easy to have multiple threads of execution. By prepending the keyword go at invocation, any function can run asynchronously (on its own “thread”). For example, the program below has only one goroutine, the main goroutine.

package main
var a string
func setA() { a = "hello" }

func main() {
  setA()
  print(a)
}

If we prepend the call to setA with go, the function setA will run on its own “thread of execution”, or as we say in Go, goroutine.

package main
var a string
func setA() { a = "hello" }

func main() {
  go setA()
  print(a)
}

The ease with which we can turn sequential code into concurrent code is staggering. With great power, however, comes great responsibility. By making setA run asynchronously, we broke the above program. It is now possible for the print in main to execute before setA has a chance to set a to hello. As a consequence, it is possible for this program to print the empty string. This situation is an example of a data race.

A data race constitutes two unsynchronized accesses to the same memory location, with at least one of the accesses being a write access.

Note that read accesses are never in conflict. In other words, we can’t have a data race between two read accesses. (There is also a definition of data races in terms of traces, and being able to put the two conflicting operations side-by-side in a trace. That definition is in-line with an idea of races as two conflicting access occurring “at the same time”. In a future post, I’ll analyze the difference between these definitions.)

Instead of locks, Go advocates synchronization via channel communication—sending and receiving messages on channels. We can repair the program as follows. We’ll create a channel, called done, that is shared between the two goroutines, we’ll send a message after writing to the shared variable in setA, and we’ll receive a message before reading from the shared variable in main.

package main
var done = make(chan bool, 10)
var a string
func setA() { a = "hello"; done <- true }

func main() {
  setA()
  <- done
  print(a)
}

We can think about the repair as follows. The reception of the message is blocking, meaning that the main goroutine will block until a message is available to be received. Recall from a previous post that, according to the Go memory model, a send on a channel happens-before the corresponding receive completes. So the send and its corresponding receive work to place the setting of a in setA in happens-before relation to the reading of a in main. (You can brush up on the happens-before relation here.)

In the next post, I plan to we discuss the difference between concurrency and distribution, relating the two concepts to different types of synchronization. We make a connection between concurrency and locks and between distribution and channels. After that, we look at how channels can be used as locks and later argue that these synchronization primitives are actually fundamentally different.

Rhubarb tang

2020-06-04T10:00:00+00:00

If you are following the blog, you have been busy learning about memory models. We started from the basics: what are memory models? What’s interesting and challenging about them? Then covered weak memory and got to the point of introducing the concept of happens-before relation. With that, we visited a real-world memory model specification, that of the Go programming language.

Pat yourself on the back. Great job! It is time to take a refreshing break. In my welcome post, I said I would share some (drink) recipes. Here is one for the summer. I call it “rhubarb tang”.

We’ll use:

Orange juice
Apple juice (optional)
Rhubarb
Sugar
Tequila
Ice

This recipe yields a drink and a dessert. Two for the price of one.

Roasted sweetened rhubarb makes for a nice dessert. We’ll use the liquid that remains to make a tangy drink. Here we go:

Cut the rhubarb in about 3cm length (about 1 inch).
Place the rhubarb in an oven pan.
Add the juice so as to cover the bottom of the pan.
Top the rhubarb with lots of sugar. Be generous.
Put it in the oven for about 20 minutes at around 200C (390 F).

Once out of the oven and cooled, you can eat the rhubarb for dessert—possibly adding a bit more sugar. We will use the liquid that remained at the bottom of the oven pan to make a drink:

Pour the sweetened juicy rhubarb liquid into a glass.
Add a splash of water (sparking water is even better).
Add one shot of tequila.
Top it with lots of ice.

You get extra credit if the rhubarb comes from your home garden.

Enjoy!

The Go memory model

2020-03-12T13:00:00+00:00

The Go memory model starts with the following ominous trespassing sign:

If you must read the rest of this document to understand the behavior of your program, you are being too clever.

Don’t be clever.

Feeling clever? Then come on in! Make sure you have brushed up on the concept of happens-before relation from the previous post.

The Go memory model lays out the rules for what values can be read from memory given previous writes to memory from different threads (or goroutines as they are called in Go). It uses the happens-before relation to precisely describe when a read operation can observe given write. If remember the previous post and are paying super close attention, you will notice that Go defines the happens-before relation slightly different from Lamport. The Go specification says:

Instructions within a thread are in happens-before relation. (Same as Lamport)
A send onto a channel happens-before the corresponding receive from that channel completes. (Almost the same as Lamport)
The $k^{th}$ receive from a channel with capacity $C$ happens-before the $(k+C)^{th}$ send onto that channel completes. (New compared to Lamport)

There are a few other rules not mentioned above, such as rules about spawning new threads (or Go routines). But we don’t need them right now. Rules (0), (1), and (2) above will give us plenty to think about.

Reasoning with the Go memory model

Let us revisit our good old example from our first post on memory models. In case you don’t remember, here you go again:

     T1         |    T2
z    = 42       |   if (done)
done = true     |     print(z)

We saw that this program is properly synchronized given a strong or sequentially consistent memory model. However, under weak memory, T2 can observe T1’s instructions as if they were executed out of program order. Given this swapping of T1’s instructions, the program is not properly synchronized and T2 may print an uninitialized value of z.

Let us reason about the example from the perspective of the Go memory model. Again, we label the statements in the program as A, B, C and D:

     T1            |    T2
z    = 42     (A)  |   if (done)     (C)
done = true   (B)  |     print(z)    (D)

Given rule (0) from the Go memory model, we infer that A $\rightarrow_{hb}$ B and that C $\rightarrow_{hb}$ D. But that is all we get! There is no way to relate events from thread T1 to events of thread T2. Given the fact that the instructions in T1 and T2 are not related by happens-before, it is possible for D to occur before A in an execution.

If we want to fix this program and make sure T2 will necessarily print 42, then we must ensure that A happens-before D. How can we do that?! In Go, we use channels. Let us replace the setting of done and the checking of done with a send on a channel c and a receive from that channel:

     T1            |    T2
z    = 42     (A)  |   <- c        (C)
c   <- true   (B)  |   print(z)    (D)

The statement c <- true, with the arrow pointing into the channel, means that we are sending the value of true on the channel. The statement <- c, with the arrow pointing away from the channel, means we are receiving a value from the channel.

Because the receive operation is blocking, meaning, it will block until there is something to be received, we know that the receive will only occur once the sent has occurred. Indeed, given rule (1) from the Go memory model, which says that “a send happens-before the corresponding receive completes”, we now have B $\rightarrow_{hb}$ C. Thanks to rule (1), we are able to relate events from different threads.

At this point we have:

A $\rightarrow_{hb}$ B by rule 1, program-order
B $\rightarrow_{hb}$ C by rule 2, send happens-before completion of receive
C $\rightarrow_{hb}$ D by rule 1, program order.

By transitivity of the happens-before relation, we can then conclude that A $\rightarrow_{hb}$ D. We have thus reasoned that this program will necessarily print 42. Neat!

Much ado about something

We used the Go memory model and the happens-before relation to analyze a program, we saw that the original program was not properly synchronized. We then used channels to obtain a new program and we employed the happens-before relation to reason about the correctness of this new program.

You may be wondering… that was a lot of work! Indeed. We are building concepts and mathematical tools for reasoning about programming languages and programs. It takes effort. But the effort can have big pay-offs. Later we will see how these same concepts and tools helped expose a bug in Go that was undetected for many years.

The happens-before relation

2020-03-11T13:00:00+00:00

At the end of the previous post, we saw that a compiler can change the order of instructions in a binary (as long as single-threaded semantics is preserved). These changes can break synchronization, especially if we expect to synchronize by reading and writing to variables. When pondering about synchronization, it would be unreasonable to expect a programmer to peek into the compiler. Luckily, languages come with a memory model: a document that serves as a contract between the programmer and the compiler.

Memory models are often defined in terms of the happens-before relation. Lamport introduced the concept in 1978 when studying the relative order of events in distributed systems. Today, the happens-before has become a vehicle for speaking of memory, for defining data races, etc.

The happens-before relation is a mathematical construct. You shouldn’t read “happens-before” and think in terms of natural language. Here are two ways in which our intuition about the words “happens before” breaks. First, happens-before does not necessarily imply an order of execution. Even if A is in happens-before relation to B, it is still possible for B to occur before A in an execution. Hum… Second, even if A necessarily occurs before B, it is still possible for A and B to not be related by happens-before. Double “hum…”

To illustrate, let us go back to our good old example from a previous post.

     T1             |    T2
z    = 42       (A) |   if (done)    (C)
done = true     (B) |     print(z)   (D)

We saw in that same previous post that statements A and B can be swapped without breaking single-threaded semantics. So it is possible for A to be executed after B. This observation comes despite of the fact that A is in happens-before relation to B. As it will become clear later, A and B are in happens-before relation because they are in program order.

Also, note that B occurs before D in every execution: we can only print when the guard of the if-statement succeeds, the if-guard can only succeed when done is set to true. Even though B occurs before D is every execution, it does not mean that B is in happens-before relation to D. These facts may seem confusing now but, but they should become clearer by the time you finish reading this and the next post.

For now, instead of using our intuition when thinking about happens-before, we will apply a definition.

Definition

Let us write happens-before as $\rightarrow_{hb}$. Similar to how $<$ is a relation between numbers, the happens-before relation is a relation between events. To be more precise, $\rightarrow_{hb}$ is a relation between events emanating from a program’s execution.

Let $a$, $b$, and $c$ be events. For example, $a$ could be the reading of a variable by a thread. Event $b$ could be a write to another variable by another thread. Event $c$ could be some synchronization operation. The happens-before relation was originally defined in terms of message-passing as follows:

If $a$ occurs before $b$ within the same process, then $a \rightarrow_{hb} b$,
If $a$ is the sending of a message and $b$ is the message reception, then $a \rightarrow_{hb} b$,
If $a \rightarrow_{hb} b$ and $b \rightarrow_{hb} c$ then $a \rightarrow_{hb} c$.

Notice that rule (1) above captures the preservation of single-threaded semantics. Single-thread semantics means that instructions within a thread must appear to be executed in program-order. The compiler may still reorder read and writes within a thread, as long as the thread can’t tell the difference.

Note also that, although happens-before was originally defined by Lamport in terms of message passing, today, the concept is used to described various types of systems. It is common, for example, to describe locks in terms of the happens-before relation.

Finally, rule (3) means that the happens-before relation, like the less-than relation, is transitive. For example, since $3 < 5$ and $5 < 10$ then $3 < 10$.

We are now prepared to look into the Go memory model. But we will have to do a little trespassing… Follow along on the next post.

Weak memory models

2020-03-06T09:00:00+00:00

In the previous post, we touched on consequences of our quest for performance. We saw that, by relaxing the order of execution of instructions, compilers are able to produce faster binaries, and processors are able to execute these binaries faster. We want this rearranging to be done for us because efficient instruction scheduling is a science in itself. Also on the previous post, we touched on the concept of sequential consistency.

In this post, we will discuss single-threaded semantics, compositionality, and their relation to weak memory models.

Compositionality and single-threaded semantics

Single-threaded semantics means that each thread must not be able to tell that the order of its instructions has been messed with. So, if I am a thread, I must not be able to tell that my instructions have been rearranged. But if you are an external thread, then you may be able to tell the difference: you may be able to notice that my instructions are being executed out of program order.

What is interesting is that single-threaded semantics is not compositional: You can make program transformations that preserve single-threaded semantics. While each thread behaves the same before and after the transformations, the overall program behavior changes. In other words, the whole is not always the same as the sum of its parts.

For example, let x and y be shared variables initialized to 0, and r1 and r2 be registers local to the threads.

   T1               T2
x  = 1       |   y  = 1
r1 = y       |   r2 = x
print r1     |   print r2

You should convince yourself that this program can print the following pairs of values: {(0,1), (1,0), (1,1)}.

I will now swap the first two instructions in T1 to obtain a new program. Maybe this new program executes faster.

   T1               T2
r1 = y       |   y  = 1
x  = 1       |   r2 = x
print r1     |   print r2

The swap is not noticeable from the point of view of T1: before the swap, T1 could print 0 or 1 and, after the swap, T1 can still print 0 or 1. No change. However, these are now the possible pairs of values printed by the program: {(0,0), (0,1), (1,0), (1,1)}. What a minute! The pair (0,0) is new. This pair didn’t exist before the swapping of T1’s instructions! Is this allowed?!

In many relaxed memory models, the swap described above is totally acceptable. Single-threaded semantics has not changed: individually, T1 and T2 behave exactly the same before and after the reordering of instructions. However, it is possible to preserve single-threaded semantics and still obtain a different program as a whole.

While sequential consistency imposes program-order across the board, relaxed memory models only preserve single-thread semantics.

M for model and for meaning

There are different sources of weakness in a memory model. Reordering of instructions can happens at the hardware level (at execution time) and at software level (at compilation time). Also, in hardware, memory hierarchy (like processor store buffers, load buffers, caches) can lead to weaknesses in the memory model. In fact, a bewildering number of models exist.

How can I then tell how my program really behaves?! One answer is, a compiler makes sure a program behaves the same across different architectures. The compiler does so by inserting the proper synchronization primitives for the given compilation target. When pondering about program behavior and synchronization, however, it would be unreasonable to expect a programmer to peek into the compiler. Luckily we don’t have to do that. Languages come with a memory model: a document that serves as a contract between the programmer and the compiler.

Note that the exists a tension here. Programmers expect clear constructs and simple explanations. Compiler writers want to implement complex optimizations and want freedom to evolve a language. These two camps can find themselves at odds. It is the job of a language’s memory model to bring these camps together. Regardless of the complexity of the optimizations, a program shall behave as described by the model.

Defining a reasonable memory model is juggling act. We will look at a real-world memory model soon. But first, in the next post we cover concepts used in the specification these models.