Go, Scala and case classes
In this blog post, we will take a look at Go and Scala, and specifically, at their approach to case classes.
One of Scala’s key features is its support for case classes. Case classes are meant for holding immutable data. They are similar to regular classes, but they come with a number of useful features out-of-the-box, such as the ability to generate a toString method, a copy method; they also come with matching support.
Here’s an example of two case classes, Dollar
and Euro
, that extend a common Currency
class in Scala:
abstract class Currency(val name: String, val alpha: String, val symbol: String)
case class Dollar() extends Currency("US Dollar", "USD, "$")
case class Euro() extends Currency("Euro", "EUR", "€")
val dollar = Dollar()
val euro = Euro()
println(dollar.name) // Output: US Dollar
println(dollar.alpha) // Output: USD
println(dollar.symbol) // Output: $
println(euro.name) // Output: Euro
println(euro.alpha) // Output: EUR
println(euro.symbol) // Output: €
Go, on the other hand, does not have built-in support for case classes. We will look at three different approaches that can be used instead:
- structs
- enums
- interface and “empty” types definitions
Structs
We use structs to hold data. Going back to the currency example, we can define a Currency
as such:
package main
import (
"fmt"
)
type Currency struct {
Name string
Alpha string
Symbol string
}
type dollar struct {
Currency
}
type euro struct {
Currency
}
var (
Dollar = dollar{Currency{"US Dollar", "USD", "$"}}
Euro = euro{Currency{"Euro", "EUR", "€"}}
)
func main() {
fmt.Println(Dollar.Name) // Output: US Dollar
fmt.Println(Dollar.Alpha) // Output: USD
fmt.Println(Dollar.Symbol) // Output: $
fmt.Println(Euro.Name) // Output: Euro
fmt.Println(Euro.Alpha) // Output: EUR
fmt.Println(Euro.Symbol) // Output: €
}
Enums
Enums, short for enumerations, are a type in programming that allows you to define a set of named values. They are used to represent a fixed set of possible values for a variable, parameter, or property.
Here is an alternative implementation to our “currency” example. In this implementation, we are removing information from the structs and putting it in methods. If you squint, you can see the interplay between data and code: Functions can be implemented as table look-ups, where the function’s behavior is precomputed and stored in a data structure, rather than being computed on the fly. Conversely, data can sometimes be computed or generated on-the-fly by code, rather than being stored.
The struct here would only have to hold one integer for each currency… so there is no point of a struct at all! Instead of a struct holding an integer, we just have the integer. And we have one integer value for each currency. In other words, the struct collapses into an enum.
package main
import (
"fmt"
)
type Currency int
const (
Dollar Currency = iota
Euro
)
func (c Currency) String() string {
switch c {
case Dollar:
return "US Dollar"
case Euro:
return "Euro"
default:
return "Unknown"
}
}
func (c Currency) Alpha() string {
switch c {
case Dollar:
return "USD"
case Euro:
return "EUR"
default:
return "Unknown"
}
}
func (c Currency) Symbol() string {
switch c {
case Dollar:
return "$"
case Euro:
return "€"
default:
return "?"
}
}
func main() {
dollar := Dollar
euro := Euro
fmt.Println(dollar.String()) // Output: US Dollar
fmt.Println(Symbol(dollar)) // Output: $
fmt.Println(euro.String()) // Output: Euro
fmt.Println(Symbol(euro)) // Output: €
}
Empty struct
We can push this interplay between data and code further. We don’t need an integer for each currency, we can have that information in the type itself. With this approach, the struct is empty.
We define the Currency
type as an interface, and each currency then implements this interface accordingly:
package currency
type Currency interface {
String() string
Alpha() string
Symbol() string
IsValid() bool
}
type dollar struct{}
type euro struct{}
type empty struct{}
var (
Dollar = dollar{}
Euro = euro{}
Empty = empty{}
)
// Dollars
func (dollar) String() string {
return "US Dollar"
}
func (dollar) Alpha() string {
return "USD"
}
func (dollar) Symbol() string {
return "$"
}
func (dollar) Valid() bool {
return true
}
// Euros
func (euro) String() string {
return "Euro"
}
func (euro) Alpha() string {
return "EUR"
}
func (euro) Symbol() string {
return "€"
}
func (euro) Valid() bool {
return true
}
// A zero value for the class
func (empty) String() string {
panic("attempting to access an invalid currency")
}
func (empty) Alpha() string {
panic("attempting to access an invalid currency")
}
func (empty) Symbol() string {
panic("attempting to access an invalid currency")
}
func (empty) Valid() bool {
return false
}
Comparison
The three approaches above have their pros and cons.
- struct
- pros: the variable
structs.currency.Dollar
cannot be redefined; the variable is of typedollar
and these types are unexported. - cons: we expose the internal structure to library clients
- pros: the variable
- enums
- pros: currencies like
enum.currency.Dollar
are defined asconst
, so they cannot be re-assigned - cons: we need to implement accessor methods (those methods don’t come for free)
- pros: currencies like
- empty struct; information in the type
- pros: the variable
emptystruct.currency.Dollar
cannot be redefined; the variable is of typedollar
and these types are unexported. We have well defined interfaces. - cons: we need to implement accessor methods (those methods don’t come for free in Go, like they do in Scala)
- pros: the variable
To me, option (3) is what comes the closest to case classes. Unfortunately, options (3) is not very idiomatic Go. That option also relies more heavily on the capabilities of the type system. And that’s where things start to go wrong for Go. To illustrate, let’s look at marshaling.
Marshaling and Unmarshaling
Marshaling is the process of converting an object or data structure from its in-memory representation to a format that can be stored or transmitted. In other words, marshaling takes an object in a program’s memory and converts it to a format that can be written to disk, sent over a network, or otherwise persisted.
It is trivial to marshal and unmarshal structs as in example (1) and enums, like in example (2). When marshaling a struct, you do structural decomposition of the struct’s elements until you get to elementary data types. Thankfully, the json
package does that for us, so we don’t even think about this decomposition.
Marshaling for (3) is also easy:
jsonData, err := json.Marshal(currency.Dollar)
if err != nil {
println(err.Error())
}
How about unmarshal?
var d currency.Currency
err = json.Unmarshal(jsonData, &d)
if err != nil {
fmt.Println("Error unmarshaling JSON:", err)
}
The code above gives a runtime error: Error unmarshaling JSON: json: cannot unmarshal object into Go value of type currency.Currency
. If we trace through the json package, we see the following stack trace:
json.Unmarshal
json.(*decodeState).unmarshal
json.(*decodeState).value
json.(*decodeState).object
The object()
function is trying to figure out what we are unmarshaling into. It can unmarshal to the empty interface:
if v.Kind() == reflect.Interface && v.NumMethod() == 0 {
oi := d.objectInterface()
v.Set(reflect.ValueOf(oi))
return nil
}
Otherwise, the object()
function checks if the kind of the target is a map or a struct. If it is neither, it returns the error we’ve seen:
default:
d.saveError(&UnmarshalTypeError{Value: "object", Type: t, Offset: int64(d.off)})
d.skip()
return nil
Maybe you are thinking… “wait a minute, how can the unmarshaler be able to tell what object it’s dealing with?” Let’s make the example simpler. Say we write a marshaler that marshals our currency into strings like “Dollar” and “Euro”. Then what we want the unmarshaler to do is simple:
- Parse the JSON, unmarshal it as a string, then check:
- If the string is “Euro”, return
euro{}
. If the string is “Dollar” returndollar{}
. Otherwise, returnempty{}
.
Unfortunately, if we try to force the default unmarshaler down this path, we again get a json: cannot unmarshal string into Go value of type currency.Currency
error. The call stack is a bit different:
json.Unmarshal
json.(*decodeState).unmarshal
json.(*decodeState).value
json.(*decodeState).object
json.(*decodeState).literalStore
We to get a bit deeper into the decoder. The error message is now coming from literalStore()
. At this point, the unmarshaler has determined that it’s unmarshaling a string and it is trying to put it into an interface. Inside literalStore()
we see this:
case reflect.Interface:
if v.NumMethod() == 0 {
v.Set(reflect.ValueOf(string(s)))
} else {
d.saveError(&UnmarshalTypeError{Value: "string", Type: v.Type(), Offset: int64(d.readIndex())})
}
Again, when trying to unmarshal into an interface, the unmarshaler checks the number of methods declared by the interface. If there is one or more methods declared, the unmarshaler errors out. So it’s possible to unmarshal to an empty interface, but we can’t unmarshal to Currency
because Currency
declares several methods.
We want to somehow tell the unmarshaler to use custom logic when dealing with the Currency
interface. We could then implement this logic in the currency package alongside the interface. The algorithm would work for all known currencies (USD, EUR, etc). By “known” I mean currencies known at compile time. The method would check if the string is Dollar
and create the dollar type, similar for Euro
and etc. We want something like this:
func UnmarshalJSON(data []byte) (Currency, error) {
var s string
if err := json.Unmarshal(data, &s); err != nil {
return Empty, err
}
switch (s) {
case "USD":
return Dollar, nil
case "EUR":
return Euro, nil
default:
return Empty, "Invalid currency"
}
}
Even if this were possible in Go and the json package, what happens if someone creates a new currency by implementing the interface methods?
Sealed interfaces
Say we managed to ship the currency package out. Then someone comes along and implements another currency; the Brazilian Real, for example. What should our custom unmarshaler do when it encounters this new currency? Since the unmarshaler was implemented before this new currency, it will not recognize that currency and will return the empty one instead along with an error. What can the currency package managers do about this?
Scala faces a similar issue, and it offers a couple of solutions:
- we can prevent classes from being extended by declaring them as
final
, or - we can define a
sealed
trait. Asealed
trait can only be extended in the same source file it is declared.
In Go, we can get a similar effect as sealed traits by adding an unexported function signature to the interface. For example:
type Currency interface {
Name() string
Alpha() string
Symbol() string
IsValid() bool
sealed()
}
Because of sealed()
, only the currency package will be able to implement currencies. (Note that there is nothing special about the name “sealed
”. We could have used any lowercase function name.)
I learned about sealed interfaces in Go from Chewxy. And by using sealed interfaces, BurntSushi has built a tool for checking exhaustive patter matching in Go. Neat :-)
Sealed interfaces solves one of our problems: that of safely extending currencies inside the currency
package. But we are still left with one problem: how to plumb the implementation of our unmarshaling function into the encoding/json
package. Concretely, we want to use func UnmarshalJSON(data []byte) (Currency, error)
inside literalStore()
in the encoding/json
package.
The problem boils down to associating a “static” function to an interface. Looking at the method signature for our unmarshaler; it doesn’t have a receiving object:
func UnmarshalJSON(data []byte) (Currency, error) {
The custom unmarshal logic doesn’t operate on an instance; it “operates on the class.” For better or worse, there is no way in Go to express a “static” function of an interface. At least not as far as I know.
Conclusion
There are may ways to implement something like case classes in Go. None of them seem to do justice, in my opinion. As a Go programmer, I miss algebraic data types. If we had them, error handling in Go would look nicer. But I am getting ahead of myself! Before you think that I’m advocating for changes to Go or the json package, you should watch this talk by Rob Pike. Languages are different. I too get annoyed with the lack of this or that. But I’m also glad that Go isn’t like Scala! Scala is great in many ways, I am glad it exists, and I love functional programming in general. But Go’s goals are different from Scala’s, and I welcome their difference.