Open Credo

October 13, 2016 | Data Analysis

From Java to Go, and Back Again

In Lisp, you don’t just write your program down toward the language, you also build the language up toward your program. As you’re writing a program you may think “I wish Lisp had such-and-such an operator.” So you go and write it. Afterward you realize that using the new operator would simplify the design of another part of the program, and so on. Language and program evolve together…In the end your program will look as if the language had been designed for it. And when language and program fit one another well, you end up with code which is clear, small, and efficient.

WRITTEN BY

Dominic Fox

Dominic Fox

From Java to Go, and Back Again

Introduction: fitness landscapes, Cassandra, and making things worse

 

In Lisp, you don’t just write your program down toward the language, you also build the language up toward your program. As you’re writing a program you may think “I wish Lisp had such-and-such an operator.” So you go and write it. Afterward you realize that using the new operator would simplify the design of another part of the program, and so on. Language and program evolve together…In the end your program will look as if the language had been designed for it. And when language and program fit one another well, you end up with code which is clear, small, and efficient.
– Paul Graham, Programming Bottom-Up

Visualization of two dimensions of an NK fitness landscape

In evolutionary biology, a fitness landscape is a way of visualising the relationship between genotypes and reproductive success. Each possible variation of a genotype is projected onto a position on a map, and the height of the terrain at that position represents the reproductive success associated with that variation. We can then imagine a sequence of genetic variations as a walk on this map. The rules for walking a fitness landscape are that you can only move around the map in small steps – no sudden leaps from one position to another – and steps in the “wrong” direction, i.e. those that decrease your reproductive success, are punished with a lower chance of being able to continue your walk around the map. A walker placed near a peak on the map will therefore tend to move “uphill” towards that peak, rather than heading down into the valleys to go and search for a higher peak elsewhere.

This way of visualising the behaviour of genetic algorithms enables us to see how it is that such algorithms will tend to settle on “locally optimal” solutions rather than finding a “global optimum”, and highlight a problem with purely incremental approaches to exploring a range of possible designs: if each increment must deliver a quantifiable improvement, and sudden leaps to another point on the map are forbidden, then we may get stuck on a locally optimal design and never reach the peaks that might be found elsewhere. If we’re already sitting at the top of a locally optimal solution, there may be nowhere we can go. Sometimes, things have to get worse before they can get better.

In OpenCredo’s October Cassandra Webinar, I described the distributed key-value store Cassandra as “unsolving” the problems solved by relational database management systems, in order to achieve performance and scalability characteristics that were beyond the reach of such systems. Cassanda throws out almost all of the data management features of the RDBMS, replacing them with a simple yet powerful model of data addressing based on two-dimensional keys. If we understand how that model works, and how best to work with it, then we can get exceptional performance out of Cassandra; but if we try to use Cassandra like an RDBMS, we will quickly discover that Cassandra is a worse RDBMS in almost every possible way.

If we visualise the space of possible data management systems as a fitness landscape, we can see that Cassandra and an RDBMS sit at (or near) different “peaks” in the landscape: by very loose analogy with evolution, they occupy different “niches”, being optimised for different things. Of course, Cassandra didn’t evolve from the RDBMS by small increments: its developers didn’t begin with an RDBMS and progressively make it worse until the Cassandra data model hoved into view. Rather, Cassandra is the result of choosing a different starting point altogether in the design space.

When I first started learning Google’s Go language, a few months ago, my immediate impression was of a language that had deliberately been made worse in multiple ways. I’ve heard fellow developers complain bitterly about how retrograde its design is, how wilfully ugly and restrictive. For them, using Go was like rolling down into a trough in the fitness landscape. They couldn’t wait to scramble back up to the heights of elegance and productivity offered by Java. But even as I winced at some of the things Go was taking away from me, I wondered what it was optimising for instead. What if Go wasn’t a worse Java, but a better something else? I decided to stick with the language, and see what it might have to teach me about an area of the programming language design space I hadn’t explored before.

Go: What Is It Good For?

A programming language is low level when its programs require attention to the irrelevant.
– Alan Perlis, Epigrams on Programming

As much as it has its detractors, Go also has many fans: people who love programming in it, who find it satisfyingly productive, who read their own Go code with pleasure and confidence. Let’s assume they’re not simply deluded. What is Go doing for these developers?

A commonplace among Go programmers is that other people’s code written in Go is usually easy to read and understand. When I was given the task of implementing a new Terraform provider, I went to the Terraform codebase and read through the implementation code for other providers. I was able to follow it almost immediately, and had a working prototype of my own provider up and running the same day. While “readability” is a very subjective notion, Go code has some specific properties which seem to aid comprehension.

Chiefly, Go sacrifices expressiveness for uniformity. I like to think of programming as a “creative” activity, and as part of that I like to develop within each system an idiom which fits the way I think about the domain. Paul Graham described this approach many years ago as one in which you first build up the language you want to write your program in, then write the program itself, so that the latter seems to fall naturally and expressively out of the domain abstractions you have created for it. Go provides very limited means for doing this: no macros, no generics, no lambda expressions, an approach to error handling which makes “fluent” APIs impracticable, and so on. Almost all the tricks I’ve built up over the years for creating embedded DSLs in Java or Ruby or Clojure are simply unavailable. You basically have no choice but to write your program in more or less “vanilla” Go.

On the other hand, if there’s one complaint I’ve heard consistently from my fellow developers over the years, it’s that they can’t understand my code until they also understand the set of abstractions I’ve introduced in order to make the logic of it easier to express. Once they do understand them, they’re usually comfortable (once or twice I’ve had to roll something back out because nobody else could see the point of it; they were usually right, in retrospect). Je ne regrette rien. But you do have to work to get people over the initial hurdle; and that means that they have to work, too. This is practicable in a small, close-knit team where you can just wander over to someone’s desk and talk them through something unfamiliar, but it presents a challenge when your codebase will be read and potentially worked on by hundreds or thousands of people, situated in different timezones, across a period of decades. That’s Google’s reality, and Go is optimised for that scenario.

Here’s a concrete example. Suppose we wanted to append an exclamation mark to each of the strings in a collection, creating a new collection with the modified strings. In Kotlin, that would be

In Go, you could write

but it’s terribly clunky. Go’s lack of generics means that you have to write mapStringToString as a mapper between collections of two fixed types – you can’t write a generic map from an array of T to an array of R, as you can in almost any other strongly-typed modern language. The lack of lambda functions means that you have to write out the function signature in full just to pass the string modifying function into the mapper. It hardly seems worth it; you might as well just write:

While Go does support passing around functions (and this is occasionally useful), it doesn’t really support the use of functions to build up composable abstractions that feel like extensions to the language. It favours the concrete, inline expression of logic, with function calls acting primarily as a mechanism for structured programming. “Concrete and inline” means “you can see it in front of you, right in the place where it’s happening” – which makes life a lot easier on the programmer who has to follow the execution path of an unfamiliar piece of code.

The same philosophy is at work in Go’s approach to library inclusion, which is to compile and link everything statically. It means that whenever you’re using a library you have the library’s source code to hand, in a predictable location in the path Go uses to locate external dependencies. Ctrl-click in an IDE will take you straight to the declaration site of the function you’re using. The general principle is to favour the transparent and ready-to-hand over the remote and opaque, the concrete and literal over the abstract and magical. It’s like speaking a language without metaphors.

Once you accept that “vanilla” Go is all you have, it may be best to think of it as a sort of DSL for performing disk and network I/O (the standard libraries are comprehensive and immediately useful), enriching and transforming data (Go’s struct types are extremely good for this) and efficiently parallelising work (goroutines are a well-thought-out low-level abstraction for parallel execution). These features make it excellent for “systems” programming, where you’re typically concerned with discrete, chainable operations – read in this text file, parse its contents as CSV, emit a sum of the values in the third column – and a concrete, imperative style of programming naturally fits the problem space. It’s less well-suited to “applications” development, where you need a wider range of metaphors to express multiple perspectives on the application’s domain. I would not, for example, want to use Go to write software which relied heavily on an “Actor”-based approach – you can do it, but it’s pretty clunky compared to the expressiveness offered to Akka by Scala’s case classes and pattern matching.

What Java Developers Can Learn From Go

A language that doesn’t affect the way you think about programming, is not worth knowing.
– Alan Perlis, Epigrams on Programming

Returning to Java (and Kotlin) development after a couple of months of Go development, I found myself more willing to consider that the concrete and immediate expression of some piece of program logic might serve better than trying to pull out a more generic version of the same logic that might never be re-used in practice. I discovered the value of a certain sort of “plain speaking” in code.

I also found myself feeling just a little less patient with the amount of behind-the-scenes magic that the Spring Boot framework puts into play, and wishing for something a little more lightweight and transparent. When you work in a language that makes different trade-offs, for different purposes, then the trade-offs involved in your familiar, comfortable environment become more apparent: you see what you are sacrificing in return for the things you value.

Learning Go won’t teach you any exciting new computer science concepts, or introduce you to a whole new paradigm of software development (for that, try Idris). But it will give you a better understanding of the breadth and variety of the design space for programming languages, at a time when mainstream languages generally seem to be converging (Kotlin is rather like Swift, is rather like Typescript, etc). Sometimes it isn’t the innovative new features that distinguish a language, but its choice of restrictions.

RETURN TO BLOG

SHARE

Twitter LinkedIn Facebook Email

SIMILAR POSTS

Blog