Meta trips

Introducing the error loop


A common point of discussion among Go programmers, especially those new to the language, is how to handle errors.

Rob Pike on the Go blog

Two days ago Rob Pike published the article Errors are values where he promotes an error handling pattern that could help preventing to have a code base cluttered with statements like this

if err != nil {
    return err
}

I recently discovered an alternative way to deal with this problem and want to share it with you.

Introducing the error loop

First of all, the approach proposed by Rob Pike and used inside the standard library (bufio.Scanner) is perfectly fine for some cases and may be overkill for others. However I’ll leave it to the reader to judge if and when she sees a better fit for one or the other pattern and just present an alternative way here.

Motivation

Some time ago, I developed a library to simplify the error handling in Go, but it had to make use of the reflect package and to use the empty interface everywhere, circumventing the type safety of the compiler. So I don’t consider it the best solution to the problem.

However when announcing it, Dustin responded with a nice use case that I want to reuse here.

Ok, so here is the use case:

  1. Create request.
  2. Set some headers.
  3. Have client execute it.
  4. On err, do error thing.
  5. On success, verify HTTP status, consider it an error if not 200.
  6. If 304, reuse existing content
  7. Parse JSON response.
  8. Handle JSON errors.
  9. Close request if we got past 4.

So the typical code might look somewhat like this

func getUser(user *User) error {
    req, err := http.NewRequest("GET", "https://api.github.com/users/"+user.Name, nil)

    if err != nil {
        return err
    }

    req.Header.Add("X-Somekey", "some value")

    client := &http.Client{}
    resp, err := client.Do(req)

    if err != nil {
        return err
    }

    if user.Name == "metakeule" || user.Name == "misterx" {
        // fake 304 situation
        resp.StatusCode = 304
    }

    var body []byte
    
    switch resp.StatusCode {
    case 304:
        body, err = fromCache(user.Name)
        if err != nil {
            return err
        }
    case 200:
        body, err = ioutil.ReadAll(resp.Body)
        if err != nil {
            return err
        }
    default:
        return fmt.Errorf("wrong status code: %v", resp.StatusCode)
    }

    err = req.Body.Close()
    if err != nil {
        return err
    }

    return json.Unmarshal(body, user)
}

type User struct {
    Name     string `json:"name"`
    Location string `json:"location"`
}

func fromCache(name string) ([]byte, error) {
    switch name {
    case "metakeule":
        return []byte(`{"name": "cached metakeule", "location": "Cologne / Germany"}`), nil
    default:
        return nil, fmt.Errorf("nothing in cache for %#v", name)
    }
}

The interesting function here is getUser where we have 7 opportunities to return an error and 5 blocks of

if err != nil {
    return err
}

While being far from DRY, a function like this also has the following negative properties:

  • It is hard to see the flow, i.e. where errors occur and are returned in the whole picture.
  • It is unclear if it is better to reuse a single variable for the errors or to introduce new error variables.
  • When using the same error variable, things like variable shadowing might happen unintended.
  • This all makes it error-prone to move or add code inside the function (reorder function calls that return errors).

Certainly there are better ways to organize your code for a particular problem. But the situation where a function needs to return errors it got from multiple other function calls is not that uncommon. And the more function calls are involved the more confusing this kind of code gets.

The error loop

So I had an idea: Why not reuse a single error variable and have a loop that interrupts if the error is not nil. Something like this

var err error

for err == nil {
  // do your stuff, setting err if needed
}

return err

Now here have another problem to solve: How can we organize our function calls in a way that one call comes after another?

How about that:

var err error

steps: 
    for i := 0; err == nil; i++ {
        switch i {
            default: 
                break steps
            case 0:
                err = fn1()
            case 1:
                err = fn2()
            ...
        }
    }

return err

Since the default is breaking the for loop, we will leave it as soon as i has no case statement (or err != nil). We just have to make sure that the following conditions hold:

  • The case statements map to the execution order of the functions, i.e. case 0 maps to the first called function, case 1 maps to the second called function and so on.
  • There must be no “holes” inside the case statements, that means we can’t skip a number and not having a case 1 expecting a case 2 to run.

Ok, with that in place there are some open questions left. First of all, what if we want to share variables between the function calls?

Well, the solution is easy, we just have to define them outside the loop:

var (
    err error
    body []byte
)

steps: 
    for i := 0; err == nil; i++ {
        switch i {
            default:
                break steps
            case 0:
                body, err = fn1()
            case 1:
                err = fn2(body)
            ...
        }
    }

return err

This has the nice side effect that we can see all shared variables at one place and don’t have to fear shadowing effects.

The added bonus: goto in error loops

It is not strictly needed here but may be nice to have.

Yes we all now, goto is evil but sometimes it is handy. Several times I found myself wanting to jump to some code places inside a single function based on some errors that happened or not. While you could argue that this code must be refactored out into its own function, sometimes this feels like overkill. On the other hand, goto statements can make it hard to follow the function flow.

Since we already got a position indicator inside our error loop, we can use it to have our little error loop goto:

var (
    err error
    body []byte
)

steps:
    for jump := 1; err == nil; jump++ {
        switch jump - 1 {
            default: 
                break steps
            case 0:
                body, err = fn1()
                if len(body) == 0 {
                    jump = 4
                }
            case 1:
                err = fn2(body)
            case 4:
                err = fn3()
            ...
        }
    }

Here jump starts with 1 and the switch is over jump - 1. This has the nice effect that you can jump to the corresponding case statement by simply assigning the target number to the jump variable.

Please note that you might want to make “holes” into your loop before case statements that should only be reached by jumps.

How does it work in practise?

So lets rewrite the getUser function with an error loop:

func getUser(user *User) error {

    var (
        req  *http.Request
        resp *http.Response
        body []byte
        err  error
    )

steps:
    for jump := 1; err == nil; jump++ {
        switch jump - 1 {
        default:
            break steps
        case 0:
            req, err = http.NewRequest("GET", "https://api.github.com/users/"+user.Name, nil)
        case 1:
            req.Header.Add("X-Somekey", "some value")
            client := &http.Client{}
            resp, err = client.Do(req)
        case 2:
            if user.Name == "metakeule" || user.Name == "misterx" {
                resp.StatusCode = 304 // fake 304 situation
            }
            switch resp.StatusCode {
            case 304:
                body, err = fromCache(user.Name)
                jump = 4 // jump to case 4
            case 200:
                body, err = ioutil.ReadAll(resp.Body)
            default:
                err = fmt.Errorf("wrong status code: %v", resp.StatusCode)
            }
        case 3:
            if req.Body != nil {
                err = req.Body.Close()
            }
        case 4:
            err = json.Unmarshal(body, user)
        }
    }

    return err
}

I think that it is really easy to see what is happening here and when. Also the execution order can be changes easily and new error returning function calls can be easily added without having to fear shadowing or other side effects. For the rule of what a case statement represents, you could go with “every code block that might return an error” or with “every processing step with or without returned errors” just as you like.

Performance

I did some simple micro benchmarks. They show not that much overhead for the error loop vs. the conventional error handling:

Results:

// 5 functions called that might return errors
BenchmarkErrLoop5     20000000          69.5 ns/op +18%
BenchmarkErrBench5  30000000            59.0 ns/op

// 10 functions called that might return errors
BenchmarkErrLoop10  20000000           115.0 ns/op +36%
BenchmarkErrBench10 20000000            84.6 ns/op

// 20 functions called that might return errors
BenchmarkErrLoop20  10000000           164.0 ns/op +27%
BenchmarkErrBench20 10000000           129.0 ns/op

Comfort

I realize that the error loop is lot of code to keep in memory as a design pattern. However some editors have shortcuts for snippets. I am using sublime text3, so I made a snippet for it, enjoy!

Summary

Error loops provide an alternative way to organize error returning functions in a way that has less code duplication and helps to see the code flow. They make it easy to add and change function calls that are returning errors, don’t obfuscate the code and have no big performance impact.

Update

Always break out of the “steps” loop.

comments powered by Disqus