Vagrant and GNU Core Utilities

I recently had the need to install GNU’s Core Utilities on my mac. After a bit of Googling, the easiest, and most straightforward way to proceed was of course with Brew.

Here’s the steps I took for the installation of the GNU packages:

1) Install coreutils and gnu-sed:

$ brew install coreutils
$ brew install gnu-sed

2) Edit my .bashrc file to add the corresponding paths:

...
# Add alias for nice color output.
alias ll="ls -lAFh --color"
...

export PATH=$(brew --prefix coreutils)/libexec/gnubin:$(brew --prefix gnu-sed)/libexec/gnubin:$PATH

# Bonus points for adding the man page references
export MANPATH=$(brew --prefix coreutils)/libexec/gnuman:$(brew --prefix gnu-sed)/libexec/gnuman:$MANPATH

3) Verify:

$ source ~/.bashrc

$ which ls
/usr/local/opt/coreutils/libexec/gnubin/ls

$ which sed
/usr/local/opt/gnu-sed/libexec/gnubin/sed

That’s all there was too it. The commands I needed worked perfectly. That is until I attempted to load up one of my Vagrant boxes. Then that annoying NFS exports prompt came up asking me for my password!? I fixed this before didn’t I?

Vagrant NFS Synced Folders

As most of you know, you can add a little snippet to your /etc/sudoers file so vagrant up doesn’t require your password. It’s in the documentation here, but looks like this for macs:

Cmnd_Alias VAGRANT_EXPORTS_ADD = /usr/bin/tee -a /etc/exports
Cmnd_Alias VAGRANT_NFSD = /sbin/nfsd restart
Cmnd_Alias VAGRANT_EXPORTS_REMOVE = /usr/bin/sed -E -e /*/ d -ibak /etc/exports
%admin ALL=(root) NOPASSWD: VAGRANT_EXPORTS_ADD, VAGRANT_NFSD, VAGRANT_EXPORTS_REMOVE

After some debugging to see what was happening, it became clear what the problem was. Vagrant was running the tee and sed commands from my newly installed GNU packages. Luckily there’s an easy fix:

Cmnd_Alias VAGRANT_EXPORTS_ADD = /usr/bin/tee -a /etc/exports
Cmnd_Alias VAGRANT_EXPORTS_ADD_GNU = /usr/local/opt/coreutils/libexec/gnubin/tee -a /etc/exports
Cmnd_Alias VAGRANT_NFSD = /sbin/nfsd restart
Cmnd_Alias VAGRANT_EXPORTS_REMOVE = /usr/bin/sed -E -e /*/ d -ibak /etc/exports
Cmnd_Alias VAGRANT_EXPORTS_REMOVE_GNU = /usr/local/opt/gnu-sed/libexec/gnubin/sed -E -e /*/ d -ibak /etc/exports
%admin ALL=(root) NOPASSWD: VAGRANT_EXPORTS_ADD_GNU, VAGRANT_EXPORTS_ADD, VAGRANT_NFSD, VAGRANT_EXPORTS_REMOVE_GNU, VAGRANT_EXPORTS_REMOVE


TestMain

I almost missed an awesome new feature in Go 1.4, but thankfully Justinas Stankevičius caught it and wrote a great blog post about it. Of course I’m talking about the new TestMain functionality in the test package. I encourage you to check out Justinas’ post for all the details.

How it works

I’ve done a lot of Django development in the past, and when writing unit tests you leverage the setUp and tearDown methods a lot for preparing fixtures or cleaning up. Go’s TestMain enables you to do this, but for an entire package (not just a single test). So it’s actually closer to Python’s setUpClass and tearDownClass workflow. Here’s a few things to remember when using TestMain:

  1. You can define this function in your test file: func TestMain(m *testing.M).
  2. Since function names need to be unique in a given package, you can only define the TestMain once. If your package has multiple test files, choose a logic place for your single TestMain function.
  3. The testing.M struct has a single defined function named Run(). As you can guess, it runs all the tests within the package. Run() returns an exit code that can be passed to os.Exit.
  4. A minimal implementation looks like this: func TestMain(m *testing.M) { os.Exit(m.Run()) }.
  5. If you do not call os.Exit with the return code, your test command will return 0. Yes, even if a test fails!

Example

So if you add the minimal implementation as defined above, your tests will continue to run exactly as they used to. The real power comes with things you can do before and after you run your tests.

package somepackagename

import (
    "testing"
    "os"
)

func TestMain(m *testing.M) {
    log.Println("This gets run BEFORE any tests get run!")
    exitVal := m.Run()
    log.Println("This gets run AFTER any tests get run!")

    os.Exit(exitVal)
}

func TestOne(t *testing.T) {
    log.Println("TestOne running")
}

func TestTwo(t *testing.T) {
    log.Println("TestTwo running")
}

Now if you run go test, you will see this output:

2014/12/29 12:48:32 This gets run BEFORE any tests get run!
2014/12/29 12:48:32 TestOne running
2014/12/29 12:48:32 TestTwo running
PASS
2014/12/29 12:48:32 This gets run AFTER any tests get run!

This will come in handy for creating some dummy data, or switching databases to point at a test instance. Just remember, the code before and after m.Run() only gets called once! So you may need to clear out a few database tables or files in your /tmp folder after each test. See Justinas’ post for a good example of setting up a test database and clearing it in every test.



Developing on a Mac

I’m a mac user. I like the idea of switching to a Linux system for my everyday computing, some laptop running Ubuntu would be cool. But I just can’t seem to give up this retina screen. That and the fact that I do a fair bit of iOS development. It’s also hard to give up something you’re so intimately familiar with. In production environments, the majority of my experience is with CentOS (versions 5 and 6). But lately I’ve been running my services on Google Compute Engine using Debian. So far I have no complaints, and I’m enjoying the more up-to-date packages (if you’ve used CentOS before, you know what I mean). Though I have to admit, CentOS 7 has some interesting changes like systemd and Docker support baked in. But I think if I’m going to switch to another OS, it’ll be CoreOS.

So, I develop on a mac, but deploy to a Debian environment. Sounds like a perfect use case for Vagrant! If you’ve never heard of or used Vagrant before, I highly suggest you take a peek at the documentation. A simple explanation of Vagrant according to their site:

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

In my case, I’m using VirtualBox as my provider. It’s free and works exactly as you would expect, without any noticeable performance problems. I’m not going to explain how to get everything setup, I’ll leave that as an exercise for you to tackle on your own.

What are we solving?

One of the key problem I have specifically, is that I don’t like cluttering up my mac with every service I may need for every project. In theory I could just install everything locally with Brew:

brew install nginx
brew install postgresql
brew install mongodb
brew install redis
brew install beanstalk

Along with those packages, Brew would also install the needed dependencies: openssl, readline, and pcre.

So that’s not terrible, and besides how much CPU could be sucked up with all those processes idle? But here’s another question, what if some of your projects run on different OSs, or more importantly different versions? If you attempt to install MongoDB with apt-get install mongodb on a vanilla Debian, it’ll install version 2.0.6. Brew will install 2.6.5. That’s a pretty big gap in versions, and who knows whats been fixed or broken between versions. Another big difference most of us forget about are OS specific implementations like kqueue vs epoll.

How Vagrant can solve this

So lets say we have a simple blogging web app written in Go. It persists it’s data in postgres. Previously we would have had a code snippet like this to hook it up in code:

import (
    _ "github.com/lib/pq"
    "database/sql"
)

db, err := sql.Open("postgres", "dbname=unrolled user=root host=127.0.0.1 port=5432 sslmode=disable")
if err != nil {
    panic(err)
}
// Do something with db.

This assumes postgres is running on our localhost (127.0.0.1). So lets fix this up so we don’t need postgres installed on our local system.

Vagrant setup

I have created two Gists. Lets go over the bootstrap.sh file first:

This bootstrap installs most of the services I need to use on daily basis. Nothing to special, but I’d like to draw your attention to lines 38, 53, 58, and 72. These simple configuration tweaks tell the services to listen on any network interface, not just the localhost. If we didn’t apply these changes, the services would only listen to connections from within the virtual machine, which wouldn’t help us much.

Now we can change our above Go snippet to something like this:

import (
    _ "github.com/lib/pq"
    "database/sql"
)

db, err := sql.Open("postgres", "dbname=unrolled user=root host=unrolled.vm port=5432 sslmode=disable")
if err != nil {
    panic(err)
}
// Do something with db.

Notice the host is now pointing at our VM. Obviously this assumes you have added your virtual machine’s name to your /etc/hosts file on your mac.

But we can go one step further to make this even more or a seamless transition. Checkout the Vagrantfile:

Specifically checkout lines 11 to 16. We can forward the VM’s ports and map them to our local system! We don’t even need to change our code now. You can leave the host as 127.0.0.1 and it will actually be using the VM. This also comes in handy when using GUI applications on your mac.

Happy developing!



Three Questions From dotGo

With all the dotGo 2014 videos out now, I took an interest in the three questions Andrew Gerrand would ask each speaker in a back stage style interview. Every time I heard the questions, I would rack my brain to find the best answer I could come up with. So without further ado, he’s the questions and my responses.

What is the first Go program you wrote?

The first true Go program that I wrote eventually turned in to my first package called gapless. Its purpose was to listen to a Redis list, process the message, and send push notifications to Apple (which in turn pushed them to iOS devices). The original implementation was in python, and sending over 100,000 notifications a day would backlog the queue for long periods of time as it was a single threaded process. So after investigating Go for a while, I realized how goroutines could solve my current pains. This worked so well, that I ended up rewriting the API server in Go as well.

What’s your favorite bug?

This bug has nothing to do with Go, but it’s one of my favorites. I was working on an API that took a datetime like 2014-11-12 12:59:59.997 as a parameter. The consuming app was written in VB.NET and backed by a MSSQL 2008 database, and yes this is an important detail. This consumer app would look at the last element it had received, and send a datetime of that element to my API. My API would then return anything newer based on the datatime (and yes, there is a good reason we didn’t use a unique identifier for this). So it turns out MSSQL has a bit of an accuracy issue when dealing with datetimes. The consuming app would send a datatime like 2014-11-12 12:59:59.997 but actually it meant to send 2011-01-01 23:59:59.998, and thus my API would send a duplicate element back! Took a lot of back and forth to solve this little gem.

What is the worst code you have ever written?

This is an example of embarrassing code. The first site I was ever tasked to write needed a way to update the content without editing the source files. Today we would call this a CMS, and use some out-of-the-box solution like Drupal, Django or Wordpress. But being new to this crazy web world, I figured writing something from scratch would be the way to go. After creating a very crude administrative section, I decided to implement the most amazing and horrify feature… a form with a textarea that would take any data, and with no validation checking or escaping, run it on the database as a query. I’m happy to say I take security much more seriously these days.



Test Tables in Go

I dove into the time standard library the other week to fix an outstanding bug. Turns out the bug is not quite a bug, but it might be… lets just say its status is complicated. Anyway, while diving through the code I found a new (to me) method for doing repetitive tests. Lets setup a fictitious function to test so I can show you why it’s so awesome.

The function

Lets say the function input is a list of integers. Given this list, return the integer closest to 1 which is not in the list. So that is to say given the list of [1,2,3,5], our function should return 4. Also of note, duplicates may exist, order is not guaranteed, and an empty list should return 1. Got it? Awesome, so here is our function signature:

func LowestAvailableInt(input []int) int

Normal tests

We know the function’s signature, so lets do a little TDD. Here is how I would usually start:

func TestEmptyList(t *testing.T) {
    input := []int{}
    expected := 1
    result := LowestAvailableInt(input)
    if result != expected {
        t.Errorf("Result should have been %d, but it was %d [Input: %#v]", expected, result, input)
    }
}

Run go test and see that it fails. Fix the function implementation by making it return 1 and continue adding more test cases:

...

func TestSingleItem(t *testing.T) {
    input := []int{1}
    expected := 2
    result := LowestAvailableInt(input)
    if result != expected {
        t.Errorf("Result should have been %d, but it was %d [Input: %#v]", expected, result, input)
    }
}

func TestUnorderedItems(t *testing.T) {
    input := []int{5, 1, 4, 2}
    expected := 3
    result := LowestAvailableInt(input)
    if result != expected {
        t.Errorf("Result should have been %d, but it was %d [Input: %#v]", expected, result, input)
    }
}

func TestDuplicateItems(t *testing.T) {
    input := []int{1, 2, 4, 5, 6, 1, 4, 5, 2, 3}
    expected := 7
    result := LowestAvailableInt(input)
    if result != expected {
        t.Errorf("Result should have been %d, but it was %d {Input: %#v}", expected, result, input)
    }
}

As you can imagine, this could drag out for a while depending on the complexity of the function. So how can we improve this repetitive testing process?

Enter test tables

Here’s where it gets awesome. I found this method of testing in the src/pkg/time/time_test.go file. Basically what you do is setup a data structure with your input and expected result. This could be a map or better yet, a slice of structs. Let’s convert the above test cases into a test table:

var testTable = []struct {
    input    []int
    expected int
}{
    {[]int{}, 1},
    {[]int{1}, 2},
    {[]int{5, 1, 4, 2}, 3},
    {[]int{1, 2, 4, 5, 6, 1, 4, 5, 2, 3}, 7},
}

func TestLowestAvailableInt(t *testing.T) {
    for _, data := range testTable {
        if result := LowestAvailableInt(data.input); result != data.expected {
            t.Errorf("Result should have been %d, but it was %d", data.expected, result, data.input)
        }
    }
}

Conclusion

How awesome is that!? This has turned into one of my favorite testing methods in Go. It’s concise, readable, and so easy to extend. If we ever find a new edge case or want to expand our testing, we can just add a new case to our testTable.

...
{[]int{9}, 1},
{[]int{12395768557482757672}, 1},
{[]int{2,2,2,2,3,3,3,3,4,4,4,4}, 1},
...


Code Organization - Part Two

In part one I discussed workspaces in Go. To reiterate, the best way to manage workspaces in Go is to not manage it. Create a single workspace for all your work, and do not worry about it. If you are worried about dependency management, use a tool like godep. Also, you may want to checkout gopkg.in as it looks quite promising. I hope to explore it more in a future post.

So we know that Go code is kept in a workspace, and we know that your code should live in a package which lives in the src directory of your workspace. But what is a package, and how do we organize them?

Packages

In Go, a package is simply an application or a library. You can think of it as a collection of code in a single directory. A package can have a subpackage (subdirectory), but Go still views it as independent package (net/http does exactly this). There are two important parts to think about when creating and dealing with packages, paths, and names.

Paths

The package path is where your code lives. The best example of laying out packages is the standard library. Here we can see that fmt and time are packages we can import into our code:

package main

import (
    "fmt"
    "time"
)

func main() {
    fmt.Println("Current time: ", time.Now().Format(time.RFC1123))
}

When Go compiles the above example, it looks in the standard library for the imports first. But what happens after that? If Go can not find an import in the standard library, it will then traverse the $GOPATH/src directory looking for a path that matches your import. This is where you need to pay special attention to your package’s path.

In theory we could create any arbitrary path in $GOPATH/src and Go would be happy to import our code. For example, if we had a package in $GOPATH/src/batman, this would compile and run:

package main

import (
    "fmt"
    "batman"
)

func main() {
    fmt.Println("Batman says: ", batman.ReturnSomeCoolString())
}

But this is not ideal for a few reasons:

  1. The Go standard library might introduce a package named the same as the one we created (unlikely, but it could happen). If that were to happen, it would break our code because it would be found in the standard library first, thus skipping the package we created.
  2. It is not unique enough. Any other user in the Go ecosystem could create a package with the same name, but do something completely different. This makes sharing code difficult.
  3. This is not go gettable!

go get

The go get command is awesome. If we want to integrate some third party code into our application, we can run a single command to retrieve the code for use:

$ go get github.com/codegangsta/negroni

This will download the negroni package into its canonical path at $GOPATH/src/github.com/codegangsta/negroni. Notice that it uses the full github.com/username path. This is a best practice outlined in the Go docs. Not only is it unique to this package, but it makes the code sharable via the go get command. To use the package we just downloaded we would import it like this:

package main

import (
    "net/http"
    "github.com/codegangsta/negroni"
)

func main() {
    ...
}

Even if you do not plan on sharing your code, or even version controlling your code, use your github.com/username as the base of your package paths. Of course this does not have to be GitHub, you could use your bitbucket.org/username or any other unique endpoint. The priority here is a unique path that you can use. The added benefit is that if you decided to publish your code, it will be go gettable.

Names

There are two types of packages we can create. The first is an executable application. In order to tell Go that you want your code to be executable, we give it the package name main:

package main

// Code lives here.
...

You may have noticed that the majority of examples on this blog and the Go Tour all use package main. This is because we want to execute the code and see the output.

The second type of package is a library. This is code that can not be directly executed, but can be imported by an application. An example of this is middleware packages in web applications. I have created a package called secure which has a package name of secure:

package secure

// Secure package code lives here.
...

This is how we would use the secure library:

package main

import (
    "net/http"
    "github.com/unrolled/secure"
)

func main() {
    m := http.NewServeMux()
    m.HandleFunc("/", func(w http.ResponseWriter, req *http.Request) {
        w.Write([]byte("Hello world."))
    })

    s := secure.New(secure.Options{
        ContentSecurityPolicy: "default-src 'self'",
    })

    http.ListenAndServe("0.0.0.0:3000", s.Handler(m))
}

Conventionally package names are the same as the last element of their import path. In my above example the package path is github.com/unrolled/secure, and if you view the source the package name is secure. This is Go’s convention and you should try your best to follow it. But it is not a requirement. If we changed my secure package name to something different, like gowebsecurity:

package gowebsecurity

// Secure package code lives here.
...

The above example would only need one line changed, and the import path would stay the same:

...
import (
    "net/http"
    "github.com/unrolled/secure"
)
...
s := gowebsecurity.New(secure.Options{
...

This is an important concept to understand. Package paths and package names do not have any correlation. It is purely a best practice (and one you should follow). The path is simply where the code lives. When we import a package into our code, the Go compiler finds the files in the package path and loads the package name into scope and continues compiling our code.

General Tips

There are not many requirements when organizing your package, for the most part it is completely up to you. But there are a couple of things to keep in mind.

A Go package can (and probably should) have multiple files. This is great for organizing your code into logical groups. No one likes looking at a 2,000 lines of code in a single file. Just make sure your package name is the same across all the files within your package. Go will only allow you to have one name per package.

If you feel the need to have multiple package names create a subpackage, but do so sparingly. See Ben Johnson’s post for more on using subpackages sparingly. A great example of properly using a subpackage is the bson subpackage within the mgo package (MongoDB driver package). Since they created bson as a subpackage, it can be tested independently and imported by others who only need the bson functionality.

That brings me to my last point. If you create subpackages, be aware that you can not have cyclic dependencies. Anyone coming from Python will be well aware of this, but just something to keep in mind.