more info ⬇

@amattn

subscribe for more
stuff like this:

SW engineering, engineering management and the business of software



2013 08 06

Filesystems are for work and Dropbox is the de facto iOS Filesystem

Many people are getting work done with iOS.

One thing that most have in common is the reliance on Dropbox as part of their workflow. The big difference between work and casual computing use is that people pay humans to accomplish a task. More often than not, these tasks involve multiple tools. Any time multiple tools are required, you need some kind of staging area, where intermediate results can be persisted, organized and routed to the correct tool.

Traditional PCs have always used the desktop and its underlying filesystem for this task. The simplicity and flexibility of a hierarchical folders & files system has worked well enough for many people, even though it does assume a bit of computer literacy.

If it’s not yet obvious, iOS has no such built-in staging area. Users informally rely on an awkward replacement in email. It’s like trying to draw while wearing rubber gloves filled with Jello. People who get work done rely on Dropbox. The rapid adoption of the Dropbox API across many 3rd party apps is further testimony to its importance.

There are many reasons to abandon the File & Folders system as we know it. The reliance on computer literacy it requires places an artificial limit on your available market. Theoretically, you could reduce the conceptual burden with some kind of iTunes.app-like interface which is heavily reliant on metadata or a search & tag based approach akin to gmail.

However, Apple’s replacement for the filesystem in iOS is iCloud, which solves the persistance requirement, but not the organization or routing requirement. iCloud is a fundamentally inappropriate architecture for getting work done with multiple tools. Vertical data silos per app (or per vendor) are no where near flexible enough to solve the infinitely varied requirements of the job market.

Apple has some other data specific silos, like the contacts API and the photo library, but these don’t solve the organization problem. If I need documents organized by client which includes a mix of outlines, photos, text, contact vCards, code and layout files, then my only reasonable solutions are Dropbox or stabbing my eyes with a spork.

Steve Jobs famously called Dropbox “just a feature”. While hard drives in the sky aren’t the only way to solve the organization and routing issues, they are issues need to be solved or Apple will find itself beholden to a 3rd party for a fairly critical piece of the computing mindshare: “Getting Work Done”.

Apple should be thinking about the organization and routing of documents and data. For many people, Dropbox is already the de facto filesystem on iOS. If “many” becomes “majority”, then Apple has a big problem.

2013 08 13

Cognitive Offloading and the Productivity of Go

Minimizing The Time from Idea to Production

There are many steps in the making of software. Conceptually, we can organize them in stages of a metaphoric pipeline as idea, architecture, prototype, and production-ready product.

I’ve been writing software for over three decades and Go is the best tool I’ve ever had for getting from idea to production.

There are many small reasons and two big reasons for this kind of efficiency and productivity.

Go Is Inherently Productive and Efficient

The small reasons are fairly well documented, but some highlights:

Go is a compiled, statically-typed language that feels more like a dynamic language than its peers. The syntax has some convenience sugar sprinkled in but the bulk of the credit is due to the compiler. Primarily via type elision, the compiler is smart enough to enforce static typing with minimal developer hand-holding. Also the compiler is faster than an ADHD squirrel marinated in Red Bull.

The built-in library covers a large surface area for such a young language, and the overall ecosystem is flourishing.

Error handling seems overwrought and full of boilerplate, but my experience is that idiomatic style of inline error handling makes programs faster and easier to debug. The end result is being able to zero in on problematic lines of code quickly which reduces the overall time to solution. (Russ Cox talks about the philosophy of Go and errors here.)

There are too many other small reasons to continue, but assume for now the language was crafted with productivity in mind.

Cognitive Offloading

Go’s primary advantage in facilitating fast time-to-product is a high level of positive cognitive offloading.

Making software involves quite a bit of mental juggling. You have to keep many disparate thoughts, concepts, requirements and goals in working memory simultaneously. The reason that Paul Graham coined the term Maker’s Schedule and the concept of half day chunks is that typically, in order to write software you need to load your working memory with the context of the problem your are solving and the existing state of the solution. This “ramp up” takes time and an interruption can wipe out a good chunk of that working memory.

Positive Cognitive Offloading can be thought of as a juggling partner you can hand off items to. If you trust them not to drop things, it frees your working memory for other items or allows you to juggle fewer items faster. Since there’s less to load, you move into the productive state faster.

Language features such as static typing, interfaces, closures, composition over inheritance, lack of implicit integer conversion, defer, fallthrough, etc. all result in a compiler that tells you when your code is likely to be buggy. Lack of warnings enforce discipline on the weak, squishy, analog life-forms who would otherwise allow ambiguity to deploy to production.

defer is an absolutely brilliant pattern that clearly illustrates this:

file, err := os.Open("some.file")
if err != nil {
    // don't forget to handle this!
    return
}
defer file.Close()

if X {
    return
} else if Y {
    return
}
// otherwise whatever
return

The explicit offloading that occurs is that you don’t have to worry about if else chains or intermediate returns. In practice you can write code that needs cleanup without having constantly be on alert for exit points or wrapping code around an closure. As a matter of practice, the odds of leaving a dangling file are much lower when Close is nearby Open.

The time and energy requried to keep code bug-free is lower with defer allowing you to progress faster.

This kind of mindset is even more apparent in the toolchain. One of the nicest features of Go is the fmt tool. By outsourcing all coding formating standards to a command line tool, a surprising amount of weight is lifted from the task of writing code. Time wasting aside, the reduction of social friction or worse, check-in ping pong, over coding standards makes the whole world feel a bit more civilized.

Other parts of the toolchain which reduce cognitive weight include vet (a lint-like tool), test and even the GOPATH mechanism which forces a high level folder structure across all go projects.

It’s also important to contrast negative cognitive offloading. If your juggling partner drops things occasionally, this is arguably worse than having no juggling partner at all. If your ORM occasionally produces poor SQL that takes down your database, suddenly your cognitive overhead every time you use ORM methods skyrockets, because you have to ensure your ORM code doesn’t negatively impact the system.

In Go, the current state of the garbage collector can potentially cause NCO, but the existing profiler and improvements to the GC itself as well as some library additions in the upcoming Go 1.2 offer some relief. Needless to say, the other benefits far outweigh the cost.

Pipeline reversal penalty

One of the most common tasks a developer does is rewriting code that already exists. Thinking again about our metaphoric pipeline (idea, architecture, prototype, and production-ready product), rewriting code is essentially backing up through the pipeline.

There are many good and bad reasons to go in reverse, but there is always a short-term efficiency penalty to doing so. Good developers tend to offset that penalty by achieving longer-term benefits in maintainability, correctness and/or business goals.

Go, via intentional design and compiler implementation, has the shortest pipeline reversal penalty of any development ecosystem I’ve ever used. In practice, this means you are able to refactor more often with less regressions.

If you change an interface, the compiler tells you every single place that needs to be modified. Change a type and you are notified by line number everywhere your round peg no longer fits in the old, square hole.

If you take advantage of unit testing and benchmarking infrastructure, then you are residing near the magical Stuff Just Works Zone™. Even if you are not a developer, it should also be obvious that Go codebases are more easily adapted to changing business requirements.

Not Perfect

There is some tarnish on the generally shiny Go. Most crashes in Go are due to nil pointer references. John Carmack very concisely explains why:

The dual use of a single value as both a flag and an address causes an incredible number of fatal issues.

Something like Haskell’s Maybe type would be nice, or possibly some kind of Guaranteed-Good-Pointer type. In the meantime, there is a the cognitive overhead of nil checking.

The concurrency model is great, but has a bit of a learning curve to it. If you identify a performance bottleneck, you end up implementing & profiling both the the traditional way with mutexs/locks and the idiomatic way with channels and goroutines.

Some things that seem like negatives aren’t. The lack of generics is unfortunate, but the language designers are not willing to give up any of the other good stuff in Go in order to shoehorn them into the language. So far I’m convinced it’s the right decision. If they manage to pull it off in the future, their track record suggests that they will have found the right tradeoffs.

Net Win

Nearly two years ago, I said the following and I still believe it to be true:

Go is a tremendous productivity multiplier. I wish my competitors to use other lesser means of craft.

Go may not be for everyone, but there is more and more evidence that others are coming to similar conclusions.

One last point of interest, many of the above posts discuss how much fun Go is and my own experience upholds this. Not just programming, but in any domain, better tools that reduce friction nearly always make the process more entertaining. It turns out when you help people avoid the some of the irritations in their craft, they have a more enjoyable time with it.

Many thanks to @jkubicek, @yokimbo and Daniel Walton for reading drafts and providing feedback.

2013 08 29

Installing MariaDB/MySQL with Docker

Simply put, I think docker is going to change the game. Anyone who has any interest whatsoever in devops better be paying attention.

The best part for me about docker is that I can iterate very, very quickly on getting images the way I want. If I am installing a VM, and I screwup a step, reinstalling ubuntu from scratch is no fun. Spinning up a new docker container or image takes less than half a second.

Derek from Scout put it simply and concisely: “Docker is git for deployment”

Here’s what I’ve learned over the past week:

Typical docker usage:

Create a new container by loading a load a fresh image

sudo docker run -i -t ubuntu /bin/bash

Start populating the container with whatever:

apt-get update
apt-get install <WHATEVER>

Leave the container

exit

Create a snapshot image of the current state of the container

# grab the container id (this will be the first one in the list)
docker ps -a  
docker commit <CONTAINER_ID> <YOU>/<CONTAINER_NAME>

Run the container as necessary, confifguring ports, using detached mode what not.

At this point you have a docker image (like a snapshot) of a clean install of WHATEVER. Running a container will load the image as a starting point and then allow you to configure as necessary. Screw up the conf file? Just exit the container and start over. The pipeline reversal penalty is minimal.

MariaDB

Create container and do the basic install

Launch a fresh container (do one of the following):

sudo docker run -i -t ubuntu:precise /bin/bash

Inside the docker container, just do the regular install dance:

apt-get update
#a mounted file systems table to make MySQL happy
cat /proc/mounts > /etc/mtab

MySQL:

apt-get install mysql-server

MariaDB:

apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 0xcbcb082a1bb943db

# if you want v10.0.x
echo "deb http://ftp.osuosl.org/pub/mariadb/repo/10.0/ubuntu precise main" >> /etc/apt/sources.list
# if you want v5.5
echo "deb http://ftp.osuosl.org/pub/mariadb/repo/5.5/ubuntu precise main" >> /etc/apt/sources.list
apt-get update
apt-get install mariadb-server

# exit the container
exit

Commit the basic install, I like to note which version. Everything after this assumes 5.5, but the basic directions work for 10.0.x as well.

# one of the following:
docker commit <container_id> <YOU>/mariadb55
docker commit <container_id> <YOU>/mariadb100

FIRST RUN:

First launch back into a new container, but this time with a data directory mounted. This example uses a host folder at $HOME/mysqldata. Inside the container the directory is mapped to /data. This makes it easy to spin up different instances of MySQL without having to be constantly configuring new data dirs.

sudo docker run -v="$HOME/mysqldata":"/data"  -i -t -p 3306 <YOU>/mariadb55 /bin/bash

THE FOLLOWING FIRST RUN STEPS ASSUMES -v=“$HOME/mysqldata”:“/data” FLAG!

Backup my.cnf

cp /etc/mysql/my.cnf /etc/mysql/my.cnf.orig

Allow access from any ip address. This is obviously not secure for production, but more than enough for this example.

sed -i '/^bind-address*/ s/^/#/' /etc/mysql/my.cnf

Change the data dir

sed -i '/^datadir*/ s|/var/lib/mysql|/data/mysql|' /etc/mysql/my.cnf
rm -Rf /var/lib/mysql

Setup new data tables

mysql_install_db

Startup:

/usr/bin/mysqld_safe &

Follow the prompts (Typically, set a root pw and answer Y for everything else:

mysql_secure_installation

Allow docker to login from wherever (Again, obviously not secure for production):

mysql -p --execute="CREATE USER 'docker'@'%' IDENTIFIED BY 'tester';"
mysql -p --execute="GRANT ALL PRIVILEGES ON *.* TO 'docker'@'%' WITH GRANT OPTION;"

Bail from the container, back to the host:

mysqladmin -p shutdown
exit

In the host, commit and tag:

sudo docker commit -m "mariadb55 image w/ external data" -author="<YOU>" <CONTAINER_ID> amattn/mariadb55 <SOME_TAG>

Running

We can then run with:

sudo docker run -v="$HOME/mysqldata":"/data" -d -p 3306 amattn/mariadb55:<SOME_TAG> /usr/bin/mysqld_safe

See which port is being forwarded with:

sudo docker ps -a

At this point, you can access MariaDB at the host IP address and the forwarded port (usually 49xxx).

Tips

Iterate! Did you screw up the mysql_secure_installation step? Just exit the container and start from your last docker commit. Docker is all about iterating repeatable steps until you have an image ready to go.

There are lots of docker commands. Spend some time browsing the docs:

http://docs.docker.io/en/latest/commandline/cli/

The two most useful tutorials I came across were:

http://hotcashew.com/2013/07/lemp-stack-in-a-docker-io-container/ http://zaiste.net/2013/08/docker_postgresql_how_to/

And an older list of tutorials:

http://blog.docker.io/2013/06/14-great-tutorials-on-docker/

Use tags. The repo format is <USERNAME>/<IMAGENAME>:<TAG>. I tend name like this:

amattn/component:webappname
amattn/postgres92:favstarclone
amattn/postgres92:flickrclone
amattn/mariadb55:bookmarker
amattn/mariadb55:accountmgr


the fine print:
aboutarchive@amattn
© matt nunogawa 2010 - 2017
back ⬆