Matt Good

Random musings on code & other stuff

Context Managers FTW

In their most basic form, Python’s context managers make it easier to do the right thing by getting rid of some boiler-plate. For example, it was sometimes tempting to skip the try/finally to close a file object correctly, since Python would usually clean it up anyways when it went out of scope. But it’s much easier to do it right this way:

1
2
with open('filename') as fp:
  contents = fp.read()

instead of this way:

1
2
3
4
5
fp = open('filename')
try:
  contents = fp.read()
finally:
  fp.close()

However, the more I work with context managers, the more they change the way I think about many different problems.

It’s easy to extend this pattern into other places where you have a resource to clean up when you’re done. I use this one a lot to create a temporary directory to work with, and then it’s cleaned up at the end:

1
2
3
4
5
6
7
8
9
10
@contextlib.contextmanager
def temp_directory(*args, **kwargs):
  path = tempfile.mkdtemp(*args, **kwargs)
  try:
    yield path
  finally:
    shutil.rmtree(path)

with temp_directory() as tmp:
  # work with files in tmp, and they'll be cleaned up when you're done!

In many of these cases, the context manager isn’t a revolutionary shift from just writing the try/finally code inline, but as you start building on the basic patterns, you can start to come up with some more interesting solutions.

I have a few places where I need to replace an existing file in a safe way. For example, I have a running web app that uses a file-based cache. I have a background process that updates the data for the cache, so I want to write the new data into a new file, and if it completes successfully, swap it in place of the old cache. I could do all this inline where I’m updating the cache, but it’s much easier to abstract the details away so that it’s easy to code once and reuse elsewhere.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
@contextlib.contextmanager
def overwriting(path, prefix='', suffix='.tmp'):
  dirname, filename = os.path.split(path)
  tmp_path = os.path.join(dirname, prefix + filename + suffix)
  fd = os.open(tmp_path, os.O_EXCL | os.O_CREAT | os.O_WRONLY | os.O_TRUNC, 0644)
  f = os.fdopen(fd, 'w')
  try:
    yield f
    if not f.closed:
      f.flush()
    os.rename(tmp_path, path)
  except:
    os.remove(tmp_path)
    raise
  finally:
    f.close()

with overwriting('filename') as fp:
  fp.write('Hello world!')

In the case of the cache, it was critical to get the logic right so that I didn’t swap out the cache until it was fully written, and didn’t swap out the cache if there was an error somewhere in the middle, etc. But, this has also made it easy to take advantage of this in places I probably wouldn’t have bothered with making as robust before. Soon after writing the overwriting context manager I needed to write some code to make some tweaks to an XML document and re-write it. This is what I ended up with:

1
2
3
4
5
6
@contextlib.contextmanager
def xml_mutator(filename):
  xml = ElementTree.parse(filename)
  yield xml
  with overwriting(filename) as fp:
    xml.write(fp)

It simply parses an XML file, gives you the ElementTree document to manipulate, and then writes back any changes you made to the document. Normally I would have just overwritten the original file directly, but now I get the nice behavior that if something goes wrong, I won’t overwrite the original file with a broken file.

Other good examples I’ve encountered include mmapping files, or setting SIGALRM to run some code with a timeout. Being able to abstract some of the messy details of these APIs makes it easier to just get the pattern right once and reuse it. This has simplified my code quite a bit, and has made context managers one of my favorite language features the more I’ve started factoring them into the way I think about my code.

Released Jprops Library

I’ve uploaded a new Python library jprops to GitHub. It reads and writes Java’s .properties file format.

We were using pyjavaproperties at work, but ran into a few limitations. One of our developers had patched it to be able to read from a StringIO object instead of a file object. I had also encountered an issue where it has extended the standard property parsing to interpolate patterns like {var} to expand the value of var as a reference to another property. This interfered with some of our properties files that use ${var} as a pattern recognized by our configuration system.

Since pyjavaproperties was also missing some other features like unicode support I decided to go ahead and read over the the Java documentation of the file format and build a new cleanroom implementation. I have a decent set of unit tests and I think this implementation should cover all of the features documented in the spec. It has full support for reading and writing unicode values, though it it will return Python str objects by default when your key or value contains only ASCII. This was more convenient, but I may revisit that to always return unicode or have a switch for that behavior.

So, if you need to work with Java .properties files from Python, just pip install jprops and check out the docs. If it’s missing something let me know.

Simple JavaScript Namespaces

Namespaces are one honking great idea — let’s do more of those!

Tim Peters The Zen of Python

After a few years of mostly server-side development I’m getting back into JavaScript, and for the first time I feel like I’m approaching it as an actual engineering challenge rather than a bunch of quick hacks. As our JS code base is growing I’m breaking things up into namespaces to keep things manageable.

Though JavaScript has no built-in notion of “modules” like many other languages you can use objects to create namespaces, which in their simplest form would look something like:

1
2
3
4
5
6
util = {
  my_func: function() {
    alert('my_func');
  }
};
util.my_func();

However, creating nested namespaces like “myapp.util.text” gets a bit ugly with that approach. I liked the basic API of Namespace.js, though I didn’t want any of the additional features like remote loading, so I figured I could make something much simpler and smaller.

You can first “declare” a namespace and then assign attributes like:

1
2
3
4
Namespace('myapp.ui');
myapp.ui.ToggleButton = function() {
  // ...
};

Or pass in the attributes to include in that namespace:

1
2
3
4
Namespace('myapp.util.text', {
  some_func: function() {
  }
});

Here’s the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function Namespace(name, attributes) {
  var parts = name.split('.'),
      ns = window,
      i = 0;
  // find the deepest part of the namespace
  // that is already defined
  for(; i < parts.length && parts[i] in ns; i++)
    ns = ns[parts[i]];
  // initialize any remaining parts of the namespace
  for(; i < parts.length; i++)
    ns = ns[parts[i]] = {};
  // copy the attributes into the namespace
  for (var attr in attributes)
    ns[attr] = attributes[attr];
}

This is of course much smaller than the 16kb of Namespace.js, so as an additional challenge I decided see if I could compact it to fit the 140 character limit of a Twitter status message :) (Note: the Twitter version used jQuery’s $.extend function to copy the attributes into the namespace which I later replaced with the framework-independent version above.)

I’ll follow up later with some tricks I’ve been thinking about to “import” stuff from other namespaces.

Safari 3

I think I’ve tried just about every web browser available for the Mac. The included Safari 2 didn’t feel quite right, Firefox was too slow and a memory hog, Flock, Shiira and some others just didn’t really fit what I was looking for. Camino was nice and light, and better integrated with the Mac than Firefox, but I was missing a good search box and it felt a little too limited in features. Now I’m using the Safari 3 beta and couldn’t be happier. There are lots of little things that just feel “right” and it fits with the Mac OS nicely. After downloading an application you are warned that it contains an app, and if you approve the DMG file is mounted so you can run or install the app. But, the one thing that seems simple, but I’ve felt missing in browsers for a while is the tab management. On Linux I used Epiphany which has supported rearranging tabs for a while, and Firefox now supports it, but Safari gets one more thing right: dragging tabs between windows. So far I haven’t seen another browser do this, but in Safari you can drag a tab up or down to detach it from the current window and either drag it to a different window, or out into a new window. It’s not a feature I use constantly, but I do really like being able to keep my tabs organized in a logical manner instead of having a random assortment of tabs all open in the same window. Thanks Apple for getting this right.

Scripts: Svndiff

To make the output of the “svn diff” command more readable here’s a small script to pipe the output to the Pygments library to colorize the command line output:

1
2
#!/bin/bash
svn diff "$@" | pygmentize -ldiff

Mac Migration (Part 1)

A week ago I started my new job at YouTube. Most people here use Macs so I got a nice shiny new MacBook Pro to work on. I’ve been using Linux (Debian and Ubuntu) almost exclusively for about 4 years now and Windows before that, so I’ve been quickly getting up to speed on using my new Mac.

One of my first major annoyances was that some form controls weren’t keyboard navigable. Filling out web forms was frustrating since hitting Tab would skip past drop-down fields, and when dialogs popped up and I didn’t want to respond with the default button I had to switch over to the mouse instead of just tabbing to the right one.

Fortunately I found these instructions on changing this behavior. Now I can use the keyboard to quickly navigate these inputs.

More on my Mac switch to come.

PythongPaste

Ian Bicking has just added a new package to the Paste suite for WSGI utilities