Hexagonal Django

The last few weeks I've been thinking about the architectural pattern known as Clean, Onion, Hexagonal, or Ports'n'Adaptors. I'm curious if many people are applying it in the Django world.

The premise is for your core application entity classes and business rules to be plain old objects, with no dependencies. In particular, they are not dependent on the interfaces between your application and external systems, such as your persistence mechanism, or your web framework. Instead, external interface components depend upon your core business objects. This essentially moves the database from the 'bottom' layer of the old traditional 'three layer architecture', to form a part of the topmost layer - a sibling with the 'UI.'

For inbound messages (e.g handling a web request) this is straightforward - Django calls your view code which calls your business layer, but keep your business layer separate from your Django code, so it is stand-alone and unit-testable. For outbound messages, such as then rendering the web page in response, it's slightly more complicated: Your business logic must pass the result (typically a pure data structure) back to your web-aware code, but without your business logic depending on the web-aware component. This requires an inversion of control.

That way, all your business logic can easily be tested in unit tests, with no mocking required. You still need some end-to-end tests to verify integration, but you shouldn't need to involve your UI or database in testing every detail of your business logic.

Also, you can easily switch out your external system interfaces, such as persistence, to use another RDBMS, another ORM, a NoSQL store, or an in-memory version for testing Since the core of your application doesn't have any dependency on these components, it is oblivious to the change. The business logic, because it doesn't depend on Django, is no longer riddled with Django's convenient ORM database access.

Same thing goes for switching out your web framework, or calling the same logic from web UI or web API calls. And again, for switching out your UI: add a command line application, or a console UI. The core application logic is unaffected, and your new interface components contain only the code that is specific to that interface's concerns.

Another side effect is that your web framework, if you're using one, becomes a peripheral detail which depends upon your core application, rather than the other way round. Your Django project would become a subdirectory of your project, rather than dominating your project directory structure. Since the business logic formerly contained within it is now elsewhere (in your core business objects) the Django project is now very thin. Views, for example, are delegations to single business-layer functions. The Django project now contains just the web-oriented aspects of your project, as it should.

These ideas all seem like relatively straightforward software engineering, and I feel a bit foolish for not having been aware of them all these years. I console myself that I'm not alone.

UncleBob's Ruby Midwest keynote "Architecture - The Lost Years" attributes one source of this idea to Ivar Jacobsen's 1994 book Object Oriented Software Engineering : A Use Case Driven Approach (2nd-hand hardbacks cheap on Amazon.)

I see a few people applying these ideas to Rails, but are many people out there doing this in Django? I plan to refactor a small vertical slice of our monster Django app into this style, to try and prove the idea for myself.

Encrypted zip files on OSX

Update: I've since switched to KeePassXC, the community fork of KeePassX, an open source, cross-platorm, local-first, encrypted password storage program.

My passwords and other miscellany are in a plain text file within an encrypted zip. Since starting to use OSX I've been looking for a way to access my passwords such that:

  • I get prompted for the decryption password.
  • The file gets unzipped, but not in the same directory, because that's synced to Dropbox, so would send my plaintext passwords to them every time I accessed them. Maybe to /tmp?
  • The plaintext file within the zip is opened in \$EDITOR.
  • Wait for me to close \$EDITOR, then remove my plaintext passwords from the filesystem.
  • Before deleting the passwords, check if I've updated them. If so, put the new updated version back into the original zip file.
  • Don't forget to keep the updated zip file encrypted, using the same password as before, without prompting for it again.

I failed to find an existing app which would do all this (although I had no trouble on Linux or even on Windows.) Hence, resorting to good old Bash:



read -s -p "Password:" key

unzip -P $key passwords.zip passwords.txt -d $TMPDIR
if [[ $? != 0 ]] ; then
    exit 1

cd "$TMPDIR"
touch passwords.datestamp
$EDITOR passwords.txt
if [[ passwords.txt -nt passwords.datestamp ]] ; then
    zip -P $key -r "$ZIPDIR/passwords.zip" passwords.txt

rm passwords.txt
rm passwords.datestamp

I don't expect this to be watertight, but seems good enough for today. I'm happy to hear suggestions.

Compiling MacVim with Python 2.7

I love the brilliant Vim plugin pyflakes-vim, which highlights errors & warnings, and since I got a MacBook for work, I've been using MacVim a lot.

This combination has a problem, that MacVim uses the OSX system default Python 2.6, so pyflakes is unable to handle Python 2.7 syntax, such as set literals. These are marked as a syntax errors, which prevents the rest of the file from being parsed.

The solution is to compile your own MacVim, using Python 2.7 instead of the system Python. The following commands got MacVim compiled for me:

git clone git://github.com/b4winckler/macvim.git
cd macvim/src
export LDFLAGS=-L/usr/lib
./configure \
    --with-features=huge \
    --enable-rubyinterp \
    --enable-perlinterp \
    --enable-cscope \
    --enable-pythoninterp \
open MacVim/build/Release
echo Drag MacVim.app to your Applications directory

Without the LDFLAGS setting, I was missing some symbols at link. The --with-python-config-dir entry came from typing 'which python' to find where my Python 2.7 install lives, and modifying that result to find its 'config' directory (whatever that is) near to the binary.

As indicated, install by dragging the resulting macvim/src/MacVim/build/Release/MacVim.app into your Applications directory.

Open up MacVim, and check out the built-in Python version:

:python import sys; print sys.version
2.7.1 (r271:86882M, Nov 30 2010, 10:35:34)

And files with set literals are now correctly parsed for errors.

Update: This only works if the Python 2.7 is your default 'python' executable. Otherwise, or if you get "ImportError: No module named site"?, see Richard's comments below.

Python 2.7 regular expression cheatsheet

Couldn't find one of these, so I whipped one up.

Bit of restructured text:


Install some Python packages:


Invoke rst2pdf:


Get a nice PDF out:

Python 2.7 regular expression cheatsheet (click this link or the image for the most up-to-date PDF from github.)

Django testing 201 : Acceptance Tests vs Unit Tests

I'm finding that our Django project's tests fall into an uncomfortable middle-ground, halfway between end-to-end acceptance tests and proper unit tests. As such they don't exhibit the best qualities of either. I'd like to fix this.

We're testing our Django application in what I believe is the canonical way, as described by the excellent documentation. We have a half-dozen Django applications, with a mixture of unittest.TestCase and django.test.TestCase subclasses in each application's tests.py module. They generally use fixtures or the Django ORM to set up data for the test, then invoke the function-under-test, and then make assertions about return values or side-effects, often using the ORM again to assert about the new state of the database.

Not an Acceptance Test

Such a test doesn't provide the primary benefit of an acceptance test, namely proof that the application actually works, because it isn't quite end-to-end enough. Instead of calling methods-under-test, we should be using the Django testing client to make HTTP requests to our web services, and maybe incorporating Selenium tests to drive our web UI. This change is a lot of work, but at least the path forward seems clear.

However, an additional problem is that acceptance tests ought to be associated with features that are visible to an end user. A single user story might involve calls to several views, potentially spread across different Django apps. Because of this, I don't think it's appropriate for an acceptance test to live within a single Django app's directory.

Not a Unit Test

On the other hand, our existing tests are also not proper unit tests. They hit the (test) database and the filesystem, and they currently don't do any mocking out of expensive or complicated function calls. As a result, they are slow to run, and will only get slower as we ramp up our feature set and our test coverage. This is a cardinal sin for unit tests, and it discourages developers from running the tests frequently enough. In addition, tests like this often require extensive setup of test data, and are therefore hard to write, so it's very labour-intensive to provide adequate test coverage.

My Solution

1) I've created a top-level acceptancetests directory. Most of our current tests will be moved into this directory, because they are closer to acceptance tests than unit tests, and will gradually be modified to be more end-to-end.

These acceptance tests need to be run by the Django testrunner, since they rely on lots of things that it does, such as creating the test database and rolling back after each test method. However, the Django testrunner won't find these tests unless I make 'acceptancetests' a new Django application, and import all acceptance test classes into its tests.py. I'm considering doing this, but for the moment I have another solution, which I'll describe in a moment.

We also need to be able to create unit tests for all of our code, regardless of whether that code is within a Django model, or somewhere else in a Django app, or in another top-level directory that isn't a Django app. Such unit tests should live in a 'tests' package right next to the code they test. I'm puzzled as to why Django's testrunner doesn't look for unit tests throughout the project and just run them all, along with the Django-specific tests.

2) My solution to this is to augment the Django test runner, by inheriting from it. My test runner, instead of just looking for tests in each app's models.py and tests.py, looks for subclasses of unittest.TestCase in every module throughout the whole project. Setting Django's settings.TEST_RUNNER causes this custom test runner to be used by 'manage.py test'. Thanks to the Django contributors for this flexibility!

So the new test runner finds and runs all the tests which the default Django runner runs, and it also finds our unit tests from all over the project, and it also includes our new top-level 'acceptancetests' directory. This is great!

One surprise is that the number of tests which get run has actually decreased. On closer inspection, this is because the standard Django test runner includes all the tests for every Django app, and this includes not just my project's apps, but also the built-in and middleware Django apps. We are no longer running these tests. Is this important? I'm not sure: After all, we are not modifying the code in django.contrib, so I don't expect these tests to start failing. On the other hand, maybe those tests help to demonstrate that our Django settings are not broken?

An appeal for sanity

My solutions seem to work, but I'm suspicious that I'm swimming against the current, because I haven't found much discussion about these issues, so maybe I'm just well off the beaten path. Have many other people already written a similar extension to Django's test runner? If so, where are they all? If not, why not? How else is everyone running their Django project tests in locations other than models.py or tests.py? Or do they not have tests outside these locations? If not, why not? I'd love to hear about it if I'm doing it wrong, or if there's an easier approach.

Update: My fabulous employer has given permission to release the test runner as open source:


Update2: I like this post's numeric ID (check the URL)

£ key in Windows on a US laptop keyboard, done right.

The usual solution to typing non-US characters on a US keyboard in Windows is to hold left-alt, then type on the numeric keypad:

£   Left-alt + 0163

€   Left-alt + 0128

This is a pain on my (otherwise fabulous) Thinkpad laptop, because the numeric keypad is accessed by holding the blue 'Fn' key while you tap ScrLk, to toggle numeric keypad mode, and then doing the same again afterwards to turn it off.

One inadequate alternative (on WindowsXP, YMMV) is to go into control panel; Regional and Language Options; Languages; Details; Settings. Add a new keyboard configuration, "United States-International", which should be grouped under your existing language ("English (United Kingdom)" for me.) OK all the dialogs, restart your applications.

Now you can simply type:

£   Right-alt + Shift + 4

€   Right-alt + 5

The downside of this solution is that the "UnitedStates-International" keyboard setting adds a bunch of other features, including 'dead-keys', whereby quotes and other punctuation are used to add accents to letters, which is overly intrusive if, like me, you hardly ever use accents.

Ultimate solution then, define your own personal keyboard layout. Download the Microsoft Keyboard Layout Creator from here: http://msdn.microsoft.com/en-us/goglobal/bb964665.

My end result is an MSI with which I can install a new keyboard layout, which is exactly like 'US', but with the addition of £ on the key right-alt + 3:


The source .klc file is in there, so you could add your own tweaks on top of that.

We'll parallelise your long-running test suite on EC2

Another brainstorming project idea:

Some projects have a suite of tests which take a long time to run. This hinders agility.

We could run these tests suites for clients across EC2 instances. We've had great success at Resolver Systems in slicing a test run across several machines and then recombining the results. No doubt some people have similar solutions. In taking the hassle of configuring these machines out of your hands, we could divide the test suite execution time to a fraction of that taken running them in serial on a single machine.

Additionally, we could run tests against multiple versions of Python, or run acceptance / system tests, on multiple browsers. These can all be run in parallel across as many instances as it takes.

The downside of this idea is that unit tests run very quickly, so we provide no value there. Only for long-running acceptance tests is this a useful service - and those are tricky because they so often require so much custom configuration and resources in order to run. Also, like our own acceptance tests, they may be running on Windows clients, which is a more expensive VM to run on EC2.

I don't think it could be made to work with an arbitrary set of projects' tests - the differences between how each project writes and runs their tests are just too great. But we could provide a sort of 'snakebite light' - an available bunch of servers with a variety of Python and browser versions which people could write tests to run against.

Do many people have long-running acceptance tests suites that would run on a cheap Linux-based VM? How comfortable would people be outsourcing this sort of service which is so fundamental to your project? Do you think we'd be overwhelmed by the difficulties inherent in custom configuration each project would need to run their tests? Again, if you love or hate this idea, I'd love to hear about it.

Your Python, Our Servers : What could possibly go wrong?

We've been brainstorming for products to build. What do you think of this one?

Python Interactive Console in the Browser

One idea is a CPython interactive console in the browser. Client is Javascript, which sends the user's Python to execute in a sandbox on our EC2 instances. Dirigible already has the infrastructure to support running our user's arbitrary Python in a sandbox across our grid - so we'd only need to strip the spreadsheet-UI off to expose this functionality to end-users.

Python in the browser

Your console state is persistant - so you could start some investigation, or a long-running statement, then log off, and then reconnect to it later from another machine.

Since the Python runs on our servers, all the client needs is a working browser with Javascript. You could run code from any device - even funky things like iPads which don't traditionally make installing and running Python very easy. Your only hinderance would be the keyboard. Since our servers could be beefy, it would run substantially faster. We could help to support running in parallel across several servers.

We could provide multiple versions of Python - and keep them fully loaded with packages from PyPI - so you could use any combo without having to install anything locally.

There are already some examples of in-browser Python consoles like this, but none of them quite do what we'd hope to do.

We're also thinking of providing really simple tools so you could work on scripts or programs locally, and then run them in the cloud with a single command / button.

Generally we're thinking of running your stuff on our own EC2 servers. Other applications might involve executing stuff on your own servers, instead of ours, but thus far we haven't come up with applications of this that offer much over fabric or a combination of an SSH session and Unix program 'screen'.

Would you use this? What for? Do you think it's totally dumb? I'd love to hear about it.

Most interestingly, would you pay for it? We're thinking free for trivial uses, a few dollars for some advanced features (access to networking, persistance) and scale up to support teraflop computing.

Launch Gitk displaying all branches

Update: All of the below is made totally redundant by simply using gitk --all. Thanks Russel!

When I launch Gitk, it just displays the current branch. To display other branches, you must name them on the command line. To display all existing branches, you need to find out all the branch names:

$ git branch
* master

Then laboriously type them in to the gitk command line:

$ gitk create-sql-dev formula-rewrite master

Alternatively, save this Bash snippet in a script on your PATH. I call mine gitka, for 'all branches':

# run gitk, displaying all existing branches
for b in "`git branch`"; do echo "$b"; done | tr -d "*" | xargs gitk

Gitk displaying all branches, not just the current ('master' in bold)

This works on Windows too, if you save it as 'gitka.sh', and have Cygwin installed and associate the .sh filename extension with the Cygwin Bash executable. You can then run it as 'gitka' from a Windows command prompt thingy. If you then use 'ln -s gitka.sh gitka', then you can also run it as just 'gitka' from a Cygwin bash prompt too - without this you would have had to type out the full 'gitka.sh'.

'Go to Definition' in Vim for Python using Ctags, Done Right

How to set up and configure Vim to use tags for Python development so that it doesn't suck.

Install Ctags

Get the latest version of ctags, put it on your PATH. Recent releases are much improved for Python.

Creating or updating tags files

You'll probably want one tags file at the root of your project, which will need to be created or updated whenever you make significant changes. Either get used to manually running the following command a lot:

ctags -R .

or bind it to a key in your \~/.vimrc:

map  :!start /min ctags -R .

I like to set Vim's current working directory equal to the root of whatever project I'm working in, so now I can press f12 to update the tags file for the project. The 'start /min' part is a Windows-specific way to run the command in the background, so Vim isn't locked up waiting for it to finish.

Test it out

Now, in Vim, ctrl-] will jump to the definition of the symbol under your text cursor. Hooray, etc. If there is more than one definition of that symbol, it presents a menu for you to choose from.

Turn off useless tags

By default, ctags generates tags for Python functions, classes, class members, variables and imports. The last two are useless to me, and they actually make ctrl-] more inconvenient, because they increase the likelyhood of finding duplicate definitions of a tag, causing the menu to inconveniently pop up, rather than just jumping to the tag you want.

To fix this, create a \~/.ctags file:


The first line turns off tags generation for variables and imports. The second and third lines turn off generation of tags in the named dirs, since you almost certainly want to ignore source code in those directories.

Case insensitive tag matching

If your .vimrc requests case-insensitive searching by setting ignorecase (aka ic), then the above tag matching will also be case insensitive. This is irksome, because searching for the definition of property .matrix will present you with a menu asking you to choose between property .matrix and class Matrix, rather than just jumping to the property.

To fix this, add this to your .vimrc:

" go to defn of tag under the cursor
fun! MatchCaseTag()
    let ic = &ic
    set noic
        exe 'tjump ' . expand('')
       let &ic = ic
nnoremap   :call MatchCaseTag()

Update:This Vim script was suggested in a comment by James Vega, in order to reliably restore the state of 'ignorecase' after doing the tag jump. Many thanks!

This maps your ctrl-] key to turn off case-insensitivity while it does the jump to tag, then turn it back on again. Now pressing ctrl-] will jump directly to your property, only presenting menus on the occasion when the tag you search for is defined in more than one place using precisely the same name.

Much better.

Update: Also see this post about adding stdlib and venv contents to your tags: https://www.fusionbox.com/blog/detail/navigating-your-django-project-with-vim-and-ctags/590/