Using Redis SORT and GET to save on roundtrips

A common pattern when storing more complex data in Redis is to use plain strings or hashes to store a representation of some data object, and then store references to those keys in Redis’ other data structures (lists, sets, and sorted sets). A contrived example might look like this:

redis> LRANGE FooBar|list 0 -1
1) "1"
2) "2"
3) "4"

redis> GET FooBar|id|1
"One"

redis> GET FooBar|id|2
"Two"

redis> GET FooBar|id|4
"Four"

redis>

A common access pattern for this type of data seems to be to first retrieve the ids contained in the list, and then retrieve the keys one by one as shown in the example above. The primary deficiency of this approach is that it requires a network operation for each item referenced in the list. I’m a fan of trying to retrieve data with as few I/Os as possible.

If you were to use a MULTI/EXEC transaction for the GETs you could retrieve the contents in two operations, but one oft overlooked feature of the SORT command will let you retrieve the entire structure in a single I/O: GET.

Using SORT and GET is fairly straightforward. We can point the SORT command at lists and sorted sets and retrieve the contents in specified order. If we use the GET clause, we can specify a pattern for keys external to the list that we would like to retrieve instead of the sorted list contents. So now we can finally dereference our references!

Using the contrived example above, the magic command would be:

redis> SORT FooBar|list GET FooBar|id|*
1) "One"
2) "Two"
3) "Four"

redis>

Nice! However, what if you want to retrieve the contents of the list in the order they were inserted and bypass the sorting? Redis provides a way to do that by sorting on a nonexistent key (have a look at the SORT … BY syntax for more details):

redis> SORT FooBar|list BY nonexistentkey GET FooBar|id|*

Hopefully someone else finds this as useful as I do. Sometimes it pays to peruse the docs!

Where do you store your Django signals?

If you’re using Django to build your web application, more often than not there is already a convention for organizing your modules and packages. A typical Django application is structured something like this:

project/
    __init__.py
    app1/
        __init__.py
        forms.py
        models.py
        views.py
        signals.py
        utils.py
        urls.py
    app2/
        ...
    templates/
    settings.py
    urls.py
    ...

And so forth. Overall, this structure is pretty manageable, though minor deviations are common.

If there is significant interaction between your various relational models, we often run into the requirement of performing an action based on an event related to a model, such as a “save” event. Keeping these actions decoupled from the actual model is a textbook example where the Observer pattern comes in handy. This is exactly what Django’s signals infrastructure aims to provide.

As detailed in the example above, the typical placement of signal handlers related to an application’s models is in the application package itself, in a module aptly named “signals”.

The same reasoning that pulls your actions out of model save methods can be applied to your handlers. As your application grows in complexity it also tends to happen that your signal handlers begin to interact with models from several different applications. In this case it is not completely clear (at least to me) where the signals should live, and since signal logic is by definition decoupled from the models there are few clues to let us know what handlers a model’s event might trigger since we can only associate a single application with a signal handler.

The structure I’m experimenting with right now is to create a single signals package underneath the project package to organize all signals code. I can then add as many modules to the signals package as I like, with informative names like “user_signals” or “sluggable_model_signals”. The file structure of the signals package looks like:

signals/
    __init__.py
    user_signals.py
    sluggable_model_signals.py
    ...

Signals/__init__.py is defined as:

""" Module for registering centralized signals """

__all__ = [
    'user_signals',
    'sluggable_model_signals',
    ]


from example.signals import *

The all keyword makes sure the modules can be imported using the “import *” functionality. Finally, the example.signals package is added to INSTALLED_APPS, before any of our defined applications to ensure the signals are included only once and early enough to register properly.