Recursively walking all a widget’s descendants in PyGTK

The other day, I had a need to walk through all the descendants of a dialog box in PyGTK, so that I could save the contents of each text-entry field to the appropriate database record. After a bit of poking around in the PyGTK manual and not finding the recursive get_children() function that I wanted, I decided to write my own.

def walk_descendants(root):
    """
    Walk through a tree of this object's children, their own children,
    and so on, yielding each object in depth-first order.
    """
    yield root
    if not hasattr(root, 'get_children'):
        return # No children, so we're done
    children = root.get_children()
    if not children:
        return # No children, so we're done
    for child in children:
        for widget in walk_descendants(child):
            yield widget

I use this function to build a dict listing all the widgets in my dialog box, keyed by their names. Then when I need to do something with the OK button, I can use something like self.widgets['button_OK'] and no matter where it is in the hierarchy, even nested inside several VBoxes and HBoxes, it’s easy to use.

In case others might find this useful, I hereby release this function into the public domain. Use it however you like.

Learning to create Debian/Ubuntu packages

I’m starting to learn to create .deb packages for Debian or Ubuntu. A quick breadcrumb trail for myself, to remind me of where I’ve found useful information:

Making SVN trust a new root CA certificate

If you’re using Subversion to connect to an HTTPS repository that’s signed by a non-standard root certificate — such as a CACert.org certificate, for example — here’s how to do it on Linux or OS X. (Windows users: sorry, you’re out of luck. I haven’t developed on Windows since 1999, and I don’t ever want to go back. So the only way this post will ever be updated with Windows instructions is if someone else figures out how to do it and leaves a comment.)

  • First, download the certificate you’re interested in, e.g. “wget http://www.cacert.org/certs/class1.crt”. I suggest storing it in /etc/ssl/certs with an appropriate name, such as “cacert-root-ca.crt”. You’ll need to have root privileges (use “sudo”) to get write access to the /etc/ssl/certs directory.
  • Run “openssl md5 /etc/ssl/certs/cacert-root-ca.crt” and/or “openssl sha1 /etc/ssl/certs/cacert-root-ca.crt” and compare the results against the certificate fingerprint given on the website. The website you’re downloading this certificate from does give you its MD5 and/or SHA1 fingerprints, right? (If not, what the heck are you doing trusting a certificate you haven’t verified?!?)
  • Run “openssl x509 -text -in /etc/ssl/certs/cacert-root-ca.crt” to verify that the certificate’s data (company name and so on) looks correct.
  • If the above fails, add “-inform der” to the command above: maybe you accidentally downloaded the DER-encoded certificate instead of the PEM-encoded certificate.
  • If you have the DER version, you’ll need to convert it to PEM. Run “sudo openssl x509 -inform der -outform pem -in /etc/ssl/certs/cacert-root-ca.crt -out /etc/ssl/certs/cacert-root-ca.pem”. Note the “sudo” in front of that command: you’re writing to the /etc/ssl/certs directory, so you need to be root.
  • Now that you’ve got a certificate in PEM format and verified it, it’s time to edit your “~/.subversion/servers” file. In the “[globals]” section, add the line “ssl-authority-files = /etc/ssl/certs/cacert-root-ca.crt”. The “ssl-authority-files” option is a colon-delimited list, so if you already have something there and are adding the second certificate to it, use a colon to separate the two paths. If you’re adding a third certificate to the list, then you should already see the colon and be able to figure it out. :-)

I mostly figured this out from the “SSL Certificate Management” section of the Subversion book. Which I highly recommend reading, BTW.

I hope this helps someone else spend a little less time on Google figuring out how to trust a new root CA.

Interesting ideas for human-computer interaction

There’s a research project at University of Toronto that’s exploring different ideas for how people interact with computers. Here‘s an interesting new way of looking at the “desktop” metaphor. There’s some rather clever ideas there.

A step-by-step SQLAlchemy tutorial

SQLAlchemy is a very useful database-access library for Python. It’s got excellent documentation; but what it was missing until recently was a tutorial. I wrote a step-by-step SQLAlchemy tutorial to fill in the gap. Of course, the day after I wrote it, SQLAlchemy’s author posted the tutorial that he’d been working on, so I just duplicated his efforts. :-)

Nevertheless, it might be useful to someone, so I put it up anyway.

PCI ID viewer for Windows

I often end up working on Windows machines that don’t have the right drivers installed for this or that piece of hardware. And since Windows makes it difficult to get at the actual PCI ID’s for its devices, all you have to go on is the “Unknown Device” entry in the Device Manager control panel. Thanks, Microsoft, that’s real helpful.

What I need is a tool to list the PCI ID’s for all devices, and one (preferably) that’s free. I found such a tool at http://members.datafast.net.au/dft0802/. Craig Hart has written a program called PCI32 (for Win2000, NT, XP, etc) and PCI (for Win 9x/ME) that can list the PCI ID’s of all your devices, along with the manufacturer name and model name. Very useful for grabbing exactly the right driver.

Some useful programming links

A few useful programming links gleaned from various sources:

Some light geekery for a change

The original Star Wars trilogy holds a special place in the heart of many geeks. It has grandiose space battles, a rip-roaring plot full of daring escapes and noble sacrifices, and memorable lines (C-3PO: “Sir, the possibility of successfully navigating an asteroid field is approximately 3,720 to 1!” Han: “Never tell me the odds.”) The movie is clearly not meant to be taken as realistic — after all, it’s got planets being blown up by lasers — but it can be great fun to take it at face value.

One article that does exactly that is “Endor Holocaust“, which asks, “What happens when you detonate a spherical metal honeycomb over five hundred miles wide just above the atmosphere of a habitable world? Regardless of specifics, the world won’t remain habitable for long.” The Rebels may have saved the galaxy, but it seems they doomed the Ewoks in the process.

Another such article is “On the Implausibility of the Death Star’s Trash Compactor“. What was that thing doing in the cell block, anyway?

Finally, there’s the ultimate in geekery: the article entitled “How Lightsabers Work“. Filled with handy safety tips (“NEVER point the blade emitter of a lightsaber toward your own body. NEVER look down the “barrel” of a lightsaber, even if you are “sure” it is in safe mode.”), this article also contains photos of other handy uses for a lightsaber, such as hedge trimming or slicing bagels (“The big advantage of using a lightsaber, of course, is that you can both cut and toast the bagel in one stroke.”) A must-read for anyone considering purchasing this handy device.

If you have a particular favorite Star Wars article that I didn’t mention, let me know and I’ll update this post.

You paid HOW MUCH for that?!!?

I couldn’t believe it until I saw it with my own eyes. Oracle — the 800-lb gorilla of database servers, the mighty Oracle — cannot distinguish between an empty string (“”) and a NULL! From the current Oracle documentation, available on-line at their Web site (although you have to register and give them your E-mail address to get access to it):

“Note: Oracle Database currently treats a character value with a length of zero as null. However, this may not continue to be true in future releases, and Oracle recommends that you do not treat empty strings the same as nulls.”

That last sentence sounds promising until you realize that it’s been there since version 7, and Oracle is now at version 10.

From now on, when someone’s extolling the virtues of Oracle to me, I’ll just ask them, “Oh, and have they fixed their empty string == NULL problem yet?”

Logix: Lisp with Python syntax

New discovery of the day: Logix. Logix is Python, with macros. Logix is Lisp, with Python syntax. Logix is a programming language that lets you create programming languages. Logix is whatever language you need it to be.

If you’re familiar with Lisp, you already grasp how powerful macros can be. If all you’ve used is Python, let’s take a look at a practical example. Let’s say you’re writing a threaded program, and throughout your code is scattered constructions like:

_lock.acquire()
try:
    print "We have the lock, now doing some work"
finally:
    _lock.release()

Wouldn’t it be nice to have a synchronized construct that would do the work for you? You could write it as a function, but then you’d have to pass in the code you want to run, which means turning it into another function or a lambda, and that just creates more boilerplate. With macros, you’d be able to do something like this:

defmacro synchronized(lock, codeblock):
    lock.acquire()
    try:
        codeblock
    finally:
        lock.release()

Some magic would be required to pass in the codeblock, of course.

Well, in Logix, this is how that macro would be written:

defop 0 "synchronized" "(" $lock:expr ")" ":" $code:block
    macro lock code
        lock.acquire()
        `try:
            *code
        `finally:
            lock.release()

If you’re wondering about the backslashes and backquotes, read the Logix tutorial. It will explain what those do, and in the process introduce you to what Logix is capable of (hint: quite a lot!).

What makes Logix so powerful? The same thing that makes Lisp so powerful: the idea that code is data, and data is code. Which means you can pass around chunks of code to functions, save them in variables, and generally do whatever you want with them. I’ve been meaning to learn Lisp for a while now, but I’ve always been put off by its syntax. That’s because the basic unit of Lisp is the list, and lists inside lists inside lists meant Lots of Irritating Superfluous Parenthesis :-). And I could never get used to writing addition as (+ 1 2 3) instead of 1 + 2 + 3. But Logix is based on Python, and inherits a lot of Python’s syntax. So instead of parens inside parens, you’ve got blocks based on indentation, and colons at the end of lines signalling that a block is about to start. It’s got the same “clean”-feeling syntax that makes Python so easy to read, but it’s also got the power of Lisp: macros, and the ability to pass code objects (of any kind, not just functions) around. I’m looking forward to playing with it and learning it.

UPDATE, 2005-06-06: The code samples I gave above are for Logix 0.4. 0.5 is going to introduce some changes to the underlying structure (in particular, what “chunks of code” are made of is changing), and so the macro syntax may end up looking a little different. If the code above ends up being wrong, I’ll post an update correcting it.

WSGI demystified

One of the most exciting things to come out of PyCon 2005 was the WSGI spec. WSGI is a standard that specifies a common interface for Python Web frameworks.

For a full understanding of the interface there’s no substitute for reading the spec. But in my experience, many people are too busy to read specs. So here’s an executive summary of how WSGI works.

WSGI actually specifies two interfaces, one for the server to talk to the application, and one for the application to talk to the server. For the server to talk to the application, it calls a function (which the application supplies) which should accept two arguments, environ and start_response. The environ argument will be a dictionary containing the HTTP environment (including variables like REQUEST_METHOD and PATH_INFO), and the start_response argument will be a function.

For the application to talk to the server, it should first prepare any headers it wants to send, then call the start_response function that it was handed, with a status code and a set of headers. Then it should prepare the body of the response as a list of strings. (An iterator can be used instead of a list). To pass the response body back to the server, it should simply return it.

Once the server receives the list (or iterator) from the application, it simply sends those strings, one at a time, to the client (i.e., the user’s Web browser).

Got that? Here’s what it looks like in Python code:

def simple_app(environ, start_response):
    status = '200 OK'
    response_headers = [('Content-type','text/plain')]
    start_response(status, response_headers)
    return ['Hello world!n']

There you have it — the WSGI interface in a nutshell. The server invokes the application by calling a function, then sends the function’s return value(s), one at a time, back to the client. Pretty simple, eh?

There are a lot of subtleties that I haven’t covered, of course, such as the contents of the environ dictionary, or the fact that the application doesn’t have to be a function, but can be any callable object (such as a class). To grasp those, you’ll have to read the spec. But just understanding the concepts presented here is a good jumping-off point to make the spec a lot less mysterious.

Update, 2005-08-15: There’s a simpler way to write my simple_app example: as a generator. In fact, that’s probably a simpler way to write any WSGI app. Instead of building a list of strings and returning the list, simply use yield statements, thus:

def simple_app(environ, start_response):
    status = '200 OK'
    response_headers = [('Content-type','text/plain')]
    start_response(status, response_headers)
    yield 'Hello world!n'
    yield 'Second line goes heren'

Simple, eh?

Long filenames in Windows

Long file names in Windows are a hacked-in kludge, and sometimes it shows. Here’s an example of how, if you’re not careful, you can lose data:

C:TEMP>mkdir robintemp

C:TEMP>cd robintemp

C:TEMProbintemp>dir
 Volume in drive C has no label.
 Volume Serial Number is C899-6ADC

 Directory of C:TEMProbintemp

01/28/2004  02:26p      <DIR>          .
01/28/2004  02:26p      <DIR>          ..
               0 File(s)              0 bytes
               2 Dir(s)  34,996,314,112 bytes free

C:TEMProbintemp>echo Hi > longfi~1

C:TEMProbintemp>dir
 Volume in drive C has no label.
 Volume Serial Number is C899-6ADC

 Directory of C:TEMProbintemp

01/28/2004  02:26p      <DIR>          .
01/28/2004  02:26p      <DIR>          ..
01/28/2004  02:26p                   5 longfi~1
               1 File(s)              5 bytes
               2 Dir(s)  34,996,314,112 bytes free

C:TEMProbintemp>dir /x
 Volume in drive C has no label.
 Volume Serial Number is C899-6ADC

 Directory of C:TEMProbintemp

01/28/2004  02:26p      <DIR>                          .
01/28/2004  02:26p      <DIR>                          ..
01/28/2004  02:26p                   5                 longfi~1
               1 File(s)              5 bytes
               2 Dir(s)  34,996,314,112 bytes free

C:TEMProbintemp>echo Hi > longfilename01

C:TEMProbintemp>dir /x
 Volume in drive C has no label.
 Volume Serial Number is C899-6ADC

 Directory of C:TEMProbintemp

01/28/2004  02:27p      <DIR>                          .
01/28/2004  02:27p      <DIR>                          ..
01/28/2004  02:27p                   5 LONGFI~2        longfilename01
01/28/2004  02:26p                   5                 longfi~1
               2 File(s)             10 bytes
               2 Dir(s)  34,996,314,112 bytes free

C:TEMProbintemp>mkdir ..foo

C:TEMProbintemp>copy *.* ..foo
longfilename01
longfi~1
Overwrite ..foolongfi~1? (Yes/No/All): n
        1 file(s) copied.

C:TEMProbintemp>

See what happened? The file longfilename01 was copied first, thus creating the short filename longfi~1 according to Microsoft’s filename-creation rules. But the next step was to copy the file longfi~1 from the source directory, and oops! It already exists in the destination!

What should have happened here is that the file longfilename01 should have kept its short file name longfi~2 when it got copied; then there wouldn’t have been a filename collision.

And lest you think this is a contrived example: I ran into exactly this problem about a year and a half ago when trying to do some backups. The solution I came up with was to hack together a quick script that would first copy all the 8.3 filenames, and second copy all the long file names. Then I could be sure that the shortnames chosen by Windows wouldn’t accidentally conflict with a file that was going to get copied over later.

I hope this spares someone some pain someday.

SQLite embedded in Tiger

In the middle of Apple’s Web site, in a discussion of their new Core Data framework, comes the following quote:

SQLite is an open source embedded database that is included in Tiger [...]

This makes me happy. SQLite is one of the most useful embedded-database solutions I’ve come across. It’s open-source, has Python bindings, and works with the very cool SQLObject module.

Vim Outliner

I’ve just discovered a very useful outliner for Vim. I’d never really been happy with the todo list management programs I’d tried before — well, now I can use my favorite text editor to track to-do lists and other miscellaneous items for Getting Things Done. I may have to write a longer post about it once I’ve used it for a little while.