17 June 2006

Python Macros

OK, I should post this on comp.lang.python, but I wanted to think it through first by posting it somewhere nobody's going to read. Also, I wanted something techy on my blog.

For the last 300 years or so, Python fans with lisp envy have been arguing that Python needs macros, while Python fans with lispophobia have been arguing that Python should avoid macros like the plague.

The problem is simple: How do we make macros usable (as opposed to the C preprocessor) and implementable, while still being Pythonic?

The only answer anyone ever comes up with is "sort of lisp-like" or "sort of scheme-like" or "anything but the C preprocessor," or at best, "sort of lisp-like but with Python doc strings." Without a good proposal, Guido tells everyone to shut up, and Python stays macro-free.

I think the answer has to come from Dylan. Dylan has a macro system is both hygenic (like Scheme) and powerful (unlike Scheme). But Dylan isn't a syntaxless Lisp derivative; it's a language with a quasi-Algol syntax (like Python). And the macros preserve the syntax. Pretty cool. The classic article is D-Expressions: Lisp Power, Dylan Style (Jonathan Bacharach and Keith Playford, 1999).* (Watch out, that's a PDF.) You can also read The Dylan Reference Manual.**

I'm not going to explain Dylan macros in detail. There are three secrets.

(1) Pattern-matching rewrite rules are sufficiently powerful for almost all purposes. Full-scale code-generating procedures break syntax, but fortunately, they're unnecessary. While there are a few annoying limitations to Dylan macros, they have to do with the implementation, not the concept.

(2) Algol-like languages have useful Skeleton Syntax Trees (SSTs), just like Lisp. Macros based on raw or tokenized text either break syntax, or are too restricted (like cpp). Macros based on ASTs are useless unless you can modify the AST parser on the fly. With macros based on SSTs, a single one-time change to the parser is needed to allow macros, but then you're done.

(3) Hygenic macros need an escape clause. One reason that Lisp and C macros are dangerous is that they accidentally inject names into the scope of the calling code, possibly overriding the caller's names. One reason that Scheme macros suck is that there is no way to inject names into the calling scope. Dylan solves this by making new names hygenic by default, but providing the "?=" form to circumvent this.

Here's an example of a Dylan macro:

define macro with-open-file
{ with-open-file (?stream:name, ?options:*) ?:body end }
=>
{ let ?stream = #f;
block()
?stream := make(, ?options);
?body
cleanup
?stream & close(?stream)
end block }
end macro

with-open-file(stream, locator: "phonenumbers")
process-phone-numbers(stream);
end with-open-file;

Here's what the same thing could look like in Python:

defmacro withfile:
"""withfile(stream, °options, **kw) -> Open a file for use without the enclosing scope"""
withfile(?stream:name, ?options:*, ?kw:**):
?:body
=>
?stream = file(*?options, **?kw)
if (?stream):
try:
?body
finally:
?stream.close()

withfile(stream, name="phonenumbers"):
processPhoneNumbers(stream)

Of course this is a pretty bad example for Python--but it's short.

Anyway, there are a few obvious differences. In Python, indentation is part of the SST. There are docstrings. The list of matchable forms is somewhat different--in particular, "*" matches an argument list, rather than arbitrary syntax. (We might need a form to match arbitrary syntax, the "*" and "**" forms should obviously match the usual Python meaning. Maybe "***"?)

Why do macros need names? Not just for debugging, but so you can undefine them with "del withfile" the same way you can with functions.

I haven't written an implementation. The obvious first step is to build a preprocessor that converts Python source with macros into macro-expanded standard Python source. But I'm not sure that's sufficient to show off the idea, because it doesn't show how macros will integrate with the parser, or how macros can be used in interactive code.

* Look, an actual blog-style link in my blog!
** And another one!

No comments: