Python: is using "..%(var)s.." % locals() a good practice?


Question

I discovered this pattern (or anti-pattern) and I am very happy with it.

I feel it is very agile:

def example():
    age = ...
    name = ...
    print "hello %(name)s you are %(age)s years old" % locals()

Sometimes I use its cousin:

def example2(obj):
    print "The file at %(path)s has %(length)s bytes" % obj.__dict__

I don't need to create an artificial tuple and count parameters and keep the %s matching positions inside the tuple.

Do you like it? Do/Would you use it? Yes/No, please explain.

1
70
10/11/2009 11:36:38 AM

Accepted Answer

It's OK for small applications and allegedly "one-off" scripts, especially with the vars enhancement mentioned by @kaizer.se and the .format version mentioned by @RedGlyph.

However, for large applications with a long maintenance life and many maintainers this practice can lead to maintenance headaches, and I think that's where @S.Lott's answer is coming from. Let me explain some of the issues involved, as they may not be obvious to anybody who doesn't have the scars from developing and maintaining large applications (or reusable components for such beasts).

In a "serious" application, you would not have your format string hard-coded -- or, if you had, it would be in some form such as _('Hello {name}.'), where the _ comes from gettext or similar i18n / L10n frameworks. The point is that such an application (or reusable modules that can happen to be used in such applications) must support internationalization (AKA i18n) and locatization (AKA L10n): you want your application to be able to emit "Hello Paul" in certain countries and cultures, "Hola Paul" in some others, "Ciao Paul" in others yet, and so forth. So, the format string gets more or less automatically substituted with another at runtime, depending on the current localization settings; instead of being hardcoded, it lives in some sort of database. For all intents and purposes, imagine that format string always being a variable, not a string literal.

So, what you have is essentially

formatstring.format(**locals())

and you can't trivially check exactly what local names the formatting is going to be using. You'd have to open and peruse the L10N database, identify the format strings that are going to be used here in different settings, verify all of them.

So in practice you don't know what local names are going to get used -- which horribly crimps the maintenance of the function. You dare not rename or remove any local variable, as it might horribly break the user experience for users with some (to you) obscure combinaton of language, locale and preferences

If you have superb integration / regression testing, the breakage will be caught before the beta release -- but QA will scream at you and the release will be delayed... and, let's be honest, while aiming for 100% coverage with unit tests is reasonable, it really isn't with integration tests, once you consider the combinatorial explosion of settings [[for L10N and for many more reasons]] and supported versions of all dependencies. So, you just don't blithely go around risking breakages because "they'll be caught in QA" (if you do, you may not last long in an environment that develops large apps or reusable components;-).

So, in practice, you'll never remove the "name" local variable even though the User Experience folks have long switched that greeting to a more appropriate "Welcome, Dread Overlord!" (and suitably L10n'ed versions thereof). All because you went for locals()...

So you're accumulating cruft because of the way you've crimped your ability to maintain and edit your code -- and maybe that "name" local variable only exists because it's been fetched from a DB or the like, so keeping it (or some other local) around is not just cruft, it's reducing your performance too. Is the surface convenience of locals() worth that?-)

But wait, there's worse! Among the many useful services a lint-like program (like, for example, pylint) can do for you, is to warn you about unused local variables (wish it could do it for unused globals as well, but, for reusable components, that's just a tad too hard;-). This way you'll catch most occasional misspellings such as if ...: nmae = ... very rapidly and cheaply, rather than by seeing a unit-test break and doing sleuth work to find out why it broke (you do have obsessive, pervasive unit tests that would catch this eventually, right?-) -- lint will tell you about an unused local variable nmae and you will immediately fix it.

But if you have in your code a blah.format(**locals()), or equivalently a blah % locals()... you're SOL, pal!-) How is poor lint going to know whether nmae is in fact an unused variable, or actually it does get used by whatever external function or method you're passing locals() to? It can't -- either it's going to warn anyway (causing a "cry wolf" effect that eventually leads you to ignore or disable such warnings), or it's never going to warn (with the same final effect: no warnings;-).

Compare this to the "explicit is better than implicit" alternative...:

blah.format(name=name)

There -- none of the maintenance, performance, and am-I-hampering-lint worries, applies any more; bliss! You make it immediately clear to everybody concerned (lint included;-) exactly what local variables are being used, and exactly for what purposes.

I could go on, but I think this post is already pretty long;-).

So, summarizing: "γνῶθι σεαυτόν!" Hmm, I mean, "know thyself!". And by "thyself" I actually mean "the purpose and scope of your code". If it's a 1-off-or-thereabouts thingy, never going to be i18n'd and L10n'd, will hardly need future maintenance, will never be reused in a broader context, etc, etc, then go ahead and use locals() for its small but neat convenience; if you know otherwise, or even if you're not entirely certain, err on the side of caution, and make things more explicit -- suffer the small inconvenience of spelling out exactly what you're going, and enjoy all the resulting advantages.

BTW, this is just one of the examples where Python is striving to support both "small, one-off, exploratory, maybe interactive" programming (by allowing and supporting risky conveniences that extend well beyond locals() -- think of import *, eval, exec, and several other ways you can mush up namespaces and risk maintenance impacts for the sake of convenience), as well as "large, reusable, enterprise-y" apps and components. It can do a pretty good job at both, but only if you "know thyself" and avoid using the "convenience" parts except when you're absolutely certain you can in fact afford them. More often than not, the key consideration is, "what does this do to my namespaces, and awareness of their formation and use by the compiler, lint &c, human readers and maintainers, and so on?".

Remember, "Namespaces are one honking great idea -- let's do more of those!" is how the Zen of Python concludes... but Python, as a "language for consenting adults", lets you define the boundaries of what that implies, as a consequence of your development environment, targets, and practices. Use this power responsibly!-)

87
10/11/2009 5:14:09 PM

Regarding the "cousin", instead of obj.__dict__, it looks a lot better with new string formatting:

def example2(obj):
    print "The file at {o.path} has {o.length} bytes".format(o=obj)

I use this a lot for repr methods, e.g.

def __repr__(self):
    return "{s.time}/{s.place}/{s.warning}".format(s=self)

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon