Countdown to Backpedaling Widget

Over on the right, in the sidebar, you should see a countdown clock entitled “Countdown to Backpedaling”. (If not, then something went wrong.)

If you’ve been listening to Ask an Atheist, then you should recognize this as a widget version of the Countdown to Backpedaling clock. And if not, then you should definitely be listening to them. Because they’re cool.

At any rate, it’s a clock that counts down to May 22, the day after Jesus’ return and the Day of Judgment, when the backpedaling and excuses begin.

So anyway, now you want to know a) where to download this, b) how to install it, and c) how to complain to me about all the problems you’ve had with (a) and (b).

Download

The main download page is .

If you’re using WordPress, you can download , put it in your wp-content/plugins directory, and with any luck, a “Countdown to Backpedaling” widget should magically show up in your “Widgets” control panel. You can then drag it into position, and it should work.

If you’re using some other software, you’ll want . Installation depends on what you’re using, of course, but you should be able to insert it anywhere that takes HTML.

Configuration

The main configuration option is the “show_secs” variable at the top. If you want to see seconds in the countdown, set it to true. If you find the seconds’ flashing annoying, set it to false.

You can also look through the CSS part, and edit as you please. You might need to change the width.

I might improve on this, if time permits and I don’t get raptured before getting around to it.

If you have any comments or complaints, leave a comment. Bug reports accompanied by a rum and Coke will get higher priority. Bug reports accompanied by a patch will get even higher priority.

Bourne Shell Introspection

So I was thinking about how to refactor our custom Linux and Solaris init scripts at work. The way FreeBSD does it is to have the scripts in /etc/rc.d define variables with the commands to execute, e.g.,

start_cmd='/usr/sbin/foobard'
stop_cmd='kill `cat /var/run/foobar.pid`'

run_rc_command "$1"

where $1 is “start”, “stop”, or whatever, and run_rc_command is a function loaded from an external file. It can check whether $stop_cmd is defined, and if not, take some default action.

This is great and all, but I was wondering whether it would be possible to check whether a given shell function exists. That way, a common file could implement a generic structure for starting and stopping daemons, and the daemon-specific file could just set the specifics by defining do_start and do_stop functions.

The way to do this in Perl is to iterate over the symbol table of the package you’re looking for, and seeing whether each entry is a function. The symbol table for Foo::Bar is %Foo::Bar::; for the main package, it’s %::. Thus:

while (my ($k, $v) = each %::)
{
	if (defined())
	{
		print "$k is a functionn";
	}
}

sub test_x() {}
sub test_y() {}
sub test_z() {}

But I didn’t know how to do it in the Bourne shell.

Enter type, which tells you exactly that:

#!/bin/sh

# List of all known commands
STD_CMDS="start stop restart status verify"
MORE_CMDS="graceful something_incredibly_daemon_specific"

do_start="This is a string, not a function"

do_restart() {
	echo "I ought to restart something"
}

do_graceful() {
	echo "I am so fucking graceful"
}

for cmd in ${STD_CMDS} ${MORE_CMDS}; do
	if type "do_$cmd" >/dev/null 2>&1; then
		echo "* do_$cmd is defined"
	else
		echo "- do_$cmd is not defined"
	fi
done

And yes, this works not just in bash, but in the traditional, bourne-just-once shell, on every platform that I care about.

So yay, it turns out that the Bourne shell has more introspection than
I thought.

Pre-Compressing Web Content

This was definitely a “D’oh!” type of problem.

One thing I’d been meaning to figure out for a while was how to send gzip-compressed files to a browser. That is, if I have a large HTML file, it’d be nice if the server could compress it to save bandwith and transmission time. Yes, Apache has mod_deflate which takes foo.html and gzips it on the fly, setting all the appropriate HTTP headers. But for static content, I should just be able to compress the file in advance. If the browser asked for foo.html, I wanted Apache to see that there’s a foo.html.gz and send that instead, with headers saying that it’s a text/html file that happens to be compressed.

mod_mime seemed like just the thing: just add

AddEncoding x-gzip .gz

to .htaccess. But every time I did that, Apache sent back “Content-Type: application/x-gzip“, so my browser treated it as a random file of unknown type that happened to be compressed.

Then I noticed that my vanilla-ish site-wide Apache config had

AddType application/x-gzip .gz .tgz

so that when Apache saw foo.html.gz, it ignored the .html extension, and saw only the .gz one.

The fix was to add RemoveType to my .htaccess:

RemoveType .gz
AddEncoding x-gzip .gz

And voilĂ ! .gz stops being a file type and becomes an encoding, allowing .html to shine through.

I’ll add that this plays nice with AddLanguage as well. In my test setup, I have foo.html.en.gz, for which Apache returns the headers

Content-Type: text/html
Content-Encoding: x-gzip
Content-Language: en

I.e., it’s an HTML file, it’s gzip-encoded, and it’s in English.

Just as importantly, this works with other file types (e.g., CSS files and JavaScript scripts), and XMLHttpRequest does the Right Thing with them on all of the browsers I care about.

Evil Hack of the Day

MacOS plist XML files are evil; even more so than regular XML. For instance, my iTunes library file consists mostly of entries like:

<key>5436</key>
<dict>
	<key>Track ID</key><integer>5436</integer>
	<key>Name</key><string>Getting Better</string>
	<key>Artist</key><string>The Beatles</string>
	<key>Composer</key><string>Paul McCartney/John Lennon</string>
	<key>Album</key><string>Sgt. Pepper's Lonely Hearts Club Band</string>
	…
</dict>

You’ll notice that there’s no connection between a key and its value, other than proximity. There’s no real indication that these are fields in a data record, and unlike most XML files, you have to consider the position of each element compared to its neighbors. It’s almost as if someone took a file of the form

Track_ID = 5436
Name = "Getting Better"
Artist = "The Beatles"
Coposer = "Paul McCartney/John Lennon"

and, when told to convert it to XML in the name of buzzword-compliance, did a simple and quarter-assed search and replace.

But of course, what was fucked up by (lossless) text substitution can be unfucked by text substitution. And what’s the buzzword-compliant tool for doing text substitution on XML, even crappy XML? XSLT, of course. The template language that combines the power of sed with the terseness of COBOL.

So I hacked up an XSLT template to convert my iTunes library into a file that can be required in a Perl script. Feel free to use it in good or ill health. If you spring it on unsuspecting developers, please send me a photo of their reaction.

Monthly Reports with Org-Mode

Like a lot of people, I have to submit a monthly “bullet” report, listing the things I’ve done in the previous month.

Since I use Org-Mode for planning, scheduling, and organizing tool (or rather: I tend to throw a bunch of notes into a file and tell this love child of a day planner and a wiki to tell me what I should do next), I figured I should use that.

I could use the timeline feature (C-c a L), but that only works for the current buffer, and I want a report that covers all buffers, just like the agenda.

What I’ve done in the past is to use C-c a a to get the agenda view, go back a month, toggle displaying completed/archived/whatever items, and go through that to make my bullet list.

But I finally got around to encapsulating that into a single M-x bullet command:

; Make it easier to generate bullets for $BOSS
(defvar bullet-entry-types
  '(:closed)
  "Org-mode agenda types that we want to see in the monthly bullet report
See `org-agenda-entry-types'."
  )

(defun bullets ()
  "Show a list of achievements for the past month, for monthly reports.
Uses `org-agenda'.
"
  (interactive)
  (require 'org-agenda)
  ; All we're doing here, really, is calling `org-agenda' with
  ; arguments giving a start date and a number of days. But to do
  ; that, we need to figure out
  ; - the date of the first of last month
  ; - the number of days in last month
  (let* ((now (current-time))
	 ; Figure out when last month was. Assuming that I run this
	 ; close to the beginning of a month, then `now' minus two
	 ; weeks was some time in the previous month. We can use that
	 ; to extract the year and month that we're interested in.
	 (2weeks-ago
	  (time-subtract now
			 (days-to-time 14)))
	 ; We'll also need to know when the first of this month was,
	 ; to find out how long last month was. If today is the 12th
	 ; of the month, then the first of the month was `now' minus
	 ; 11 days.
	 (1st-of-this-month
	  (time-subtract now
			 (days-to-time
			  (- (nth 3 (decode-time now))
			     1))))
	 ; Ditto to find the first of last month.
	 (1st-of-last-month
	  (time-subtract 2weeks-ago
			 (days-to-time
			  (- (nth 3 (decode-time 2weeks-ago))
			     1))))
	 ; The length of last month is the difference (in days)
	 ; between the first of last month, and the first of this
	 ; month.
	 (len-last-month
	  (time-to-number-of-days
	   (time-subtract 1st-of-this-month
			  1st-of-last-month)))
	 (start-date (decode-time 1st-of-last-month))
	 (start-year (nth 5 start-date))	; Year number
	 (start-mon (nth 4 start-date))		; Month number
	 ; Restrict the agenda to only those types of entries we're
	 ; interested in. I think this takes advantage of dynamic
	 ; scoping, which is normally an abomination unto the lord,
	 ; but is useful here.
	 (org-agenda-entry-types bullet-entry-types)
	 )
    ; Create an agenda with the stuff we've prepared above
    (org-agenda-list nil
		     (format "%04d-%02d-01"
			     start-year
			     start-mon)
		     len-last-month)
    ))

I hope this proves useful to someone.

Network Problems Fixed?

As far as I can tell, FreeBSD 8 tickled something in the driver for my ethernet card, and caused it to behave unreliably. Rather than muck around with half-tested kernel patches or ifconfig settings, I slapped a $30 Whatevertheyhadontheshelf-3000 (read: common chipset that’s been debugged by a lot of people), and as far as I can tell, things are now working as they should. If the site stays up for a year, I guess we’ll know.

I also took the opportunity to add some memory. So whoo-hoo all around.

And while I’m at it, I should point out that FreeBSD is like a VW Bug: not the prettiest thing to look at, especially compared to various Apple or Linux offerings, but in a crunch it’s nigh-indestructible. Wanna run with a root partition that’s over 100% full? Sure thing. Boot a 7.2 kernel with a 8.0 /usr? No problem.

Quick and Dirty Perl Hack: Is Foo::Bar Installed?

Every so often, I need to find out whether I have a certain Perl module installed. Usually it’s either because of a security alert, or because I’m wondering how much of a pain it would be to install some package that has Some::Obscure::Module as a prerequisite.

I don’t know how y’all do it, what with the plethora of package-management utilities out there, but one way that works for sure is simply:

perl -MSome::Module -e ''

If this command succeeds, that means Perl successfully loaded Some::Module, then executed the (empty) script, printing nothing. If Some::Module is missing, it’ll print an error message and fail.

This is short enough that it should be aliased, but I haven’t gotten around to that yet.

Just A Little Bit of Planning

One thing I’ve noticed about my code is that an awful lot of the
comments are of the form

call_some_function();
	// XXX - Error-checking

(where
XXX
is an easily-grepped
marker for something that needs to be fixed.)

The proximate reason for this accumulation of “do something smart if
something goes wrong to-do items is that a lot of the time, the
function in which this appears doesn’t have a mechanism for reporting
errors, so I don’t know what to do if I detected a run-time error.

This leads to the other big problem, the one where I’m calling another
function of mine, which doesn’t report errors, so I can’t even tell if
something went wrong.

So if a() calls b() which calls
c(), all three are likely to have
XXX - Error-checking
comments; but c() doesn’t know what to do in case of en
error, and b() and a() don’t even know how
to detect errors. And so the XXX comments accumulate.

For me, this is often caused by the fact that experimental programming
and production programming are quite different: when I’m learning a
new system, such as a new graphics or math library, I want to figure
out which functions I’m supposed to call to get the results I want,
what the various data structures do, which one of multiple approaches
to the problem works best, and so forth.

If I set up a test environment (e.g., a database server so I can play
around with database-manipulation code), I’ll be keeping an eye on it
to make sure that everything is sane, so my experimental code needn’t
worry about checking whether the server is up. And if it can’t make a
connection, it’ll likely dump core; but since I don’t have any
precious data or users who’ll yell at me if things go wrong, it
doesn’t matter. It’s best to just set up a working environment and
hammer at the code until it works. I tend to accumulate a large number
of ad hoc modules, functions, data files with names like
foo, bar, foo2, and so forth.

In production, of course, this isn’t good enough: all sorts of things
can go wrong: servers go down, network connections get broken, runaway
processes suck up all available memory, users try to open nonexistent
files, viruses try to overflow buffers, and so forth. Code needs not
only to detect errors, but deal with them as gracefully as possible.

But if I’m working on a larger project, and working on adding new
functionality that I’m not familar with, the temptation is strong to
take the “learning” code that’s been hammered into some kind of shape,
and plop it in the middle of existing production code.

Neither approach is inherently wrong. Each is appropriate in certain
contexts. You want to keep things loose and fluid and unstructured
while learning, because by definition you don’t know what’s going and
what’s best. And you want to have things organized, structured, and
regimented in production, to make it easier to avoid and find bugs, to
ensure code quality and stability.

But this difference does mean that it can be hard to integrate new
code into production code.

Often, test code is so messy that there’s really no choice but to do a
complete rewrite. Neater test code is worse in this regard, since it
may trick you into thinking that with just a little bit of cleaning
up, you can use it in production.

So it’s important, when moving from test to production code, to ask
oneself how errors should be reported. This takes a bit more planning
ahead of time, but like security, it’s easier to build it in from the
start than to retrofit it onto existing code.

The thing that works for me is:

  1. Think of a function to add.
  2. Write a comment describing the function: what it does, which
    arguments it takes, what value(s) it returns, and what it does in case
    of error.
  3. Write the function.

In that order.

This may be more structured than you want in the playing-around phase,
in which case you should definitely consider using it during the
hammering-into-shape phase, after you have the basic functionality
working, and before you’ve started moving test code into production.
Over time, though, this kind of forethought may become automatic
enough that it doesn’t get much in the way of experimentation.

Don’t Put Information in Two Places

While playing around with a Perl script to look up stock quotes, I
kept getting warning messages about uninitialized values, as well as
mising data in the results.

I eventually tracked it down to a bug in an old version of the
Finance::Quote
Perl module, specifically to these lines:

# Yahoo uses encodes the desired fields as 1-2 character strings
# in the URL.  These are recorded below, along with their corresponding
# field names.

@FIELDS = qw/symbol name last time date net p_change volume bid ask
             close open day_range year_range eps pe div_date div div_yield
	     cap ex_div avg_vol currency/;

@FIELD_ENCODING = qw/s n l1 d1 t1 c1 p2 v b a p o m w e r r1 d y j1 q a2 c4/;

Basically, to look up a stock price at
Yahoo! Finance,
you fetch a URL with a parameter that specifies the data you want to
retrieve: s for the ticker symbol (e.g., AMZN), n
for the company name (“Amazon.com, Inc.”), and so forth.

The @FIELDS array lists convenient programmer-readable names
for the values that can be retrieved, and @FIELD_ENCODING
lists the short strings that have to be sent as part of the URL.

At this point, you should be able to make an educated guess as to what
the problem is. Take a few moments to see if you can find it.

The problem is that @FIELDS and @FIELD_ENCODING
don’t list the data in the same order: “time” is the 4th
element of @FIELDS ($FIELDS[3]), but t1,
which is used to get the time of the last quote, is the 5th element of
@FIELD_ENCODING ($FIELD_ENCODING[4]). Likewise,
date is at the same position as t1.

More generally, this code has information in two different places,
which requires the programmer to remember to update it in both places
whenever a change is made. The code says “Here’s a list of names for
data. Here’s a list of strings to send to Yahoo!”, with the unstated
and unenforced assumption that “Oh, and these two lists are in
one-to-one correspondence with each other”.

Whenever you have this sort of relationship, it’s a good idea to
enforce it in the code. The obvious choice here would be a hash:

our %FIELD_MAP = (
	symbol	=> s,
	name	=> n,
	last	=> l1,
	…
)

Of course, it may turn out that there are perfectly good reasons for
using an array (e.g., perhaps the server expects the data fields to be
listed in a specific order). And in my case, I don’t particularly feel
like taking the time to rewrite the entire module to use a hash
instead of two arrays. But that’s okay; we can use an array that lists
the symbols and their names:

our @FIELD_MAP = (
	[ symbol	=> s ],
	[ name	=> n ],
	[ last	=> l1 ],
	…
)

We can then generate the @FIELDS and @FIELD_ENCODING
arrays from @FIELD_MAP, which allows us to use all of the old
code, while preserving both the order of the fields, and the
relationship between the URL string and the programmer-readable name:

our @FIELDS;
our @FIELD_ENCODING;

for my $datum (@FIELD_MAP)
{
	push @FIELDS,         $datum->[0];
	push @FIELD_ENCODING, $datum->[1];
}

With only two pieces of data, it’s okay to use arrays inside
@FIELD_MAP. If we needed more than that, we should probably
use an array of hashes:

our @FIELD_MAP = (
	{ sym_name	=> symbol,
	  url_string	=> s,
	  case_sensitive	=> 0,
	},
	{ sym_name	=> name,
	  url_string	=> n,
	  case_sensitive	=> 1,
	},
	{ sym_name	=> last,
	  url_string	=> l1,
	  case_sensitive	=> 0,
	},
	…
)

Over time, the amount of data stored this way may rise, and the cost
of generating useful data structures may grow too large to be done at
run-time. That’s okay: since programs can write other programs, all we
need is a utility that reads the programmer-friendly table, generates
the data structures that’ll be needed at run-time, and write those to
a separate module/header/include file. This utility can then be run at
build time, before installing the package. Or, if the data changes
over time, the utility can be run once a week (or whatever) to update
an existing installation.

The real moral of the story is that when you have a bunch of related
bits of information (the data field name and its URL string, above),
and you want to make a small change, it’s a pain to have to remember
to make the change in several places. It’s just begging for someone to
make a mistake.

Machines are good at anal-retentive manipulation of data. Let them do
the tedious, repetitive work for you.

Another Reason Not to Write csh Scripts

In case you haven’t read Tom Christiansen’s
Csh Programming Considered Harmful,
here’s another reason not to write csh/tcsh scripts
if you can avoid it.

Unlike the Bourne Shell, the C shell exits with “Undefined variable”
if you reference an undefined variable, instead of expanding that
variable to the empty string, the way the Bourne shell and Perl do.

But there’s the $?VAR construct, which allows you to
tell whether a variable is set.

I needed to check this in a script that, for reasons I won’t go into
here, needed to be written in Csh. So I had

if ( !$?FOO ) then
    # Stuff to do if $FOO isn't set

and got an error. It turns out that csh started by evaluating
$?FOO. In this case, $FOO was set, so $?FOO
was set to 0. Csh then tried to evaluate !0 and parsed it as
“event number zero”, which failed. Grrr.

Putting a space between the bang and the dollar sign fixed that.