Ansible: Roles, Role Dependencies, and Variables

I just spent some time banging my head against Ansible, and thought I’d share in case anyone else runs across it:

I have a Firefox role that allows you to define a Firefox profile with various plugins, config settings, and the like. And I have a work-from-home (WFH) role that, among other things, sets up a couple of work profiles in Firefox, with certain proxy settings and plugins. I did this the way the documentation says to:

dependencies:
  - name: Profile Work 1
    role: firefox
    vars:
      - profile: work1
        addons:
          - ublock-origin
          - privacy-badger17
          - solarize-fox
        prefs: >-
          network.proxy.http: '"proxy.host.com"'
          network.proxy.http_port: 1234
  - name: Profile Work 2
    role: firefox
    vars:
      - profile: work2
        addons:
          - ublock-origin
          - privacy-badger17
          - solarized-light
        prefs: >-
          network.proxy.http: '"proxy.host.com"'
          network.proxy.http_port: 1234

The WFH stuff worked fine at first, but then I added a new profile.

- name: Roles
  hosts: my-host
  roles:
    - role: wfh
    - role: firefox
      profile: third
      addons:
        - bitwarden-password-manager
        - some fancy-theme

This one didn’t have any prefs, but Ansible was applying the prefs from the WFH role.

Eventually, I found that the problem lay in the two vars blocks in the wfh role’s dependencies: apparently those get set as variables for the entire task or play, not just for that invocation of the firefox role. The solution turned out to be undocumented: drop the vars blocks and pull the role parameters up a level:

dependencies:
  - name: Profile Work 1
    role: firefox
    profile: work1
    addons:
      - ublock-origin
      - privacy-badger17
      - solarize-fox
    prefs: >-
      network.proxy.http: '"proxy.host.com"'
      network.proxy.http_port: 1234
  - name: Profile Work 2
    role: firefox
    profile: work2
    addons:
      - ublock-origin
      - privacy-badger17
      - solarized-light
    prefs: >-
      network.proxy.http: '"proxy.host.com"'
      network.proxy.http_port: 1234

I do like Ansible, but it’s full of fiddly stupid crap like this.

Recursive grepping

Sometimes you just need to grep recursively in a directory. If you’re using find $somedir -type f -exec grep $somestring {} \;, don’t:

Use xargs to avoid creating a bazillion grep processes:
find $somedir -type f -print | xargs grep $somestring

But spaces in the output of find (i.e., spaces in filenames) will confuse xargs, so use -print0 and xargs -0 instead:
find $somedir -type f -print0 | xargs -0 grep $somestring

Except that you can achieve the same effect with find, with \+:
find $somedir -type f -exec grep $somestring {} \+

Or you can just use recursive grep:
grep -r $somestring $somedir
except that find allows you to filter by file type, name, age, etc.

Campaign Manager: Serializing Recursive Objects

There’s an idea for a game that’s been rattling around my brain for a while, so I finally decided to learn enough Unity to birth it.

To get an idea of what I have in mind, think Sid Meier’s Civilization, but for elections: you play a campaign manager, and your job is to get your candidate elected. This is complicated by the fact that the candidate may be prone to gaffes (think Joe Biden) or unmanageable (think Donald Trump); union leaders, state governors, CEOs, etc. can all promise their support, but they’ll all want something in return.

The thing I learned this week is the Unity JSON Serializer. It’s used to store prefabs and other data structures in the editor, so I figured it would be the natural tool for storing canned scenarios, as well as for saving games. It has a few quirks, though: for one thing, it can only serialize basic types (int, string, and the like), and a few constructs like struct, and arrays and List<T> of basic types. If you’re used to the introspection of languages like Python and Perl, you’re likely to be disappointed in C#. For another, in order to avoid runaway recursion, it only serializes to a depth of 7.

To start with, I defined class Area to represent an area such as a country, like the US. An Area is divided in various ways: politically, it’s divided into states. Culturally, it’s divided into regions like the Pacific Northwest or the Bible Belt. Other divisions are possible, depending on what you’re interested in. So I added a Division class:

namespace Map {
    [Serializable]
    public class Area
    {
	public string id;
	public string name;
	public List<Division> divisions = new List<Division>();
    }

    public class Division
    {
	public string id;
	public string name;
	public List<Area> children;
    }
}

As you can see, this is recursive: an Area can have multiple Divisions, each of which can contain other Areas with their own Divisions. This allows us to divide the United States into states, which are in turn divided into Congressional Districts, which in turn are divided into precincts.

Since neither of our classes are elementary types, they can’t be serialized directly. So let’s add struct SerializableArea and struct SerializableDivision to hold the structures that will actually be stored on disk, as opposed to Area and Division which will be used in memory at run time, and use the ISerializationCallbackReceiver interface that will give our classes a hook called when the object is serialized or deserialized.

Without wishing to repeat what’s in the docs: in order to get around the various limitations of the Serializer, the way to serialize a tree of objects is to serialize an array of objects, and use identifiers to refer to particular objects. Let’s say our in-memory Area tree looks something like:

  • Area United States
    • Division Political
      • Area Alabama
      • Area Alaska
    • Division Regions
      • Area Bible Belt
      • Area Pacific Northwest

(That’s just an outline, of course. Each node has more members than are shown here.) We can serialize this as two arrays: one with all of the Areas, and one with all of the Divisions:

  • Area United States
    • List<SerializableArea> childAreas:
      • SerializableArea Alabama
      • SerializableArea Alaska
      • SerializableArea Bible Belt
      • SerializableArea Pacific Northwest
    • List<SerializableDivision> serialDivisions:
      • SerializableDivision Political
      • SerializableDivision Regions

We don’t want to recurse, but we do want to be able to rebuild the tree structure when we deserialize the above. So SerializableArea contains, not a list of Divisions, but a list of identifiers that we can find in serialDivisions. Likewise, SerialDivision contains not a list of Areas, but a list of identifiers that we can look up in childAreas.

Naturally, Area and Division each contain a Serialize() method that recursively serializes it and its children.

The next question is: so you’ve been asked to serialize an Area. How do you know whether you’re supposed to add it to an existing childAreas list or start your own?

Answer: if the serialization was invoked through OnBeforeSerialize(), then you’re serializing the top level object and should allocate a list to put the children in. Otherwise, append to an existing list, which should be passed in as a parameter to Serialize().

If anyone’s interested in what it looks like when all is said and done, here it is:

namespace Map {

    [Serializable]
    public class Area : ISerializationCallbackReceiver
    {
	public string id;
	public string name;

	public List<Division> divisions = new List<Division>();

	[Serializable]
	public struct SerializableArea
	{
	    public string id;
	    public string name;
	    public List<string> divisions;
	}

	public List<Division.SerializableDivision> serialDivisions;
	public List<SerializableArea> childAreas = new List<SerializableArea>();

	public void OnBeforeSerialize()
	{
	    serialDivisions =
		new List<Division.SerializableDivision>(divisions.Count);
	    childAreas = new List<SerializableArea>();

	    for (int i = 0; i < divisions.Count; i++)
		divisions[i].Serialize(ref serialDivisions, ref childAreas);
	}

	// Look up a Division by name, so we can avoid adding it twice
	// to serialDivisions.
	private int FindDivisionById(string id,
				     ref List<Division.SerializableDivision>dlist)
	{
	    for (int i = 0; i < dlist.Count; i++)
		if (dlist[i].id == id)
		    return i;
	    return -1;
	}

	public void Serialize(ref List<SerializableArea> alist,
			      ref List<Division.SerializableDivision> dlist)
	{
	    SerializableArea sa = new SerializableArea();
	    sa.id = id;
	    sa.name = name;
	    sa.divisions = new List<string>(divisions.Count);

	    alist.Add(sa);

	    for (int i = 0; i < divisions.Count; i++)
	    {
		sa.divisions.Add(divisions[i].name);

		int d = FindDivisionById(divisions[i].id, ref dlist);
		if (d < 0)
		    // Don't add a Division to dlist twice.
		    divisions[i].Serialize(ref dlist, ref alist);
	    }
	}
    }

    public class Division : ISerializationCallbackReceiver
    {
	public string id;
	public string name;
	public List<Area> children;

	[Serializable]
	public struct SerializableDivision
	{
	    public string id;
	    public string name;
	    public List<string> areas;
	}

	public void Serialize(ref List<SerializableDivision> dlist,
			      ref List<Area.SerializableArea> alist)
	{
	    SerializableDivision sd = new SerializableDivision();
	    sd.id = id;
	    sd.name = name;
	    sd.areas = new List<string>(children.Count);

	    dlist.Add(sd);

	    for (int i = 0; i < children.Count; i++)
	    {
		sd.areas.Add(children[i].name);

		int a = FindAreaById(children[i].id, ref alist);
		if (a < 0)
		    // Don't add an Area to alist twice.
		    children[i].Serialize(ref alist, ref dlist);
	    }
	}

	private int FindAreaById(string id,
				 ref List<Area.SerializableArea> alist)
	{
	    for (int i = 0; i < alist.Count; i++)
		if (alist[i].id == id)
		    return i;
	    return -1;
	}
    }
}

WFHing with Emacs: Work Mode and Command-Line Options

Like the rest of the world, I’m working from home these days. One of the changes I’ve made has been to set up Emacs to work from home.

I use Emacs extensively both at home and at work. So far, my method for keeping personal stuff and work stuff separate has been to, well, keep separate copies of ~/.emacs.d/ on work and home machines. But now that my home machine is my work machine, I figured I’d combine their configs.

To do this, I just added a -work command-line option, so that emacs -work runs in work mode. The command-switch-alist variable is useful here: it allows you to define a command-line option, and a function to call when it is encountered:

(defun my-work-setup (arg)
   ;; Do setup for work-mode
  )
(add-to-list 'command-switch-alist
  '("work" . my-work-setup))

Of course, I’ve never liked defining functions to be called only once. That’s what lambda expressions are for:

(add-to-list 'command-switch-alist
  '("work" .
    (lambda (arg)
      ;; Do setup for work-mode
      (setq my-mode 'work)
      )))

One thing to bear in mind about command-switch-alist is that it gets called as soon as the command-line option is seen. So let’s say you have a -work argument and a -logging option. And the -logging-related code needs to know whether work mode is turned on. That means you would always have to remember to put the -work option before the -logging option, which isn’t very elegant.

A better approach is to use the command-switch-alist entries to just record that a certain option has been set. The sample code above simply sets my-mode to 'work when the -work option is set. Then do the real startup stuff after the command line has been parsed and you know all of the options that have been passed in.

Unsurprisingly, Emacs has a place to put that: emacs-startup-hook:

(defvar my-mode 'home
  "Current mode. Either home or work.")
(add-to-list 'command-switch-alist
  '("work" . (lambda (arg)
                (setq my-mode 'work))))

(defvar logging-p nil
  "True if and only if we want extra logging.")
(add-to-list 'command-switch-alist
  '("logging" . (lambda (arg)
                  (setq logging-p t))))

(add-hook 'emacs-startup-hook
  (lambda nil
    (if (eq my-mode 'work)
      (message "Work mode is turned on."))
    (if logging-p
      (message "Extra logging is turned on."))
    (if (and (eq my-mode 'work)
             logging-p)
      (message "Work mode and logging are both turned on."))))

Check the *Messages* buffer to see the output.

Popular Vote: Majority Rule Is Disenfranchisement

Here’s a rather breathless letter to the editor of the Brattleboro (VT) Reformer, promising dire consequences if we start electing presidents the same way we elect governors, senators, mayors, and school board members:

The Los Angeles Times editorial (Feb. 17 in the Reformer) would like to disenfranchise more than half our nation by ending the electoral college, validating only “mob rule” elections dominated by metropolitan area voters and perhaps a portion of their rural allies. Vermont’s legislature endorsed the “National Popular Vote Interstate Compact” which would effectively disenfranchise a sizable portion of Vermont voters, as actually happened prior to that in the 2016 election when our three electors all voted for one popular candidate, even though the Vermont voters were divided 2 to 1. So one third of our voters were ignored entirely.

Dave Garrecht, Guilford, VT

Mr. Garrecht seems confused with worry. For one thing:

disfranchise [ dis-fran-chahyz ]:
verb (used with object), dis·fran·chised, dis·fran·chis·ing.
1. to deprive (a person) of a right of citizenship, as of the right to vote

dictionary.com

No one is talking about taking away anyone’s right to vote. What he’s upset about is not getting his way, as he demonstrates in the rest of the sentence:

as actually happened prior to that in the 2016 election when our three electors all voted for one popular candidate, even though the Vermont voters were divided 2 to 1. So one third of our voters were ignored entirely.

No, no one ignored anyone. It’s just that the minority lost. That’s how it works in a democracy. Get used to it.

He mentions “mob rule”, as do a lot of other people, so that’s worth addressing. Wikipedia’s definition seems as good as any I’ve seen:

Ochlocracy […] or mob rule is the rule of government by mob or a mass of people, or, the intimidation of legitimate authorities. As a pejorative for majoritarianism, it is akin to the Latin phrase mobile vulgus, meaning “the fickle crowd”, from which the English term “mob” originally was derived in the 1680s.

Now, no one is condoning, or even suggesting, intimidating anyone. So really, the biggest fear worth addressing is that the majority might vote to take away the rights of the minority, as in the old quotation about how “Democracy is two wolves and a lamb voting on what to eat for lunch.” It’s worth reading the context, in Marvin Simkin’s article in the Los Angeles Times:

Democracy is not freedom. Democracy is two wolves and a lamb voting on what to eat for lunch. Freedom comes from the recognition of certain rights which may not be taken, not even by a 99% vote. Those rights are spelled out in the Bill of Rights and in our California Constitution. Voters and politicians alike would do well to take a look at the rights we each hold, which must never be chipped away by the whim of the majority.

This problem has been recognized for a long time, and that’s why the first ten amendments to the US Constitution spell out a list of rights that can’t be taken away by a simple vote. And nobody’s trying to take that away here.

In short, Mr. Garrecht is upset over nothing. No one’s taking away anyone’s vote. If anything, the popular vote would make more people’s vote significant. And really, if you’re supporting an unfair system for fear of what other people might do to you if their vote counts the same as yours, what does that say about you?

Counting Every Vote Will Nullify Your Vote

An editorial at Fredericksburg.com warns of dire consequences if the National Popular Vote Interstate Compact passes:

The House of Delegates voted to have the Old Dominion join the National Popular Vote Interstate Compact, an agreement that would force Virginia’s 13 electors to vote for the candidate chosen by the national popular vote.


Joining this compact would have nullified Virginia’s voice in this most important of all elections, enabled the tyranny of the majority, and upended a system that has worked well for 233 years.

This is an argument I’ve seen elsewhere; particularly the “nullify” part of it. As I see it, it’s just seizing on the more eyebrow-raising part of the NPVIC–the idea that a state’s Electoral Votes will sometimes be given to a candidate who didn’t win that state’s popular vote–and presenting it as though it were a new idea.

The word “nullified” makes it sound as though Virginia’s voters would be deprived of their chance to participate in the election. In fact, if this situation were to arise, it would simply mean that the majority of Virginia’s voters would have lost the election, something that has happened any number of times.

In fact, a popular vote would achieve the reverse of what the author fears. Let’s say that the Compact passes, with states totaling 270 Electoral Votes signing on, but Virginia holds out and doesn’t sign on.

In the next election, the winning strategy changes: instead of looking at states as blocks and assuming that California will vote Democratic and Texas will vote Republican, candidates will have to appeal to broad swaths of people: suddenly the Republicans in California and the Democrats in Texas are worth reaching out to. And so are the voters in Virginia, be they Democrat, Republican, third-party, or independent. Unlike the present system, every one of those votes will move the dial a little bit to the left or to the right.

So far from nullifying anyone’s vote, the popular vote would make everyone’s vote count.

Silly Objections to the Popular Vote

Of late, I’ve been taking an interest in the National Popular Vote Interstate Compact. For those who haven’t heard about it, the basic idea is that the US’s system of electing presidents through the Electoral College is archaic and convoluted, and too often doesn’t elect the person who won the most people’s votes.

Since the Electoral College is in the Constitution, it would take a constitutional amendment to get rid of it, and that’s notoriously difficult. However, there’s a workaround: have states allocate their Electoral Votes not to the candidate who won the state’s popular vote, but to the one who won the national popular vote. Yes, it means that if most people in a state vote for candidate A, but candidate B wins the vote nationwide, then that state will allocate its Electoral votes to candidate B. Naturally, it would be crazy for a state to go it alone in this. So this would only kick in once enough states signed up to determine the outcome of the election, i.e., states with 270 Electoral Votes between them.

It sounds weird at first, but it could work. And it’s because it sounds crazy that I’ve started following the issue. But one thing that struck me is just how few good arguments there are against it.

Take, for instance, this letter to the editor of the Richmond Times-Dispatch:

As far as presidential elections go, contenders should learn some things from sports. Know the rules. More importantly, play by the rules. They should broaden their support and try to win more states. They know what the rules are. A contender should need to win more than 11 states (the most populous of which total 270 electoral votes), potentially negating the other 78% (39 states).

For starters, presidential candidates and their campaign managers do know the rules, and do play by them; that’s why they only really campaign in a handful of swing states: everyone knows that, say, New Jersey will vote Democratic no matter what, so the Republicans can just write it off and spend their campaign money elsewhere, where it’ll do more good (i.e., where advertising is likely to get them some more Electoral Votes). The Democrats, meanwhile, can take New Jersey pretty much for granted, and spend their campaign money elsewhere, where it’ll do more good.

Secondly, the author makes a mistake that many opponents of the NPVIC make: that of thinking in terms of states instead of people: states don’t vote; individual voters do. Under a popular vote system, it doesn’t make sense to talk about “how New Jersey voted”, except as a broad trend. The more important question is, how many people in New Jersey voted for each candidate?

Note, too, that the 11 most populous states (also the ones with the most Electoral Votes) are California, Texas, Florida, New York, Pennsylvania, Illinois, Ohio, Georgia, North Carolina, Michigan, and New Jersey. The author’s fear lies on the premise of everyone in those 11 states voting the same way. I can’t imagine an election in which everyone in California votes the same way, let alone one where everyone in Texas votes the same way as everyone in California. If a candidate comes along who’s such a uniter that they can get a majority of the vote in those 11 states, then I think that person deserves the presidency.

Furthermore, while those 11 states add up to 270 Electoral Votes, they also have 52% of the population. So what this person is arguing against is the idea of majority rule.

But the biggest mistake this writer makes, in my opinion, is the one I mentioned first: it’s in part because of the Electoral College that presidential candidates only try to win a handful of states and ignore the rest. If he wants them to reach out to, say, Republicans in rural California and New York, or Democrats in Austin and Salt Lake City, then a popular-vote approach is the way to go.

Ansible As Scripting Language

Ansible is billed as a configuration manager similar to Puppet or cfengine. But it occurred to me recently that it’s really (at least) two things:

  1. A configuration manager.
  2. A scripting language for the machine room.

Mode 1 is the normal, expected one: here’s a description; now make the machine(s) look like the description. Same as Puppet.

Mode 2 is, I think, far more difficult to achieve in Puppet than it is in Ansible. This is where you make things happen in a particular order, not just on one machine (you’d use /bin/sh for that), but on multiple hosts.

For instance, adding a new user might involve:

  1. Generate a random password on localhost.
  2. Add a user on the Active Directory server.
  3. Create and populate a home directory on the home directory server.
  4. Add a stub web page on the web server.

This is something I’d rather write as an ansible play, than as a Puppet manifest or module.

Which brings me to my next point: it seems that for Mode 1, you want to think mainly in terms of roles, while for Mode 2, you’ll want to focus on playbooks. A role is designed to encapsulate the notion of “here’s a description of what a machine should look like, and the steps to take, if any, to make it match that description”, while a playbook is naturally organized as “step 1; step 2; step 3; …”.

These are, of course, just guidelines. And Mode 2 suffers from the fact that YAML is not a good way to express programming concepts.  But I find this to be a useful way of thinking about what I’m doing in Ansible.

Buying Lunch with Bitcoin?

This weekend, I had occasion to eat at a food truck that had this intriguing sign in the window:

Sign reading "Crypto accepted here, with the icons of various cryptocurrencies" in the window of a food truck.

This intrigued me, because the way I understand it, it takes 10-20 minutes for Bitcoin transactions to go through. That’s why it doesn’t work everyday purchases like, for instance, lunch at a food truck.

I asked the vendor about this, and he told me that “Oh, no. If you have $APP, it goes through right away!” There was a line, so I wasn’t able to ask more questions, but when I read the reviews for the app, a lot of the complaints involved how long it took for transactions to be processed: up to an hour, in some cases.

So I suspect that the food truck guy is confused: yes, you can instantly get a hash that says to transfer X amount of Bitcoin from address A to address B. But the real question is, will that transaction actually get recorded in the consensus ledger? This is similar to the way it only takes a few seconds to write someone a check; but you can’t really spend that money until the check clears, which can take a week. But if I went around to two or three food trucks in the same ten-minute window, I think I could “spend” the same money, and only one of the transactions would clear, in the end.

In short, I think this guy is lucky that he’s only dealt with honest people. Either that, or he figures that there’ll be some amount of theft, and that’s just the price of doing business.

Ansible: Running Commands in Dry-Run Mode in Check Mode

Say you have an Ansible playbook that invokes a command. Normally, that command executes when you run ansible normally, and doesn’t execute at all when you run ansible in check mode.

But a lot of commands, like rsync have a -n or --dry-run argument that shows what would be done, without actually making any changes. So it would be nice to combine the two.

Let’s start with a simple playbook that copies some files with rsync:

- name: Copy files
  tasks:
    - name: rsync the files
      command: >-
        rsync
        -avi
        /tmp/source/
        /tmp/destination/
  hosts: localhost
  become: no
  gather_facts: no

When you execute this playboook with ansible-playbook foo.yml rsync runs, and when you run in check mode, with ansible-playbook -C foo.yml, rsync doesn’t run.

This is inconvenient, because we’d like to see what rsync would have done before we commit to doing it. So let’s force it to run even in check mode, with check_mode: no, but also run rsync in dry-run mode, so we don’t make changes while we’re still debugging the playbook:

- name: Copy files
  tasks:
    - name: rsync the files
      command: >-
        rsync
        --dry-run
        -avi
        /tmp/source/
        /tmp/destination/
      check_mode: no
  hosts: localhost
  become: no
  gather_facts: no

Now we just need to remember to remove the --dry-run argument when we’re ready to run it for real. And turn it back on again when we need to debug the playbook.

Or we could do the smart thing, and try to add that argument only when we’re running Ansible in check mode. Thankfully, there’s a variable for that: ansible_check_mode, so we can set the argument dynamically:

- name: Copy files
  tasks:
    - name: rsync the files
      command: >-
        rsync
        {{ '--dry-run' if ansible_check_mode else '' }}
        -avi
        /tmp/source/
        /tmp/destination/
      check_mode: no
  hosts: localhost
  become: no
  gather_facts: no

You can check that this works with ansible-playbook -v -C foo.yml and ansible-playbook -v foo.yml.