I’m thinking this will want to be two or three consecutive weeks of our Ann Arbor-based Tuesday evening “Craftsman Guild” meeting, but I could be convinced to run it on a couple of consecutive weekends, as well. The gaps in between sessions are actually useful, since I think they give folks a chance to think about stuff that should be bothering them as we proceed.
Developing an awesome Genetic Programming system… from scratch
The Point:
We often run these agile coding exercises as if user stories and acceptance tests drop from the sky. In real projects, they’re typically the biggest source of confusion and pain—even in projects we’re working on by ourselves. The subject matter we’ll explore here, Genetic Programming, is hugely sexy, technically simple, and offers only trivial coding challenges.
You might wonder why so few people use it, then, after 20 years. Why it hasn’t changed the world and made artificial intelligence part of our everyday lives.
The answers to those questions have nothing to do with the computer.
The Structure:
Two or three sessions, each about 2 hours.
We’ll run the sessions in a CodingDojo format, much like the “coding randori” we’ve seen in earlier CraftsmanGuild meetings, where there’s one “driver” and one “navigator” pairing on a laptop connected to a projector, with the entire “audience” helping them along the way as they write code (and do chores).
If it seems practical in a later session, we may split into two teams (still with one customer and one project).
I’ll role-play “the customer representative” for a customer who’s off-site, with the rest of the group acting as “the dev team”.
In addition to the coding computer, we’ll set up a projector up showing a live PivotalTracker instance where we can collect, sort and make progress on stories as an integral part of the development process.
During the first iteration we’ll decide on language and infrastructure, based on who’s there and what they want and know.
As code is written it’ll be committed to the github project (so the audience can fork it and work along), but we’ll have formal review sessions with “the customer” accepting or declining particular solutions after every iteration, looking at the stories we worked on and gathering new ones as they crop up.
Participants:
…should have used some modern testing framework, but they don’t need to be experts (or evangelists) at either TDD or BDD. They should be comfortable, but don’t need to be fluent, in at least one modern programming language like Java, Ruby, Python, &c. They should at least have looked at pivotaltracker.com to familiarize themselves with the feature set and story-sorting idiom.
The language we pick should be the one which most participants are most comfortable using when they do real work. Whatever language and infrastructure we decide on, it shouldn’t be an obstacle to take a simple user story like “Adding two numbers together should return their sum” and actually write the acceptance test, and then run it.
The Project:
The Customer’s overall goal is to build a Genetic Programming system that can accept a set of
data, a set of mathematical primitives, and will evolve mathematical equations of the form
that fit the data. Here’s an (antique!) Java applet that does something along those lines already.
This sort of GP project typically breaks down into five chunks:
- build a simple but full-featured interpreter for a domain-specific language (DSL) intended for mathematical modeling
- build an evaluator that determines how well an arbitrary DSL script matches target data
- write methods to create random programs, and also mutate and cross over DSL scripts
- build a simple symbolic regression system that fits numerical data with arbitrary mathematical models
- adapt to some minor problems that may arise along the way
These may sound like big, ambitious steps, but in fact they’re all technically simple.
The goal of the dojo is not to learn to type something quickly or get as much done as possible.
It’s designed so we adapt the emerging codebase and our collective understanding of the problem the customer is asking for, in a context where there are no “tricks” (I won’t be lying to you, except maybe by omission), but where there are plenty of traps.