I’d hammer in the morning

Why don’t more peo­ple know about (and use) genetic pro­gram­ming, espe­cially for sym­bolic regres­sion? GP is an approach that can be use­ful in all sorts of domains, for prob­lems rang­ing from exploratory data analy­sis to design automa­tion. SR can be a sub­tly infor­ma­tive com­ple­ment to sta­tis­ti­cal mod­el­ing projects, or it can be used as a mon­strously pow­er­ful open-​​ended exploratory machine learn­ing engine. It rocks.

So. Do you know any­thing about it? [Cheat­ing has been dis­cour­aged by elim­i­nat­ing out­bound links from this post.]

This has become a prob­lem for me. In seven con­ver­sa­tions in two weeks with col­leagues about work, includ­ing bosses and peers, I’ve men­tioned or advised or absolutely insisted they con­sider GP/​SR. In one case my oppo­site knew about GP but hadn’t con­sid­ered it because he only knew about pole-​​balancing and stuff; in four cases they thought I was talk­ing about genetic algo­rithms for para­me­ter opti­miza­tion (not that there’s any­thing wrong with that, but… no); in two cases I sup­pose they thought “sym­bolic regres­sion” meant some­thing ick­ily sta­ti­sticky, and didn’t want to go down that road, so they played like it was some fancy new­fan­gled numer­i­cal regres­sion tech­nique fad-​​of-​​the-​​day. Then, yes­ter­day, in a room full of peo­ple using fast but utterly opaque SVMs to do machine learn­ing, where the goal is to under­stand the sys­tem, they had thought about nei­ther Bayesian net­works nor GP/​SR, both of which could tell them impor­tant things about how the sys­tem works. And in this lat­ter case they hadn’t ever heard of SR.

I sup­pose now I have to do some­thing about it.

Sigh. More in a while.

This entry was posted in Uncategorized by Tozier. Bookmark the permalink.

5 thoughts on “I’d hammer in the morning

  1. Just out of curios­ity, do you really think that any rea­son­ably com­plex Bayesian net­work, whose parameter-​​set or struc­ture is learned from the data, is likely to be any less opaque than the results of a well-​​designed SVM?

    Never-​​mind the fact that the graph rep­re­sen­ta­tion of a bayesian net­work often invites peo­ple to con­sider the edges which are drawn (when in fact, it’s what isn’t drawn that’s impor­tant, graph­i­cally) and, at the same time, to con­sider the edges singly (when in fact, it’s a “set of edges” which form the incom­ing arcs to a node that are the irre­ducible ele­ments in a BN), which is totally wrong.

    Bleh. (what I’m say­ing is, it’s not about your machine-​​learning tool-​​of-​​choice, it’s about the aux­il­iary tools you have avail­able to sum­ma­rize and visu­al­ize the results of what­ever method you’ve chosen.)

    But I’m intrigued… what is GP/​SR? and how is it dif­fer­ent from a genetic algorithm?

  2. This is true… but you can’t really make strong state­ments about the role of, say, qual­i­ta­tive rela­tion­ship between mass and time variables,as you can with sym­bolic regres­sion approaches.

    In many cases, the customer–the per­son who wants and pays to see the analy­sis done–may not real­ize ini­tially that they don’t just want a clas­si­fier or a model, but also want one that pro­vides insight to drive fur­ther first-​​principles mod­el­ing and exper­i­men­ta­tion. That’s the short­com­ing of most NN-​​like and PCA-​​like approaches: the inter­me­di­ate cal­cu­la­tions are some func­tion of almost all the inputs, and as a result it’s nigh impos­si­ble to tease out mean­ing­ful insight.

    Any­way… I’m work­ing on it.….

  3. What the alter­na­tives to do Sym­bolic Regres­sion?
    Make a web search for the term. You will see that there is no def­i­n­i­tion for the term out­side GP world.
    My guess is that the most of peo­ple don’t even know that such a prob­lem can be solved by a machine.
    Take a look in http://​www​.it​.lut​.fi/​m​a​t​/​E​c​m​i​N​L​/​e​c​m​i​3​5​/​n​o​d​e​7​0​.​h​tml

  4. It’s what you do for when mix­tures of logis­tic func­tions aren’t flex­i­ble enough, isn’t it? You tend to get laughed out of the shop in econo­met­rics when you get to this level of com­plex­ity, but that’s mainly because we’re usu­ally look­ing for pat­terns that prob­a­bly don’t exist.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>