<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Search algorithms</title>
	<atom:link href="http://williamtozier.com/slurry/2008/04/02/search-algorithms/feed" rel="self" type="application/rss+xml" />
	<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms</link>
	<description>Pontification without all the gritty gravitas</description>
	<lastBuildDate>Mon, 08 Mar 2010 04:16:04 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Maarten Keijzer</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52353</link>
		<dc:creator>Maarten Keijzer</dc:creator>
		<pubDate>Tue, 15 Apr 2008 22:04:43 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52353</guid>
		<description>Wow Bill, a post worthy of a blog entry in its own right. But this thread is getting interesting, so I&#039;m not letting it die yet.

First of all, I want to re-emphasize that I&#039;m not advocating linearly combining objectives in order to reduce the objective space. I am simply saying that usually, one is interested in the convex hull of the Pareto set, not the entire set. This is still a large set, and the set needs to be discovered. This may sound like a dangerous simplification to you, but I think it is warranted, especially in the error / size trade-off I sketched. 

Multi-objective problems are multi-dimensional, and prematurely reducing them to a single objective will lead to poor results. I completely agree with you that you want the algorithm to search out the alternatives, and let the decision maker pick the best (combination of) solutions *after* the set has been produced.

If you present it to the &#039;customer&#039; in this way, even with a full Pareto based approach, you will notice that the decision maker will always choose a solution on the convex hull, never one in a concave bend. (Yes, also I can be very confident in my assertions without any evidence to back me up)

Yes, these concave bends are very interesting, almost as interesting as the first front of *dominated* solutions. That is, potentially important, but given that we&#039;ve got a full convex hull to go through for analysis, they can wait.

I think we&#039;re on the same page generally, as also I have stared and stared at what this genetic programming is producing. I violently agree that one needs to examine and re-examine everything that is produced, try out different things. I love to inspect the evolutionary trees that produce a solution (I believe that Nic is one of the few who has that same hobby). I have gotten great result by mixing objectives. I&#039;m just interested to make my life a bit easier, by reducing the solution set to this convex hull. Why?

I&#039;m into this symbolic regression thing. From the way I view it, each and every fitness case has an unknown associated with it, the noise of that point (even when assuming Gaussian noise). Calculating something trivial as the MSE is already cheating, as you put the assumption in that every point has exactly the same noise. You&#039;re reducing the objective space from all points to an unweighted average. We don&#039;t know if this is the case, it&#039;s an assumption.

So now, if I have 100 datapoints, I&#039;m theoretically confronted with at least 101 objectives: one objective for each datapoint, and one (or more) objectives for the succinctness of the solution. For starters, each constant that is a target value is part of the convex hull, and thus part of the Pareto set. Now, if I were to do a full Pareto dominance optimization in this 101 dimensional space, this would become a nightmare. Not only does the  size of the solution set becomes intractable (also the convex hull would be huge), the algorithm itself would come to a screeching halt simply because of the sheer complexity of calculating the dominance relation. Convexity can however be used to at cull the space a bit, and can be used for quicker algorithms to determine if the point is on the hull or not. I would also not want to waste my time examining these concave bends when there are more promising points.

No, I&#039;m currently not tackling problems this way (although I would like to), but just making the point that with a growing number of objectives, utilizing convexity is going to be a help, both in getting the algorithms to perform, and in culling the solution set to the most interesting points. 

Also given the strong theoretical support for convex hulls as objects of interests next to Pareto sets, I&#039;m amazed for years now that the Multi-Objective community in EC has completely ignored it, and is fully focused on Pareto sets alone (with additional assumptions (eta-dominance) to manage more-than-two objective problems). Pareto sets are more difficult to find and to manage than convex hulls, and the additional benefit of these concave bends that they can induce is at best marginal. You would find them by examining the second convex hull.

As for assumptions, the additional assumption of convexity over Pareto dominance is that you can indeed stochastically mix solutions. 
As far as assumption go, this is quite a better assumption than assuming that the noise is distributed equally over all your datapoints. Now that&#039;s a dangerous one! ROC curves are also based on the principle of stochastic mixing. It&#039;s sound, it works in many case, it&#039;s what your decision maker wants.

The convexity assumption usually works (as a first-order approximation) for error measures (actually it only works exactly for absolute error, the line between two points has a little bend based on the covariance of the two solutions for squared error.), and for speed it is quite correct. 

For size it would be incorrect, but given that size itself is already a hopeless measure of complexity (the Kolmogorov kind), also Pareto concavity will easily lead one astray. 

So, what am I saying here? Possibly: better to add a few more objectives and keep things manageable by mainly considering the convex hull, than using fewer objectives so that you get this concave
stuff as well.

And now I&#039;ve got to quit. Need to get some sleep. Got a scrum first thing tomorrow morning.</description>
		<content:encoded><![CDATA[<p>Wow Bill, a post worthy of a blog entry in its own right. But this thread is getting interesting, so I&#8217;m not letting it die yet.</p>
<p>First of all, I want to re-emphasize that I&#8217;m not advocating linearly combining objectives in order to reduce the objective space. I am simply saying that usually, one is interested in the convex hull of the Pareto set, not the entire set. This is still a large set, and the set needs to be discovered. This may sound like a dangerous simplification to you, but I think it is warranted, especially in the error / size trade-off I sketched. </p>
<p>Multi-objective problems are multi-dimensional, and prematurely reducing them to a single objective will lead to poor results. I completely agree with you that you want the algorithm to search out the alternatives, and let the decision maker pick the best (combination of) solutions *after* the set has been produced.</p>
<p>If you present it to the &#8216;customer&#8217; in this way, even with a full Pareto based approach, you will notice that the decision maker will always choose a solution on the convex hull, never one in a concave bend. (Yes, also I can be very confident in my assertions without any evidence to back me up)</p>
<p>Yes, these concave bends are very interesting, almost as interesting as the first front of *dominated* solutions. That is, potentially important, but given that we&#8217;ve got a full convex hull to go through for analysis, they can wait.</p>
<p>I think we&#8217;re on the same page generally, as also I have stared and stared at what this genetic programming is producing. I violently agree that one needs to examine and re-examine everything that is produced, try out different things. I love to inspect the evolutionary trees that produce a solution (I believe that Nic is one of the few who has that same hobby). I have gotten great result by mixing objectives. I&#8217;m just interested to make my life a bit easier, by reducing the solution set to this convex hull. Why?</p>
<p>I&#8217;m into this symbolic regression thing. From the way I view it, each and every fitness case has an unknown associated with it, the noise of that point (even when assuming Gaussian noise). Calculating something trivial as the MSE is already cheating, as you put the assumption in that every point has exactly the same noise. You&#8217;re reducing the objective space from all points to an unweighted average. We don&#8217;t know if this is the case, it&#8217;s an assumption.</p>
<p>So now, if I have 100 datapoints, I&#8217;m theoretically confronted with at least 101 objectives: one objective for each datapoint, and one (or more) objectives for the succinctness of the solution. For starters, each constant that is a target value is part of the convex hull, and thus part of the Pareto set. Now, if I were to do a full Pareto dominance optimization in this 101 dimensional space, this would become a nightmare. Not only does the  size of the solution set becomes intractable (also the convex hull would be huge), the algorithm itself would come to a screeching halt simply because of the sheer complexity of calculating the dominance relation. Convexity can however be used to at cull the space a bit, and can be used for quicker algorithms to determine if the point is on the hull or not. I would also not want to waste my time examining these concave bends when there are more promising points.</p>
<p>No, I&#8217;m currently not tackling problems this way (although I would like to), but just making the point that with a growing number of objectives, utilizing convexity is going to be a help, both in getting the algorithms to perform, and in culling the solution set to the most interesting points. </p>
<p>Also given the strong theoretical support for convex hulls as objects of interests next to Pareto sets, I&#8217;m amazed for years now that the Multi-Objective community in EC has completely ignored it, and is fully focused on Pareto sets alone (with additional assumptions (eta-dominance) to manage more-than-two objective problems). Pareto sets are more difficult to find and to manage than convex hulls, and the additional benefit of these concave bends that they can induce is at best marginal. You would find them by examining the second convex hull.</p>
<p>As for assumptions, the additional assumption of convexity over Pareto dominance is that you can indeed stochastically mix solutions.<br />
As far as assumption go, this is quite a better assumption than assuming that the noise is distributed equally over all your datapoints. Now that&#8217;s a dangerous one! ROC curves are also based on the principle of stochastic mixing. It&#8217;s sound, it works in many case, it&#8217;s what your decision maker wants.</p>
<p>The convexity assumption usually works (as a first-order approximation) for error measures (actually it only works exactly for absolute error, the line between two points has a little bend based on the covariance of the two solutions for squared error.), and for speed it is quite correct. </p>
<p>For size it would be incorrect, but given that size itself is already a hopeless measure of complexity (the Kolmogorov kind), also Pareto concavity will easily lead one astray. </p>
<p>So, what am I saying here? Possibly: better to add a few more objectives and keep things manageable by mainly considering the convex hull, than using fewer objectives so that you get this concave<br />
stuff as well.</p>
<p>And now I&#8217;ve got to quit. Need to get some sleep. Got a scrum first thing tomorrow morning.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tozier</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52351</link>
		<dc:creator>Tozier</dc:creator>
		<pubDate>Tue, 15 Apr 2008 14:17:20 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52351</guid>
		<description>Good points, Maarten. True enough when we&#039;re talking about offline search and optimization, and when the constraint set is relatively tractable.

In &lt;a href=&quot;http://joechip.net/nudge/&quot; rel=&quot;nofollow&quot;&gt;working on this project&lt;/a&gt;, and writing these posts, I&#039;ve come to realize that my real complaint is that most of offline optimization is short-sighted. &lt;i&gt;Rude&lt;/i&gt; to a real decision-maker. Inagile, and frequently maladaptive.

A lot of my rants through the years, where I ask people why they did something dumb with multiobjective search, are not &lt;i&gt;really&lt;/i&gt; about the convexity structure of solution spaces, or even about premature convergence to misleading subsets of pseudo-optima.

They&#039;re actually questions about evidence. Practices. Whether their decision was informed by the actual structure of the &lt;i&gt;actual solution space being explored&lt;/i&gt;, or if they just did it because it&#039;s the way they learned in school, or because nobody complained before, or because it never broke in earlier problems. Or because it&#039;s &lt;i&gt;easier&lt;/i&gt; that way.

And then I ask: Who told them &quot;easier&quot; was an objective of their project?

And as it happens, I&#039;ve rarely met people who have explored the solution set and mapping to objective space of real problems. And fewer people who have solved problems or run optimization &quot;online&quot; --- paying attention to what&#039;s happening, and changing their goals or definitions.

But your point is absolutely right, and helps clarify my thinking: Of course there are convex projections of high-dimensional multiobjective search spaces onto lower dimensions. One can use an arithmetic combination model in many common cases, whether it&#039;s an affine combination of objectives or even a nonlinear mapping. There are plenty of cunning statistical methods one could use to simplify all kinds of fun stuff in hard problems; principle components analysis leaps to mind.

So yes, you&#039;re right. I&#039;m using hyperbole. What I&#039;m always yelling about is &lt;b&gt;sometimes&lt;/b&gt; unimportant.

When a technician, or even the decision-maker herself, decides to collapse objectives &lt;i&gt;without first modeling the objective space&lt;/i&gt; by sampling or first-principles modeling, that&#039;s a mistake. They may be correct in doing it, in their problem, but the way they did it was a mistake.

Not always a mistake in terms of &lt;i&gt;results&lt;/i&gt;. In the case of many simple search space structures, like the one you point out, it&#039;s AOK.

If simplification tended to give a wrong answer, then the mythology of single-objective search wouldn&#039;t have so much traction. People would know better if they encountered problems sooner.

My point is that for many practitioners, especially &quot;in theory&quot; and in the Academy, life has given them an easy ride. They&#039;re making assumptions based on a string of early successes. People raised in a culture of models assume they can get away with this crap in real life, and I admit that bugs me. Look at Economics for analogies, if you want. Premature simplification without rigorous reality check is a mistake in terms of &lt;b&gt;project risk&lt;/b&gt;.

Let&#039;s change gears briefly. Look at software development.

Suppose I&#039;m a project manager. Some cowboy solo coder tells me &lt;i&gt;not to worry&lt;/i&gt; about the code he wrote by himself, in the middle of the night, without testing, without having the customer in the room at the time. Without any unit tests. Without a pair programmer. Sounds like a student programmer I used to &lt;strike&gt;know&lt;/strike&gt; be.

That&#039;s project risk: The customer is at risk because the coder might have failed to do what they thought they were doing (&quot;made mistakes&quot;). &lt;i&gt;And also&lt;/i&gt; because the customer might have seen the intermediate results and changed her mind about something she specified earlier in the project. Or might have discovered new business cases that weren&#039;t specified as goals.

And the project is at risk in future because other programmers &lt;i&gt;cannot trust code&lt;/i&gt; like this black box, not without going to the trouble of writing their own suite of automated unit tests. Worst: the rest of the world wasn&#039;t involved in the conversation at all, didn&#039;t share the knowledge that this code represents; I&#039;ll bet even the cowboy and the client will have to start over if they try to do similar work again.

That&#039;s a bunch of risk. More risk if it&#039;s an interesting project. All risks that the project will fail---where success is defined as &lt;i&gt;delivering what the client needs&lt;/i&gt;.

And yes, I know the cowboy coder I described is &quot;doing it the normal way.&quot; That&#039;s often the wrong way.

Is what the client specifies always what she needs? Suppose she came in and handed a three-ring binder of specifications to a smart programming team. Wrote down what she wanted, all the way to the function call structure, walked away, and came back to get the results when the team had written all the software specified in the stupid binder. No questions about language, about typos in the binder, about better ways to write the code that a team might know, no insight gained by actually making the project work. Just up-front specification of the exact solution she wants.

I&#039;ll argue that project is also at risk. For many of the same reasons as the first one. It&#039;s inagile, and any of us who have worked on software development projects know there&#039;s &lt;i&gt;always&lt;/i&gt; a better way to solve a problem than the first one you consider. You always adapt.

Now. Think about any complex search and optimization project. Stock trading. Pharmaceutical design. Supply chain management. Transportation scheduling. Anything more interesting than sticking integer-sized boxes in a backpack so a salesman can deliver them to the Towers of Hanoi.

Think about what you just said about modeling. About your assumption that we &lt;i&gt;know ahead of time&lt;/i&gt; the relationship between variables. Think about what that implies about solutions&#039; interaction with complex nonlinear constraints.

Think about when we make contingent decisions to combine objectives &lt;i&gt;or not&lt;/i&gt;, depending on the way they interact with one another in a given problem.

So for sake of the other readers, consider the design pattern you just spelled out: Take two separate nondominated solutions (both are elements of the solution set mapped to points in the objective space) and create a new model that&#039;s a simple stochastic mixture. You&#039;ve defined a new point in the solution set---assuming I allowed you to include stochastic combination in the solution set&#039;s definition---and this maps to a new point in objective space.

Your point is that for certain aggregate error measures, this new point will lie between the first two in the objective space.

If we (whoever &quot;we&quot; is) are &lt;i&gt;certain&lt;/i&gt; we&#039;ll never, ever care about anything but the average error, never care about that little constant you mention,  or the description length of your second-order solution, but only the number of states it passes through in executing its algorithm... sure. You&#039;re right.

How&#039;s that work when we ever consider the variance of the errors, though? What if the nondominated &quot;simple error-prone&quot; model is a coin-flip itself? What if I didn&#039;t let you have stochastic elements when I specified the problem, if all models were deterministic? What if the data we&#039;re testing your models on is itself full of structure, of the type which biases the log-likelihood errors?

What if, what if? 

That&#039;s really the basis of all these years of complaint about multiobjective search. &lt;b&gt;There are always contingencies.&lt;/b&gt; Always reasons to change, to improve, to adapt any model, any algorithm, any approach.

Linear programming is a stupid way to solve problems more often than it&#039;s the right way... but it&#039;s right sometimes. Neural nets are stupid more often than they&#039;re useful, but sometimes they&#039;re the right way. Single-objective search is stupid more often than not. Nonadaptive search, offline optimization: stupid. Blind application of any methodology, or making assumptions about objectives or constraints or parameters or relations or constants: stupid.

We all know this.

So let&#039;s go back to that software analogy I was using, just for one minute. If I were running a programming team, they would be Agile. They&#039;d be using XP or Scrum or something to address that risk I mentioned. There would be a customer on the team. There would be tests. There would be objective, communicative tests of all design decisions.

So: &lt;i&gt;Write me a test that will detect the conditions under which a mapping from solution set to objective space can be assumed, with high probability, to be convex.&lt;/i&gt; Alternately, and more usefully, write me a test that will detect a failure in that assumption. Something I can attach to any search method I choose, as a monitor or a decorator, and see when things are getting out of hand. A diagnostic. You can do it statistically, or analytically. I don&#039;t care.

When a decision-maker can &lt;i&gt;ask&lt;/i&gt; me to reduce the number of objectives for her problem, and we can run some tests and she can decide with confidence whether it&#039;s safe or not, then I&#039;ll be happy. When I am in the middle of a project and we&#039;ve decided to collapse some objectives and a little alarm goes off saying &quot;FAIL: convexity assumption violated&quot;, then I&#039;ll be happy.

If I have to &lt;i&gt;remember to check all the time&lt;/i&gt;... well, that&#039;s stupid. Just as stupid as if I had started using Linear Programming at the beginning without checking to see whether the problem was well suited to it, and spent hours running LP algorithms with no way of knowing if the solutions were converging.

See: you and I, and &lt;a href=&quot;http://www.morris.umn.edu/~mcphee/&quot; rel=&quot;nofollow&quot;&gt;Nic&lt;/a&gt; and &lt;a href=&quot;http://cswww.essex.ac.uk/staff/rpoli/&quot; rel=&quot;nofollow&quot;&gt;Riccardo&lt;/a&gt; and &lt;a href=&quot;http://www.genetic-programming.com/johnkoza.html&quot; rel=&quot;nofollow&quot;&gt;Koza&lt;/a&gt; and &lt;a href=&quot;http://joechip.net/nudge&quot; rel=&quot;nofollow&quot;&gt;my friends on the Nudge project&lt;/a&gt; and all the rest, we&#039;re in worse shape than almost every other search geek. Think about it. Go back to that software project analogy. Not only are we trying to model complicated problems at the same time we&#039;re searching through them---in the analogy, specify and refine business deliverables at the same time they&#039;re being developed. We&#039;re using &lt;i&gt;genetic fucking programming&lt;/i&gt;.

We get solutions, but we have to figure out how the solutions &lt;i&gt;work&lt;/i&gt;. Think about it. A million cowboy programmers writing a million black boxes, all trying to write programs &lt;i&gt;that we specify up front&lt;/i&gt;?

Or are we better off if we watch what search is making them discover, and reach in and interact, and drive solutions toward what we &lt;i&gt;mutually discover&lt;/i&gt; are not only better solutions to what we specified initially, but also better-specified?

How many times have we seen a GP search spit out an answer that optimizes everything we asked for... but which nonetheless &quot;cheats&quot;? Fails because of something we didn&#039;t spell out.

I prefer dialog. Agility. Constant interactivity. On-line search and optimization, only, all the way down. &lt;b&gt;Because I don&#039;t even trust myself&lt;/b&gt; to specify or solve interesting problems. I don&#039;t trust myself to know what I want, or how to describe it, or to know beforehand how my toolkit will interact with it. I don&#039;t even trust myself to stick to the plan. I don&#039;t trust myself programming.

&lt;i&gt;If I don&#039;t even trust myself&lt;/i&gt;, even though I so obviously think highly of myself, consider how much I trust programmers writing programs for me, or running an optimization for me, or optimizers solving other complex problems  for me.

Especially when the &quot;programmer&quot; or the optimizer isn&#039;t even human.

So that&#039;s why I continue advising people never to simplify prematurely. I think I&#039;d advise them never to do anything algorithmic at all unless they&#039;ve got fine-grained automated tests in place to cover the details, and reliable acceptance tests in place to interrogate prospective &quot;solutions&quot;.

But I don&#039;t know what those tests are, yet. You sketched one, though you didn&#039;t present a reliable automated version yet. Until then, every time we step ahead without those explicit tests in place---so we know when something complicated happens---our project is at greater risk of failure.

And combining objectives is exactly the sort of step I mean.
</description>
		<content:encoded><![CDATA[<p>Good points, Maarten. True enough when we&#8217;re talking about offline search and optimization, and when the constraint set is relatively tractable.</p>
<p>In <a href="http://joechip.net/nudge/" rel="nofollow">working on this project</a>, and writing these posts, I&#8217;ve come to realize that my real complaint is that most of offline optimization is short-sighted. <i>Rude</i> to a real decision-maker. Inagile, and frequently maladaptive.</p>
<p>A lot of my rants through the years, where I ask people why they did something dumb with multiobjective search, are not <i>really</i> about the convexity structure of solution spaces, or even about premature convergence to misleading subsets of pseudo-optima.</p>
<p>They&#8217;re actually questions about evidence. Practices. Whether their decision was informed by the actual structure of the <i>actual solution space being explored</i>, or if they just did it because it&#8217;s the way they learned in school, or because nobody complained before, or because it never broke in earlier problems. Or because it&#8217;s <i>easier</i> that way.</p>
<p>And then I ask: Who told them &#8220;easier&#8221; was an objective of their project?</p>
<p>And as it happens, I&#8217;ve rarely met people who have explored the solution set and mapping to objective space of real problems. And fewer people who have solved problems or run optimization &#8220;online&#8221; &#8212; paying attention to what&#8217;s happening, and changing their goals or definitions.</p>
<p>But your point is absolutely right, and helps clarify my thinking: Of course there are convex projections of high-dimensional multiobjective search spaces onto lower dimensions. One can use an arithmetic combination model in many common cases, whether it&#8217;s an affine combination of objectives or even a nonlinear mapping. There are plenty of cunning statistical methods one could use to simplify all kinds of fun stuff in hard problems; principle components analysis leaps to mind.</p>
<p>So yes, you&#8217;re right. I&#8217;m using hyperbole. What I&#8217;m always yelling about is <b>sometimes</b> unimportant.</p>
<p>When a technician, or even the decision-maker herself, decides to collapse objectives <i>without first modeling the objective space</i> by sampling or first-principles modeling, that&#8217;s a mistake. They may be correct in doing it, in their problem, but the way they did it was a mistake.</p>
<p>Not always a mistake in terms of <i>results</i>. In the case of many simple search space structures, like the one you point out, it&#8217;s AOK.</p>
<p>If simplification tended to give a wrong answer, then the mythology of single-objective search wouldn&#8217;t have so much traction. People would know better if they encountered problems sooner.</p>
<p>My point is that for many practitioners, especially &#8220;in theory&#8221; and in the Academy, life has given them an easy ride. They&#8217;re making assumptions based on a string of early successes. People raised in a culture of models assume they can get away with this crap in real life, and I admit that bugs me. Look at Economics for analogies, if you want. Premature simplification without rigorous reality check is a mistake in terms of <b>project risk</b>.</p>
<p>Let&#8217;s change gears briefly. Look at software development.</p>
<p>Suppose I&#8217;m a project manager. Some cowboy solo coder tells me <i>not to worry</i> about the code he wrote by himself, in the middle of the night, without testing, without having the customer in the room at the time. Without any unit tests. Without a pair programmer. Sounds like a student programmer I used to <strike>know</strike> be.</p>
<p>That&#8217;s project risk: The customer is at risk because the coder might have failed to do what they thought they were doing (&#8220;made mistakes&#8221;). <i>And also</i> because the customer might have seen the intermediate results and changed her mind about something she specified earlier in the project. Or might have discovered new business cases that weren&#8217;t specified as goals.</p>
<p>And the project is at risk in future because other programmers <i>cannot trust code</i> like this black box, not without going to the trouble of writing their own suite of automated unit tests. Worst: the rest of the world wasn&#8217;t involved in the conversation at all, didn&#8217;t share the knowledge that this code represents; I&#8217;ll bet even the cowboy and the client will have to start over if they try to do similar work again.</p>
<p>That&#8217;s a bunch of risk. More risk if it&#8217;s an interesting project. All risks that the project will fail&#8212;where success is defined as <i>delivering what the client needs</i>.</p>
<p>And yes, I know the cowboy coder I described is &#8220;doing it the normal way.&#8221; That&#8217;s often the wrong way.</p>
<p>Is what the client specifies always what she needs? Suppose she came in and handed a three-ring binder of specifications to a smart programming team. Wrote down what she wanted, all the way to the function call structure, walked away, and came back to get the results when the team had written all the software specified in the stupid binder. No questions about language, about typos in the binder, about better ways to write the code that a team might know, no insight gained by actually making the project work. Just up-front specification of the exact solution she wants.</p>
<p>I&#8217;ll argue that project is also at risk. For many of the same reasons as the first one. It&#8217;s inagile, and any of us who have worked on software development projects know there&#8217;s <i>always</i> a better way to solve a problem than the first one you consider. You always adapt.</p>
<p>Now. Think about any complex search and optimization project. Stock trading. Pharmaceutical design. Supply chain management. Transportation scheduling. Anything more interesting than sticking integer-sized boxes in a backpack so a salesman can deliver them to the Towers of Hanoi.</p>
<p>Think about what you just said about modeling. About your assumption that we <i>know ahead of time</i> the relationship between variables. Think about what that implies about solutions&#8217; interaction with complex nonlinear constraints.</p>
<p>Think about when we make contingent decisions to combine objectives <i>or not</i>, depending on the way they interact with one another in a given problem.</p>
<p>So for sake of the other readers, consider the design pattern you just spelled out: Take two separate nondominated solutions (both are elements of the solution set mapped to points in the objective space) and create a new model that&#8217;s a simple stochastic mixture. You&#8217;ve defined a new point in the solution set&#8212;assuming I allowed you to include stochastic combination in the solution set&#8217;s definition&#8212;and this maps to a new point in objective space.</p>
<p>Your point is that for certain aggregate error measures, this new point will lie between the first two in the objective space.</p>
<p>If we (whoever &#8220;we&#8221; is) are <i>certain</i> we&#8217;ll never, ever care about anything but the average error, never care about that little constant you mention,  or the description length of your second-order solution, but only the number of states it passes through in executing its algorithm&#8230; sure. You&#8217;re right.</p>
<p>How&#8217;s that work when we ever consider the variance of the errors, though? What if the nondominated &#8220;simple error-prone&#8221; model is a coin-flip itself? What if I didn&#8217;t let you have stochastic elements when I specified the problem, if all models were deterministic? What if the data we&#8217;re testing your models on is itself full of structure, of the type which biases the log-likelihood errors?</p>
<p>What if, what if? </p>
<p>That&#8217;s really the basis of all these years of complaint about multiobjective search. <b>There are always contingencies.</b> Always reasons to change, to improve, to adapt any model, any algorithm, any approach.</p>
<p>Linear programming is a stupid way to solve problems more often than it&#8217;s the right way&#8230; but it&#8217;s right sometimes. Neural nets are stupid more often than they&#8217;re useful, but sometimes they&#8217;re the right way. Single-objective search is stupid more often than not. Nonadaptive search, offline optimization: stupid. Blind application of any methodology, or making assumptions about objectives or constraints or parameters or relations or constants: stupid.</p>
<p>We all know this.</p>
<p>So let&#8217;s go back to that software analogy I was using, just for one minute. If I were running a programming team, they would be Agile. They&#8217;d be using XP or Scrum or something to address that risk I mentioned. There would be a customer on the team. There would be tests. There would be objective, communicative tests of all design decisions.</p>
<p>So: <i>Write me a test that will detect the conditions under which a mapping from solution set to objective space can be assumed, with high probability, to be convex.</i> Alternately, and more usefully, write me a test that will detect a failure in that assumption. Something I can attach to any search method I choose, as a monitor or a decorator, and see when things are getting out of hand. A diagnostic. You can do it statistically, or analytically. I don&#8217;t care.</p>
<p>When a decision-maker can <i>ask</i> me to reduce the number of objectives for her problem, and we can run some tests and she can decide with confidence whether it&#8217;s safe or not, then I&#8217;ll be happy. When I am in the middle of a project and we&#8217;ve decided to collapse some objectives and a little alarm goes off saying &#8220;FAIL: convexity assumption violated&#8221;, then I&#8217;ll be happy.</p>
<p>If I have to <i>remember to check all the time</i>&#8230; well, that&#8217;s stupid. Just as stupid as if I had started using Linear Programming at the beginning without checking to see whether the problem was well suited to it, and spent hours running LP algorithms with no way of knowing if the solutions were converging.</p>
<p>See: you and I, and <a href="http://www.morris.umn.edu/~mcphee/" rel="nofollow">Nic</a> and <a href="http://cswww.essex.ac.uk/staff/rpoli/" rel="nofollow">Riccardo</a> and <a href="http://www.genetic-programming.com/johnkoza.html" rel="nofollow">Koza</a> and <a href="http://joechip.net/nudge" rel="nofollow">my friends on the Nudge project</a> and all the rest, we&#8217;re in worse shape than almost every other search geek. Think about it. Go back to that software project analogy. Not only are we trying to model complicated problems at the same time we&#8217;re searching through them&#8212;in the analogy, specify and refine business deliverables at the same time they&#8217;re being developed. We&#8217;re using <i>genetic fucking programming</i>.</p>
<p>We get solutions, but we have to figure out how the solutions <i>work</i>. Think about it. A million cowboy programmers writing a million black boxes, all trying to write programs <i>that we specify up front</i>?</p>
<p>Or are we better off if we watch what search is making them discover, and reach in and interact, and drive solutions toward what we <i>mutually discover</i> are not only better solutions to what we specified initially, but also better-specified?</p>
<p>How many times have we seen a GP search spit out an answer that optimizes everything we asked for&#8230; but which nonetheless &#8220;cheats&#8221;? Fails because of something we didn&#8217;t spell out.</p>
<p>I prefer dialog. Agility. Constant interactivity. On-line search and optimization, only, all the way down. <b>Because I don&#8217;t even trust myself</b> to specify or solve interesting problems. I don&#8217;t trust myself to know what I want, or how to describe it, or to know beforehand how my toolkit will interact with it. I don&#8217;t even trust myself to stick to the plan. I don&#8217;t trust myself programming.</p>
<p><i>If I don&#8217;t even trust myself</i>, even though I so obviously think highly of myself, consider how much I trust programmers writing programs for me, or running an optimization for me, or optimizers solving other complex problems  for me.</p>
<p>Especially when the &#8220;programmer&#8221; or the optimizer isn&#8217;t even human.</p>
<p>So that&#8217;s why I continue advising people never to simplify prematurely. I think I&#8217;d advise them never to do anything algorithmic at all unless they&#8217;ve got fine-grained automated tests in place to cover the details, and reliable acceptance tests in place to interrogate prospective &#8220;solutions&#8221;.</p>
<p>But I don&#8217;t know what those tests are, yet. You sketched one, though you didn&#8217;t present a reliable automated version yet. Until then, every time we step ahead without those explicit tests in place&#8212;so we know when something complicated happens&#8212;our project is at greater risk of failure.</p>
<p>And combining objectives is exactly the sort of step I mean.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Maarten Keijzer</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52350</link>
		<dc:creator>Maarten Keijzer</dc:creator>
		<pubDate>Tue, 15 Apr 2008 10:39:18 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52350</guid>
		<description>Hi Bill,

I think your comment about convexity/concavity of the solution set is a bit of the mark. Using MDL type of arguments, you can actually show that the front of non-dominated solutions for &#039;log-likelihood&#039; type of error functions is necessarily convex. I.e., you will try to minimize

error + constant * complexity

or alternatively (as a likelihood function):

p(D &#124; M) * P(M)

(where &#039;constant&#039; is hidden as an exponent in the size prior in P(M)).

Of course, fixing the constant before the run is no solution as you state, you need a multi-objective search to find the trade-off. However, strict Pareto dominance is a bit of overkill in this situation, as you can use convexity to cull the solution space.

For the sake of argument, assume that we use &#039;speed&#039; instead of &#039;size&#039; as a measure of model complexity. Apart from this being a more natural measure, it also allows one to highlight the convexity.

Suppose one has two solution, each running at a different speed. They also have a different error (MSE). The slower solution has a smaller error than the faster solution (they are non-dominated). 

Now create a &#039;combined model&#039;, which selects a model with a probability for the fitness cases. Flip a biased coin using this probability and select the first or second model based on this outcome. This means that each solution will be used randomly, biased by the coin.

Then, the error/speed trade-off of this model will necessarily lie on the line between the two original models: it&#039;s error is a weighted average of the errors of the two original models, while it&#039;s speed is the weighted average of the speeds of the two original models (plus a small constant for tossing the coin. This constant we ignore.).

Hence, any two non-dominated models dominate the entire area that lies above the line spanned by the combined model, and the solution set is convex. The pareto set that is usually used is larger than the true solution set.

(Of course, this whole argument depends on the fact that we assume that the &#039;size&#039; of a model is commensurable with the &#039;complexity&#039; of a model. In general, Kolmogorov disagrees. The speed argument however holds.)</description>
		<content:encoded><![CDATA[<p>Hi Bill,</p>
<p>I think your comment about convexity/concavity of the solution set is a bit of the mark. Using MDL type of arguments, you can actually show that the front of non-dominated solutions for &#8216;log-likelihood&#8217; type of error functions is necessarily convex. I.e., you will try to minimize</p>
<p>error + constant * complexity</p>
<p>or alternatively (as a likelihood function):</p>
<p>p(D | M) * P(M)</p>
<p>(where &#8216;constant&#8217; is hidden as an exponent in the size prior in P(M)).</p>
<p>Of course, fixing the constant before the run is no solution as you state, you need a multi-objective search to find the trade-off. However, strict Pareto dominance is a bit of overkill in this situation, as you can use convexity to cull the solution space.</p>
<p>For the sake of argument, assume that we use &#8217;speed&#8217; instead of &#8217;size&#8217; as a measure of model complexity. Apart from this being a more natural measure, it also allows one to highlight the convexity.</p>
<p>Suppose one has two solution, each running at a different speed. They also have a different error (MSE). The slower solution has a smaller error than the faster solution (they are non-dominated). </p>
<p>Now create a &#8216;combined model&#8217;, which selects a model with a probability for the fitness cases. Flip a biased coin using this probability and select the first or second model based on this outcome. This means that each solution will be used randomly, biased by the coin.</p>
<p>Then, the error/speed trade-off of this model will necessarily lie on the line between the two original models: it&#8217;s error is a weighted average of the errors of the two original models, while it&#8217;s speed is the weighted average of the speeds of the two original models (plus a small constant for tossing the coin. This constant we ignore.).</p>
<p>Hence, any two non-dominated models dominate the entire area that lies above the line spanned by the combined model, and the solution set is convex. The pareto set that is usually used is larger than the true solution set.</p>
<p>(Of course, this whole argument depends on the fact that we assume that the &#8217;size&#8217; of a model is commensurable with the &#8216;complexity&#8217; of a model. In general, Kolmogorov disagrees. The speed argument however holds.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Short Notes: Quantitative Hedge Funds, Google App Engine, DTH, itimes &#171; Blue Screen Of Duds</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52342</link>
		<dc:creator>Short Notes: Quantitative Hedge Funds, Google App Engine, DTH, itimes &#171; Blue Screen Of Duds</dc:creator>
		<pubDate>Wed, 09 Apr 2008 07:39:06 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52342</guid>
		<description>[...] more than blondes and brunettes. In a somewhat-related topic we have a fascinating entry on search algorithms that also mentions Pareto in the same breath. Most of the math in it flies like a supersonic above my head (confession time, [...]</description>
		<content:encoded><![CDATA[<p>[...] more than blondes and brunettes. In a somewhat-related topic we have a fascinating entry on search algorithms that also mentions Pareto in the same breath. Most of the math in it flies like a supersonic above my head (confession time, [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: things to look at (april 1st - april 8th) &#124; stimulant - changing things around. . .</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52341</link>
		<dc:creator>things to look at (april 1st - april 8th) &#124; stimulant - changing things around. . .</dc:creator>
		<pubDate>Wed, 09 Apr 2008 02:03:58 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52341</guid>
		<description>[...] Notional Slurry &#187; Search algorithms [...]</description>
		<content:encoded><![CDATA[<p>[...] Notional Slurry &raquo; Search algorithms [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tozier</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52339</link>
		<dc:creator>Tozier</dc:creator>
		<pubDate>Tue, 08 Apr 2008 12:10:03 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52339</guid>
		<description>Daniel,

Please post only publicly readable links to scholarly works. JSTOR links are not readable outside the academy.</description>
		<content:encoded><![CDATA[<p>Daniel,</p>
<p>Please post only publicly readable links to scholarly works. JSTOR links are not readable outside the academy.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tozier</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52338</link>
		<dc:creator>Tozier</dc:creator>
		<pubDate>Tue, 08 Apr 2008 11:34:55 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52338</guid>
		<description>Nobody ever said anything about enumerating the entirety of the Pareto front. And did I say the decision-maker has no right to minimize computation effort (or clock time spent) as an objective?

&quot;My position remains&quot; is what&#039;s throwing me off; did I come across as trying to convince you of something? Or are you chatting with Adan?

In any case, you&#039;re saying things I agree with entirely, in a tone I&#039;m not sure I get. Yes, you&#039;re right. On all counts.</description>
		<content:encoded><![CDATA[<p>Nobody ever said anything about enumerating the entirety of the Pareto front. And did I say the decision-maker has no right to minimize computation effort (or clock time spent) as an objective?</p>
<p>&#8220;My position remains&#8221; is what&#8217;s throwing me off; did I come across as trying to convince you of something? Or are you chatting with Adan?</p>
<p>In any case, you&#8217;re saying things I agree with entirely, in a tone I&#8217;m not sure I get. Yes, you&#8217;re right. On all counts.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52334</link>
		<dc:creator>Daniel</dc:creator>
		<pubDate>Tue, 08 Apr 2008 05:04:53 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52334</guid>
		<description>My position remains that advocating one way to perform all multiple objective optimisation problems is a gross simplification. It is better to understand what the decision maker wants and allow that to guide our problem solving approach. 

From http://www.jstor.org/pss/2581556:
A Priori Preference Articulation: Combines all objectives into a single objective through the use of an aggregation function, turning the multi-objective problem into a single-objective problem to search. 
Progressive Preference Articulation: May have partial preference information, and this preference information is adjusted as the search continues by interpreting the results of the search.
A Posteriori Preference Articulation: No preference information, instead the decision maker is presented with a set of candidate solutions (generated by some search process) to choose from.

If we know that the decision maker wants a specific trade-off it would be useless to search the extent of the Pareto front (wasted computation). Similarly if we don&#039;t know what the decision maker wants it would be silly to suggest one specific trade-off as being `optimal&#039;.</description>
		<content:encoded><![CDATA[<p>My position remains that advocating one way to perform all multiple objective optimisation problems is a gross simplification. It is better to understand what the decision maker wants and allow that to guide our problem solving approach. </p>
<p>From <a href="http://www.jstor.org/pss/2581556" rel="nofollow">http://www.jstor.org/pss/2581556</a>:<br />
A Priori Preference Articulation: Combines all objectives into a single objective through the use of an aggregation function, turning the multi-objective problem into a single-objective problem to search.<br />
Progressive Preference Articulation: May have partial preference information, and this preference information is adjusted as the search continues by interpreting the results of the search.<br />
A Posteriori Preference Articulation: No preference information, instead the decision maker is presented with a set of candidate solutions (generated by some search process) to choose from.</p>
<p>If we know that the decision maker wants a specific trade-off it would be useless to search the extent of the Pareto front (wasted computation). Similarly if we don&#8217;t know what the decision maker wants it would be silly to suggest one specific trade-off as being `optimal&#8217;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tozier</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52332</link>
		<dc:creator>Tozier</dc:creator>
		<pubDate>Mon, 07 Apr 2008 23:25:53 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52332</guid>
		<description>lorg,

I wouldn&#039;t call it an &quot;article&quot;. It&#039;s a post in a rambling story that&#039;s really happening &lt;a href=&quot;http://joechip.net/nudge&quot; rel=&quot;nofollow&quot;&gt;over here&lt;/a&gt;. We&#039;re building a[nother] open-source genetic programming system, and hopefully one that&#039;s not as stupid as all the rest in the world, and maybe as useful eventually as the good ones like &lt;a href=&quot;http://www.cs.gmu.edu/~eclab/projects/ecj/&quot; rel=&quot;nofollow&quot;&gt;ECJ&lt;/a&gt;.

Point of order, though: Genetic Programming is &lt;i&gt;not&lt;/i&gt; just a genetic algorithm. Genetic Programming is search through a &lt;b&gt;set of algorithms&lt;/b&gt;, complex grammatical structures, or functions, not search through a parameter space. The &quot;genetic&quot; part of the title is essentially a red herring, too: what I described above is random search, and a couple of population-based metaheuristics that are somewhat kindof like GAs, but not really.

Have a look at &lt;a href=&quot;http://www.lulu.com/content/2167025&quot; rel=&quot;nofollow&quot;&gt;Ricardo, Bill and Nic&#039;s book [a free download]&lt;/a&gt;, for more information on GP vs. GAs.</description>
		<content:encoded><![CDATA[<p>lorg,</p>
<p>I wouldn&#8217;t call it an &#8220;article&#8221;. It&#8217;s a post in a rambling story that&#8217;s really happening <a href="http://joechip.net/nudge" rel="nofollow">over here</a>. We&#8217;re building a[nother] open-source genetic programming system, and hopefully one that&#8217;s not as stupid as all the rest in the world, and maybe as useful eventually as the good ones like <a href="http://www.cs.gmu.edu/~eclab/projects/ecj/" rel="nofollow">ECJ</a>.</p>
<p>Point of order, though: Genetic Programming is <i>not</i> just a genetic algorithm. Genetic Programming is search through a <b>set of algorithms</b>, complex grammatical structures, or functions, not search through a parameter space. The &#8220;genetic&#8221; part of the title is essentially a red herring, too: what I described above is random search, and a couple of population-based metaheuristics that are somewhat kindof like GAs, but not really.</p>
<p>Have a look at <a href="http://www.lulu.com/content/2167025" rel="nofollow">Ricardo, Bill and Nic&#8217;s book [a free download]</a>, for more information on GP vs. GAs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tozier</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52331</link>
		<dc:creator>Tozier</dc:creator>
		<pubDate>Mon, 07 Apr 2008 23:05:45 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52331</guid>
		<description>Adan,

Multiobjective optimization is, indeed, subjective. That&#039;s the point. Any idiot who linearizes two objectives, sets up an affine weighting scheme that says &quot;0.8 safety + 0.2 speed&quot;, they&#039;re... well, they&#039;re an idiot. They&#039;re stepping on the domain of the decision-maker: (a) they&#039;re presupposing that the objectives are reconcilable (meaning, the solution set is convex, not concave) and (b) they&#039;re presuming that externalities not captured by those apples and oranges won&#039;t skew things in another direction. Many&#039;s the time when a customer/decision-maker has said , &quot;Give me all the optimal solutions, and I&#039;ll pick the best one.&quot; That&#039;s not dumb on their part, it&#039;s dumb on your part if you prematurely limit their choices.

Best &lt;b&gt;always&lt;/b&gt; to keep all objectives orthogonal, and let the decision-maker take her pick.

So, in sum: You don&#039;t determine weighting. Weighting is &lt;i&gt;not your job&lt;/i&gt;, it&#039;s a domain-specific expert&#039;s job. Odds are any affine weighting or even nonlinear combination functional is in practice a lie or a misstep &lt;b&gt;every time&lt;/b&gt; it&#039;s assigned before searching. Weight &lt;i&gt;after&lt;/i&gt; you search; let the decision-maker consider all the Pareto-optimal solutions, and take their pick.

Try it my way first, and see what clients---actual people making actual decisions---think. And maybe some of the things you&#039;re assuming about multiobjective optimization are misapprehensions of the terminology, the methodology, and the intent. If a customer wants to optimize for both apples and oranges, then why insist they do the &quot;simpler&quot; thing and pick one, or make up some stupid fake thing that supposedly combines them? Why not just do the relatively easy math, use the correct algorithm for the problem at hand, and optimize for both objectives simultaneously?

I have a strong feeling your answer will involve fictions like &quot;tractability&quot; and &quot;standard algorithms&quot;. Which are, no matter how important one&#039;s department is in a field, lazy BS.

That&#039;s my opinion, of course. You&#039;re welcome to carry on however you feel is appropriate for your customers&#039; needs. :)</description>
		<content:encoded><![CDATA[<p>Adan,</p>
<p>Multiobjective optimization is, indeed, subjective. That&#8217;s the point. Any idiot who linearizes two objectives, sets up an affine weighting scheme that says &#8220;0.8 safety + 0.2 speed&#8221;, they&#8217;re&#8230; well, they&#8217;re an idiot. They&#8217;re stepping on the domain of the decision-maker: (a) they&#8217;re presupposing that the objectives are reconcilable (meaning, the solution set is convex, not concave) and (b) they&#8217;re presuming that externalities not captured by those apples and oranges won&#8217;t skew things in another direction. Many&#8217;s the time when a customer/decision-maker has said , &#8220;Give me all the optimal solutions, and I&#8217;ll pick the best one.&#8221; That&#8217;s not dumb on their part, it&#8217;s dumb on your part if you prematurely limit their choices.</p>
<p>Best <b>always</b> to keep all objectives orthogonal, and let the decision-maker take her pick.</p>
<p>So, in sum: You don&#8217;t determine weighting. Weighting is <i>not your job</i>, it&#8217;s a domain-specific expert&#8217;s job. Odds are any affine weighting or even nonlinear combination functional is in practice a lie or a misstep <b>every time</b> it&#8217;s assigned before searching. Weight <i>after</i> you search; let the decision-maker consider all the Pareto-optimal solutions, and take their pick.</p>
<p>Try it my way first, and see what clients&#8212;actual people making actual decisions&#8212;think. And maybe some of the things you&#8217;re assuming about multiobjective optimization are misapprehensions of the terminology, the methodology, and the intent. If a customer wants to optimize for both apples and oranges, then why insist they do the &#8220;simpler&#8221; thing and pick one, or make up some stupid fake thing that supposedly combines them? Why not just do the relatively easy math, use the correct algorithm for the problem at hand, and optimize for both objectives simultaneously?</p>
<p>I have a strong feeling your answer will involve fictions like &#8220;tractability&#8221; and &#8220;standard algorithms&#8221;. Which are, no matter how important one&#8217;s department is in a field, lazy BS.</p>
<p>That&#8217;s my opinion, of course. You&#8217;re welcome to carry on however you feel is appropriate for your customers&#8217; needs. <img src='http://williamtozier.com/slurry/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adan</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52330</link>
		<dc:creator>Adan</dc:creator>
		<pubDate>Mon, 07 Apr 2008 22:33:01 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52330</guid>
		<description>I&#039;ve never been a fan of multi-objective functionals.  In fact, I would say they are somewhat meaningless.  How does one determine the weighting between any particular objective (usually completely subjective)?  This is similar to adding apples and oranges.  For a cost functional to make sense it must be composed of common units.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve never been a fan of multi-objective functionals.  In fact, I would say they are somewhat meaningless.  How does one determine the weighting between any particular objective (usually completely subjective)?  This is similar to adding apples and oranges.  For a cost functional to make sense it must be composed of common units.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lorg</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52329</link>
		<dc:creator>lorg</dc:creator>
		<pubDate>Mon, 07 Apr 2008 20:55:19 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52329</guid>
		<description>Excellent and very interesting read. I&#039;ll be sure to try that out when I get the opportunity.

A few months ago I tried solving some small problem with what I remembered about GA&#039;s and had no luck. After reading your article, I&#039;ll be smarter the next time around. It really has been quite some time since I read about the subject.

Also, good reading links. You&#039;ve been added to my rss reader :)</description>
		<content:encoded><![CDATA[<p>Excellent and very interesting read. I&#8217;ll be sure to try that out when I get the opportunity.</p>
<p>A few months ago I tried solving some small problem with what I remembered about GA&#8217;s and had no luck. After reading your article, I&#8217;ll be smarter the next time around. It really has been quite some time since I read about the subject.</p>
<p>Also, good reading links. You&#8217;ve been added to my rss reader <img src='http://williamtozier.com/slurry/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tozier</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52319</link>
		<dc:creator>Tozier</dc:creator>
		<pubDate>Fri, 04 Apr 2008 00:36:54 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52319</guid>
		<description>Bah. Don&#039;t flatter me. And let&#039;s take the discussion to the &lt;a href=&quot;http://joechip.net/nudge&quot; rel=&quot;nofollow&quot;&gt;Nudge blog&lt;/a&gt;. Post future discussions there, anyway; the guys would all love to chat.

Seems to me you should talk with Guido and Katya about that question. I have a vague memory of her possibly doing some real experiments with ParetoGP size dynamics for her thesis. And I have vivid memories of Guido dismissing bloat out of hand, since it &quot;never happens if you just maintain model complexity as a distinct objective.&quot; I know they focus 99% on symbolic regression, so they&#039;re firmly in your theorists&#039; space of Koza-style trees.

But you know all that, of course, having spent time sitting and talking with Kotanchek, Smits and all. So I&#039;m not sure I follow your question. You asking whether it works?

I&#039;d imagine theory is almost trivial: The expected offspring size change after crossover must surely decrease if there&#039;s been selection pressure for smaller parents. Doesn&#039;t that make sense?

Further, as the population matures there are small, high-error parents and large, low-error parents in the population (look at any of Kotanchek&#039;s &amp; Smits&#039;s graphs, say in GPTP IV). Under those circumstances, I can&#039;t see how crossover can &lt;i&gt;ever&lt;/i&gt; exceed a certain steady-state limit. It&#039;s just a matter of the innate tendency of unequal crossover to grow bigger code trees, offset by the innate selection for smaller ones.

Or are you asking something else I&#039;m missing?

In any case, it&#039;s worth a look. You wanna do it? We&#039;ll help.</description>
		<content:encoded><![CDATA[<p>Bah. Don&#8217;t flatter me. And let&#8217;s take the discussion to the <a href="http://joechip.net/nudge" rel="nofollow">Nudge blog</a>. Post future discussions there, anyway; the guys would all love to chat.</p>
<p>Seems to me you should talk with Guido and Katya about that question. I have a vague memory of her possibly doing some real experiments with ParetoGP size dynamics for her thesis. And I have vivid memories of Guido dismissing bloat out of hand, since it &#8220;never happens if you just maintain model complexity as a distinct objective.&#8221; I know they focus 99% on symbolic regression, so they&#8217;re firmly in your theorists&#8217; space of Koza-style trees.</p>
<p>But you know all that, of course, having spent time sitting and talking with Kotanchek, Smits and all. So I&#8217;m not sure I follow your question. You asking whether it works?</p>
<p>I&#8217;d imagine theory is almost trivial: The expected offspring size change after crossover must surely decrease if there&#8217;s been selection pressure for smaller parents. Doesn&#8217;t that make sense?</p>
<p>Further, as the population matures there are small, high-error parents and large, low-error parents in the population (look at any of Kotanchek&#8217;s &#038; Smits&#8217;s graphs, say in GPTP IV). Under those circumstances, I can&#8217;t see how crossover can <i>ever</i> exceed a certain steady-state limit. It&#8217;s just a matter of the innate tendency of unequal crossover to grow bigger code trees, offset by the innate selection for smaller ones.</p>
<p>Or are you asking something else I&#8217;m missing?</p>
<p>In any case, it&#8217;s worth a look. You wanna do it? We&#8217;ll help.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nic McPhee</title>
		<link>http://williamtozier.com/slurry/2008/04/02/search-algorithms/comment-page-1#comment-52317</link>
		<dc:creator>Nic McPhee</dc:creator>
		<pubDate>Thu, 03 Apr 2008 15:00:50 +0000</pubDate>
		<guid isPermaLink="false">http://williamtozier.com/slurry/2008/04/02/search-algorithms#comment-52317</guid>
		<description>A really cool summary - one of the best blog posts I&#039;ve ever seen about GP, to be honest.

Some recent work by Stephen Dignum and Riccardo Poli shows that things like size limits actively speed up bloat - the average program sizes grow faster with limits, they just stop growing when they hit the limit.  Do you know if anyone&#039;s looked at the impact of the kind of pareto approaches you&#039;re suggesting?  My guess is that they wouldn&#039;t have that affect (which would be cool), but no promises.</description>
		<content:encoded><![CDATA[<p>A really cool summary &#8211; one of the best blog posts I&#8217;ve ever seen about GP, to be honest.</p>
<p>Some recent work by Stephen Dignum and Riccardo Poli shows that things like size limits actively speed up bloat &#8211; the average program sizes grow faster with limits, they just stop growing when they hit the limit.  Do you know if anyone&#8217;s looked at the impact of the kind of pareto approaches you&#8217;re suggesting?  My guess is that they wouldn&#8217;t have that affect (which would be cool), but no promises.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
