Archive for Uncategorized
April 24, 2009 at 11:18 am · Filed under Project, Tidbits of nanohistory, Uncategorized
From John Dewey’s Logic: The Theory of Inquiry, by way of John J. McDermott’s The Philosophy of John Dewey: The Structure of Experience, this summary of Dewey’s own chapter on the nature of inquiry.
In particular, this strikes me as something that bears on many discussions I’ve had about machine learning and modern statistics. And it reminds me of a cultural problem I’ve been wrestling with among genetic programming researchers and operations research people for some time. And would be useful in explaining the pedagogy and practice of engineering “craftsmanship”, and more specifically that of software development.
Oh, and complex systems research and emergence, too. That’s in there, somehow.
So you can see why I might think it’s important to understand.
I can’t quite put my finger on it, but something in here—perhaps obfuscated by what today we might perceive as a difficult style, but which is an attempt to convey very specific concepts in a way that tries to avoid misunderstanding—is vital to many threads in modern life. In particular, something deeply important happens down in the last paragraph, where I’ve highlighted it.
I would love to have a correspondent who could discuss this productively. Perhaps one might be found to read the original Dewey, or even the few surrounding pages extracted in McDermott’s summary, and tell me just what it is I’m responding to?
…Inquiry is the directed or controlled transformation of an indeterminate situation into a determinately unified one. The transition is achieved by means of operations of two kinds which are in functional correspondence with each other. One kind of operations deals with ideational or conceptual subject-matter. This subject-matter stands for possible ways and ends of resolution. It anticipates a solution, and is marked off from fancy because, or, in so far as, it becomes operative in instigation and direction of new observations yielding new factual material. The other kind of operations is made up of activities involving the techniques and organs of observation. Since these operations are existential they modify the prior existential situation, bring into high relief conditions previously obscure, and relegate to the background other aspects that were at the outset conspicuous. The ground and criterion of the execution of this work of emphasis, selection and arrangement is to delimit the problem in such a way that existential material may be provided with which to test the ideas that represent possible modes of solution. Symbols, defining terms and propositions, are necessarily required in order to retain and carry forward both ideational and existential subject-matters in order that they may serve their proper functions in the control of inquiry. Otherwise the problem is taken to be closed and inquiry ceases.
One fundamentally important phase of the transformation of the situation which constitutes inquiry is central in the treatment of judgement and its functions. The transformation is existential and hence temporal. The pre-cognitive unsettled situation can be settled only by modification of its constituents. Experimental operations change existing conditions. Reasoning, as such, can provide means for effecting the change of conditions but by itself cannot effect it. Only execution of existential operations directed by an idea in which ratiocination terminates can bring about the re-ordering of environing conditions required to produce a settled and unified situation. Since this principle also applies to the meanings that are elaborated in science, the experimental production and re-arrangement of physical conditions involved in natural science is further evidence of the unity of the pattern of inquiry. The temporal quality of inquiry means, then, something quite other than that the process of inquiry takes time. It means that the objective subject-matter of inquiry undergoes temporal modification.
Terminological. Were it not that knowledge is related to inquiry as a product to the operations by which it is produced, no distinctions requiring special differentiating designations would exist. Material would merely be a matter of knowledge or of ignorance and error; that would be all that could be said. The content of any given proposition would have the values “true” and “false” as final and exclusive attributes. But if knowledge is related to inquiry as its warrantably assertible product, and if inquiry is progressive and temporal, then the material inquired into reveals distinctive properties which need to be designated by distinctive names. As undergoing inquiry, the material has a different logical import from that which it has as the outcome of inquiry. In its first capacity and status, it will be called by the general name subject-matter. When it is necessary to refer to subject-matter in the context of either observation or ideation, the name content will be used, and, particularly on account of its representative character, content of propositions.
The name objects will be reserved for subject-matter so far as it has been produced and ordered in settled form by means of inquiry; proleptically, objects are the objectives of inquiry. The apparent ambiguity of using “objects” for this purpose (since the word is regularly applied to things that are observed or thought of) is only apparent. For things exist as objects for us only as they have been previously determined as outcomes of inquiries. When used in carrying on new inquiries in new problematic situations, they are known as objects in virtue of prior inquiries which warrant their assertibility. In the new situation, they are means of attaining knowledge of something else. In the strict sense, they are part of the contents of inquiry as the word content was defined above. But retrospectively (that is, as products of prior determination in inquiry) they are objects.
[Latter emphasis is mine.]
February 27, 2009 at 2:26 pm · Filed under Uncategorized
We have a Citi MasterCard that was one of the (apparently) hundreds of thousands whose security was compromised in the recent Heartland Security Breach.
I’d heard the news about the breach, but the first sign I had that we were involved was when I tried to use the card for an online purchase. No email, no phone call, nothing from Citi regarding the problem. When the transaction failed three or four times I knew it wasn’t the vendor website’s fault, so I checked my Citi account online. There I saw a bright red warning that my account had been shut down because of risk of compromise.
When I called (this was back on February 20th or so, I think) to complain about the lack of notice, the customer service representative explained that Citi had no time or resources to notify all the cardholders, especially given the scale of the possible breach, but had rather acted to place all the possibly compromised accounts on hold as soon as they could. I was told they had issued new cards with new account numbers, at no charge to any of us, and that the new card would be here shortly.
Well, we got the new card, and we activated it and set up online access.
Interesting thing we discover, which (aside from the general lack of coverage of the Heartland fiasco in the press and blogosphere) is why I’m bothering to write this: a strange charge we didn’t recognize, with code TOTAL SEC BALANCE TRANSFR-ITEMIZED. The amount charged ($99) was the same as the new charges that had accrued on the old account before the transfer, but “99″ is one of those numbers that makes you wonder about intentional design. In any case, this clearly implied we had either been double-charged, or charged an extra and unauthorized $99 fee.
So I got back on the phone and called customer service just now, and spoke with Jim. He explained to me that TOTAL SEC BALANCE TRANSFR-ITEMIZED was a “system message”, which represented (as it seemed) the sum of items booked to the old closed account just before the new one was set up. He explained it was an “accounting quirk in their system”, and that it would disappear at the beginning of the next billing cycle. Merchants had authorized $99 worth of charges right before the account was closed and balances were transferred, and the mysterious line item indicated the transition from “authorization” to actual charge. Jim explained that generally this transition removes the authorization charge from the billing system, but because the account changed in the interim period, the charge accrued on the new account but the authorization couldn’t be removed from the old one (or something like that). He pointed out (very helpfully) that if my card had been misplaced or stolen, the same dynamics would have kicked in there, too, and the same sort of transactions would have happened.
This got me thinking. It may be ephemeral, a “quirk of the system”, but nonetheless on the books and until the authorization is cleared I owe an extra $99 to Citi. It’s mere coincidence of timing that our account came to $99. But it seems highly likely (given the several-days typical delay between authorization and charge in many merchants’ transactions) that any regular cardholder might have one or more transactions spanning a period like this.
So here we have hundreds of thousands, or millions of credit card accounts, all compromised and all synchronously being transferred to new accounts. What fraction of those had interrupted transactions spanning the synchronized transfer, resulting in these TOTAL SEC BALANCE TRANSFR-ITEMIZED “system messages”?
The numbers are hard for me to even estimate with the information I have on hand (though Jim did allow it was “really a lot” of cards). Seems big.
The thing I have to wonder about is: just at this crucial juncture in the financial crisis, when the company is under the closest scrutiny in decades and the stock is suffering from massive loss of investor faith, Citi has double-booked a sizable Accounts Receivable sum.
And probably not just Citi….
February 2, 2009 at 11:40 am · Filed under Project, Uncategorized
Still trying to put my finger on something bothering me. Very subjective, no doubt ill-considered… but still there and not quite stated clearly enough.
This is something about business, project management, planning and implementation: About how a certain class of manager views the specification of goals, the sense that goals met create business value, and how those people deal with the real people whose work it is to connect the two idealizations (goal, value) to one another by applying their experience, insight, and ability to communicate.
In my experience, software developers are appropriate to the task; “computer programmers” cannot as a rule reliably deliver value from their work.
This is something about pedagogy, graduate training, the Academy and specialization: About how grant applications are written years before monies are acquired; how “real” academic projects are spelled out in grant applications as if foresight were perfect and exploration was rational, while the work is done by substitutable and inexperienced students and young faculty; how “homework” projects and evaluations are treated as if individual people can learn in a vacuum of reading and self-direction and wordy lecture, as if textbooks were helpful without conversation; as if the cost, utility, quality and duration of scholarship were all perfectly fungible with one another, perfectly liquid… subject to insignificant exchange costs not worthy of note.
In my experience, students learn when they work collectively on a shared goal, supporting one another, and in the process learn by discovering and sharing their nonoverlapping skills: when they “cheat”. “Stars” who cannot explain their work, who cannot collaborate, who disdain “cheating” (by the standards of most modern Honor Pledges and tenure review committees) by sitting quietly by themselves and doing what their massive insight has revealed is the path to what you (mere people) need… these folks cannot as a rule reliably deliver value from their work.
This is something about the theory and practice of artificial intelligence, operations research, machine learning, and metaheuristics: About the unwillingness or inability to treat techniques prescriptively except as a form of self-promotion of one’s own research or personal bias; about the strangely persistent shortfall in communicating the utility of those thousand variant methods from linear programming to fictitious play to genetic programming or graphical model learning, any one of which might potentially answer questions, identify patterns, and help people invent software or physical engineering designs; about a culture of “practitioners” who cannot be bothered to learn enough theory to explain why their approach is sufficient for their particular tasks, and a separate culture of “theorists” who cannot be bothered to learn enough of best practice to explain why their approach is necessary for any task.
In my experience, the average time an algorithm is expected to run may be of interest, but as far as my particular problem is concerned it has no bearing until I have run it for a while to see some results, see how it’s going, suss out what “kind” of problem this specific instance is—to see what value comes from “how long” it will take to run, as opposed to seeing any answer at all. I do work, I create stuff, to better understand the path from idealized goal to realized value. Things like speed, accuracy, ease of use and understandability, these are things I try to measure, not assume beforehand for some combination of problem and approach, and I want information with which to update my assessments as quickly and accurately as possible. Because for some strange reason I am unable to tell beforehand how difficult an interesting instance of a problem will be, even with the most familiar approach.
I have a great deal of both practical experience and theoretical backing in these matters, and all that has happened for me (your mileage may vary) is that I am more uncertain about my prejudices, and yours, all the time.
On average, doing something small, immediately, is better than talking a long time about the many things you could do, about potentialities and limits and average behavior. And perhaps better than doing “just anything” is considering the small set of simple incremental improvements, selecting the one that seems it will provide the most value for that scale of effort, and trying it.
In too many domains we conflate rationality with rigor, and treat the straightest path between them as a recipe for success. But isn’t “rationality” an intentionally bounded thought process? a strategy of fully dismissing alternatives as greedily and thoroughly as possible?
But I don’t want to spend my time with rigorous people. They’re fucking annoying, when you get right down to it. When I’m actually trying to solve a problem, I would prefer to collaborate with ten experienced people (some “theorists”, some “practitioners”) who can speak quickly, approximately, and explore oh so many alternatives. I want people who can use simple, stupid, non-optimal tools all of us poor fools can understand… but who in using those tools discover many paths by which we might collectively trace our way—any goddamned way as long as we arrive—from our immediate goal to our desired value.
Because value trumps method.
And value (as I’ve said) is something that may not be rationally predictable. Value comes along the way, it emerges. Value in so many cases is contingent on multiple scales of experience, long and short term, on constantly revised and discarded models, on alternative hypotheses easily exchanged. Achieving value depends on my tools, my inclination, my habit. On what I’ve done so far.
And all these change from person to person, from problem to problem. From moment to moment. In my experience, on a shorter scale than any—any—problem-solving method, whether it’s a business project, a thesis or grant, a single “simple” application of heuristic to instance.
Something deep is missing out there.
February 1, 2009 at 2:47 pm · Filed under Project, Uncategorized
I’m working on a presentation and a chapter for the forthcoming GPTP Workshop, and trying to capture something that’s bothered me for… well, as long as I’ve been writing computer simulations and doing algorithmic search and optimization, which is (jesus) like 3/4 of my life. And moreso recently, when I went back to graduate school in Industrial & Operations Engineering, and was exposed to a suite of cultural norms I had only experienced indirectly when I was a biologist.
And I’m not sure how best to put my finger on it or sum it up, so let me just dump a little pile here to fester while I try to think more: A core myth of “modern” computer science and applied mathematics—a foundational one it seems—is that algorithms are autonomous and atomic.
And yes, this probably seems like a “yeah, so?” realization. But I sit here working on the Nudge system and designing it to be used interactively in exploratory settings (unlike, as far as I know, any other GP system). And I found myself rolling my eyes (again) at the senseless folderol a computer science graduate was saying about software development the other day at lunch, about how anything that “answers the question as fast as possible” is the best programming solution, QED. And so on.
I can’t think of a single example of a search, optimization, machine learning, neural net training, agent-based simulation, AMPL optimization or other programming project and “run”, in a 25 year span, where I didn’t watch what was happening, see a problem, stop the “run”, make changes, and re-start it. Not one. I’ve fiddled with training/test data breakdowns, seen symptoms of bugs and model deficiencies and statistical anomalies that lead me to intervene, or seen slowness (or over-eagerness) to converge that led me to improve my code, or seen transient patterns that were more useful or interesting than the “real” program paid attention to.
Well, OK: Maybe I’m not a very good programmer. This is a thing I would agree with.
I note that I haven’t written a paper or even an email without revision. I haven’t had an earnest conversation on a technical topic without some minor argument and restatement and analysis. I haven’t willingly programmed in maybe a decade without unit tests and a dynamic notion of “requirements” and “goals”. And I haven’t been in a seminar without questioning the direction of the research, asking about tangents and parallel tracks and the roads untraveled.
It’s what people do.
Yet in AI research, and in not just the little byway that’s genetic programming but also that broader world of computer science and operations research and machine learning and datamining and so on, people still act as if analysis, modeling, design and programming were something utterly, distantly separate from execution of code. As if there were a “right” algorithm in a general case, as if faster was always better, as if it is not the job of an engineer to know anything about domain, or to adapt in any way to “externalities”.
As if you could specify a problem up front, spell out everything in a nice three-ring binder, and “hand” this specification to some plug-compatible mechanistic “solver” or “programmer” that was optimally fast and provably convergent and correct in the limit, and the lights would flash and the bell would go “ding” and a little punch card would poke out at you like the pert tongue of Athena herself with the answer.
This is a problem for me.
Quite literally, since I gladly walked away from my last Ph.D. program (which was an excellent one in its field) for essentially this core difference. There’s something wrong, and I increasingly believe dangerous, about… well, something I can’t quite name. Call it “hubris” or “cowboy culture” or “objectivism” if you really want to get nasty. That suite of traits that includes financial engineering’s unquestioning reliance on stupid “simplifying” assumptions; and computer science’s interest in algorithmic complexity at the expense of finding answers to questions; and almost all of operations research, where “be wise, linearize” is a mantra; and my own technical specialty of metaheuristics, where even today people hand me charts labeled “average performance vs. time” no matter how many times I reject their papers and yell at them in print because I have never cared about average performance.
There’s a stink of mind:body duality in there. A kind of biased religio-mathematization that imagines there is a best, an ideal, a way of delimiting a idealized set of problems that is better and more tractable and more elegant than any single instance.
Than the real world, for example.
And increasingly, I think Herb Simon is the antichrist because of it.
When I’m designing a genetic programming system, or a multiagent simulation, or a software development (not computer science) project, or a meeting or a story for that matter, I’m not looking for autonomy.
The basis of my interest in genetic programming (and machine learning and statistics more generally) is how it aids people. The C programming language, as far as I’m concerned, is not automatically “faster” than Python, because I count the time it takes to think and write and debug and understand a C program and a Python program. If the same algorithm will take ten times longer to code in C than Python, and may hide secret bugs behind stupid pointer errors or strange type handling, and which blocks my ability to use test-driven development and emergent software design… that’s worse, not better.
And that same shortcoming is true, I realize, about the way academics approach nonlinear programming and bioinformatics and swarm-based computing and stuff, too. Papers are written, projects undertaken, grant monies spent, and graduates pooped out into the workplace as if people who haven’t even met me could determine what I wanted in a given situation.
They piss me off like the worst marketers do, in other words. [Ironically, the most beloved of my academic friends never watch TV, and the most beloved of my marketing friends never pay attention to the math....]
Here: No matter what your professor tells you, people still have to analyze and model a problem; spend time typing C or Python or AMPL code somewhere; debug semicolons or memory management or matrix definitions or recursion stacks; spend hours staring at results trying to concoct rules from their intuitions for acceptability (or risk re-running their experiments tenfold with different parameters in an attempt to “get better results”).
I count the conversations, the lab meetings, the code review and unit test writing, the peer review and the conferences and the late nights spent working waiting to see—like Kekule—the devils dance in a circle before we understand benzene’s structure. I count how hard it is to talk about something, how long it takes to see a way of solving a problem, how hard it is to understand what you have in the end, to tell whether you’re “done” or not. And how hard it is to do it again, to re-use what you’ve learned. I count that as wall-clock time, as my own measure of “net computational complexity”.
I suppose my mental model is much more a kind of heuristic conversation, a partnership between mathematics, man and machine. Where software and mathematics are a simply ways of framing special parts of a conversation.
Value does not automatically come with speed, or even with rigor. I do not value rigor in my conversations; I find it cloying. I prefer exploration (of ideas and errors) and exploitation (of good ideas and cliches) in balance, not just one or the other.
Why do you think I blame Herb? Hint: pragmatism. And if not Herb, who should I blame?
update: Part of why I want to blame Herb Simon comes from conversations with Michael Cohen, some years ago. See, for example, his “Reading Dewey: Reflections on the Study of Routine” in Organization Studies (2007) vol 28 pg 773.
January 19, 2009 at 11:59 am · Filed under Uncategorized

, 
, 
, 
, and finally American Magazine Journalists, 1741-1850 (Dictionary of Literary Biography) Volume 73
, which pisses me off because it’s so goddamned expensive. Gale Research (now displaying your name in the skyline near my house), you are increasingly becoming an obstacle.
[He said, waving a fist at the sky, not realizing that Gale Research might well be a different animal from the Thomson Reuters sign he was indicating. And also failing to connect in any way his disappointment in finding how expensive the Dictionary of Literary Biography actually is to his failure in reading the one he had been hosting in his own home for several weeks.]
December 1, 2008 at 11:56 am · Filed under Uncategorized
Dear CRC Press,
I am writing to say that I am forced interpret your recent postal offer of “deep discounts” on several dozen poorly-written, shoddily-manufactured, untimely and obsolete vanity press offerings in fields for the most part unrelated to my work or interests, said “deep discounts” reducing the price to a mere $99 per useless volume, as a telling symptom of some sort of neurodegenerative or psychopathological disorder.
In you, I mean.
I hope you take the time to see a doctor, and you may want to set your affairs in order.
My deepest condolences, in any event, to the authors of the poorly-written, shoddily-manufactured, untimely vanity press offerings you have been foisting for decades. They will inevitably be forced to seek actual editors among their peers and readers, and rework their peculiarly isolated notions in order to appeal to an actual audience, and I am saddened to observe that you have given them little or no cultural support in those skills through the years.
That said, if you have any more 1980s-vintage Handbook of Chemistry and Physics, I am interested in taking them off your hands. Ever since I graduated (with my first degree in the sciences), I have found they are excellent for pressing flowers and leaves for art & craft projects.
The latter offered by way of solace, if you were feeling completely useless. Buck up!
November 30, 2008 at 9:32 am · Filed under Uncategorized
Google Maps must have purchased a new suite of road information recently. Or maybe they algorithmically tried to “improve” the dataset they had. Used to be it knew local geography pretty well; now, not so much.
When asking for directions from our farm (on Walsh Road, Webster Township) to the Dairy Queen in Hamburg Township, the driving (not walking!) algorithm suggests we stay on northbound Scully. If you saw it from a distance, the satellite image would lead you to believe that, yes.

Except that many years’ fierce argument at the county border has left a nasty but potent gate blocking the road, which will persist into the foreseeable future.

If you were to drive up the rough, mainly untended Scully Road on a snowy day, trying to get (say) to a hospital in Pinckney or something, the least that would happen is you’d waste a half-hour trying to back out of the last few hundred yards without ending up in a ditch… once you arrived at the impassable gate at the border, and well after you had trespassed on a private road at the end.
The De Lorme Michigan Atlas & Gazetteer, a nice old printed book I keep in my car, and which is so obsolete that it shows little red lines for roads of all sizes and characters, manages to catch the gap.
Now every dataset contains errors or missing information. But every time that dataset is used to make a single, summary statement, based on a single model? Badness can happen in unexpected ways. In fact, I am obliged to be curmudgeonly about it because of my professional experience in these matters: it is always wrong to present a single answer for any multi-objective or highly constrained decision-making problem. Big, fat period.
I can’t complain, in all honesty, about advice given by a black box operations-research algorithm that on inspection I knew was incorrect. You get what you pay for. But I can complain about a cultivating a misleading user experience in a ubiquitous data-driven decision support system that presents only one solution at a time to the decision-maker. Hell, every iPhone in the world has one of these on it; they’re all wrong, too.
No, I don’t think I am feeling lucky, Google. And you didn’t even ask.
I want to see a sheaf of routes. The little “adjust the route and recalculate a new one using my milestones” handles Google introduced a few years back are a beautiful thing, a cunning artifact and a useful tool! And of course, the standard “avoiding highway” or “fastest” toggles let me reach in and fiddle with the search method. But only indirectly.
I want the objectives right there, not combined. I want not just to surface the meter (to use a phrase Dan Cooney’s taught me), but surface all of them. I want choices coupled to clearly differentiable supporting arguments.
Like the basic Google Search results themselves: ten routes at a time, ranked somehow. Or not even ranked, but handed to me as a Pareto-equivalent set of alternatives, some faster, some bumpier, some with bigger roads, some with more gas stations, some more scenic. Heck, maybe I just want to know there are at least ten ways to go back and forth, so I can stage a race, or not get bored on my commute, or defend against unwanted SUV invasion by a foreign county or something.
At least sometimes. Stop assuming I’m feeling lucky.
Next time, we can bitch about the misleading user experience and illusory authority created by the fuckin’ weather forecast format. Everybody complains about the weather forecast, but nobody does anything about it.
Lather, rinse, repeat.
November 3, 2008 at 11:31 am · Filed under Uncategorized
Barbara bought me two inexpensive packs of simple black socks the other day at the local department chain. Name brand. You’d recognize them.
Both came in zip-lock bags. Six pairs of socks sold in a bag with a sliding reclosure mechanism.
In both cases the package labels called out the fact of the “RESEALABLE BAG”. So clearly it’s important.
However, in both cases the bags were permanently sealed by fusing, above the line of the resealer. Without perforations, there was no way to open the bags the first time without damaging the resealer mechanism.
So: point?
I’m not the first person to have asked the question. But no clear answer is forthcoming.
October 5, 2008 at 7:30 pm · Filed under Uncategorized
What would happen if a government (not this one, surely, but some US government) declared all credit default swap contracts null and void?
Just killed them all. No more payments, no more indemnification.
What would happen?
September 23, 2008 at 5:55 pm · Filed under Uncategorized
I’m still reading Woody Holton’s Unruly Americans
, and I’m struck by the incessant similarities between Recent Events and those in the period immediately after the Revolutionary War in the US.
More detail to follow, but without being pessimistic about it at all, I find myself returning to a sequence of questions that has popped into my Highly Paid Futurist’s head in each of the last five socioeconomic shocks:
- What are the chances that a Constitutional Convention will be convened in the next decade?
- Under what circumstances will state or regional secession be considered by one or more US states?
- How many “nations” (scare-quoted on purpose, since I’m suspicious the term may come to mean something new) will there be in North America in 12 years?
As I said, I’m not too worried about these things. But the increasing scale of these shocks, ranging from the Oil Crisis, to the Savings & Loan debacle, to the LTCM bailout, and now this… the HPF in me keeps wanting to use the word disintermediation.
And I don’t mean in the context of a business plan.
August 31, 2008 at 8:07 am · Filed under Uncategorized
Oh, you two know each other already?
[Happy Labor Day, people. Especially those of you in the Academy, who don't imagine those two fellows are involved in your lofty endeavor.]
Older entries »