Notional Slurry Logo

Archive for April, 2009

links for 2009-04-24

Dewey’s “Pattern of Inquiry”: money shot

From John Dewey’s Logic: The Theory of Inquiry, by way of John J. McDermott’s The Philosophy of John Dewey: The Structure of Experience, this summary of Dewey’s own chapter on the nature of inquiry.

In particular, this strikes me as something that bears on many discussions I’ve had about machine learning and modern statistics. And it reminds me of a cultural problem I’ve been wrestling with among genetic programming researchers and operations research people for some time. And would be useful in explaining the pedagogy and practice of engineering “craftsmanship”, and more specifically that of software development.

Oh, and complex systems research and emergence, too. That’s in there, somehow.

So you can see why I might think it’s important to understand.

I can’t quite put my finger on it, but something in here—perhaps obfuscated by what today we might perceive as a difficult style, but which is an attempt to convey very specific concepts in a way that tries to avoid misunderstanding—is vital to many threads in modern life. In particular, something deeply important happens down in the last paragraph, where I’ve highlighted it.

I would love to have a correspondent who could discuss this productively. Perhaps one might be found to read the original Dewey, or even the few surrounding pages extracted in McDermott’s summary, and tell me just what it is I’m responding to?

…Inquiry is the directed or controlled transformation of an indeterminate situation into a determinately unified one. The transition is achieved by means of operations of two kinds which are in functional correspondence with each other. One kind of operations deals with ideational or conceptual subject-matter. This subject-matter stands for possible ways and ends of resolution. It anticipates a solution, and is marked off from fancy because, or, in so far as, it becomes operative in instigation and direction of new observations yielding new factual material. The other kind of operations is made up of activities involving the techniques and organs of observation. Since these operations are existential they modify the prior existential situation, bring into high relief conditions previously obscure, and relegate to the background other aspects that were at the outset conspicuous. The ground and criterion of the execution of this work of emphasis, selection and arrangement is to delimit the problem in such a way that existential material may be provided with which to test the ideas that represent possible modes of solution. Symbols, defining terms and propositions, are necessarily required in order to retain and carry forward both ideational and existential subject-matters in order that they may serve their proper functions in the control of inquiry. Otherwise the problem is taken to be closed and inquiry ceases.

One fundamentally important phase of the transformation of the situation which constitutes inquiry is central in the treatment of judgement and its functions. The transformation is existential and hence temporal. The pre-cognitive unsettled situation can be settled only by modification of its constituents. Experimental operations change existing conditions. Reasoning, as such, can provide means for effecting the change of conditions but by itself cannot effect it. Only execution of existential operations directed by an idea in which ratiocination terminates can bring about the re-ordering of environing conditions required to produce a settled and unified situation. Since this principle also applies to the meanings that are elaborated in science, the experimental production and re-arrangement of physical conditions involved in natural science is further evidence of the unity of the pattern of inquiry. The temporal quality of inquiry means, then, something quite other than that the process of inquiry takes time. It means that the objective subject-matter of inquiry undergoes temporal modification.

Terminological. Were it not that knowledge is related to inquiry as a product to the operations by which it is produced, no distinctions requiring special differentiating designations would exist. Material would merely be a matter of knowledge or of ignorance and error; that would be all that could be said. The content of any given proposition would have the values “true” and “false” as final and exclusive attributes. But if knowledge is related to inquiry as its warrantably assertible product, and if inquiry is progressive and temporal, then the material inquired into reveals distinctive properties which need to be designated by distinctive names. As undergoing inquiry, the material has a different logical import from that which it has as the outcome of inquiry. In its first capacity and status, it will be called by the general name subject-matter. When it is necessary to refer to subject-matter in the context of either observation or ideation, the name content will be used, and, particularly on account of its representative character, content of propositions.

The name objects will be reserved for subject-matter so far as it has been produced and ordered in settled form by means of inquiry; proleptically, objects are the objectives of inquiry. The apparent ambiguity of using “objects” for this purpose (since the word is regularly applied to things that are observed or thought of) is only apparent. For things exist as objects for us only as they have been previously determined as outcomes of inquiries. When used in carrying on new inquiries in new problematic situations, they are known as objects in virtue of prior inquiries which warrant their assertibility. In the new situation, they are means of attaining knowledge of something else. In the strict sense, they are part of the contents of inquiry as the word content was defined above. But retrospectively (that is, as products of prior determination in inquiry) they are objects.

[Latter emphasis is mine.]

links for 2009-04-23

links for 2009-04-22

links for 2009-04-21

  • "Based on an article originally published in Technicalities (v. 25, no. 5, Sept./Oct. 2003), this pamphlet provides a brief overview of the Functional Requirements for Bibliographic Records (FRBR) as developed by the International Federation of Library Associations (IFLA). Using full-color graphics, What is FRBR? outlines the background of the development of the Functional Requirements, the concepts involved and their potential impact on cataloging rules, bibliographic structures and systems design for cataloging applications."
  • "In the short run, the Google Book Search settlement will unquestionably bring about greater access to books collected by major research libraries over the years. But it is very worrisome that this agreement, which was negotiated in secret by Google and a few lawyers working for the Authors Guild and AAP (who will, by the way, get up to $45.5 million in fees for their work on the settlement—more than all of the authors combined!), will create two complementary monopolies with exclusive rights over a research corpus of this magnitude. Monopolies are prone to engage in many abuses.
    The Book Search agreement is not really a settlement of a dispute over whether scanning books to index them is fair use. It is a major restructuring of the book industry’s future without meaningful government oversight. The market for digitized orphan books could be competitive, but will not be if this settlement is approved as is."

links for 2009-04-20

links for 2009-04-19

links for 2009-04-16

  • "I’ve heard a variation on Benton’s phrase “good people get good jobs” in a number of venues, some from clueless, privileged wankers who said it in earnest, sometimes from professors who said that the phrase was the extent of the advice they got from their grad school advisors shortly before being turned out to the wolves of the mid-70s job market. Don’t believe it. Sometimes good people get bad jobs or no jobs at all; sometimes terrible people get great jobs. Not only is there a shortage of jobs, the search process is totally capricious and inscrutable."

If I titled this the way I wanted, it might crash my server

A bit more on rspec and cucumber, since last night wrangling with directory structure in a project we’re doing I finally put my finger right on what was bugging me.

As I think I said, cucumber is for people, an almost-natural-seeming domain-specific language parser that smoothes the conversation between a non-programmer customer and a programmer implementing particular features. Better yet (and safer, in my case) it can help frame the notion of utility, so that a programmer writing code which may be peripheral or unrelated to actual features can discover the actual value to a customer… or get rid of the chaff.

I tend to be wordy, see. Tend to add features glibly, or type in extra methods without being called to do so. So for me, with my own special needs as such an unusual creature as a programmer who doesn’t pay attention (or as a customer who doesn’t notice how many features are being demanded), I think of cucumber and rspec as my affordances. Keep me from falling down as much as I am wont to do.

Cucumber provides an elegant interface between those two modes of thinking: between the linguistic description of a feature and the technical definition of the corresponding behavior. Not just by exercising all the cool and elegant code in its arsenal, but also in modifying the typical practice of conversation between a programmer and a customer: because it demands a degree of linguistic precision that is lacking in many conversations between these worlds. You “get” to use “plain language” to describe your features. But success requires thoughtful wording.

For example, this is the cucumber code I’m using to manage a feature in an image-processing system I’m playing with. I use it to capture the over-arching “meaning” of this particular feature, and also to state clearly when I’ll accept this code as being “done”:

Feature: Process an image in memory
  In order to collect training data
  As a statistical artist
  I want to reduce example files to database entries

  Scenario Outline: base-level block scanning
    Given an image in memory with <height> x <width> pixels
    And a maximum block depth of <maxDepth>
    When I ask for all blocks
    Then I should a list of all the <total> blocks I expect

  Scenarios: low-level blocks
    | height | width | maxDepth | total |
    | 1      | 1     | 1        | 1     |
    | 10     | 4     | 2        | 67    | # 40+27
    | 8      | 9     | 4        | 200   | # 72+56+42+30
    | 1      | 10    | 3        | 10    | # 10 (can't make higher depth than 1x1)

These four scenarios I’ve described are exercising a bunch of edge cases and standard behavior, and when they’re implemented in functional code they will imply all sorts of unmentioned things about exception testing and error checking and all kinds of libraries, and in that code I’ll have to pass all kinds of messages here and there. And note also that they don’t specify what a lot of the words mean, nor specify the names of functions (there’s no explicit mention of the method Scanner#allBlocks, for instance).

They may look like a flat text file, but in fact they’re very formal bases of very particular chunks of executable code. They say what I want, and when the cucumber story-runner turns them all green, I’ll feel this feature has been finished.

And elsewhere in my project code, in other files for “me, the programmer”, there are definitions that map these formal lines into runnable code:

Given /^an image in memory with (\d+) x (\d+) pixels$/ do |height, width|
  ... [code that sets up the infrastructure for the acceptance test]
end
Given /^a maximum block depth of (\d+)$/ do |maxDepth|
  ... [code that sets up the infrastructure for the acceptance test]
end
When /^I ask for all blocks$/ do
  ... [code that creates the test output]
end
Then /^I should a list of all the (\d+) blocks I expect$/ do |total|
  ... [code that compares the test output to the expected result]
end

With all these plain-text strings floating around, it may seem as if that's a bunch of wobbly, ill-defined vagueness in action. But it turns that the use of very specific framing of detailed and focused concepts is critical for making the jump from notions to code, or (if by some chance you're doing things ass-backwards) to take already written code and capture the pure function that it implements. It focuses attention on the boundaries between "system under [integration] test" and the rest of the typing you're doing.

Indeed, now I think about it, a lot of the negative comments I've heard from programmers who don't like the style of rspec or cucumber (or test-driven development, for that matter) story-writing focuses on how wishy-washy "plain words" are. But when you push them, it turns out the problem really is about the painful rigor one feels, when forced to map what somebody actually wants to what one is actually typing.

Being meaningful is hard work.

Now the code I show above, that's from cucumber files, and cucumber is for functional, integration, acceptance testing. That junction between what somebody actually wants, and what is actually done. Down inside the development process itself, where those little chunks of function are transformed into classes and methods and calls and results, is where rspec comes into play for me, as a unit testing framework. And what I was saying the other day, about the difference (which some consider spurious) between traditional assertion-based unit testing and specification-based unit testing? That's a matter of semantics. (As evidenced by, for example, shoulda, the unit testing extension for Rails that lets you frame your standard unit tests in spec-like language).

See, for me the preference of specification-driven as opposed to assertion-driven unit testing is not really about being wordy, and wanting to always use a paragraph when a line of elegant one-character APL code will do. Though I am. Wordy.

No, it's really about more comfortably capturing the little intuitions one gets, when building some complex object. Because when I'm programming I constantly feel, or I hear, or I know, "This should be over there. This shouldn't happen this way. This should never come up again." And whether I frame my actions as unit tests or as specifications, creating new ones should be simple and communicative, so I know when I'm done, and so I can pose and solve each little concern as quickly and with as much focused attention as possible.

How one states these little problems is, at its essence, how I (as an amateur) see the difference between behavior-driven and test-driven work: that behavior-driven work more clearly surfaces teleology.

For example, here's an extract of rspec code from that same project, where I am implementing code that will (eventually) get those cucumber feature description steps to work:

module Scanning
  describe TrainingImage do
    describe "when initializing" do
      it "should contain a new image with the given width and height" do
        newImage = TrainingImage.new(10,10)
        newImage.width.should == 10
      end

      it "shouldn't complain when given no parameters" do
        lambda {TrainingImage.new}.should_not raise_error(ArgumentError)
      end

      it "should have a default backgroundColor of white" do
        newImage = TrainingImage.new
        newImage.bgColor.should == '#FFFFFF'
      end
    end
  ... [lots more]

Each one of those "should" phrases is a little increment of work I consciously set up---or that I discovered along the way---as I worked on this complex object.

As you can see, this rspec code is more... code-y. But it still has those framing statements of purpose staring you in the eye, the describe and it "should..." blocks. And when you run the specs, you get a list of passing and failing and pending phrases, not function names or some other kind of code that would be another step removed from explicit description.

And, yes, somewhere else in the project hierarchy---frankly it's pretty much buried---is the actual code that makes it work. It's a little, tiny fraction of the bytes in a project. But that's OK. Because (if I do my jobs well enough), every distilled character of that code is simpler, easier to understand, tested, described, and tied directly to the functionality I actually need to produce business value.

Now I mentioned above, last night I finally put my finger on what it was that was bothering---paining---me the other day about both cucumber and rspec, about my experiences in using them, and about introducing these techniques to other people. And it's these intrusions:

require File.join(File.dirname(__FILE__), "/../spec_helper")

and

$: << File.join(File.dirname(__FILE__), "/../lib")

Ouch. No. Please make it stop.

These little nuggets of dense, inscrutable Ruby are the links between the many files in a standard cucumber- or rspec-driven project directory. They crop up at the top of files now and then (and often with no apparent cause in the manuals and tutorials past "this has to be here for it to work"). Some files need them, some files don't. To understand which files need these lines at the top, and where the links need to point, and what the links have to specify and the syntax to use... you need to understand what the cucumber and rspec libraries are actually doing.

You need to figure out what it is the algorithms want.

Now, it you're familiar with Rails programming, you'll recognize the pattern because it also appears in most Rails projects, what with the deep many-branched magical directory structure in that framework. And there, as here, they are a frickin' pain in the butt.

Why am I railing against such a little inconvenience of some nasty perlish Ruby code? Well, think about it: In the midst of these two rather elegant domain-specific systems (and for that matter in the midst of magickal mystical Rails, of all places), where it almost seems like natural language parsing is happening right there before your eyes, and which exist only to surface the link between desire and implementation as comfortably as possible, you have to say some shit like "$: <<"? This is not peanut butter on my chocolate bar, this cuts the eye.

OK, so work with me, here. One has already created a suite of pleasant, linguistically-smooth function names like Given and When and should_not and should have. And the mystical libraries that interpret them are stuffed down inside some gems so real people need never bother themselves with the ugly stuff they actually do. Yay!

But before you can use those elegant libraries, you need to say these magic phrases? This is what, some kind of reminder of our mortality, some kind of intentional flaw in the symmetry of a beautiful Persian rug?

Bah. As an amateur programmer myself, who has gone out on a limb so far as to insist my beloved wife pair program with me as I learn to use these frameworks better and remind myself of Ruby code after many months in Python land... let me tell you that the last frickin' thing I want is for the first line of the code that makes the wonderul thing I'm touting actually run to be something I have to look up every character in the manual to understand, and I still can't figure out which files have to have it and why.

Now, maybe I don't understand enough about the structure of these libraries. I'm not sure it's possible to use the same kind of parsing language they already use---you know, the ones they already have baked-in to take clauses of text and transform them into object instances? Whether maybe those could be brought to bear to create functions like Collect all_specifications.from "/../lib", or maybe FirstLoad "spec_helper.rb"?

I'm not really a very good programmer, as it happens, and I suspect doing anything to write these myself is way past my abilities. But I have this weird feeling it's feasible to add these little affordances for stupid people like me. And that maybe, since they crop up all over in big multi-file projects, they might actually be useful.

links for 2009-04-15

links for 2009-04-14

Older entries »