Digitization access licensing and scholarship’s “best before” date

Ralph Luker at His­tory News Net­work points out:

Last week, I noted the agree­ment between the National Archives and Foot​note​.com to dig­i­tize mil­lions of doc­u­ments from the Archives and make them avail­able to researchers on the net. Dan Cohen takes a closer look at the agree­ment and com­pares it to the agree­ment between the Smith­son­ian and Show­time. “From now until 2012 it will cost you $100 a year, or even more offen­sively, $1.99 a page,” Cohen points out, “for online access to crit­i­cal his­tor­i­cal doc­u­ments such as the Papers of the Con­ti­nen­tal Congress.“

I’m reminded of what was described to me as the “incred­i­bly lib­eral” license and usage agree­ments of EEBO-​​TCP, in which the par­tic­i­pat­ing Uni­ver­si­ties have “allowed” pri­vate for-​​profit online media com­pa­nies to dig­i­tize their micro­film records and serve them back to their fac­ulty and stu­dents for a fee. But that fee only lasts a cou­ple of decades, as I recall, expir­ing around 2010. After that, those dig­i­tized doc­u­ments from the 14th — 18th Cen­turies will become free.

Well, free to reg­is­tered Library users.

Of course, the micro­film that’s being dig­i­tized was itself cre­ated from the phys­i­cal library books in the 1970s and 80s, often by the same com­pa­nies, and back then access to the phys­i­cal images was sold back to the fac­ulty and stu­dents and other libraries for a hefty fee as well. I can still buy a micro­film of some­thing I want to read a grainy over-​​developed lith print of, for $120 or so.

Nice work if you can get it. Almost as smooth a busi­ness plan as forc­ing Ph.D. stu­dents to pub­lish their the­ses with a par­tic­u­lar com­pany, and then forc­ing them to buy them back, and pay for long-​​term storage.…

I gripe. I hon­estly have no major prob­lem with the busi­ness model. After all, I can still—thank the founders of actually-​​expiring copy­right law, and damn you Sonny Bono—go to the library and check out the orig­i­nal book and scan it and repub­lish an open-​​access, fully public-​​domain, proof­read ver­sion of text, if it comes to that.

What I want to know, though, is What makes delayed access seem like a fair trade?

Sure, I see that it “costs lots of money” to dig­i­tize mil­lions of books. I can fol­low the logic — much as I despise the sen­ti­ments — that makes it seem rea­son­able to encum­ber access to public-​​domain works owned by pub­lic insti­tu­tions and bought with pub­lic funds with obstruc­tive pri­vate license agree­ments and heavy fees, in order to repay that effort and offer up a bit of profit for the digitizing/​microfilming firm. An amaz­ing suc­cess of com­pro­mise, con­flict of inter­est and greed over pub­lic stew­ard­ship, sure… but not stupid.

But why should it expire? Recall we’re not talk­ing about a copy­right pro­tec­tion here — the orig­i­nal works are gen­er­ally in the pub­lic domain, and phys­i­cally owned by pub­lic insti­tu­tions, and so any repro­duc­tion of the work must also be in the pub­lic domain, regard­less of medium. You can’t re-​​copyright a 16th-​​century book, just because you made a micro­film or elec­tronic ver­sion of the page images.

We’re talk­ing about con­trac­tual license agree­ments. The micro­film and dig­i­tized page images are encum­bered by an agree­ment cov­er­ing terms for access. The re-​​re-​​re-​​publishers have access, they charge for you to have access, and you agree not to let any­body else have access.

There is no nat­ural ter­mi­na­tion of license agree­ments. They could charge for­ever, like they do with micro­film, to let you see their pic­tures of your stuff. And they could for­ever encum­ber that access with what­ever terms they want.

What does 2012 have to do with it?

And of course it’s not just man­u­scripts and Very Early Old Stuff for His­to­ri­ans that have this sort of ephemeral encum­brance. The same thing has hap­pened with cer­tain North­west­ern Euro­pean Pub­lish­ers’ back-​​catalogs of mod­ern tech­ni­cal jour­nals. You can search and read and see full text for all sorts of research pub­lished in the 1980s and before. These have “opened up” under vocal pres­sure from librar­i­ans and schol­ars, who are busy, and pressed for time and resources, and pissed… and now armed with their own print­ing and dis­tri­b­u­tion net­work.1 So now — if you work at a Uni­ver­sity that pays big fees — you can look at many pub­lished works that are old enough, but not yet out of copy­right. Again, a decade-​​or-​​two delay.

What’s ten or fif­teen years, in the life of the mind?

Unlike many of my peers, I’ve read both old sci­ence and tech­ni­cal research papers from the 70s and 80s and 90s, and also pub­lic domain books and man­u­scripts from way ear­lier. And I can vouch for the qual­ity: It’s not spoiled yet. There’s good stuff in there, people.

But nonethe­less, the deal with all the re-​​re-​​re-​​publishers is: pay lots of money now for licensed access, or sit and twid­dle your thumbs for 20 years for open access. There’s a tacit trade-​​off, some­how, and clearly that’s a con­ces­sion. How can this few years’ delay for open­ness bal­ance the huge prof­its being made on license agree­ments? Why allow any open access at all?

It costs to main­tain bur­den­some license-​​protecting infra­struc­ture. So surely it ben­e­fits the com­pa­nies to let it all go when the costs out­weigh the rev­enue gen­er­ated. It’s inter­est­ing, though, to see how quickly the revenue-​​generating value dis­ap­pears. Ten, maybe 20 years. Why?

I can make some sto­ries up. We might explain the lack of inter­est among schol­ars in older works by some com­bi­na­tion of these handy myths and tropes:

  • Dust in the Wind: Beyond a cer­tain point, every arti­cle pub­lished in an old jour­nal becomes obso­lete. Every one of them has been read to death, and cited appro­pri­ately for its intrin­sic value, and has thus had its fair effect and impact on the state of mod­ern schol­ar­ship. Noth­ing left to fol­low up on, except for a cer­tain type of navel-​​gazing meta-​​scholar. And besides, they will surely tell us what they think they’ve seen.
  • Com­puter Rev­o­lu­tion Now!: Even his­to­ri­ans and Early Mod­ernists have lap­tops! With Word! With Pow­er­point! Some­times with e-​​mail! Unstop­pable wave of the future, computer-​​aided schol­ar­ship. Stop all this pedan­tic nat­ter­ing about incon­se­quen­tial indi­vid­ual fid­dly things like indi­vid­ual arti­cles or para­graphs or poems or authors; it’s cor­pora or noth­ing, these days, or you can’t get the Big Pic­ture [Of the Frac­tion of Stuff in the Online Index]. Sum­ma­rize a swathe, graph it, pro­vide a sweep­ing 50000-​​foot-​​view. Look—are those ants down there?
  • Stroke your Bet­ters: You’re an up-​​and-​​coming aca­d­e­mic wannabe. You need the peo­ple you’re beg­ging for a “real”, “tenured” life to see you’re lis­ten­ing to their every word. Cite them; for­get their old ene­mies and men­tors from the 70s and 80s. The last thing they want to be reminded of is the stuff they missed (ignored?) when they wrote their the­ses. That’s tan­ta­mount to call­ing them lazy. Look to their most recent bib­li­ogra­phies, and do that selec­tively to point out the insights they’ve pro­vided you.
  • Facts are Cold and Hard: Libraries and phys­i­cal books are slow, and far away, and the flu­o­res­cent lights just hum and suck the life out of a car­rel, and fer­chris­sakes it’s snow­ing in Win­ter Semes­ter. I can do this at home in my Snug­Sack, with a cat and some Earl Gray Hot by my side.

Haha, hyper­bole; look at the amus­ing Straw Men dancing.

No, really. I’m done.

It all comes down to one thing: Suc­cess­ful schol­ars do what other schol­ars do. Because their work is exactly con­ver­sa­tion among them­selves, not about what’s use­ful or pow­er­ful but also unre­marked. There are very few back-​​catalog “wild­cat­ters”. Nobody looks at old con­tent except when there’s a Gold Rush already in progress, as in the 18C these last few years… since the 18C doc­u­ments have been digitized.

In the end, nobody in Com­puter Sci­ence these days ever cites the 1970s papers of Lin­den­mayer and his col­leagues; few Vic­to­ri­an­ists ever read 19C mag­a­zine reviews; and who reads the steam-​​era math­e­mat­i­cal jour­nals, or even 1980s issues of Cell? You buy them at the library sale so you can use them to press flow­ers in.

The value of aca­d­e­mic raw mate­ri­als — man­u­scripts, pub­lished works, arti­cles — for schol­ars is real­ized only through the con­ver­sa­tions they spark. Schol­ar­ship is con­ver­sa­tion. Many con­ver­sa­tions are going on at once, surely. Now and then a con­ver­sa­tion will shift, when some­body with a car­ry­ing voice brings up a new topic, and some of the time this new topic some­thing redis­cov­ered and revived from long ago.

Of course the back-​​catalogs are full of inter­est­ing, use­ful, intrigu­ing top­ics. The jour­nals are full of untrav­eled roads and research ideas left unex­plored; the mus­ings and odd ref­er­ences and pre­ced­ing con­ver­sa­tions of a dozen gen­er­a­tions are writ­ten down there for all to see.

But in the mod­ern Uni­ver­sity, one is rewarded only for fol­low­ing and con­tribut­ing to the ongo­ing con­ver­sa­tion with one’s peers. It’s worth it, to pay a fee to join the club. Access to what every­body else is say­ing—that’s what’s worth the money. That’s how re-​​re-​​republishers can charge for access, and also why they need not charge for old stuff. Old stuff, that’s the record of silenced con­ver­sa­tion. And being silent, it is no longer scholarship.

Besides — now we blog. Oh, wait.…


1 If only they had time to orga­nize and use it, they might offer up a real threat to those old arms-​​dealing fam­ily pub­lish­ing ven­tures from the 17th century…

This entry was posted in Uncategorized by Tozier. Bookmark the permalink.

One thought on “Digitization access licensing and scholarship’s “best before” date

  1. I really hope there are some Uni­ver­sity librar­i­ans read­ing this who can shed some light on the mat­ter. I got the impres­sion that there was some tri­an­gu­la­tion between the schol­ars (demand­ing con­tent from the library), the librar­i­ans (send­ing the schol­ars after the admin), and the admin­is­tra­tion (slash­ing the library bud­get to keep class­rooms from being over­crowded). Nobody takes a long view in such cir­cum­stances, so we ended up with the unpleas­ant license agreements.