What’s Wrong With Asking the Crowd?

When I was in library school, about half an Internet generation ago [0], we were warned, very specifically and repeatedly, against relying on Google or Wikipedia or any other online resource authored by “non-authoritarian” sources. Instead, we were directed toward proprietary academic and professional databases— EDGAR, Dialog, Lexus/Nexus, Westlaw, and the like.

I chalked up a great deal of this propagandizing to existential angst amongst an older generation of library professionals. The unwashed public having direct access to raw information without the kindly and professional intermediation provided by suitably indoctrinated gatekeepers? Quel horreur! [1]

I had been an IT professional for about 10 years by then [2] and had nothing but the deepest respect for the free-form conversation that is the Internet [3]. Sure, plenty of the information you find may be wrong, but if so, somebody else will be along shortly shouting at the top of their CAPS LOCK key precisely how wrong it is, with illustrative asides and digressions into the quality of the original poster’s intelligence, reading comprehension, research methodology, and parentage, including hyperlinked footnotes to, for example, the website of the guy who invented whatever the heck you’re talking about. If you have a high tolerance for alpha-geek posturing, the Internet can give you one hell of an education.

On the other hand, I once spent three hours attending a seminar about how to construct Dialog [4] search strings. The Dialog rep was all enthusiastic about the up-to-date bright! shiny! web-based! coat of paint they had just slapped over their “state-of-the-art” [5] interface.

I finally raised my hand and asked what was, to me, the obvious question.

“Instead of requiring your users to memorize a series of unintuitive and non-standardized tags and switches in order to construct a search string that reads like cuneiform, why not build a web form and use a piece of middleware to do the translation into Dialog syntax?”

The Dialog rep got that deer-in-the-headlights look. “Our techs say that’s impossible.” [6]

“Well, since all of your content is already digitized, why not just build a comprehensive index and write a search-engine-like interface that can handle natural-language queries?”

“We can’t do that, either.”

“You could hire Google to do it. They’ve already written the search algorithms; all they would have to do is customize the bots for your environment.”

“…… Moving on— here’s the exact syntax for searching by author in this database….”

I knew then that the proprietary pay-ware information database model was doomed.

Hackers are fond of saying that information wants to be free. They’re not entirely correct. I like the way Charles Stross put it: “The dirty little secret of the intelligence-gathering job is that information doesn’t just want to be free— it wants to hang out on street corners wearing gang colors and terrorizing the neighbors.” [7] Information wants to be loved, it wants to be discussed, to be picked apart, to be an active participant in our day-to-day lives. It resides in our collective headspace and feeds on conversation. It can’t do that locked behind a paywall and proprietary interface. But it can flourish on the open Internet.

Go forth and be a part of the conversation.

[0] That is, four years.
[1] Those of us who had staked out our email addresses pre-1995 had a somewhat different attitude— we were the ones coining the terms “crowdsourcing,” “open source,” and “folksonomy.”
[2] Job description: Googling for error messages to see who else had bushwhacked their way through this particular stretch of misconfiguration hell.
[3] With the possible exception of those with aol.com email addresses. I thought for years America Online users were the nadir of networked human potential— then WebTV made its debut. At least America Online made you learn to use a computer.
[4] At the time Dialog was considered the platinum standard of academic research databases. I considered it a dinosaur.
[5] For values of “state-of-the-art” that fell into a cryogenic freezing chamber in the early 1980s and had only recently been reanimated as the result of a lightning strike or nuclear accident.
[6] I had spent my week writing exactly that kind of middleware, and knew it could easily be done with PHP, Perl, Python, or any of a dozen other scripting languages beginning with the letter P.
[7] From his excellent Laundry Files novel The Jennifer Morgue. Go read it. It’s Lovecraftian horror meets James Bond meets selected specimens from the O’Reilly bestiary, and if you understood that description, you will understand all the jokes.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.