some thoughts on open data (1): movements and communities

Back in May Tom Slee started what turned out to be a really interesting conversation on open government data when he wrote a couple of blog posts that were fairly critical of the open data movement as seen from the perspective of progressive politics. The thread then got picked up by Crooked Timber, which ran a seminar on open data in late June/early July. The conversation pulled in a number of open data advocates and critics, and I guess it got a bit heated at times, but I found the whole thing helpful in drawing out a lot of the things people should think about when thinking about open data.  If you missed the discussion and are interested in reading more, I put up a whole bunch of links in my last post.

I meant to write some comments of my own while the discussion was still going, but various things came up and I’m finally just now getting back to it. The conversation was broad enough that it would be impossible to respond to everything; besides, I’m still at a “gathering my thoughts” stage about all of this. So what I’m going to try to do in this series of posts is just pull out some of the more salient points that came up about open data, and then add a few of my own comments. I’m numbering the posts, but just for convenience – I’m coming up with the order as I go.

1. Movements and communities

One point of dispute was whether the “open data movement” can really be considered a movement – in the sense of having a unified goal or goals – or if, as Slee argues, there are just too many different groups and individuals involved, representing too many different goals and interests for it to be considered just one thing. If pressed, I’d probably shy away from using the term “movement” to describe the broader phenomenon, but for now I’m just going to sidestep the issue. Like some other commenters, I’m not sure how important it is to come to a strict conclusion on this point.

I do think that you can, arguably, identify an open data community. Now, you might object, didn’t I just substitute one seemingly-precise-but-not-strictly-defined term (“community”) for another (“movement”)? It’s a valid question. I just find the concept of a community more helpful in thinking about open data advocates. Different groups might have different ultimate goals and interests beyond their open data work, but while they’re working on open data, they generally seem to use similar language and to have similar needs. This still could prove to be a rather transient community as people move on to other things, but they’re neighbors while in town.

Or maybe that’s still a too simplified way of looking at things. Most of the discussion of open data I’ve been referring to has been about open government data: data collected, produced, and disseminated by government bodies at any jurisdictional levels. But that’s just one category of data that could be made more open.

I can think of a few others: there’s the open access movement, which centers on academic research. It isn’t just about data, but it includes a call for open access to the data produced/collected as part of research activities. Some of this data comes from studies that have received government funding, but it’s not really government data in the sense of census data. A second example would be LODLAM – linked open data in libraries, archives, and museums. Again, many of these institutions are public, but this also doesn’t really fall into the government data category as conventionally understood.

While there are probably some people who are working on all of these types of data at once, it would be hard to say that there’s just one large community that includes everything under these umbrellas. So it might be more accurate to think of multiple open data communities, each with points in common with the others but working in different domains.

One final note: I’ve been reading Crooked Timber for years and I got the impression that the open data seminar didn’t generate as much activity in the comment threads as some other topics have in the past. Now, I could just be mistaken about the comment volume, but I did feel like there was a real mismatch on some posts between the level of engagement of the posters and that of the commenters. (However, a few posts got a lot of comments, like Slee’s “Seeing Like a Geek“.) This impression is another reason I think it would be possible to identify an open data community: within it, people are eager to talk, debate, share ideas, but outside of it the level of interest is apparently much lower. Or maybe this was just a result of the seminar being held during the mid-summer holiday season.

Next post [not written yet]: government information and information about government.

My instructor was Mr. Langley, and he taught me to sing a song. If you’d like to hear it I can sing it for you.

[Note: This post originally appeared on my old blog on June 15, 2008 and I am posting it here under that date. I didn’t want to carry over that whole blog, but I did want to keep this post with me. For the record, today is actually July 24, 2012.]

————

I read this

Over the past few years I’ve had an uncomfortable sense that someone, or something, has been tinkering with my brain, remapping the neural circuitry, reprogramming the memory. My mind isn’t going—so far as I can tell—but it’s changing. I’m not thinking the way I used to think. I can feel it most strongly when I’m reading. Immersing myself in a book or a lengthy article used to be easy. My mind would get caught up in the narrative or the turns of the argument, and I’d spend hours strolling through long stretches of prose. That’s rarely the case anymore. Now my concentration often starts to drift after two or three pages. I get fidgety, lose the thread, begin looking for something else to do. I feel as if I’m always dragging my wayward brain back to the text. The deep reading that used to come naturally has become a struggle.

and thought that I’ve been having the same experience for a few years now, except that when I lose a thread while reading a book or article online and look for something else, that something else is more text in another tab or window. Then I remembered that I’ve always had to put energy into concentrating on what I’m reading, even if I find it interesting. The only exceptions are things I find engrossing – even if I don’t find them interesting. What makes something engross me? I don’t exactly know. I’d say “good writing” but that’s hardly a satisfying explanation.

I read this

Research that once required days in the stacks or periodical rooms of libraries can now be done in minutes. A few Google searches, some quick clicks on hyperlinks, and I’ve got the telltale fact or pithy quote I was after.

and thought that the seeming thinness of research aimed mainly at gathering “telltale fact”s or “pithy quote”s resides more in its goals than in its methods.

I read this

I’m not the only one. When I mention my troubles with reading to friends and acquaintances—literary types, most of them—many say they’re having similar experiences. The more they use the Web, the more they have to fight to stay focused on long pieces of writing. Some of the bloggers I follow have also begun mentioning the phenomenon. Scott Karp, who writes a blog about online media, recently confessed that he has stopped reading books altogether. “I was a lit major in college, and used to be [a] voracious book reader,” he wrote. “What happened?” He speculates on the answer: “What if I do all my reading on the web not so much because the way I read has changed, i.e. I’m just seeking convenience, but because the way I THINK has changed?”

and the sentiments felt familiar. I may have always had to work to keep focused on long writing, but I used to finish books at a much higher rate. Outside of required readings, I used to start multiple books at once until I found one that held my interest until I finished it, at which point I re-started the process. Now it seems like I’m always beginning books.

I read this

“I can’t read War and Peace anymore,” he admitted. “I’ve lost the ability to do that. Even a blog post of more than three or four paragraphs is too much to absorb. I skim it.”

and thought, who can read War and Peace in any sort of “normal” way at all? I read it in bunches over a period of about a month, quickly at first when I was into it, more slowly when I began to get frustrated with the plot about halfway through, lethargically as I approached the end, determinedly as I read the final few hundred pages in one sitting, knowing that if I put it down I was in danger of never picking it up again. I reflected that reading fiction has always been a different experience with me than reading non-fiction. I can’t skim fiction. I might read blog posts quickly, but I don’t skim them unless I’m deciding whether or not to then read them.

I read this

As part of the five-year research program, the scholars examined computer logs documenting the behavior of visitors to two popular research sites, one operated by the British Library and one by a U.K. educational consortium, that provide access to journal articles, e-books, and other sources of written information. They found that people using the sites exhibited “a form of skimming activity,” hopping from one source to another and rarely returning to any source they’d already visited. They typically read no more than one or two pages of an article or book before they would “bounce” out to another site. Sometimes they’d save a long article, but there’s no evidence that they ever went back and actually read it.

and wondered if there was also evidence that they never went back and actually read those articles. I wondered if the authors considered that people may be exhibiting “a form of skimming activity” because they were skimming to see which of their search results were useful, if any. Or because they were curious about something they found but weren’t looking for. I wondered if browsing nearby books in the stacks is “a form of skimming activity.” I wondered if this says something about how people search as well as about how people read.

I read this

“We are not only what we read,” says Maryanne Wolf, a developmental psychologist at Tufts University and the author of Proust and the Squid: The Story and Science of the Reading Brain. “We are how we read.” Wolf worries that the style of reading promoted by the Net, a style that puts “efficiency” and “immediacy” above all else, may be weakening our capacity for the kind of deep reading that emerged when an earlier technology, the printing press, made long and complex works of prose commonplace. When we read online, she says, we tend to become “mere decoders of information.” Our ability to interpret text, to make the rich mental connections that form when we read deeply and without distraction, remains largely disengaged.

and tried to remember where I saw Wolf’s work discussed recently. I resisted searching for it right then and there. [I later looked and found it: Caleb Crain’s essay “Twilight of the Books” on the future of reading.]

I read this

Reading, explains Wolf, is not an instinctive skill for human beings. It’s not etched into our genes the way speech is. We have to teach our minds how to translate the symbolic characters we see into the language we understand. And the media or other technologies we use in learning and practicing the craft of reading play an important part in shaping the neural circuits inside our brains. Experiments demonstrate that readers of ideograms, such as the Chinese, develop a mental circuitry for reading that is very different from the circuitry found in those of us whose written language employs an alphabet. The variations extend across many regions of the brain, including those that govern such essential cognitive functions as memory and the interpretation of visual and auditory stimuli. We can expect as well that the circuits woven by our use of the Net will be different from those woven by our reading of books and other printed works.

and thought, that may be true, but doesn’t that mean we have the flexibility to re-wire if we change our behavior? So to the extent that there’s a change taking place, it might not be a permanent one.

I read this

Sometime in 1882, Friedrich Nietzsche bought a typewriter—a Malling-Hansen Writing Ball, to be precise. His vision was failing, and keeping his eyes focused on a page had become exhausting and painful, often bringing on crushing headaches. He had been forced to curtail his writing, and he feared that he would soon have to give it up. The typewriter rescued him, at least for a time. Once he had mastered touch-typing, he was able to write with his eyes closed, using only the tips of his fingers. Words could once again flow from his mind to the page.

and thought of Francis Parkman, who needed a special tool to help him hand-write along straight lines as his vision worsened.

I read this

In Technics and Civilization, the historian and cultural critic Lewis Mumford described how the clock “disassociated time from human events and helped create the belief in an independent world of mathematically measurable sequences.” The “abstract framework of divided time” became “the point of reference for both action and thought.”

and thought that maybe if I finish reading Mumford’s two best known Cities books, I might read some of his other work. I remembered that I decided not to get a used copy of Technics and Civilization recently because I wasn’t sure how it stood in relation to his other work – and, more importantly, because it was kind of heavy and I didn’t want to carry it when I moved.

I read this

When the Net absorbs a medium, that medium is re-created in the Net’s image.

and was reminded of Marx writing that the bourgeoisie creates the world in its own image.

I read the rest of that paragraph

It injects the medium’s content with hyperlinks, blinking ads, and other digital gewgaws, and it surrounds the content with the content of all the other media it has absorbed. A new e-mail message, for instance, may announce its arrival as we’re glancing over the latest headlines at a newspaper’s site. The result is to scatter our attention and diffuse our concentration.

and thought: you can change a lot of those settings, you know.

I read this

The Net’s influence doesn’t end at the edges of a computer screen, either. As people’s minds become attuned to the crazy quilt of Internet media, traditional media have to adapt to the audience’s new expectations. Television programs add text crawls and pop-up ads, and magazines and newspapers shorten their articles, introduce capsule summaries, and crowd their pages with easy-to-browse info-snippets. When, in March of this year, The New York Times decided to devote the second and third pages of every edition to article abstracts, its design director, Tom Bodkin, explained that the “shortcuts” would give harried readers a quick “taste” of the day’s news, sparing them the “less efficient” method of actually turning the pages and reading the articles. Old media have little choice but to play by the new-media rules.

and wondered if the author thought these were all bad developments. More ads and shorter articles certainly don’t seem like a positive step, but abstracts and snippets, done well, could be quite helpful. Assuming abstracts aren’t all that people ever read.

I read this

Taylor’s system is still very much with us; it remains the ethic of industrial manufacturing. And now, thanks to the growing power that computer engineers and software coders wield over our intellectual lives, Taylor’s ethic is beginning to govern the realm of the mind as well. The Internet is a machine designed for the efficient and automated collection, transmission, and manipulation of information, and its legions of programmers are intent on finding the “one best method”—the perfect algorithm—to carry out every mental movement of what we’ve come to describe as “knowledge work.”

Google’s headquarters, in Mountain View, California—the Googleplex—is the Internet’s high church, and the religion practiced inside its walls is Taylorism. Google, says its chief executive, Eric Schmidt, is “a company that’s founded around the science of measurement,” and it is striving to “systematize everything” it does. Drawing on the terabytes of behavioral data it collects through its search engine and other sites, it carries out thousands of experiments a day, according to the Harvard Business Review, and it uses the results to refine the algorithms that increasingly control how people find information and extract meaning from it. What Taylor did for the work of the hand, Google is doing for the work of the mind.

The company has declared that its mission is “to organize the world’s information and make it universally accessible and useful.” It seeks to develop “the perfect search engine,” which it defines as something that “understands exactly what you mean and gives you back exactly what you want.” In Google’s view, information is a kind of commodity, a utilitarian resource that can be mined and processed with industrial efficiency. The more pieces of information we can “access” and the faster we can extract their gist, the more productive we become as thinkers.

and had a few questions:

  1. What happened to labor? Is Google’s workforce organized along Taylorized lines? Reports suggest that the answer is “no,” at least for some subset of employees.
  2. How can the claim that the internet is encouraging Taylor-like efficiency be reconciled with an article premised on distraction and lack of concentration? It sounds like it is the search engine itself that’s being Taylorized.

I read this

“The ultimate search engine is something as smart as people—or smarter,” Page said in a speech a few years back. “For us, working on search is a way to work on artificial intelligence.” In a 2004 interview with Newsweek, Brin said, “Certainly if you had all the world’s information directly attached to your brain, or an artificial brain that was smarter than your brain, you’d be better off.” Last year, Page told a convention of scientists that Google is “really trying to build artificial intelligence and to do it on a large scale.”

and thought it sounded like marketing.

I read this

The idea that our minds should operate as high-speed data-processing machines is not only built into the workings of the Internet, it is the network’s reigning business model as well. The faster we surf across the Web—the more links we click and pages we view—the more opportunities Google and other companies gain to collect information about us and to feed us advertisements. Most of the proprietors of the commercial Internet have a financial stake in collecting the crumbs of data we leave behind as we flit from link to link—the more crumbs, the better. The last thing these companies want is to encourage leisurely reading or slow, concentrated thought. It’s in their economic interest to drive us to distraction.

and thought it was a good point. I wondered if it would have been better to build the article around this observation rather than around reading. Page layouts, column widths, displaying articles on one or on multiple pages, print versions, linking within the same site or set of sites – all of these things affect the way we read and are affected by the way we read (since site designers have to try to grab and hold our attention). The internet is not just some undifferentiated entity known as “the internet”; search engines don’t just pull up “the best” or “the most efficient” results at the top. There is a sense in which technology “uses” us, sure, but that shouldn’t obscure the ways technology mediates the way people interact with or act upon each other. That’s one of the reasons we use the word “media” right? (Or is that a false etymology?)

I read this

In Plato’s Phaedrus, Socrates bemoaned the development of writing. He feared that, as people came to rely on the written word as a substitute for the knowledge they used to carry inside their heads, they would, in the words of one of the dialogue’s characters, “cease to exercise their memory and become forgetful.” And because they would be able to “receive a quantity of information without proper instruction,” they would “be thought very knowledgeable when they are for the most part quite ignorant.” They would be “filled with the conceit of wisdom instead of real wisdom.” Socrates wasn’t wrong—the new technology did often have the effects he feared—but he was shortsighted. He couldn’t foresee the many ways that writing and reading would serve to spread information, spur fresh ideas, and expand human knowledge (if not wisdom).

and thought that what Plato and the article both leave out is the unreliability of memory and the ability to check it against a documentary record (which itself isn’t always reliable).

I read this

The arrival of Gutenberg’s printing press, in the 15th century, set off another round of teeth gnashing. The Italian humanist Hieronimo Squarciafico worried that the easy availability of books would lead to intellectual laziness, making men “less studious” and weakening their minds.

and thought of Ann Blair’s article about Early Modern information overload, which you can find summarized here [dead link removed] and here [dead link removed] (the latter link points to an ungated version on this page).

I read this

The kind of deep reading that a sequence of printed pages promotes is valuable not just for the knowledge we acquire from the author’s words but for the intellectual vibrations those words set off within our own minds. In the quiet spaces opened up by the sustained, undistracted reading of a book, or by any other act of contemplation, for that matter, we make our own associations, draw our own inferences and analogies, foster our own ideas. Deep reading, as Maryanne Wolf argues, is indistinguishable from deep thinking.

and was reminded, despite my skepticism about much of the article, how much I too value sustained reading. But having read some books online in the past two years, I don’t know that it has to be print-based.

I finished reading the article. I tracked down some links and planned to post on it in a day or two. I read some other things online. I turned off the computer and began reading this book, which I’ve been meaning to read since I mentioned it months ago. It could be years before I finish it.

[The original post can still be viewed here.]