(This blog post builds upon some of the observations I made in an earlier blog post on Google, Wikipedia and the blogosphere, but unlike that post, has a more substantive part dedicated to analysis. It also builds on the previous post, Can the Internet destroy the University?.)
I recently came across Michael Nielsen’s website. Michael Nielsen was a quantum computation researcher — he’s the co-author of Quantum computation and quantum information (ISBN 978-0521632355). Now, Nielsen is working on a book called The Future of Science, which discusses how online collaboration is changing the way scientists solve problems. Here’s Nielsen’s blog post describing the main themes of the book.
Journals — boon to bane?
Here is a quick simplification of Nielsen’s account. In the 17th century, inventors such as Newton and Galileo did not publish their discoveries immediately. Rather, they sent anagrams of these discoveries to friends, and continued to work on their discoveries in secret. Their main fear was that if they widely circulated their idea, other scientists would steal the idea and take full credit for it. By keeping the idea secret, they could develop it further and release it in a more ripe form. In the meantime, the anagram could be used to prove precedence in case somebody else also came up with the idea.
Nielsen argues that the introduction of journals, combined with public funding of science and the recognition of journal publications as a measure of academic achievement, led scientists to publish their work and thus divulge it to the world. However, today, journal publishing competes with an even more vigorous and instantaneous form of sharing: the kind of sharing done in blogs, wikis, and online forums. Nielsen argues that this kind of spontaneous sharing of rough drafts of ideas, of small details that may add up to something big, opens up new possibilities for collaboration.
In this respect, the use of online tools allows for a “scaling up” of the kind of intense, small-scale collaboration that formerly occurred only in face-to-face contact between trusted friends or close colleagues. However, Nielsen argues that academics, eager to get published in reputable journals, may be reluctant to use online forums to ask and answer questions of distant strangers. Two factors are at play here: first, the system of academic credit and tenure does little to reward online activity as opposed to publishing in journals. Second, scientists may fear that other scientists can get a whiff of their idea and beat them in the race to publish.
(Nielsen develops “scaling up” more in his blog post, Doing Science Online).
Nielsen says that this in inefficient. Economists do not like deadweight losses (Market wiki entry, Wikipedia entry) in markets — situations where one person has something to sell to another, and the other person is willing to pay the price, but the deal doesn’t occur. Nielsen says that such deadweight losses occur routinely in academic research. Somebody has a question, and somebody else has an answer. But due to the high search cost (Market wiki entry, English Wikipedia entry), i.e., the cost of finding the right person with the answer, the first person never gets the answer, or has to struggle a lot. This means a lot of time lost.
Online tools can offer a solution to the technical problem of information-seekers meeting information-providers. The problem, though, isn’t just one of technology. It is also a problem of trust. In the absence of enforceable contracts or a system where the people exchanging information can feel secure about not being “cheated” (in this case, by having their ideas stolen), people may hesitate to ask questions to the wider world. Nielsen’s suggestions include developing robust mechanisms to measure and reward online contribution.
Blogging for mathies?
Some prominent mathematical bloggers that I’ve come across: Terence Tao (Fields Medalist and co-prover of the Green-Tao theorem), Richard E. Borcherds (famous for his work on Moonshine), and Timothy Gowers. Tao’s blog is a mixed pot of lecture notes, updates on papers uploaded to the ArXiV, and his thoughts on things like the Poincare conjecture and the Navier-Stokes equations. In fact, in his post on doing science online, Nielsen uses the example of a blog post by Tao explaining the hardness of the Navier-Stokes equation. In Nielsen’s words:
The post is filled to the brim with clever perspective, insightful observations, ideas, and so on. It’s like having a chat with a top-notch mathematician, who has thought deeply about the Navier-Stokes problem, and who is willingly sharing their best thinking with you.
Following the post, there are 89 comments. Many of the comments are from well-known professional mathematicians, people like Greg Kuperberg, Nets Katz, and Gil Kalai. They bat the ideas in Tao’s post backwards and forwards, throwing in new insights and ideas of their own. It spawned posts on other mathematical blogs, where the conversation continued.
Tao and others, notably Gowers, also often throw ideas about how to make mathematical research more collaborative. In fact, I discovered Michael Nielsen through a post by Timothy Gowers, Is massively collaborated mathematics possible?, which mentions Nielsen’s post on doing science online. (Nielsen later critiqued Gowers’ post. Gowers considers alternatives such as a blog, a wiki, and an online forum, and concludes that an online forum best serves the purpose of working collaboratively on mid-range problems: problems that aren’t too easy and aren’t too hard.
My fundamental disagreements
A careful analysis of Nielsen’s thesis will take more time, but off-the-cuff, I have at least a few points of disagreement about the perspective from which Nielsen and Gowers are looking at the issue. Of course, my difference in perspective stems from my different (and altogether considerably fewer) experience compared to them.
I fully agree with Nielsen’s economic analysis with regard to research and collaboration: information-seekers and information-providers not being able to get in contact often leads to squandered opportunities. I’ve expressed similar sentiments myself in previous posts, though not as crisply as Nielsen.
My disagreement is with the emphasis on “community” and “activity”. Community and activity could be very important to researchers, but in my view they can obscure the deeper goal of growing knowledge. And it seems that in the absence of strong clusters, community and activity can result in a system that is almost as inefficient.
In the early days of the Internet, mailing lists were a big thing (they continue to be a big thing, but their relative significance in the Internet has probably declined). In those days, the Usenet mailing lists and bulletin board systems often used to be clogged with the same set of questions, asked repeatedly by different newbies. The old hands, who usually took care of answering the questions, got tired of this repetition of the same old questions. Thus was born the “Usenet FAQ”. With this FAQ, the mailing lists stopped getting clogged with the same old questions and people could devote attention to more challenging issues.
Forums (such as Mathlinks, which uses PHPBB) are a little more advanced than mailing lists in terms of the ability to browse by topic. However, they are still fundamentally a collection of questions and answers posted by random people, with no overall organizing framework that aids exploration and learning. In a situation where the absence to a forum is no knowledge, a forum is a good place. In fact, a forum can be one input among many for building a systematic base of knowledge. But when a forum is built instead of a systematic body of knowledge, the result could be a lot of duplication and inefficiency and the absence of a bigger picture.
Systematic versus creative? And the irony of Wikipedia
Systematic to some people means “top-down”, and top-down carries negative connotations for many; or at any rate, non-positive connotation. For instance, the open source movement, which includes Linux and plenty of “free software”, prides itself on being largely a bottom-up movement, with uncoordinated people working of their own volition to contribute small pieces of code to a large project. Top-down direction could not have achieved this. In economic jargon, when each person is left to make his or her own choices, the outcome is invariably more efficient, because people have more “private information” about their interests and strengths. (Nielsen uses open source as an example for where science might go by being more open in many of his posts, for instance, this one on connecting scientists to scientists).
But when I’m saying systematic, I don’t necessarily mean top-down. rather, I mean that the system should be such that people know where their contributions can go. The idea is to minimize the loss that may happen because one person contributes something at one place, but the other person doesn’t look for it there. This is very important, particularly in a large project. A forum to solve mathematical questions has the advantage over offline communication: the content is available for all to see. But this advantage is truly meaningful only if everybody who is interested can locate the question easily.
Systematic organization does not always mean less of a sense of community and activity, but this is usually the case. When material is organized through internal and logical considerations, considerations of chronological sequence and community dynamics take a backseat. The ultimate irony is that Wikipedia, which is often touted as the pinnacle of Web 2.0 achievement, seems to prove exactly the opposite: the baldness, anti-contextuality, canonical naming, and lack of a “time” element to Wikipedia’s entries is arguably its greatest strength.
Through choices of canonical naming (the name of an article is precisely its topic), extensive modularization (a large number of individual units, namely the separate articles), a neutral, impersonal, no-credit-to-author-on-the-article style, and extensive strong internal linking, Wikipedia has managed to become an easy reference for all. If I want to read the entry on a topic, I know exactly where to look on Wikipedia. If I want to edit it, I know exactly what entry to edit, and I’m guaranteed that all future people reading the Wikipedia entry looking for that information will benefit from my changes. In this respect, the Wikipedia process is extraordinarily efficient. (It is inefficient in many other ways, namely, the difficulty of quality control, measured by the massive amount of volunteer hours spent combating obvious and non-obvious spam, as well as the tremendous amount of time spent in inordinate battle over control and editing of particular entries).
The power of the Internet is its perennial and reliable availability (for people with reliable access to electricity, machines, and Internet connections). And Wikipedia, through the ease with which one can pinpoint and locate entries, and the efficiency with which it funnels the efforts both of readers and contributors to edit a specific entry, builds on that power. And I suspect that, for a lot of us, a lot of the time we’re using the Internet, we aren’t seeking exciting activity, a sense of community, or personal solidarity. We want something specific, quickly. Systematic organization and good design and architecture that gets us there fast is what we need.
What can online resources offer?
A blog creates a sense of activity, of time flowing, of comments ordered chronologically, of a “conversation”. This is valuable. At the same time, a systematic organized resource, that organizes material not based on a timeline of discovery but rather based on intrinsic characteristics of the underlying knowledge, is usually better for quick lookup and user-directed discovery (where the user is in charge of things).
It seems to me that the number of successful “activity-based online resources” will continue to remain small. There will be few quality blogs that attract high-quality comments, because the effort and investment that goes into writing a good blog entry is high. There may be many mid-ranging blogs offering random insights, but these will offer little of the daily adventure feeling from a high-traffic, high-comment blog.
On the other hand, the market was quick “pinpoint references” — the kind of resources that you can use to quickly look something up — seems huge. A pinpoint reference differs from a forum in this obvious way. In a forum you ask a question and wait for an answer, or, you browse through previously asked questions. In a pinpoint reference, you decide you want to know about a topic, and go to the page, and BANG, the answer’s already there, along with a lot of stuff you might have thought of asking but never got around to, all neatly organized and explorable.
Fortunately or unfortunately, the notion of “community” and “activity” is more appealing in a naive, human sense than the notion of pinpoint references. “Chatting with a friend” has more charm to it than having electricity. But my experience with the way people actually work seems to suggest that people value self-centered, self-directed exploration quite a bit, and may be willing to sacrifice a sense of solidarity or “being with others in a conversation” for the sake of more such exploration. Pinpoint resources offer exactly that kind of a self-directed model to users.
My experiment in this direction: subject wikis
I started a group theory wiki in December 2006, and have since extended it to a general subject wikis website. The idea is to have a central source, the subject wikis reference guide, from where one can search for terms, get short general definitions, with links to more detailed entries in individual subject wikis. See, for instance, the the entry on “normal”.
I’ve also recently started a blog for the subject wikis website, that will describe some of the strategies and approaches and choices involved in the subject wikis.
It’s not clear to me how this experiment will proceed. At the very least, my work on the group theory wiki is helping me with my research, while my work on the other wikis (which has been very little in comparison) has helped me consolidate the standard knowledge I have in these subjects along with other tidbits of knowledge or thoughts I’ve entertained. Usage statistics seem to indicate that many people are visiting and finding useful the entries on the group theory subject wiki, and there are a few visitors to each of the other subject wikis as well. What isn’t clear to me is whether this can scale to a robust reference where many people contribute and many people come to learn and explore.