What Is Research?

April 18, 2009

Tricki goes live

Filed under: Tricki, Wikis — vipulnaik @ 3:42 pm

A short while ago, I blogged about the Tricki, mentioning what I considered an unnecessary subservience to Wikipedia. At the time, Tricki was still pre-live. Recently, Tim Gowers announced that the Tricki is “fully live”, which means that anybody can create a login and add and edit entries.

The Tricki already has a reasonable number of decent articles — an interesting article that I located in group theory is: first pretend that a normal subgroup is trivial. Here’s a link to the list of group theory-related tricki articles. Here is a list of tags that can be used to navigate the tricki.

The concept of the tricki is interesting, and I suspect that a lot of interesting stuff will go on there in the next few months. On the other hand, the main problem I see with the Tricki is that, while it has a lot of useful tricks, there is no easy standard reference it can use for looking up more details of the terms and facts used. The current consensus seems to be to link to Wikipedia, but using only one resource, particularly a resource of dubious value, seems restrictive, particularly since the specific Wikipedia article pointed to may not be particularly good. Consider, for instance, this page where, in the comments, Gowers notes that he “wouldn’t mind being reminded” what a characteristic subgroup is.

Here is what I suggest. For each of the concepts, Tricki should have a short page giving the definition and useful links, and the links from the tricki article pages should point to this short page. This way, article writers can concentrate on writing their articles rather than on defining all the auxiliary concepts involved (particularly in cases where these auxiliary concepts are reasonably standard and known to many of the readers of the article).

The concept page could serve three roles:

  • A concise definition.

  • A list of related tricki articles (this could be automatically generated by backlinks, or manually created and organized).

  • A list of references/weblinks for more details. This list should be carefully curated, and people making such links should actually read the reference being linked to! Wikipedia could, of course, be one of the references typically given, but it need not be the only one.

Why do I think that separate concept pages are important? To take a totally different example, consider newspaper and magazine websites. Many of these websites have their own reference pages on a number of important topics — these reference pages provide a short summary, external links, as well as details of past coverage of the topic. For instance, The New York Times has Times Topic pages on almost all the things it reports on regularly, and its news articles have internal links to these Times Topic pages. While this is partly a tactic to maintain link juice, it also helps provide entry points for people seeking to get information on topics and get an overview of how the NYT has covered the topic. See, for instance, the Times Topic pages on Wikipedia, Twitter, and The Amazon Kindle. It seems to me that despite the presence of Wikipedia pages that may carry a lot more information, these Times Topic pages serve a useful function to readers who are interested in newspaper coverage of the material.

Also, there is something — a lot, in fact — to be said for keeping links largely internal, particularly when building up an extensively cross-referenced body of knowledge, information or insight. Part of Wikipedia’s success can be attributed to its policy of strong internal linking — extensive linking to other Wikipedia pages from within Wikipedia articles.

UPDATE: Gowers has started a forum discussion on some of the issues raised in this blog post.

April 6, 2009

Tricki salutes Wikipedia

Filed under: Tricki, Wikipedia, Wikis — vipulnaik @ 12:41 am

Tim Gowers of Polymath fame announced in this blog post the release of a Prelive version of Tricki. Tricki stands for “tricks wiki”, and Gowers has been working on it for quite some time along with Olof Sisask and Alex Frolkin, as he mentioned in this earlier blog post. The eventual aim is to make the wiki open to general editing, though in the current “pre-live” stage, people can only view content and add comments, rather than edit the actual content.

As of now, it seems an interesting experiment, but the current scope appears too broad and vague. A more detailed review will take some time coming, but one thing already caught my eye: this page. All the editing, except page creation and a minor formatting change, seem to have been done by Gowers (view the revision history to confirm), so I’ll attribute the writing to Gowers.

Titled Why have a separate site rather than simply use Wikipedia?, the page tries to provide a “justification” for the Tricki. Some of the statements here depress me.

To begin with, the very premise of the heading seems mistaken. There are very good reasons why the kind of content on Tricki cannot and should not be on Wikipedia, and these reasons are Wikipedia’s own clearly stated policies, such as No original research, Notability, Reliable Sources, and verifiability. Or, just take a look at What Wikipedia Is Not, and it is clear that the kind of content that the tricks wiki currently contains and plans to expand into is not the kind of content allowed on Wikipedia.

Thus, Gowers’ statement:

In principle, it would be possible to write Tricki articles and put them on Wikipedia.

is just false! Yes, it would be possible technically, or “in practice”, and most of these articles would become candidates for deletion as per Wikipedia’s deletion policy. In other words, Gowers’ tone that Tricki could in principle be a part of Wikipedia but they’re making a different choice for their own sake is misleading: Tricki does not have the choice to become a part of Wikipedia, and going it alone is the only practical alternative. (more…)

March 24, 2009

Concluding notes on the polymath project — and a challenge

Filed under: Wikis, polymath — vipulnaik @ 4:27 pm

In this previous blog post, I gave a quick summary of the polymath project, as of February 20, 2009. The project, which began around February 2, 2009, has now been declared successful by Gowers. While the original aim was to find a new proof of a special case of an already proved theorem, the project seems to have managed to find a new proof of the general case. There’s still discussion on how to clean up, prepare the draft of the final paper, and wrap up various loose ends.

In a subsequent blog post, Gowers gave his own summary of the project as well as what he thinks about the future potential of open collaborative mathematics. Michael Nielsen, who hosted the Polymath1 wiki where much of the collaboration occurred, also weighed in with this blog post.

In Gowers’ assessment, the project didn’t have the same kind of massive participation as he had hoped for. People like Gowers and Terence Tao participated quite a bit, and there were also a number of other people who made important contributions (my own estimate of this is around eight or nine, based on the comment threads, with an additional three or four who made a few crucial comments but did not participate extensively). But it still wasn’t “massive” the way that Gowers had envisaged it. Nielsen felt that, for a project just taking off, it did pretty well. He compared it to the early days of Wikipedia and the early days of Linux, and argued that the polymath project did pretty well compared to these other two projects, even though those projects probably had a lot larger appeal.

Good start, but does it scale?

Before the polymath project began (or rather, before I became aware of it), I wrote this blog post, where my main point was that while forums, blogs and “activity” sound a lot appealing, the greater value creation lies in having reliable online reference material that people can go to.

Does that make me a critic of polymath projects?

Well, yes and no. I had little idea at the time (just like everybody else) about whether the particular polymath project started by Gowers would be a success. Moreover, because Ramsey theory is pretty far from the kind of math I have a strong feel for, I had no idea how hard the problem would be. Nonetheless, a solution within a month for any nontrivial problem does seem very impressive. More important than the success in the project, what Gowers and the many others working on it should be congratulated for is the willingness to invest a huge amount of time into this somewhat new approach to doing math. Only through experimentation with new approaches can we get a feel for whether they work, and Gowers has possibly kickstarted a new mode of collaboration.

The “no” part, though, comes from my strong suspicion that this kind of thing doesn’t scale. (more…)

March 17, 2009

Academic and journalistic support for Wikipedia

Filed under: Wikipedia — vipulnaik @ 3:24 pm

Seth Finkelstein kindly responds to my blog post Wikipedia criticism, and why it fails to matter. Seth agrees with my basic point — it is hard to influence people away from either reading or editing Wikipedia. However, he entertains the hope that criticism might affect what he calls the “public intellectual” perspective of Wikipedia.

This is probably more likely, but the chances of this are dismal too. The digging up I’ve been doing of references to Wikipedia in books and writings suggests a fairly formulaic description of Wikipedia even by academic intellectuals who should know a lot more.

Here are the typical elements of a description of the online encyclopedia:

  • History beginning: Jimbo Wales started out in 2000 with the idea of a free encyclopedia. They started a “top-down” “expert-led” project called Nupedia that produced few articles. Then, “somebody told Wales about wiki software” and they implemented it and the ordinary people started contributing. (The writers who research facts better usually highlight the co-founder controversy and the fact that the proposal for Wikipedia was originally made by Larry Sanger, many writers omit to mention this).
  • Surprise, surprise, it works: Here’s the paradox. An encyclopedia with nobody in charge, with nobody getting paid and with people supplying volunteer time, is as accurate as Britannica, the expert-written encyclopedia (quote a Nature study by Jim Giles). Surprise, surprise, surprise. Then, use this to prove the favorite theory the author is expounding (this could be “intellectual commons”, “creative commons”, “produsage”, “commons-based peer production”, or some variant of that). (The writers who do more research usually mention that the study was investigative journalism rather than scientific research, and also mention Britannica’s repudiation and Nature’s response to those repudiations.)
  • Wikipedia is not without its flaws (surprise, surprise, surprise). The most popular example here is the John Seigenthaler story. Some more serious researchers go so far as to mention other controversies such as the Essjay controversy.
  • But the fact that anybody can edit Wikipedia is its greatest strength — because what can be undone can be done. Quote this study by researchers at MIT and IBM to prove the point, or talk about some specific anecdotal example.

There are few books written about the phenomenon of Wikipedia on the whole. Most references to Wikipedia are as part of books on something such as Web 2.0, user-generated content, the greatness of the Internet, and the above kind of treatment fits in well with the points the author is trying to make.

Among the authors who have praised Wikipedia but done a more in-depth analysis than the above, I can think of Clay Shirky (in his book Here comes everybody), Axel Bruns (in his book Blogs, Wikipedia, Second Life and Beyond: From Production to Produsage) and Chris Andersen (in his book The Long Tail). Though I think the presentations of each of these authors has ample deficiencies, I believe they’ve made some original contribution each in their analysis of Wikipedia. But the same cannot be said of a lot of people who give this simplistic presentation of Wikipedia, often stretching it out over several pages.

Why so much academic support for Wikipedia?

Contrast the legions of academics who write gloriously about Wikipedia, following and adding to the outline I’ve discussed above: Clay Shirky, Axel Bruns, Chris Andersen, Jonathan Zittrain (The Future of the Internet: and how to stop it), Lawrence Lessig (Code v 2), Cory Doctorow (Content), Yochai Benkler (Wealth of Networks), Tapscott and Williams (Wikinomics), Tom Friedman (The World Is Flat) and many others. Contrast this with the much smaller number of authors who take a genuinely critical stand of Wikipedia: Andrew Keen (who describes his own book, The Cult of the Amateur, as a “polemic” as opposed to a serious and balanced presentation) and Nicholas Carr (The Big Switch). Other critics of Wikipedia, many of whose names I outlined in my earlier blog posts, typically content themselves with writing blog posts. It may also be noted that most of the critics aren’t academic intellectuals who work and teach at universities. There are some librarian critics, such as Karen Schneider, who wrote this piece among others critical of Wikipedia; there are also subject experts who write critically of Wikipedia’s treatment of their particular subjects. But this isn’t the same as academics who’re supposed to be experts on the Internet, collaboration, and emerging trends and on other things that Wikipedia is arguably about, offering cogent negative analyses of Wikipedia.

So why is there such an overwhelmingly strong academic support for Wikipedia?

I suspect this goes down to the way academics write books. A book by an academic, whether written for a select academic audience or for a borader audience, typically has a “point” or a “theme”, and most of the themes of books that include Wikipedia are supportive of some of the ideas for which Wikipedia is a poster child. For instance, if Yochai Benkler wants to write a book arguing for the power of commons-based peer production (CBPP), or Axel Bruns wants to describe the power of produsage, the best way to use Wikipedia is as a publicly visible and easy-to-appreciate example of this. This creates a selection bias for the author to pick those aspects of Wikipedia that further the point and ignore the aspects that do not. To add the appearance of a fair and balanced treatment, things like the Seigenthaler episode can be thrown in.

On the other hand, criticism of Wikipedia doesn’t generally add up to any big point or theme that is exciting to write a book about. At least, not yet. Perhaps, ten years down the line, somebody may write a book on how corporations and organization systemically exploit free labor to produce results, and once this kind of narrative starts gaining a foothold in academia, the standard Wikipedia tale will morph — instead of “Wikipedia is subject to vandalism; however, its openness to editing is its greatest strength” might become “although openness to editing is a strength, Wikipedia is subject to vandalism, edit wars, and a lot of unproductive disputes”. But in order for such a book to be written, the theme has to be big enough, or thick enough, to fill an academic book.

Another related factor is “academic herding” — the tendency of academics and intellectuals to herd together. Better wrong together than right alone, as the saying goes. We’ve often been told how financial herding (where different investors, brokers, and fund managers prefer to do the same things their peers are doing, for fear of standing out) precipitates market crises. I suspect that academics herd too. Currently, the herding tendency is towards singing the virtues of an uneasy amalgam of open source, free culture, user-generated content, participation, bottom-up, and a lot of buzzwords. I say “uneasy” because in principle, many of these are independent and a supporter of one may very well choose not to be a supporter of the other. In practice, they come in a bunch and self-appointed progressives like to bundle disparate things such as “the fight against restrictive copyright”, “enabling ordinary users to create content”, “the fight for opening up source code”, and the success of that specific thing called Wikipedia.

These things aren’t always unbundled. Seth Finkelstein and Jason Scott, among many others, while ardent critics of Wikipedia, have been supporters of many of the causes that usually come bundled with it, such as open source and using Creative Commons licenses. Nonetheless, I suspect that the bundling effects, along with herding effects, could be pretty strong.

Journalism on Wikipedia

Journalists such as Tom Friedman may be excused for giving a shallow treatment of Wikipedia in the four pages he devoted to it in his long book The World Is Flat. Nonetheless, some of the mistakes that Friedman makes are echoed sadly too often. One of these is to spend too much time interviewing the “person in charge” or the “people at the helm”. I’m guessing that journalists typically need to do this to get up to scratch, but interviewing people at the helm can be tricky for something like Wikipedia where nobody really is in charge.

Sincere and hardworking journalists, including Pulitzer Prize winners such as Friedman and New Yorker writer Stacy Schiff, make painstaking efforts to interview people in charge of the encyclopedia. For this New Yorker piece, Stacy Schiff did an amazing amount of work interviewing people that the Wikimedia Foundation directed her to. Only, it turned out later that one of these people whom Jimbo Wales vouched for turned out to have faked his identity (this person was Essjay, who claimed to have several academic degrees but later turned out to be a college drop-out). That said, Schiff’s piece was probably among the best in a mainstream publication that I’ve seen, along with this New York Times Magazine piece, and some pieces in the Chronicle.

Less accomplished journalists (as well as some academics) make the usual gaffe of interviewing Jimbo Wales and then saying something like “Wales has the following plans for Wikipedia …”, as if Wales is responsible for the success of Wikipedia the same way a corporate entrepreneur is responsible for the success of his or her own enterprise. The following lines are only a slight exaggeration: “The sky is the limit for Wales. Having created the world’s biggest encyclopedia for free, Wales is now working on a free dictionary, free news, and a free resource for books and source text in the public domain.”

Conclusion

The implicit support that Wikipedia enjoys from academics and journalists will last for some time, despite the excellent efforts by some journalists and academics to go beyond the surface. To get a new academic perspective on Wikipedia, what is needed is a coherent theme or theoretical framework in which a negative assessment of Wikipedia can fit — and such a theme needs to overcome the herding and bundling tendencies seen in academia. To get a new journalistic perspective that is reflected in more than just a handful of thoroughly researched articles, we need enough prominent academics and other people in a position that they are likely to get interviewed by journalists looking to write an article on Wikipedia. Neither prospect seems immediately forthcoming.

March 7, 2009

More on Wikipedia criticism

Filed under: Wikipedia — vipulnaik @ 1:54 am

My previous blog post on Wikipedia criticism generated quite a few comments. This was partly because it got covered in this forum post at The Wikipedia Review. There were several points that I made during the post that some of the commentators disagreed with, so I’ll try to elaborate the rationale behind those points a bit here, as well as what might be new insights.

Sausage: eating and making

The perspective from which I’m analysing Wikipedia is primarily an end-user perspective. The central question I explored last time was, “Does criticism of Wikipedia ultimately affect whether people read Wikipedia articles?” My rough conclusion was that there is unlikely to be any direct effect. This is particularly true of criticism that is aimed at Wikipedia’s process, because end-users care more about the end product, rather than the process.

One opposing viewpoint to this is that criticism of Wikipedia may make people less comfortable with using Wikipedia, because it might change the perception of whether the Wikipedia entry is reliable, accurate, or unbiased enough to be used. For instance, if potential web users are aware that Wikipedia entries can be edited by anybody, they may rely less on Wikipedia. I think this is plausible, but my personal experience suggests that if, even after all the possible unreliability is taken into the balance, Wikipedia is still the easiest to use, people will go to Wikipedia.

One argument that I mentioned, and that some commentators also mentioned, was that criticism of Wikipedia may have an indirect effect by discouraging people from contributing. After all, knowing the bad conditions in a sausage factory may not be that much of a disincentive for eating sausage, as long as the sausage is good. But it may discourage people from joining the sausage factory. If new contributors fail to arrive to replace the old ones who leave, then, the argument goes, Wikipedia entries will decay to the point where they get so visibly bad that even the end-users start noticing a quality decline. The argument doesn’t claim that end users care about process, but it does claim that contributors care about process.

In the last blog post, I pointed out one problem with this argument. Namely, if the most conscientious editors are the ones who are put off editing the most by criticism, the people who’re left may be the ones who are most likely to have agendas to peddle. This may result in a decline in quality — but not an obvious or visible one. That’s because the information-peddlers who are still left after some people get put off the sausage factory may also be the people who are most skilled at masking disinformation as information.

Here, I’ll try to elaborate on this, as well as give my understanding of how Wikipedia entries actually evolve.

Improve over time?

The naive belief among wiki-utopians is that Wikipedia entries keep getting better with time. The improvement may not be completely monotone, but it is largely so, with bulk of the edits in the positive direction, punctuated by occasional vandalistic edits and good-faith edits that go against the Wikipedia policy. For instance, consider this piece by Aaron Krowne, written way back in March 2005, as a critical response to this article by Robert McHenry. Another article in the Free Software Magazine gloats about the rapid growth of Wikipedia.

Wikinomics, a book by Tapscott and Williams, says that [Wikipedia] is built on the premise that collboration among users will improve content over time, and later continues on this theme:

Unlike a traditional hierarchical company where people work for managers and money, self-motivated volunteers like Elf are the reason why order prevails over chaos in what might otherwise be an impossibly messy editorial process. Wales calls it a Darwinian evolutionary process, where content improves as it goes through iterations of changes and edits. Each Wikipedia article has been edited an average of twenty times, and for newer entries that number is higher.

In his book The Long Tail, Chris Anderson writes:

What makes Wikipedia really extraordinary is that it improves over time, organically healing itself as if its huge and growing army of tenders were an immune system, ever vigilant and quick to respond to anything that threatens the organism. And like a biological system, it evolves, selecting for traits that help it stay one step ahead of the predators and pathogens in the ecosystem.

Not everybody believes that Wikipedia articles keep increasing in quality. In this philosophical essay, Larry Sanger (a co-founder of Wikipedia) makes the following interesting hypothesis: the quality of a Wikipedia entry does a random walk about the best possible value that the most difficult of the editors watching it can allow. This is merely a hypothesis, born out of Sanger’s experience, and Sanger makes no attempts to provide quantitative or even anecdotal verification of the hypothesis. Others have pointed out that many of the Wikipedia articles that receive featured article status (some of which even make it to the Wikipedia front page) later revert to being middling articles. Consider, for instance, this excerpt from an article by Jason Scott on the general failure of Wikipedia:

It is not hard, browsing over historical edits to majorly contended Wikipedia articles, to see the slow erosion of facts and ideas according to people trying to implement their own idea of neutrality or meaning on top of it. Even if the person who originally wrote a section was well-informed, any huckleberry can wander along and scrawl crayon on it. This does not eventually lead to an improved final entry.

My view: precarious equilibrium

There is exactly one Wikipedia article on every topic. Given Wikipedia’s dominance, this is the canonical source of information on the subject for hundreds of thousands (perhaps millions) of people. This canonicity is part of what makes Wikipedia so appealing to end-users, but it also means that even minor disagreements among potential editors of the article can become pretty significant when it comes to controlling that scarce and extremely valuable resource: the content of the Wikipedia article. After all, it feels like it pays off to put up a fight if fighting a little more can affect what thousands of people will learn about the topic.

A Wikipedia article on a controversial topic typically settles into a precarious equilibrium between different factions or interest groups that want to take the article in different directions (or prevent it from going in different directions). Consider, for instances, articles such as evolution, intelligent design, abortion, and alternative medicine topics ranging from well-known topics such as homeopathy to relatively lesser known topics such as Emotional Freedom Techniques.

The Emotional Freedom Techniques article is an example of such an equilibrium. There are roughly two camps: the proponents/believers, or people who for other reasons, feel that the article should contain more details about the subject. On the other side are the skeptics/disbelievers, or people who otherwise feel that putting too much information on an unproven therapeutic approach may in fact amount to an endorsement by Wikipedia. At some point long past, the article was an editing hotbed (in relative terms to its current status). It was much longer, with a lot of discussion on the talk page; sample, for instance, this revision dated 9 January, 2006.

For the next year, till around February 2007, the article remained in roughly the same state, with proponents adding in positive details and removing negative details and critics doing the opposite. On 30 January 2007, the article was nominated for deletion (here is an archived copy of the deletion discussion). A sequence of edits in the next three weeks gradually reduced the scope of the article to a considerably smaller one. The idea was that in order to “save” the article, it needed to be reduced in scope. The critics had managed to disturb the past equilibrium. By March 2007, a new equilibrium had been established, and modulo the addition and removal of a few references, this new equilibrium has been maintained for the past two years.

My experience suggests that for controversial topics, this is typical: there are two or more camps of editors, and depending on their relative strengths, the article enters a certain state around which it varies a bit but roughly remains the same. Once this equilibrium has been established, it is not easy to break. One way of breaking the equilibrium is using a “war of attrition”: keep making changes in your direction until the other person gets tired and walks away. Another one is to recruit other forces to help you, and a third approach is to threaten drastic measures, such as deletion.

Of course, the story of conflicting agendas plays out even in relatively non-controversial topics. Even editors who aren’t ideologically opposed to each other can find a lot of different things to quibble about, thus barring progress of a Wikipedia article. In the less controversial and more low-profile cases, it isn’t so much blood-curdling fights that create an impasse but simply a lack of common vision on how to take the article forward. Different editors come to Wikipedia with their own baggages and agendas — even simple agendas on how mathematics or physics should be written or what restaurants should be mentioned in the article on a local community. Typically, editors join feeling enthusiastic that they’ll be able to share their ideas and knowledge with the rest of the world, and also learn from how others are sharing. Once they realize that others holding opposing views are going to work in orthogonal or at times opposing directions, they either get put off, or they enter a wiki-war. In some cases, I think there is a clear situation of some people seeking to do constructive editing and others trying to obstruct. In most cases, it is a bunch of minor ideological mismatches that lead to people either getting put off Wikipedia or choosing to get aggressive to defend the articles.

Thus, there are roughly two kinds of articles: the controversial ones where there is a precarious equilibrium between different interest groups trying to pull it in their direction, and the relatively non-controversial ones where competing agendas and views on how things “should be” written lead, not so much to warring, as to a simple lack of activity. The former happens in cases where the stakes are more significant, and where people can feel good and hot about taking particular stands. The latter happens in more mundane things such as normal subgroup or T. Nagar where people simply couldn’t be bothered to fight.

The precarious equilibrium exists at levels higher than the level of the individual article. For instance, Wikipedia has a long history of a battle between inclusionists and deletionists. In the beginning, when the encyclopedia was small, deletionists hardly existed. As the encyclopedia became larger, deletionists started gaining the upper hand, as the need for keeping the encyclopedia free of garbage began to be appreciated. The deletionists had their heyday in 2005-2006, but inclusionists have started gaining ground again. In a recent Guardian column, Seth Finkelstein describes some of the battles and underlying agendas.

Stagnation is not the same as death

The precarious equilibrium for controversial topics and the relative stagnation for non-controversial topics may suggest that Wikipedia article quality could well get on the decline, to the point where users start noticing. However, this is probably not going to happen, for many reasons. First, encyclopedia articles on non-controversial topics are often the kind of thing that do not get outdated anyway. For instance, the definition of a normal subgroup is unlikely to ever change. Similarly, basic definitions such as friction in physics and historical articles such as Aristotle aren’t likely to get outdated either.

For controversial topics, it may well happen that a precarious equilibrium may inhibit development, but then again, the whole problem is that nobody can define “development” of an article in clear terms.

But the bigger problem, both for controversial and non-controversial topics, is that it is highly unlikely that Wikipedia article quality will actually decline significantly. So, a stagnation or stabilization in article quality can spell doom for Wikipedia only if some competing resource is trying to improve. Stagnation or stability equals death only in the presence of serious competition.

Okay, so can competition succeed?

The problem here is that the success of an encyclopedia effort requires a large amount of collation of people’s efforts, and this kind of collation doesn’t happen easily. What Wikipedia has managed to do is give enough people the impression that it can be a useful place to pool their efforts. I remain unconvinced of whether Wikipedia has actually achieved to solve the many inherent problems of large-scale collaboration, but the very fact that they can convince a lot of people, even if for a short period of time for each person, to spend some time on Wikipedia, is impressive.

People often join Wikipedia sold on the idea of working together with others, sharing their ideas, and learning from others. But not too many people are really willing to learn or change their worldviews, and the one-article paradigm of Wikipedia really forces a lot of conflicts out into the open. Thus, after trying a bit to make their voice heard amidst the din, a number of people leave.

Of course, most people who leave are bound to think of themselves as in the “right” and stubborn other Wikipedians as in the wrong, which creates a temporary bonhomie between disgruntled ex-Wikipedians. Such temporary good feelings towards one’s fellow wronged might result in a new idealistic commitment to create something better than Wikipedia. But beneath that is still the fact that many of the disgruntled ex-Wikipedians have agendas that compete with and are incompatible with each other, and new efforts that seek to do better than Wikipedia haven’t yet found a way to overcome this problem. (Fundamentally, I don’t think a way exists that the problems can be overcome). There are many examples of Wikipedia forks that have rapidly settled down into obscurity. An example is Veropedia, run by ex-Wikipedian Danny Wool and Cassiopedia, The True Encyclopedia (this already seems to have vanished and been replaced by some new wiki). Other encyclopedias that try to do Wikipedia right include, for instance, Conservapedia, intended as the replacement to Wikipedia for conservative Christians.

An example of a could-be-better-than-Wikipedia encyclopedia effort is Citizendium, founded by Wikipedia co-founder Larry Sanger. Sanger, and many of the others who work on Citizendium, seem bent on avoiding the many edit wars and other conflict situations that arise in Wikipedia. So far, so good — the Citizendium has survived and has been growing slowly for the last one and a half years. Yet, beyond the substantially greater civility and the substantially lesser activity, there is little to distinguish Citizendium from Wikipedia in terms of the competing agendas of its users. The main difference right now, as far as I can see, is that the Citizendium articles typically settle into an equilibrium of inactivity (which is similar to most Wikipedia articles on non-controversial topics) as opposed to a precarious equilibrium born of warring parties.

I personally do not think that there is room for another Wikipedia-like endeavor, at least in the near future. This does not mean that everything that seeks to do Wikipedia better will necessarily fail outright. It is probable that Citizendium will continue to grow over the next few years, and may at some stage become good enough as a general-purpose encyclopedia. Nonetheless, it is unlikely to become seriously competitive with Wikipedia in the near future.

The direction in which competition to Wikipedia could indeed be dangeorus is the direction of an increased number of more specialist sites, that help provide answers to people’s queries in somewhat more specialized topics. These specialized sites, of course, have their own conflict problems, but may be able to overcome these problems better simply because there is no single one of them. This allows people with competing agendas to work for competing specialist sites, rather than battle needlessly on the same turf. The best example of a somewhat specialized wiki-based site is Wikitravel, which is a great site for travel information. Since this is a relatively more narrow-focused site, it has clearer policy that reduces conflict over the structuring of articles. There are many others at varying ends of the spectrum between general and special. For instance, there’s WikiHow, which also seems to be doing pretty well for itself: a wiki-based how-to manual. This sacrifices some of the canonicity of the Wikipedia entry by allowing different how-to articles to be written by different people. There are a lot of substantially more specialized and narrow efforts, ranging from the hastily conceived to the well-planned. My own efforts at a group theory wiki, followed by an effort to generalize this to subject wikis in general, is one small example.

Yes, but how can diversified competition succeed?

If, as I believe, a challenge to Wikipedia can be presented only through a large number of specialist sites that compete healthily with each other and with Wikipedia, we have a bit of a paradox. The paradox is that the very reason people go first to Wikipedia is so that they do not need to navigate through or remember a bunch of different sites — Wikipedia is useful as a one-word solution to the problem of finding information.

The paradox isn’t all that big once we remember that for each specialized topic, there’s likely to be only one, or a few, places to go to. And the other important point is that within each resource, locating the article or piece of information that’s needed is pretty fast. Imagine, for instance, that people surfing casually for medical information, instead of going to the Wikipedia entry, were guided towards a collection of competing medical information websites. At first sight, a random surfer may just pick one website, and look up information there. If the surfer found that information well-presented and useful, the surfer may continue to visit that specialized site for medical information of that sort. Another surfer with somewhat different tastes may not like that first pick and may try a different medical information site. Since the medical information sites need to compete for users, they strive to provide better information that answers users’ needs more effectively.

There are two differences with Wikipedia: first, different people use different competing sites. Second, each site restricts itself to something it can specialize in, so the choices a person makes with regard to medical information sites can be independent of the choices the person makes with regard to sports information sites or glamour/fashion sites.

The idea that diverse competition can succeed against Wikipedia is also possibly an affront to people who view the collaborative principles behind Wikipedia as morally superior to the cut-throat competition that characterizes much of the messy market. Somehow, competition seems to be inherently more destructive and wasteful than the “working together” that Wikipedia engenders. Seem as it may, I think that a more diverse range of knowledge offerings can actually help reduce effort as people spend less effort fighting each other and canceling each other’s efforts, and more effort building whatever things they believe in.

Further, there are ways to ensure that a spirit of competition coexists with a spirit of sharing of ideas and knowledge. Academic research and software development often follow extremely open sharing principles, and yet can be fiercely competitive. The key thing here is that since a lot of independent entities are separately pursuing visions and borrowing ideas from each other, there is little destructive warring for the control of a single scarce resource. A spirit of sharing and openness can be backed by open content licenses such as the Creative Commons licenses.

What about search engine and link dominance?

Back at the beginning of the 21st century, when Google was nascent and Wikipedia non-existent, there were a number of books highlighting possibly disturbing tendencies that might develop on the Internet. Among these was Republic.com by Cass R. Sunstein (also co-author of Nudge, and now a member of the Obama administration). Sunstein warned of the dangers of group polarization on the Internet, with extremely personalized surfing, linking only to similar sites. This, Sunstein argued, could potentially lead to two bad outcomes: the absence of a public space, where issues of general interest could be addressed, and the total non-exposure to opposing or different thoughts and ideas.

The problem today seems to be of a somewhat different nature: the presence of a few sites that dominate much of Internet surfing. Wikipedia is increasingly becoming a destination for information-seekers, both as a direct destination and as a destination via search engines and links. As I explained earlier, I believe that the canonicity of Wikipedia as an information source is what makes it so attractive to edit and control.

That is why some people have suggested that Google and other search engines that place Wikipedia highly are largely responsible for Wikipedia’s success. Some research has shown that about half of the visits to Wikipedia still come through search engines. This suggests a “solution” to the problem of Wikipedia dominance: demote Wikipedia in the search engines.

I don’t think such a solution will either work or make sense.

Unlike Wikipedia, which faces no serious competition, search engines face tremendous competition. Google may be a market leader but it cannot afford to sit back and relax. Search engines also have a strong incentive to please their users. This means that if Google is placing Wikipedia high up in its entries, then it has a strong incentive to do so: that’s what its users want. Whether this incentive is due to conscious tweaking by Google employees or simply an unforeseen consequence of Google’s PageRank algorithms is unclear, but if it were something that displeased users, Google would fix it.

I suspect that a lot of the traffic that comes to Wikipedia through search engines actually comes through algorithms of the sort: “Do a search and pick a Wikipedia entry if it shows up in the top five, otherwise pick whatever seems relevant.” Here, search engines are being used as a “Wikipedia+”: Wikipedia, plus the rest of the web if Wikipedia failed. If the search engine failed to turn up Wikipedia in the top five, and the user later found a Wikipedia entry on the topic that he or she felt should have been up there, the user may start bypassing the search engine and go directly to Wikipedia.

Second, even if search engines stop favoring Wikipedia, the default-to-Wikipedia rule runs through many things other than search engines. Many smartphone applications and other Internet-connected devices such as Amazon’s Kindle give Wikipedia privileged status: for instance, the Kindle enables special access to Wikipedia for no extra charge. Some smartphone lookup services utilize Wikipedia articles as a knowledge base.

But even if all these big guns decided to disfavor Wikipedia, the link dominance of Wikipedia is too widespread: there are any number of blogs that provide links to Wikipedia articles on topics, many of them probably based not so much on the particular qualities of the Wikipedia entry as on the brand name.

So yes, I do believe that if all linkers, commentators, search engines, and smartphone applications suddenly revolted against Wikipedia, people, finding a lot fewer links to Wikipedia, may start forgetting Wikipedia, or at any rate, make it less of a default. I just don’t see any reason for such a collective epiphany to occur.

February 23, 2009

Wikipedia criticism, and why it fails to matter

Filed under: Wikipedia — vipulnaik @ 10:50 pm

Over the past few months, I’ve been collecting newspaper and magazine articles about the phenomenon of Wikipedia. (I’ve myself written two blog posts on Wikipedia here and here). Prominent among the Wikipedia critics is Seth Finkelstein, a consulting programmer who does technology journalism on the side and publishes columns in the Guardian. Seth’s criticism is largely related to the politics of getting people to work for free. The Register has published many news and analysis articles critical of Wikipedia, such as this, this, this, and many others. The Register points out the many flaws in Wikipedia’s editing system, and has been critical of what it terms the cult of Wikipedia.

A critic who takes a somewhat different and perhaps more holistic view is Jason Scott, famous for running TEXTFILES.COM. Jason Scott has written many critical pieces on Wikipedia, such as this and this. He’s given three famous speeches about Wikipedia: The Great failure of Wikipedia (transcript), Mythapedia, and Brickipedia. Scott, who gave Wikipedia a try for some time and has experience with the MediaWiki software, says that Wikipedia employs “child labor” and compares it to a casino. Scott also hits on a powerful point: that it is precisely the canonicity and first-go reference nature of Wikipedia combined with the speed at which edits become visible that forms the “crack” for people to edit the site (a point he explores in depth in his Mythapedia speech).

A somewhat more distanced critic of Wikipedia is Nicholas Carr. Carr occasionally talks about Wikipedia on his blog, and his entries on Wikipedia are rarely full of undiluted optimism and admiration. For instance, his blog post on the centripetal web talks about how the Web, instead of becoming decentralized, is becoming systematically more concentrated towards fewer sites — the prime example of such a site being Wikipedia. In a later blog post titled All hail the information triumvirate!, Carr talks about how the Web, Google, and Wikipedia have come to acquire a fairly dominant position in many people’s daily life and work.

And then there’s Wikipedia’s co-founder, Larry Sanger, who left the project in 2002 after spearheading it for a little over a year. In October 2007, Sanger started a new encyclopedia project called Citizendium, The People’s Compendium, that recently crossed 10,000 articles. Sanger, who did a doctorate in philosophy, has been watching and writing about Wikipedia, and he recently came out with this philosophical paper in the Episteme journal. The paper uses what appears to be epistemological reasoning, at least part of which boils down to the idea that since experts are needed to judge the accuracy of Wikipedia, Wikipedia hasn’t managed to get rid of expertise. (Brock Read at the Chronicle wrote a short piece mentioning Sanger’s paper).

Of course, this hardly completes the list of Wikipedia critics. There’s Wikipedia Watch, started by Daniel Brandt, a confirmed Wikipedia critic. There’s the Wikipedia Review. There’s Robert McHenry, former Britannica editor-in-chief, who has written pieces critical of Wikipedia such as this and this. And there’s the self-described anti-Web-2.0 polemicist Andrew Keen, author of The Cult of the Amateur. One of the Wikipedia-critical pieces that often gets quoted is Digital Maoism: The hazards of the new online collectivism by Jaron Lanier.

Does the criticism matter?

Does criticism of Wikipedia serve any purpose (constructive or destructive) other than being an excuse to fill journal columns and blog space (I might note that the critical articles I wrote about Wikipedia have driven the most traffic to my blog)? it is hard to say. I want to argue here that it does not at least serve the obvious purpose of keeping potential readers away from Wikipedia.

My reason here is simple: the cost in terms of time, money, and effort, of accessing and using Wikipedia are just so low that any kind of cost-benefit analysis is simply too much of a long stretch to be done credibly. Secondly, the hidden costs of using Wikipedia are rarely borne by the user himself or herself, or are borne with what is an extremely low probability. Even if I were to believe the point made by Jason Scott and Nick Carr that the growing monopoly of Wikipedia in terms of information is not a good thing and our own laziness is what gives Wikipedia that power, such a belief is rarely enough to stop me from going and looking up the Wikipedia entry anyway.

I don’t have access to Wikipedia’s usage logs, approximate statistics are provided by wikigeist and various other Internet usage measurement services. At the time of writing, Wikigeist claims that the Wikipedia main page was viewed more than 200,000 times in the past one hour, while the hundredth entry, one on Google Earth, was viewed 475 times. Various estimates of Wikipedia usage put the number of daily pageviews in the hundreds of millions, and some studies have indicated that for a given topic with both a Britannica entry and a Wikipedia entry, the Wikipedia entry is consulted 200 times more often. The stats.grok.se service reveals how many times a particular article was viewed; the Wikipedia article on Barack Obama, for instance, was viewed four million times in January 2009, while the Wikipedia article on “normal subgroup” (a mathematical term) was viewed 2962 times in January 2009 (for some contrast, the groupprops article on normal subgroup has been viewed fewer than 1000 times in the past year).

More telling than the sheer number of pageviews, though, is the increasing extent to which I find people not even bothering to remember information knowing that they can “find it on Wikipedia.” Here are some anecdotal examples: in many recent discussions, a friend took out an IPhone to consult a Wikipedia entry to check a point; in a discussion where a friend told me about a certain kind of mollusk that eats its own brain, he told me that for reference I could Google it and follow the link to Wikipedia; even mathematical talks seem to have parts that say, “Wikipedia defines … as …”, despite the admittedly poor treatment of mathematics in Wikipedia. Again, I think this is largely because it is so easy and quick to use Wikipedia that its many obvious disadvantages pale in comparison to the speed and ease of use.

And yet the criticism may help

The criticism of Wikipedia does little to detract potential users from using it for quick reference. By and large, it does little to change Wikipedia’s policies either, in so far as what the critics are critical of is not something that some single entity at Wikipedia can change. However, such criticism can go some way in dampening the enthusiasm of people who edit Wikipedia, and in preventing people from citing Wikipedia.

For instance, Middlebury College forbade students from citing Wikipedia for history articles. This measure was severely criticized in the blogosphere, and adjectives such as “Luddite” were used to describe it. Others have argued that Wikipedia is a good “starting point” for research but people should follow through and cite the original sources. I personally think this is good policy. Actual citation of Wikipedia articles, in so far as it does occur, should follow robust citation conventions using stable versions (i.e., a link to the version of the article at the time the citation was made should be provided, rather than simply a link to the latest version of the article, which could be significantly different from the version at the time). Since citation policy is, in general, decided by fewer people, and since it involves work that generally takes more time (writing papers), I suspect that this is indeed achievable. (Wikipedia itself has various pages, such as this one, that describe how to research with Wikipedia).

Hyperlinking from blogs could follow a similar policy. Tim Bray’s post on linking describes the dilemma of linking to Wikipedia versus linking to the original source, as well as his own way of handling the dilemma. Again, since the number of people who write blogs (well, at least blogs that get read) is considerably fewer than the number of people who use the Internet for reference, there is again a possibility that writings critical of Wikipedia can influence the behavior of bloggers. One concrete step in this direction would be if people linking to Wikipedia articles do so only after reading the article, and indicate whether the link is due to a specific point made in the article, or just as background reference. If the link is to a specific point in the article, linking to a stable version might be desirable.

The other way writings critical of Wikipedia could influence Wikipedia is in terms of the influence they exert on people wondering whether to devote time and effort to Wikipedia. In general, people put in effort on a volunteer project only if the benefits to them exceed the cost, and if writings critical of Wikipedia make people better aware of some of the costs and benefits, it could help them make more informed decisions. Unfortunately, it is unclear whether this impact will be positive or negative on the whole. The problem here is similar to a problem highlighted by research of Michael Kremer and made popular in an article by Steven Landsburg: the more we make careful people refrain from a potentially dangerous activity, the more control the careless people get over it (the argument was originally made in the context of sex and AIDS). In this case, the more that careful and conscientious editors are put off Wikipedia, the more it’ll happen that the careless, sloppy, or partisan editors will take the reins. That’s because the importance of Wikipedia as a reference is so great that there’ll always be people lining up to edit it.

That this possibility is not merely hypothetical follows from the fact that many companies and high-profile individuals actually expend considerable resources maintaining the quality of entries on themselves, while subject-matter experts in an area work hard to check the entries in the subject. For instance, in the Chronicle piece Can Wikipedia make the grade?, Brock Read says:

But as the encyclopedia’s popularity continues to grow, some
professors are calling on scholars to contribute articles to
Wikipedia, or at least to hone less-than-inspiring entries in the
site’s vast and growing collection. Those scholars’ take is simple: If
you can’t beat the Wikipedians, join ‘em.

This leads to the interesting possibility that writings critical of Wikipedia may well have a negative effect in the following sense: people who might well be the most careful and conscientious editors are also the ones most likely to get put off editing Wikipedia by the arguments, and other editors get more leeway. As a result, the quality deteriorates somewhat, but the deterioration in quality is so small negligible to the overall ease of use of Wikipedia that people still continue to use it and link to it: they just get more biased articles, less accurate facts, and slightly more instances of vandalism. Of course, this bad outcome depends on the assumption that the people likely to be put off Wikipedia are the ones who may have become its best editors.

Fickle loyalties

Despite my contention that criticism of Wikipedia does little to alter how much people read it, I doubt that too many people are loyal to Wikipedia. People’s loyalty to Wikipedia usually boils down to this mental algorithm: “Go to Google, type the term, search. If a Wikipedia entry shows up, follow it, otherwise, follow whatever else looks relevant.” Estimates suggest that between 50% and 70% of Wikipedia’s traffic is driven by search engines. This suggests that if search engines start devaluing Wikipedia content, the default mental algorithm that many people have will have to be revised: either the search engine or Wikipedia will suffer.

More importantly, what drives people to Wikipedia is, on the whole, a certain kind of brand recognition — a comfort that since this is Wikipedia, and they’ve been here before, they’ll be able to get the information they need with ease. But brand recognition alone can survive only in the absence of competing brands. If people find a single, consistent source that comes up along with Wikipedia among the top few entries, they are likely to give that source a try, at least after they start recognizing it.

In conclusion, I believe that criticism of Wikipedia can help in limited ways: it can make people more careful when citing and linking, and it can be informative to people before they get started on the job of editing Wikipedia (though this, as I pointed out, can be a two-edged sword). But a serious decrease or diversion of usage (and consequently, of editing effort) from Wikipedia can happen only in the presence of a competing resource that offers at least similar levels of ubiquity, ease of use and quick reference, and probably visibility in search engines.

CORRECTION: As Jon Awbrey noted in the comments, Wikipedia Review was not started by Daniel Brandt. The contents of the blog post have been changed to reflect the correction.

February 2, 2009

On new modes of mathematical collaboration

(This blog post builds upon some of the observations I made in an earlier blog post on Google, Wikipedia and the blogosphere, but unlike that post, has a more substantive part dedicated to analysis. It also builds on the previous post, Can the Internet destroy the University?.)

I recently came across Michael Nielsen’s website. Michael Nielsen was a quantum computation researcher — he’s the co-author of Quantum computation and quantum information (ISBN 978-0521632355). Now, Nielsen is working on a book called The Future of Science, which discusses how online collaboration is changing the way scientists solve problems. Here’s Nielsen’s blog post describing the main themes of the book.

Journals — boon to bane?

Here is a quick simplification of Nielsen’s account. In the 17th century, inventors such as Newton and Galileo did not publish their discoveries immediately. Rather, they sent anagrams of these discoveries to friends, and continued to work on their discoveries in secret. Their main fear was that if they widely circulated their idea, other scientists would steal the idea and take full credit for it. By keeping the idea secret, they could develop it further and release it in a more ripe form. In the meantime, the anagram could be used to prove precedence in case somebody else also came up with the idea.

Nielsen argues that the introduction of journals, combined with public funding of science and the recognition of journal publications as a measure of academic achievement, led scientists to publish their work and thus divulge it to the world. However, today, journal publishing competes with an even more vigorous and instantaneous form of sharing: the kind of sharing done in blogs, wikis, and online forums. Nielsen argues that this kind of spontaneous sharing of rough drafts of ideas, of small details that may add up to something big, opens up new possibilities for collaboration.

In this respect, the use of online tools allows for a “scaling up” of the kind of intense, small-scale collaboration that formerly occurred only in face-to-face contact between trusted friends or close colleagues. However, Nielsen argues that academics, eager to get published in reputable journals, may be reluctant to use online forums to ask and answer questions of distant strangers. Two factors are at play here: first, the system of academic credit and tenure does little to reward online activity as opposed to publishing in journals. Second, scientists may fear that other scientists can get a whiff of their idea and beat them in the race to publish.

(Nielsen develops “scaling up” more in his blog post, Doing Science Online).

Nielsen says that this in inefficient. Economists do not like deadweight losses (Market wiki entry, Wikipedia entry) in markets — situations where one person has something to sell to another, and the other person is willing to pay the price, but the deal doesn’t occur. Nielsen says that such deadweight losses occur routinely in academic research. Somebody has a question, and somebody else has an answer. But due to the high search cost (Market wiki entry, English Wikipedia entry), i.e., the cost of finding the right person with the answer, the first person never gets the answer, or has to struggle a lot. This means a lot of time lost.

Online tools can offer a solution to the technical problem of information-seekers meeting information-providers. The problem, though, isn’t just one of technology. It is also a problem of trust. In the absence of enforceable contracts or a system where the people exchanging information can feel secure about not being “cheated” (in this case, by having their ideas stolen), people may hesitate to ask questions to the wider world. Nielsen’s suggestions include developing robust mechanisms to measure and reward online contribution.

Blogging for mathies?

Some prominent mathematical bloggers that I’ve come across: Terence Tao (Fields Medalist and co-prover of the Green-Tao theorem), Richard E. Borcherds (famous for his work on Moonshine), and Timothy Gowers. Tao’s blog is a mixed pot of lecture notes, updates on papers uploaded to the ArXiV, and his thoughts on things like the Poincare conjecture and the Navier-Stokes equations. In fact, in his post on doing science online, Nielsen uses the example of a blog post by Tao explaining the hardness of the Navier-Stokes equation. In Nielsen’s words:

The post is filled to the brim with clever perspective, insightful observations, ideas, and so on. It’s like having a chat with a top-notch mathematician, who has thought deeply about the Navier-Stokes problem, and who is willingly sharing their best thinking with you.

Following the post, there are 89 comments. Many of the comments are from well-known professional mathematicians, people like Greg Kuperberg, Nets Katz, and Gil Kalai. They bat the ideas in Tao’s post backwards and forwards, throwing in new insights and ideas of their own. It spawned posts on other mathematical blogs, where the conversation continued.

Tao and others, notably Gowers, also often throw ideas about how to make mathematical research more collaborative. In fact, I discovered Michael Nielsen through a post by Timothy Gowers, Is massively collaborated mathematics possible?, which mentions Nielsen’s post on doing science online. (Nielsen later critiqued Gowers’ post. Gowers considers alternatives such as a blog, a wiki, and an online forum, and concludes that an online forum best serves the purpose of working collaboratively on mid-range problems: problems that aren’t too easy and aren’t too hard.

My fundamental disagreements

A careful analysis of Nielsen’s thesis will take more time, but off-the-cuff, I have at least a few points of disagreement about the perspective from which Nielsen and Gowers are looking at the issue. Of course, my difference in perspective stems from my different (and altogether considerably fewer) experience compared to them.

I fully agree with Nielsen’s economic analysis with regard to research and collaboration: information-seekers and information-providers not being able to get in contact often leads to squandered opportunities. I’ve expressed similar sentiments myself in previous posts, though not as crisply as Nielsen.

My disagreement is with the emphasis on “community” and “activity”. Community and activity could be very important to researchers, but in my view they can obscure the deeper goal of growing knowledge. And it seems that in the absence of strong clusters, community and activity can result in a system that is almost as inefficient.

In the early days of the Internet, mailing lists were a big thing (they continue to be a big thing, but their relative significance in the Internet has probably declined). In those days, the Usenet mailing lists and bulletin board systems often used to be clogged with the same set of questions, asked repeatedly by different newbies. The old hands, who usually took care of answering the questions, got tired of this repetition of the same old questions. Thus was born the “Usenet FAQ”. With this FAQ, the mailing lists stopped getting clogged with the same old questions and people could devote attention to more challenging issues.

Forums (such as Mathlinks, which uses PHPBB) are a little more advanced than mailing lists in terms of the ability to browse by topic. However, they are still fundamentally a collection of questions and answers posted by random people, with no overall organizing framework that aids exploration and learning. In a situation where the absence to a forum is no knowledge, a forum is a good place. In fact, a forum can be one input among many for building a systematic base of knowledge. But when a forum is built instead of a systematic body of knowledge, the result could be a lot of duplication and inefficiency and the absence of a bigger picture.

Systematic versus creative? And the irony of Wikipedia

Systematic to some people means “top-down”, and top-down carries negative connotations for many; or at any rate, non-positive connotation. For instance, the open source movement, which includes Linux and plenty of “free software”, prides itself on being largely a bottom-up movement, with uncoordinated people working of their own volition to contribute small pieces of code to a large project. Top-down direction could not have achieved this. In economic jargon, when each person is left to make his or her own choices, the outcome is invariably more efficient, because people have more “private information” about their interests and strengths. (Nielsen uses open source as an example for where science might go by being more open in many of his posts, for instance, this one on connecting scientists to scientists).

But when I’m saying systematic, I don’t necessarily mean top-down. rather, I mean that the system should be such that people know where their contributions can go. The idea is to minimize the loss that may happen because one person contributes something at one place, but the other person doesn’t look for it there. This is very important, particularly in a large project. A forum to solve mathematical questions has the advantage over offline communication: the content is available for all to see. But this advantage is truly meaningful only if everybody who is interested can locate the question easily.

Systematic organization does not always mean less of a sense of community and activity, but this is usually the case. When material is organized through internal and logical considerations, considerations of chronological sequence and community dynamics take a backseat. The ultimate irony is that Wikipedia, which is often touted as the pinnacle of Web 2.0 achievement, seems to prove exactly the opposite: the baldness, anti-contextuality, canonical naming, and lack of a “time” element to Wikipedia’s entries is arguably its greatest strength.

Through choices of canonical naming (the name of an article is precisely its topic), extensive modularization (a large number of individual units, namely the separate articles), a neutral, impersonal, no-credit-to-author-on-the-article style, and extensive strong internal linking, Wikipedia has managed to become an easy reference for all. If I want to read the entry on a topic, I know exactly where to look on Wikipedia. If I want to edit it, I know exactly what entry to edit, and I’m guaranteed that all future people reading the Wikipedia entry looking for that information will benefit from my changes. In this respect, the Wikipedia process is extraordinarily efficient. (It is inefficient in many other ways, namely, the difficulty of quality control, measured by the massive amount of volunteer hours spent combating obvious and non-obvious spam, as well as the tremendous amount of time spent in inordinate battle over control and editing of particular entries).

The power of the Internet is its perennial and reliable availability (for people with reliable access to electricity, machines, and Internet connections). And Wikipedia, through the ease with which one can pinpoint and locate entries, and the efficiency with which it funnels the efforts both of readers and contributors to edit a specific entry, builds on that power. And I suspect that, for a lot of us, a lot of the time we’re using the Internet, we aren’t seeking exciting activity, a sense of community, or personal solidarity. We want something specific, quickly. Systematic organization and good design and architecture that gets us there fast is what we need.

What can online resources offer?

A blog creates a sense of activity, of time flowing, of comments ordered chronologically, of a “conversation”. This is valuable. At the same time, a systematic organized resource, that organizes material not based on a timeline of discovery but rather based on intrinsic characteristics of the underlying knowledge, is usually better for quick lookup and user-directed discovery (where the user is in charge of things).

It seems to me that the number of successful “activity-based online resources” will continue to remain small. There will be few quality blogs that attract high-quality comments, because the effort and investment that goes into writing a good blog entry is high. There may be many mid-ranging blogs offering random insights, but these will offer little of the daily adventure feeling from a high-traffic, high-comment blog.

On the other hand, the market was quick “pinpoint references” — the kind of resources that you can use to quickly look something up — seems huge. A pinpoint reference differs from a forum in this obvious way. In a forum you ask a question and wait for an answer, or, you browse through previously asked questions. In a pinpoint reference, you decide you want to know about a topic, and go to the page, and BANG, the answer’s already there, along with a lot of stuff you might have thought of asking but never got around to, all neatly organized and explorable.

Fortunately or unfortunately, the notion of “community” and “activity” is more appealing in a naive, human sense than the notion of pinpoint references. “Chatting with a friend” has more charm to it than having electricity. But my experience with the way people actually work seems to suggest that people value self-centered, self-directed exploration quite a bit, and may be willing to sacrifice a sense of solidarity or “being with others in a conversation” for the sake of more such exploration. Pinpoint resources offer exactly that kind of a self-directed model to users.

My experiment in this direction: subject wikis

I started a group theory wiki in December 2006, and have since extended it to a general subject wikis website. The idea is to have a central source, the subject wikis reference guide, from where one can search for terms, get short general definitions, with links to more detailed entries in individual subject wikis. See, for instance, the the entry on “normal”.

I’ve also recently started a blog for the subject wikis website, that will describe some of the strategies and approaches and choices involved in the subject wikis.

It’s not clear to me how this experiment will proceed. At the very least, my work on the group theory wiki is helping me with my research, while my work on the other wikis (which has been very little in comparison) has helped me consolidate the standard knowledge I have in these subjects along with other tidbits of knowledge or thoughts I’ve entertained. Usage statistics seem to indicate that many people are visiting and finding useful the entries on the group theory subject wiki, and there are a few visitors to each of the other subject wikis as well. What isn’t clear to me is whether this can scale to a robust reference where many people contribute and many people come to learn and explore.

January 3, 2008

Wikis –logic and magic

Before I had set out for the University of Chicago to pursue doctoral studies, one of my favourite intra-math hobbies was the development of my Group Theory Wiki, a place where I was aggregating and assorting ideas, facts and definitions in group theory, starting from the most basic. The development and organization was based on a property-theoretic paradigm that I had come up with long ago. As often happens with me, the wiki concept and its execution seemed just too good to be true, and the time I devoted to the group theory wiki was often not so much to learn the subject as to admire the magic of a wiki-style organization in aiding and abetting my sneaky thoughts in the subject.

The frustrating thing with magic is when I’m the only one to see it; it almost feels like I’m in my deluded world. My goal was to make the group theory wiki reach a stage where the magic could be seen by everybody. However, since I was almost single-handedly developing the wiki, and the vision of the wiki was too precarious to expose to large-scale public scrutiny, I was far away from that stage when leaving for Chicago. Of course, group theory is not a topic that unites the world’s masses, so there was in any case not much of a potential audience. But I wanted the tool to reach the extent that it serves well whatever audience it has.

When I left for the University of Chicago I realized that the wiki would have to be shelved for some time; large-scale structural and restructuring work on the wiki would not be possible with the course load at Chicago (I knew this, despite the fact that I significantly under-estimated the course load at Chicago). Thus, I decided to set aside the group theory wiki for good, and come back to it when in a position of greater strength.

Some time around May-June of last year, I had also started wikis on the same model as my group theory wiki, for subjects like differential geometry, topology, and commutative algebra. Most of these wikis had languished behind the group theory wiki, and I didn’t know whether, or when, I would pick them up again.

It was somewhere in the middle of October, when I was feeling particularly overwhelmed with my coursework in Chicago, that I decided on a somewhat novel addition to my approach for studying algebraic topology: I would contribute some articles to my topology wiki as I kept understanding the course material in algebraic topology, and I would keep improving the structure and organization to reflect my improved grasp of the relationships within algebraic topology. This was hard, because my understanding was very piecemeal, and it was intimidating to try writing a wiki page. The comforting thing was that nobody else was taking a look at these pages, so I could develop and move them around as I wished. Although writing on the wiki was only one of many tools that I used to help get to grips with algebraic topology (more significant ones being attending lectures, solving assignments, and discussing with fellow students) it was a tool that left the biggest imprint — the content I had put on the wiki continued to be just as accessible a month later, and I was quickly able to mould it and improve it over time. I even took a day off to review basic concepts from point-set topology, which led to further embellishments to the wiki. The Topology wiki now contains a reasonable amount of nontrivial matter, and it has reached a stage where it is self-organizing; where I am getting as much as, or more than, I give. It is, of course, still a long way from the point where it could be of use to all the people whose work involves or relates to topology.

My goal is that when somebody reads the page on normal subgroup or characteristic subgroup or Hausdorff space, something happens over and above just that person’s reading and understanding the definitions of the terms. A number of tidbits, that the person may have heard in class as random theorems or manipulations, suddenly start clicking. “Oh, that’s what’s happening!” is the exclamation that people should routinely make on reading the articles.

My hope and goal is that when a student struggling to solve an apparently unmotivated assignment problem tunes in to the relevant wiki, he/she not only immediately gets the solution (which is the primary requirement) through an effortless search, neatly presented, but also learns of all the secret things that were hidden behind the problem: the motivation and related ideas that the instructor had cleverly concealed (or tantalizingly revealed), the relation with other problems and ideas. For instance, a student who wants to check out the proof that an intersection of normal subgroups is normal or that a characteristic subgroup of a normal subgroup is normal should be rewarded with more than just aproof of the fact; the way that fact integrates with the rest of the subject should also be relevant.

As a person explores around the wiki, he or she should naturally develop a nose for what’s going on; every article should inspire questions like “okay, what about this variant?” Survey articles like varying Hausdorffness or ubiquity of normality can give the person a feel for the different perspectives and aspects to which a single notion can be scrutinized, as well as provide pathways to enter from the very basic definitions to a lot of advanced ideas in related subjects.

There is no great technological superiority being employed in the wikis at this stage; the main good feature is the ease with which linking, structuring and organization can be done. I personally feel that a lot of us spend a lot of time writing and reading books, solving homework problems, writing them up, writing exams, writing and looking for research papers. All these are valuable actions, but a lot of it, apart from the immediate value it gives, tends to get forgotten. Homework solutions written on paper find their way either to the trash can or to dusty drawers; homework solutions written up on machines have greater longevity, but not the same ease of access as a quick-to-search wiki page. The magic of wikis can lie in reducing recall and access time to the point where there is nothing to put the brake on a never-ending stream of ideas.

This may sound like an exaggeration of the role that wikis could play, or it may seem an infeasible model, which is why I want to prove, through action rather than words, the power of the wiki model. I will soon start working on developing further the commutative algebra wiki which has been languishing for some time. Efforts on the group theory and topology wikis will continue unabated, and I might soon pick up the differential geometry wiki as well.

It’s only a matter of time for the magic to unfold!

March 10, 2007

Wikis 2

It’s been over a month since I put my last post on wikis, and at that time, I had just started on my group properties wiki. Much content has flown into the wiki since then, and with the passage of time, I have become more and more convinced that wikis could be a powerful aid to learning and sharing knowledge of mathematics — perhaps even a tool for the active creation of mathematics.

When I first started out in CMI, I was full of a lot of curiosity regarding groups, and I would try to satisfy this curiosity in different ways. I read the Wikipedia articles on the subject, I would surf arbitrarily for terms related to group theory, locate some papers and try reading a few pages and gathering a few definitions from the papers. This was all really exciting and I got a number of ideas and organizational principles on group theory.
ideas were a mesh of complicated stuff. I had this idea here, that idea there. I had some frameworks for my ideas, and I tried writing up the ideas here and there, but it was way too much effort — developing an idea in a full article required a whole lot of effort, and further, there was so much constant changing and updating that was needed that it tended to become a pain.

As a result, many of the ideas and organizational principles that I had remained undocumented. Even the ones that did get documented were usually hard to retrieve/read from the documentation because they were split across multiple files, some of their definitions got duplicated, some lemmas got re-referenced and kept changing in their wording, and so on. So it was a mess. I didn’t know of any way in which I could put the entire web of thoughts I had, somewhere down in writing.

Given the lack of a medium, I shelved aside my work on this front, and tried working on other fronts. In most cases, though, whether I was working with original ideas or trying to grasp existing ones, the structure of the ideas was highly interwoven and nonlinear and articles/write-ups didn’t seem to capture the understanding properly. Still, I used these to whatever extent possible so that at least whatever was covered in the lectures, got properly documented.

Some time in June 2006, I started on an ambitious project of putting up important group theory definitions on Wikipedia. I in fact did put up a whole lot of new articles; however, I soon realized that much of my effort was poorly directed and poorly organized, and moreover, that a lot of it may be undone because there were many other people who felt the articles and material should be organized differently.

It was towards the end of 2006 that I seriously started considering the possibility of starting my own wiki in group theory, and by early December 2006, I had gotten started.

Now for the experience with making the wiki so far.

My initial goal was to use the wiki as a Pensieve (one with easy retrieval) for all the various ideas and facts that were literally taking up too much space in my head. Thus, to begin with, I just kept introducing/writing random articles in nearly random formats. However, as I went along, a certain format/pattern started emerging (in fact I had been subconsciously following this pattern earlier). I also realized that this pattern could be significantly improved upon and thus I designed some generic layouts and formats for various kinds of articles. (I have documented some of these at Groupprops:Article)

Initially, I would manually insert the categories for each article. But then I realized that a more efficient and nicer way would be to create a ”template” for the article type (refer Groupprops:Templates). The advantage of this is that apart from including in a category, it could also print a nice message on screen telling us something about the term (which causes it to be included in the relevant category).

I have also tried to keep a few of the articles always in tune with the latest style. For instance, I have tried to make sure that the articles on normal subgroup, characteristic subgroup, and some others, have the latest format so that they can be used as reference points.

During the month of December and part of January, I was focussing on putting in definitions. Probably that’s because definitions are the thing that fascinate me the most. In January, i started experimenting with articles describing facts and their proofs — I prepared a format for these and started churning out proofs of important statements.

Currently, I am trying to work on improving the depth of the wiki in various themes, for instance, the Classification of finite simple groups, the Extensible Automorphisms problem, combinatorial/geometric gorup theory, and linear representation theory. I am also working on writing more survey articles and expository articles (like the ones on conjugacy class-representation duality, varying on the subgroup property of normality, and varying on the subgroup property of simplicity. These help tie in a lot of definitions and proofs, thus leading the wiki to offer more value than just an organized set of definitions and proofs.

Somehow, though I have put up a lot of content on the wiki and am rapidly adding more, I am still not clear about where this wiki will finally be headed. For instance, I am not clear about the question: do I currently want other people to join into the collaborative effort, or would I prefer to make the wiki fairly robust before I get started on that? Also, do I want other people to come and read it — if so, how and where should I publicize the wiki?

I’ve already circulated the link among a few of my friends here in CMI and a few people outside, but so far I don’t think it has picked up.

(On the other hand, how much can a wiki in group theory pick up?)

Another question I want to answer is: how much effort am I willing to sustainedly put into the group theory wiki? As in, will I continue working on it once I join Graduate School? If the wiki does indeed grow bigger with more people participating and getting involved, am I willing to take on the additional responsibility of coordinating and maintaining it? What end will that serve?

In answer to that, I think I can use the wiki very effectively to document and clarify my ideas in group theory, if only to myself. Hence if I choose group theory as my dissertation subject, the wiki should be very useful in helping me formulate and refine my ideas.

I’ve also been thinking beyond the group theory wiki, towards a general culture of having a wiki per topic (or rather, a huge mass of wikis such that every subject is covered by at least one) — possibly each with different organizational principles or paradigms. For instance, others who don’t like the style or organization of my group theory wiki, but like the idea of a wiki, can start another wiki of their own. Multiple wikis means more competition and more pressure to produce quality material. It also means greater variety for the end-user, each user can choose the wiki whose style more suits his/her personal learning style.

Of course, wikis would not replace the traditional tools of learning, but they could supplement, and provide new inputs. I definitely find learning in front of a computer, with the freedom to click on whatever links I want, and the freedom to search for any term, far more exciting than reading from a textbook. And it often points me to interesting texts to read that I wouldn’t have heard of otherwise.

So for now, I guess, it’s: keep working at the wiki till I get an idea of where it’s headed.

January 9, 2007

Wikis

Filed under: Wikis — vipulnaik @ 2:20 pm
Tags: , ,

I’ve been interested in group theory for a long time (in fact, ever since twelfth standard, as I have documented here). Apart from learning the subject, I also explored organizational principles in group theory, particularly those centered around properties of groups and subgroups, and I made many observations regarding groups.

Some of these observations led me to the Extensible Automorphisms Problem as well as other questions in group theory. Other observations helped open the way for studying topics closely related to group theory such as representation theory, combinatorial group theory, and so on.

Often, many definitions and terms in group theory (specially those introduced in recent times) are hard to look up and understand because they are used only in obscure papers. I have strongly felt the need for a means for easy reference to definitions and proofs. I have also sought ways in which people can easily “look up” whether or not a certain fact is true without having to go through masses of literature, and also locate proof ideas that may be useful for a particular problem they are working on.

For some time, I have felt that with my understanding and perspective of group theory, I can work towards the creation of a knowledge repository in the subject that allows for quick lookup of facts and also encourages fruitful exploration.

Some time in mid-June and early July, during free time in between my Visiting Students’ Research Programme at TIFR, I started on an intensive programme of putting up articles on basic definitions in group theory on Wikipedia. Around that time, I decided that whenever I come across a new term or definition (that is reasonably standardized), I will put it up on Wikipedia. This way, Wikipedia serves as my canonical self-reference tool, while also providing a repository for others.

However, I realized that putting up material on Wikipedia has some limitations:

* I cannot organize the article content of definition articles the way I want
* I have very little control over the global structure, something which is very crucial to effective navigation and exploration.
* I cannot put up original work and original ideas

Thus, after some time, with the pressure of other work, I shelved aside my Wikipedia drive.

Some time in late October, with CMI’s shift to its new hostels imminent, the students of CMI decided to use a wiki for discussing the important shifting-related concerns. After some research, Shreevatsa started out the page here on editthis.info. Later, I decided to set up a wiki detailing the activities of CMI Spark. This differed from the shifting concerns as it was an ”entire wiki” as opposed to just a wiki page.

Towards the end of November, while working on the Extensible Automorphisms Problem, I got very confused about the whole lot of terminology that I was both using from standard sources and introducing on my own. I first thought of starting a wiki on extensible automorphisms, and even made a few pages. I realizd that the wiki required lots of definitions from group theory, so I decided to convert it to a group properties wiki, that I made here.

I first started work on this wiki in the end of November, just as I was leaving for home after the semester in CMI. After going home, I had to concentrate on wrapping up my applications, and so I worked only sporadically on the wiki. Towards the end of my vacations in December, I started putting huge spurts of effort for the wiki. After returning to CMI in January, I spent the weekend on the group properties wiki.

My initial attempt was to ”pour out” what I knew of group properties and develop the format of the articles through experience and actual practice with writing the articles. I also drew on the ideas of property theory. With each new batch of articles I wrote, I discovered some general ways in which the article structure could be made more effective. I then tried to do a back-cleaning of the articles as per these general ways.

In the beginning of January, once I started feeling that the wiki had grown to a reasonable size and had reasonable width of coverage, I shifted focus to creating policy guidelines for the wiki. This process has been continuing till yesterday.

Working on the group properties wiki taught me that a wiki could be a very effective tool for creating, formulating and sharing ideas. So far, the ‘’sharing” part has not been active.

Today, when I realized that I needed to brush up commutative algebra in order for studying certain areas of mathematics, I looked back at my commutative algebra texts. I realized that many term definitions that I had studied earlier failed to stay in my mind because I was unable to grasp the overall structure and linkages. Then, I thought about starting a wiki on commutative/non-commutative algebra and algebraic geometry. Unlike the group properties wiki, this would be on a topic where my knowledge and foundations were quite weak and where I did not have any ”new” ideas. On the positive side, though, I had my experience with the group properties wiki to provide me a rapid start.

I started the wiki today here and have already made decent progress.

I have also been trying to develop a theory of mine that I dubbed “APS theory”, and I am using this wiki to document the theory.

How these wikis shape up and whether they get used by people other than me, remains to be seen.

Blog at WordPress.com.