What Is Research?

May 28, 2010

Planetmath and Mathworld losing out to Wikipedia

Filed under: Web structure,Wikipedia,Wikis — vipulnaik @ 10:59 pm
Tags: ,

As I’ve mentioned earlier (such as here), it has seemed to me for some time that both Planetmath and Mathworld are losing out as Internet-based mathematical references to Wikipedia. I don’t expect that there has been absolute decline in the traffic to these websites — the growth of the Internet by leaps and bounds would mean that a website has to make a fairly active effort to lose users. But I do expect that more and more of the new users who are coming in are treating Wikipedia more as a first source of information.

From the preliminary results of a recent SurveyMonkey survey: out of 29 respondents (most of them people pursuing or qualified in mathematics), 24 claimed to have used Wikipedia often, and the remaining 5 to have used it occasionally for mathematical reference information. For Planetmath, 22 people claimed to have used it occasionally, and 7 to have heard of it but not used it. For Mathworld, 5 people claimed to have used it often, 16 to have used it occasionally, 7 to have heard of it but not used it, and 1 to have never heard of it.

In response to a question on how Planetmath and Mathworld compare with Wikipedia, the nine free responses were:

  • Planetmath now loads too slowly, so I seldom use it. Even when I did use it often, I found that the articles tended either to be too incomplete, or too tailored to people who are already experts. Many Mathworld pages are still much better than many Wikipedia pages, but Wikipedia is more comprehensive.

  • planet math is slow!!! mathworld has more graphics and less maths.

  • Material is generally better presented on Mathworld, but the topics are more limited.

  • Not as much information as Wikipedia and not as well arranged

  • I usually find Wikipedia to be my first source and only go to Mathworld or Planetmath if wikipedia fails me. I guess that means Wikipedia is better.

  • Planetmath is dying, Mathworld is static.

  • I find wikipedia much more useful than Mathworld. Mathworld’s pages are very technical, which is not what I am looking for on the internet usually. Usually I am looking for someone’s nice conceptual understanding of a topic or definition (through nice examples), and Wikipedia usually has lots of these.

  • Planetmath is quite useful to find proofs. Mathworld is very specialised, but it has a few nice bits of information sometimes. They seem to both be quite stagnant compared to Wikipedia.

  • They can’t keep up. There was probably a time when PlanetMath was a better reference than Wikipedia, but it’s fading fast. I think their ownership model isn’t conducive to long term quality.

(See the responses to these and other questions here and take the survey here).

The general consensus does seem to confirm my suspicions. Why is Wikipedia gaining? Here are the broad classes of explanations:

  1. It’s just a self-reinforcing process. The more people hear about and link to Wikipedia, the more people are likely to read it, the more people are likely to edit it and improve it. But if that’s the case, why did Wikipedia ever get ahead of Mathworld and Planetmath? Two reasons: (i) its more radically open editing rules (ii) Wikipedia covers many areas other than mathematics, so people come to the site more in general. Also, since it covers many areas other than mathematics, it can better cover content straddling mathematics and other areas, such as biographies of mathematicians, and historical information that is relevant to mathematics. This creates a larger, strongly internally linked, repository of information.

  2. Planetmath’s owner-centric model (as mentioned in one of the responses) where each entry is owned by one person, does not create a conducive environment for the gradual growth and improvement of entries.

  3. The appearance of content is better on Wikipedia. Prettier symbols, faster loading, better internal links, better search. This is definitely an advantage over Planetmath, which has slow load times in the experience of many users (as indicated by the comments above) though perhaps not so much over Mathworld.

  4. Google weights Wikipedia higher (because of the larger size of the website and the fact that a lot of people link to Wikipedia). This is related to (1).

  5. The people in charge of Mathworld and Planetmath simply lost interest. Mathworld is largely run by Eric Weisstein, an employee at Wolfram, who seems to have recently been trying to integrate metadata about mathematical theorems and conjectures into Wolfram Alpha. Developing Mathworld continually to a point of excellence does not seem to have been a top priorty for Weisstein or his employer Wolfram Research (that hosts Mathworld) over the last few years. The people running Planetmath also may have become less interested in continually innovating.

Given all this, is Wikipedia the best in terms of: (i) the current product or (ii) the process of arriving at the product? While I’m far from a Wikipedia evangelist, I think that the answer to (ii) is roughly yes if you’re thinking of broad appeal. Anything which beats Wikipedia will probably do so by being more narrowly focused, but it may then not be of much interest to people outside that domain. A host of many such different niche references may together beat out Wikipedia for people who care enough to learn about a multiplicity of references. For those who just want one reference website, Wikipedia will continue to be the place of choice in the near future (i.e., the next 3-4 years at least, in my opinion).

Currently, Wikipedia is an uneasy mix of precise technical information and motivational paragraphs. It makes little use of metadata to organize its information; on the other hand, it is easy to edit and join in. The mathematics entries cannot be radically changed in a way that would make them radically different in appearance from the articles on the rest of the site. This opens up many niche possibilities, some of which are being explored:

  1. Lab notebooks, where people store a bunch of thoughts about a topic, without attempts to organize them into something very coherent. Here, good metadata and tagging conventions could allow these random lab notebook-type jottings to cohere into an easily accessible reference. This would be the mathematical version of open notebook science, a practice that is slowly spreading in some of the experimental sciences. nLab (the n-category lab) is one example of a “lab notebook” in the mathematical context. This is great for motivation, and also for understanding the minds of mathematicians and the process of mathematical reasoning.

  2. Something that focuses on a particular aspect of mathematical activity. For instance, Tricki, called the Tricks Wiki, focuses on tricks. Other references may focus on formulas, others may focus on counterexamples, yet others (such as the AIMath wiki on localization techniques and equivariant cohomology) may focus simply on providing extensive bibliographies. Somewhat more developed examples include the Dispersive Wiki and complexity zoo (actually, a computer science topic, but similar in nature to a lot of mathematics). Some may focus on exotic tricks of relevance to a particular mathematical discipline. There is some cross-over with lab notebooks, as the tricks become more and more exotic and the writing becomes more and more spontaneous and less subject to organization into an article.

  3. Highly structured content rich in metadata that is intended to provide definitions, proofs and clarify analogies/relations. Examples include the Group Properties Wiki [DISCLOSURE: I started it and am the primary contributor] which concentrates on group theory. The flip side is that the high degree of organization uses subject-specific structures and hence must be concentrated on a particular narrow subject.

There are probably many other niches waiting to be filled. And there may also be close susbstitutes for reference sites that weren’t created as references. For instance, Math Overflow, though not a reference site, may play the role of a reference site once it accumulates a huge number of questions and answers and adopts better search and specific tagging capabilities. Similarly, thirty years from now, the contents of Terry Tao’s weblog may contain a bit on virtually every mathematical topic, in the same way as Marginal Revolution have a bit on almost all basic economic topics (I say “thirty years” because economics is in many ways a smaller subject than mathematics).

April 6, 2009

Tricki salutes Wikipedia

Filed under: Tricki,Wikipedia,Wikis — vipulnaik @ 12:41 am
Tags:

Tim Gowers of Polymath fame announced in this blog post the release of a Prelive version of Tricki. Tricki stands for “tricks wiki”, and Gowers has been working on it for quite some time along with Olof Sisask and Alex Frolkin, as he mentioned in this earlier blog post. The eventual aim is to make the wiki open to general editing, though in the current “pre-live” stage, people can only view content and add comments, rather than edit the actual content.

As of now, it seems an interesting experiment, but the current scope appears too broad and vague. A more detailed review will take some time coming, but one thing already caught my eye: this page. All the editing, except page creation and a minor formatting change, seem to have been done by Gowers (view the revision history to confirm), so I’ll attribute the writing to Gowers.

Titled Why have a separate site rather than simply use Wikipedia?, the page tries to provide a “justification” for the Tricki. Some of the statements here depress me.

To begin with, the very premise of the heading seems mistaken. There are very good reasons why the kind of content on Tricki cannot and should not be on Wikipedia, and these reasons are Wikipedia’s own clearly stated policies, such as No original research, Notability, Reliable Sources, and verifiability. Or, just take a look at What Wikipedia Is Not, and it is clear that the kind of content that the tricks wiki currently contains and plans to expand into is not the kind of content allowed on Wikipedia.

Thus, Gowers’ statement:

In principle, it would be possible to write Tricki articles and put them on Wikipedia.

is just false! Yes, it would be possible technically, or “in practice”, and most of these articles would become candidates for deletion as per Wikipedia’s deletion policy. In other words, Gowers’ tone that Tricki could in principle be a part of Wikipedia but they’re making a different choice for their own sake is misleading: Tricki does not have the choice to become a part of Wikipedia, and going it alone is the only practical alternative. (more…)

March 17, 2009

Academic and journalistic support for Wikipedia

Filed under: Wikipedia — vipulnaik @ 3:24 pm

Seth Finkelstein kindly responds to my blog post Wikipedia criticism, and why it fails to matter. Seth agrees with my basic point — it is hard to influence people away from either reading or editing Wikipedia. However, he entertains the hope that criticism might affect what he calls the “public intellectual” perspective of Wikipedia.

This is probably more likely, but the chances of this are dismal too. The digging up I’ve been doing of references to Wikipedia in books and writings suggests a fairly formulaic description of Wikipedia even by academic intellectuals who should know a lot more.

Here are the typical elements of a description of the online encyclopedia:

  • History beginning: Jimbo Wales started out in 2000 with the idea of a free encyclopedia. They started a “top-down” “expert-led” project called Nupedia that produced few articles. Then, “somebody told Wales about wiki software” and they implemented it and the ordinary people started contributing. (The writers who research facts better usually highlight the co-founder controversy and the fact that the proposal for Wikipedia was originally made by Larry Sanger, many writers omit to mention this).
  • Surprise, surprise, it works: Here’s the paradox. An encyclopedia with nobody in charge, with nobody getting paid and with people supplying volunteer time, is as accurate as Britannica, the expert-written encyclopedia (quote a Nature study by Jim Giles). Surprise, surprise, surprise. Then, use this to prove the favorite theory the author is expounding (this could be “intellectual commons”, “creative commons”, “produsage”, “commons-based peer production”, or some variant of that). (The writers who do more research usually mention that the study was investigative journalism rather than scientific research, and also mention Britannica’s repudiation and Nature’s response to those repudiations.)
  • Wikipedia is not without its flaws (surprise, surprise, surprise). The most popular example here is the John Seigenthaler story. Some more serious researchers go so far as to mention other controversies such as the Essjay controversy.
  • But the fact that anybody can edit Wikipedia is its greatest strength — because what can be undone can be done. Quote this study by researchers at MIT and IBM to prove the point, or talk about some specific anecdotal example.

There are few books written about the phenomenon of Wikipedia on the whole. Most references to Wikipedia are as part of books on something such as Web 2.0, user-generated content, the greatness of the Internet, and the above kind of treatment fits in well with the points the author is trying to make.

Among the authors who have praised Wikipedia but done a more in-depth analysis than the above, I can think of Clay Shirky (in his book Here comes everybody), Axel Bruns (in his book Blogs, Wikipedia, Second Life and Beyond: From Production to Produsage) and Chris Andersen (in his book The Long Tail). Though I think the presentations of each of these authors has ample deficiencies, I believe they’ve made some original contribution each in their analysis of Wikipedia. But the same cannot be said of a lot of people who give this simplistic presentation of Wikipedia, often stretching it out over several pages.

Why so much academic support for Wikipedia?

Contrast the legions of academics who write gloriously about Wikipedia, following and adding to the outline I’ve discussed above: Clay Shirky, Axel Bruns, Chris Andersen, Jonathan Zittrain (The Future of the Internet: and how to stop it), Lawrence Lessig (Code v 2), Cory Doctorow (Content), Yochai Benkler (Wealth of Networks), Tapscott and Williams (Wikinomics), Tom Friedman (The World Is Flat) and many others. Contrast this with the much smaller number of authors who take a genuinely critical stand of Wikipedia: Andrew Keen (who describes his own book, The Cult of the Amateur, as a “polemic” as opposed to a serious and balanced presentation) and Nicholas Carr (The Big Switch). Other critics of Wikipedia, many of whose names I outlined in my earlier blog posts, typically content themselves with writing blog posts. It may also be noted that most of the critics aren’t academic intellectuals who work and teach at universities. There are some librarian critics, such as Karen Schneider, who wrote this piece among others critical of Wikipedia; there are also subject experts who write critically of Wikipedia’s treatment of their particular subjects. But this isn’t the same as academics who’re supposed to be experts on the Internet, collaboration, and emerging trends and on other things that Wikipedia is arguably about, offering cogent negative analyses of Wikipedia.

So why is there such an overwhelmingly strong academic support for Wikipedia?

I suspect this goes down to the way academics write books. A book by an academic, whether written for a select academic audience or for a borader audience, typically has a “point” or a “theme”, and most of the themes of books that include Wikipedia are supportive of some of the ideas for which Wikipedia is a poster child. For instance, if Yochai Benkler wants to write a book arguing for the power of commons-based peer production (CBPP), or Axel Bruns wants to describe the power of produsage, the best way to use Wikipedia is as a publicly visible and easy-to-appreciate example of this. This creates a selection bias for the author to pick those aspects of Wikipedia that further the point and ignore the aspects that do not. To add the appearance of a fair and balanced treatment, things like the Seigenthaler episode can be thrown in.

On the other hand, criticism of Wikipedia doesn’t generally add up to any big point or theme that is exciting to write a book about. At least, not yet. Perhaps, ten years down the line, somebody may write a book on how corporations and organization systemically exploit free labor to produce results, and once this kind of narrative starts gaining a foothold in academia, the standard Wikipedia tale will morph — instead of “Wikipedia is subject to vandalism; however, its openness to editing is its greatest strength” might become “although openness to editing is a strength, Wikipedia is subject to vandalism, edit wars, and a lot of unproductive disputes”. But in order for such a book to be written, the theme has to be big enough, or thick enough, to fill an academic book.

Another related factor is “academic herding” — the tendency of academics and intellectuals to herd together. Better wrong together than right alone, as the saying goes. We’ve often been told how financial herding (where different investors, brokers, and fund managers prefer to do the same things their peers are doing, for fear of standing out) precipitates market crises. I suspect that academics herd too. Currently, the herding tendency is towards singing the virtues of an uneasy amalgam of open source, free culture, user-generated content, participation, bottom-up, and a lot of buzzwords. I say “uneasy” because in principle, many of these are independent and a supporter of one may very well choose not to be a supporter of the other. In practice, they come in a bunch and self-appointed progressives like to bundle disparate things such as “the fight against restrictive copyright”, “enabling ordinary users to create content”, “the fight for opening up source code”, and the success of that specific thing called Wikipedia.

These things aren’t always unbundled. Seth Finkelstein and Jason Scott, among many others, while ardent critics of Wikipedia, have been supporters of many of the causes that usually come bundled with it, such as open source and using Creative Commons licenses. Nonetheless, I suspect that the bundling effects, along with herding effects, could be pretty strong.

Journalism on Wikipedia

Journalists such as Tom Friedman may be excused for giving a shallow treatment of Wikipedia in the four pages he devoted to it in his long book The World Is Flat. Nonetheless, some of the mistakes that Friedman makes are echoed sadly too often. One of these is to spend too much time interviewing the “person in charge” or the “people at the helm”. I’m guessing that journalists typically need to do this to get up to scratch, but interviewing people at the helm can be tricky for something like Wikipedia where nobody really is in charge.

Sincere and hardworking journalists, including Pulitzer Prize winners such as Friedman and New Yorker writer Stacy Schiff, make painstaking efforts to interview people in charge of the encyclopedia. For this New Yorker piece, Stacy Schiff did an amazing amount of work interviewing people that the Wikimedia Foundation directed her to. Only, it turned out later that one of these people whom Jimbo Wales vouched for turned out to have faked his identity (this person was Essjay, who claimed to have several academic degrees but later turned out to be a college drop-out). That said, Schiff’s piece was probably among the best in a mainstream publication that I’ve seen, along with this New York Times Magazine piece, and some pieces in the Chronicle.

Less accomplished journalists (as well as some academics) make the usual gaffe of interviewing Jimbo Wales and then saying something like “Wales has the following plans for Wikipedia …”, as if Wales is responsible for the success of Wikipedia the same way a corporate entrepreneur is responsible for the success of his or her own enterprise. The following lines are only a slight exaggeration: “The sky is the limit for Wales. Having created the world’s biggest encyclopedia for free, Wales is now working on a free dictionary, free news, and a free resource for books and source text in the public domain.”

Conclusion

The implicit support that Wikipedia enjoys from academics and journalists will last for some time, despite the excellent efforts by some journalists and academics to go beyond the surface. To get a new academic perspective on Wikipedia, what is needed is a coherent theme or theoretical framework in which a negative assessment of Wikipedia can fit — and such a theme needs to overcome the herding and bundling tendencies seen in academia. To get a new journalistic perspective that is reflected in more than just a handful of thoroughly researched articles, we need enough prominent academics and other people in a position that they are likely to get interviewed by journalists looking to write an article on Wikipedia. Neither prospect seems immediately forthcoming.

March 7, 2009

More on Wikipedia criticism

Filed under: Wikipedia — vipulnaik @ 1:54 am

My previous blog post on Wikipedia criticism generated quite a few comments. This was partly because it got covered in this forum post at The Wikipedia Review. There were several points that I made during the post that some of the commentators disagreed with, so I’ll try to elaborate the rationale behind those points a bit here, as well as what might be new insights.

Sausage: eating and making

The perspective from which I’m analysing Wikipedia is primarily an end-user perspective. The central question I explored last time was, “Does criticism of Wikipedia ultimately affect whether people read Wikipedia articles?” My rough conclusion was that there is unlikely to be any direct effect. This is particularly true of criticism that is aimed at Wikipedia’s process, because end-users care more about the end product, rather than the process.

One opposing viewpoint to this is that criticism of Wikipedia may make people less comfortable with using Wikipedia, because it might change the perception of whether the Wikipedia entry is reliable, accurate, or unbiased enough to be used. For instance, if potential web users are aware that Wikipedia entries can be edited by anybody, they may rely less on Wikipedia. I think this is plausible, but my personal experience suggests that if, even after all the possible unreliability is taken into the balance, Wikipedia is still the easiest to use, people will go to Wikipedia.

One argument that I mentioned, and that some commentators also mentioned, was that criticism of Wikipedia may have an indirect effect by discouraging people from contributing. After all, knowing the bad conditions in a sausage factory may not be that much of a disincentive for eating sausage, as long as the sausage is good. But it may discourage people from joining the sausage factory. If new contributors fail to arrive to replace the old ones who leave, then, the argument goes, Wikipedia entries will decay to the point where they get so visibly bad that even the end-users start noticing a quality decline. The argument doesn’t claim that end users care about process, but it does claim that contributors care about process.

In the last blog post, I pointed out one problem with this argument. Namely, if the most conscientious editors are the ones who are put off editing the most by criticism, the people who’re left may be the ones who are most likely to have agendas to peddle. This may result in a decline in quality — but not an obvious or visible one. That’s because the information-peddlers who are still left after some people get put off the sausage factory may also be the people who are most skilled at masking disinformation as information.

Here, I’ll try to elaborate on this, as well as give my understanding of how Wikipedia entries actually evolve.

Improve over time?

The naive belief among wiki-utopians is that Wikipedia entries keep getting better with time. The improvement may not be completely monotone, but it is largely so, with bulk of the edits in the positive direction, punctuated by occasional vandalistic edits and good-faith edits that go against the Wikipedia policy. For instance, consider this piece by Aaron Krowne, written way back in March 2005, as a critical response to this article by Robert McHenry. Another article in the Free Software Magazine gloats about the rapid growth of Wikipedia.

Wikinomics, a book by Tapscott and Williams, says that [Wikipedia] is built on the premise that collboration among users will improve content over time, and later continues on this theme:

Unlike a traditional hierarchical company where people work for managers and money, self-motivated volunteers like Elf are the reason why order prevails over chaos in what might otherwise be an impossibly messy editorial process. Wales calls it a Darwinian evolutionary process, where content improves as it goes through iterations of changes and edits. Each Wikipedia article has been edited an average of twenty times, and for newer entries that number is higher.

In his book The Long Tail, Chris Anderson writes:

What makes Wikipedia really extraordinary is that it improves over time, organically healing itself as if its huge and growing army of tenders were an immune system, ever vigilant and quick to respond to anything that threatens the organism. And like a biological system, it evolves, selecting for traits that help it stay one step ahead of the predators and pathogens in the ecosystem.

Not everybody believes that Wikipedia articles keep increasing in quality. In this philosophical essay, Larry Sanger (a co-founder of Wikipedia) makes the following interesting hypothesis: the quality of a Wikipedia entry does a random walk about the best possible value that the most difficult of the editors watching it can allow. This is merely a hypothesis, born out of Sanger’s experience, and Sanger makes no attempts to provide quantitative or even anecdotal verification of the hypothesis. Others have pointed out that many of the Wikipedia articles that receive featured article status (some of which even make it to the Wikipedia front page) later revert to being middling articles. Consider, for instance, this excerpt from an article by Jason Scott on the general failure of Wikipedia:

It is not hard, browsing over historical edits to majorly contended Wikipedia articles, to see the slow erosion of facts and ideas according to people trying to implement their own idea of neutrality or meaning on top of it. Even if the person who originally wrote a section was well-informed, any huckleberry can wander along and scrawl crayon on it. This does not eventually lead to an improved final entry.

My view: precarious equilibrium

There is exactly one Wikipedia article on every topic. Given Wikipedia’s dominance, this is the canonical source of information on the subject for hundreds of thousands (perhaps millions) of people. This canonicity is part of what makes Wikipedia so appealing to end-users, but it also means that even minor disagreements among potential editors of the article can become pretty significant when it comes to controlling that scarce and extremely valuable resource: the content of the Wikipedia article. After all, it feels like it pays off to put up a fight if fighting a little more can affect what thousands of people will learn about the topic.

A Wikipedia article on a controversial topic typically settles into a precarious equilibrium between different factions or interest groups that want to take the article in different directions (or prevent it from going in different directions). Consider, for instances, articles such as evolution, intelligent design, abortion, and alternative medicine topics ranging from well-known topics such as homeopathy to relatively lesser known topics such as Emotional Freedom Techniques.

The Emotional Freedom Techniques article is an example of such an equilibrium. There are roughly two camps: the proponents/believers, or people who for other reasons, feel that the article should contain more details about the subject. On the other side are the skeptics/disbelievers, or people who otherwise feel that putting too much information on an unproven therapeutic approach may in fact amount to an endorsement by Wikipedia. At some point long past, the article was an editing hotbed (in relative terms to its current status). It was much longer, with a lot of discussion on the talk page; sample, for instance, this revision dated 9 January, 2006.

For the next year, till around February 2007, the article remained in roughly the same state, with proponents adding in positive details and removing negative details and critics doing the opposite. On 30 January 2007, the article was nominated for deletion (here is an archived copy of the deletion discussion). A sequence of edits in the next three weeks gradually reduced the scope of the article to a considerably smaller one. The idea was that in order to “save” the article, it needed to be reduced in scope. The critics had managed to disturb the past equilibrium. By March 2007, a new equilibrium had been established, and modulo the addition and removal of a few references, this new equilibrium has been maintained for the past two years.

My experience suggests that for controversial topics, this is typical: there are two or more camps of editors, and depending on their relative strengths, the article enters a certain state around which it varies a bit but roughly remains the same. Once this equilibrium has been established, it is not easy to break. One way of breaking the equilibrium is using a “war of attrition”: keep making changes in your direction until the other person gets tired and walks away. Another one is to recruit other forces to help you, and a third approach is to threaten drastic measures, such as deletion.

Of course, the story of conflicting agendas plays out even in relatively non-controversial topics. Even editors who aren’t ideologically opposed to each other can find a lot of different things to quibble about, thus barring progress of a Wikipedia article. In the less controversial and more low-profile cases, it isn’t so much blood-curdling fights that create an impasse but simply a lack of common vision on how to take the article forward. Different editors come to Wikipedia with their own baggages and agendas — even simple agendas on how mathematics or physics should be written or what restaurants should be mentioned in the article on a local community. Typically, editors join feeling enthusiastic that they’ll be able to share their ideas and knowledge with the rest of the world, and also learn from how others are sharing. Once they realize that others holding opposing views are going to work in orthogonal or at times opposing directions, they either get put off, or they enter a wiki-war. In some cases, I think there is a clear situation of some people seeking to do constructive editing and others trying to obstruct. In most cases, it is a bunch of minor ideological mismatches that lead to people either getting put off Wikipedia or choosing to get aggressive to defend the articles.

Thus, there are roughly two kinds of articles: the controversial ones where there is a precarious equilibrium between different interest groups trying to pull it in their direction, and the relatively non-controversial ones where competing agendas and views on how things “should be” written lead, not so much to warring, as to a simple lack of activity. The former happens in cases where the stakes are more significant, and where people can feel good and hot about taking particular stands. The latter happens in more mundane things such as normal subgroup or T. Nagar where people simply couldn’t be bothered to fight.

The precarious equilibrium exists at levels higher than the level of the individual article. For instance, Wikipedia has a long history of a battle between inclusionists and deletionists. In the beginning, when the encyclopedia was small, deletionists hardly existed. As the encyclopedia became larger, deletionists started gaining the upper hand, as the need for keeping the encyclopedia free of garbage began to be appreciated. The deletionists had their heyday in 2005-2006, but inclusionists have started gaining ground again. In a recent Guardian column, Seth Finkelstein describes some of the battles and underlying agendas.

Stagnation is not the same as death

The precarious equilibrium for controversial topics and the relative stagnation for non-controversial topics may suggest that Wikipedia article quality could well get on the decline, to the point where users start noticing. However, this is probably not going to happen, for many reasons. First, encyclopedia articles on non-controversial topics are often the kind of thing that do not get outdated anyway. For instance, the definition of a normal subgroup is unlikely to ever change. Similarly, basic definitions such as friction in physics and historical articles such as Aristotle aren’t likely to get outdated either.

For controversial topics, it may well happen that a precarious equilibrium may inhibit development, but then again, the whole problem is that nobody can define “development” of an article in clear terms.

But the bigger problem, both for controversial and non-controversial topics, is that it is highly unlikely that Wikipedia article quality will actually decline significantly. So, a stagnation or stabilization in article quality can spell doom for Wikipedia only if some competing resource is trying to improve. Stagnation or stability equals death only in the presence of serious competition.

Okay, so can competition succeed?

The problem here is that the success of an encyclopedia effort requires a large amount of collation of people’s efforts, and this kind of collation doesn’t happen easily. What Wikipedia has managed to do is give enough people the impression that it can be a useful place to pool their efforts. I remain unconvinced of whether Wikipedia has actually achieved to solve the many inherent problems of large-scale collaboration, but the very fact that they can convince a lot of people, even if for a short period of time for each person, to spend some time on Wikipedia, is impressive.

People often join Wikipedia sold on the idea of working together with others, sharing their ideas, and learning from others. But not too many people are really willing to learn or change their worldviews, and the one-article paradigm of Wikipedia really forces a lot of conflicts out into the open. Thus, after trying a bit to make their voice heard amidst the din, a number of people leave.

Of course, most people who leave are bound to think of themselves as in the “right” and stubborn other Wikipedians as in the wrong, which creates a temporary bonhomie between disgruntled ex-Wikipedians. Such temporary good feelings towards one’s fellow wronged might result in a new idealistic commitment to create something better than Wikipedia. But beneath that is still the fact that many of the disgruntled ex-Wikipedians have agendas that compete with and are incompatible with each other, and new efforts that seek to do better than Wikipedia haven’t yet found a way to overcome this problem. (Fundamentally, I don’t think a way exists that the problems can be overcome). There are many examples of Wikipedia forks that have rapidly settled down into obscurity. An example is Veropedia, run by ex-Wikipedian Danny Wool and Cassiopedia, The True Encyclopedia (this already seems to have vanished and been replaced by some new wiki). Other encyclopedias that try to do Wikipedia right include, for instance, Conservapedia, intended as the replacement to Wikipedia for conservative Christians.

An example of a could-be-better-than-Wikipedia encyclopedia effort is Citizendium, founded by Wikipedia co-founder Larry Sanger. Sanger, and many of the others who work on Citizendium, seem bent on avoiding the many edit wars and other conflict situations that arise in Wikipedia. So far, so good — the Citizendium has survived and has been growing slowly for the last one and a half years. Yet, beyond the substantially greater civility and the substantially lesser activity, there is little to distinguish Citizendium from Wikipedia in terms of the competing agendas of its users. The main difference right now, as far as I can see, is that the Citizendium articles typically settle into an equilibrium of inactivity (which is similar to most Wikipedia articles on non-controversial topics) as opposed to a precarious equilibrium born of warring parties.

I personally do not think that there is room for another Wikipedia-like endeavor, at least in the near future. This does not mean that everything that seeks to do Wikipedia better will necessarily fail outright. It is probable that Citizendium will continue to grow over the next few years, and may at some stage become good enough as a general-purpose encyclopedia. Nonetheless, it is unlikely to become seriously competitive with Wikipedia in the near future.

The direction in which competition to Wikipedia could indeed be dangeorus is the direction of an increased number of more specialist sites, that help provide answers to people’s queries in somewhat more specialized topics. These specialized sites, of course, have their own conflict problems, but may be able to overcome these problems better simply because there is no single one of them. This allows people with competing agendas to work for competing specialist sites, rather than battle needlessly on the same turf. The best example of a somewhat specialized wiki-based site is Wikitravel, which is a great site for travel information. Since this is a relatively more narrow-focused site, it has clearer policy that reduces conflict over the structuring of articles. There are many others at varying ends of the spectrum between general and special. For instance, there’s WikiHow, which also seems to be doing pretty well for itself: a wiki-based how-to manual. This sacrifices some of the canonicity of the Wikipedia entry by allowing different how-to articles to be written by different people. There are a lot of substantially more specialized and narrow efforts, ranging from the hastily conceived to the well-planned. My own efforts at a group theory wiki, followed by an effort to generalize this to subject wikis in general, is one small example.

Yes, but how can diversified competition succeed?

If, as I believe, a challenge to Wikipedia can be presented only through a large number of specialist sites that compete healthily with each other and with Wikipedia, we have a bit of a paradox. The paradox is that the very reason people go first to Wikipedia is so that they do not need to navigate through or remember a bunch of different sites — Wikipedia is useful as a one-word solution to the problem of finding information.

The paradox isn’t all that big once we remember that for each specialized topic, there’s likely to be only one, or a few, places to go to. And the other important point is that within each resource, locating the article or piece of information that’s needed is pretty fast. Imagine, for instance, that people surfing casually for medical information, instead of going to the Wikipedia entry, were guided towards a collection of competing medical information websites. At first sight, a random surfer may just pick one website, and look up information there. If the surfer found that information well-presented and useful, the surfer may continue to visit that specialized site for medical information of that sort. Another surfer with somewhat different tastes may not like that first pick and may try a different medical information site. Since the medical information sites need to compete for users, they strive to provide better information that answers users’ needs more effectively.

There are two differences with Wikipedia: first, different people use different competing sites. Second, each site restricts itself to something it can specialize in, so the choices a person makes with regard to medical information sites can be independent of the choices the person makes with regard to sports information sites or glamour/fashion sites.

The idea that diverse competition can succeed against Wikipedia is also possibly an affront to people who view the collaborative principles behind Wikipedia as morally superior to the cut-throat competition that characterizes much of the messy market. Somehow, competition seems to be inherently more destructive and wasteful than the “working together” that Wikipedia engenders. Seem as it may, I think that a more diverse range of knowledge offerings can actually help reduce effort as people spend less effort fighting each other and canceling each other’s efforts, and more effort building whatever things they believe in.

Further, there are ways to ensure that a spirit of competition coexists with a spirit of sharing of ideas and knowledge. Academic research and software development often follow extremely open sharing principles, and yet can be fiercely competitive. The key thing here is that since a lot of independent entities are separately pursuing visions and borrowing ideas from each other, there is little destructive warring for the control of a single scarce resource. A spirit of sharing and openness can be backed by open content licenses such as the Creative Commons licenses.

What about search engine and link dominance?

Back at the beginning of the 21st century, when Google was nascent and Wikipedia non-existent, there were a number of books highlighting possibly disturbing tendencies that might develop on the Internet. Among these was Republic.com by Cass R. Sunstein (also co-author of Nudge, and now a member of the Obama administration). Sunstein warned of the dangers of group polarization on the Internet, with extremely personalized surfing, linking only to similar sites. This, Sunstein argued, could potentially lead to two bad outcomes: the absence of a public space, where issues of general interest could be addressed, and the total non-exposure to opposing or different thoughts and ideas.

The problem today seems to be of a somewhat different nature: the presence of a few sites that dominate much of Internet surfing. Wikipedia is increasingly becoming a destination for information-seekers, both as a direct destination and as a destination via search engines and links. As I explained earlier, I believe that the canonicity of Wikipedia as an information source is what makes it so attractive to edit and control.

That is why some people have suggested that Google and other search engines that place Wikipedia highly are largely responsible for Wikipedia’s success. Some research has shown that about half of the visits to Wikipedia still come through search engines. This suggests a “solution” to the problem of Wikipedia dominance: demote Wikipedia in the search engines.

I don’t think such a solution will either work or make sense.

Unlike Wikipedia, which faces no serious competition, search engines face tremendous competition. Google may be a market leader but it cannot afford to sit back and relax. Search engines also have a strong incentive to please their users. This means that if Google is placing Wikipedia high up in its entries, then it has a strong incentive to do so: that’s what its users want. Whether this incentive is due to conscious tweaking by Google employees or simply an unforeseen consequence of Google’s PageRank algorithms is unclear, but if it were something that displeased users, Google would fix it.

I suspect that a lot of the traffic that comes to Wikipedia through search engines actually comes through algorithms of the sort: “Do a search and pick a Wikipedia entry if it shows up in the top five, otherwise pick whatever seems relevant.” Here, search engines are being used as a “Wikipedia+”: Wikipedia, plus the rest of the web if Wikipedia failed. If the search engine failed to turn up Wikipedia in the top five, and the user later found a Wikipedia entry on the topic that he or she felt should have been up there, the user may start bypassing the search engine and go directly to Wikipedia.

Second, even if search engines stop favoring Wikipedia, the default-to-Wikipedia rule runs through many things other than search engines. Many smartphone applications and other Internet-connected devices such as Amazon’s Kindle give Wikipedia privileged status: for instance, the Kindle enables special access to Wikipedia for no extra charge. Some smartphone lookup services utilize Wikipedia articles as a knowledge base.

But even if all these big guns decided to disfavor Wikipedia, the link dominance of Wikipedia is too widespread: there are any number of blogs that provide links to Wikipedia articles on topics, many of them probably based not so much on the particular qualities of the Wikipedia entry as on the brand name.

So yes, I do believe that if all linkers, commentators, search engines, and smartphone applications suddenly revolted against Wikipedia, people, finding a lot fewer links to Wikipedia, may start forgetting Wikipedia, or at any rate, make it less of a default. I just don’t see any reason for such a collective epiphany to occur.

February 23, 2009

Wikipedia criticism, and why it fails to matter

Filed under: Wikipedia — vipulnaik @ 10:50 pm

Over the past few months, I’ve been collecting newspaper and magazine articles about the phenomenon of Wikipedia. (I’ve myself written two blog posts on Wikipedia here and here). Prominent among the Wikipedia critics is Seth Finkelstein, a consulting programmer who does technology journalism on the side and publishes columns in the Guardian. Seth’s criticism is largely related to the politics of getting people to work for free. The Register has published many news and analysis articles critical of Wikipedia, such as this, this, this, and many others. The Register points out the many flaws in Wikipedia’s editing system, and has been critical of what it terms the cult of Wikipedia.

A critic who takes a somewhat different and perhaps more holistic view is Jason Scott, famous for running TEXTFILES.COM. Jason Scott has written many critical pieces on Wikipedia, such as this and this. He’s given three famous speeches about Wikipedia: The Great failure of Wikipedia (transcript), Mythapedia, and Brickipedia. Scott, who gave Wikipedia a try for some time and has experience with the MediaWiki software, says that Wikipedia employs “child labor” and compares it to a casino. Scott also hits on a powerful point: that it is precisely the canonicity and first-go reference nature of Wikipedia combined with the speed at which edits become visible that forms the “crack” for people to edit the site (a point he explores in depth in his Mythapedia speech).

A somewhat more distanced critic of Wikipedia is Nicholas Carr. Carr occasionally talks about Wikipedia on his blog, and his entries on Wikipedia are rarely full of undiluted optimism and admiration. For instance, his blog post on the centripetal web talks about how the Web, instead of becoming decentralized, is becoming systematically more concentrated towards fewer sites — the prime example of such a site being Wikipedia. In a later blog post titled All hail the information triumvirate!, Carr talks about how the Web, Google, and Wikipedia have come to acquire a fairly dominant position in many people’s daily life and work.

And then there’s Wikipedia’s co-founder, Larry Sanger, who left the project in 2002 after spearheading it for a little over a year. In October 2007, Sanger started a new encyclopedia project called Citizendium, The People’s Compendium, that recently crossed 10,000 articles. Sanger, who did a doctorate in philosophy, has been watching and writing about Wikipedia, and he recently came out with this philosophical paper in the Episteme journal. The paper uses what appears to be epistemological reasoning, at least part of which boils down to the idea that since experts are needed to judge the accuracy of Wikipedia, Wikipedia hasn’t managed to get rid of expertise. (Brock Read at the Chronicle wrote a short piece mentioning Sanger’s paper).

Of course, this hardly completes the list of Wikipedia critics. There’s Wikipedia Watch, started by Daniel Brandt, a confirmed Wikipedia critic. There’s the Wikipedia Review. There’s Robert McHenry, former Britannica editor-in-chief, who has written pieces critical of Wikipedia such as this and this. And there’s the self-described anti-Web-2.0 polemicist Andrew Keen, author of The Cult of the Amateur. One of the Wikipedia-critical pieces that often gets quoted is Digital Maoism: The hazards of the new online collectivism by Jaron Lanier.

Does the criticism matter?

Does criticism of Wikipedia serve any purpose (constructive or destructive) other than being an excuse to fill journal columns and blog space (I might note that the critical articles I wrote about Wikipedia have driven the most traffic to my blog)? it is hard to say. I want to argue here that it does not at least serve the obvious purpose of keeping potential readers away from Wikipedia.

My reason here is simple: the cost in terms of time, money, and effort, of accessing and using Wikipedia are just so low that any kind of cost-benefit analysis is simply too much of a long stretch to be done credibly. Secondly, the hidden costs of using Wikipedia are rarely borne by the user himself or herself, or are borne with what is an extremely low probability. Even if I were to believe the point made by Jason Scott and Nick Carr that the growing monopoly of Wikipedia in terms of information is not a good thing and our own laziness is what gives Wikipedia that power, such a belief is rarely enough to stop me from going and looking up the Wikipedia entry anyway.

I don’t have access to Wikipedia’s usage logs, approximate statistics are provided by wikigeist and various other Internet usage measurement services. At the time of writing, Wikigeist claims that the Wikipedia main page was viewed more than 200,000 times in the past one hour, while the hundredth entry, one on Google Earth, was viewed 475 times. Various estimates of Wikipedia usage put the number of daily pageviews in the hundreds of millions, and some studies have indicated that for a given topic with both a Britannica entry and a Wikipedia entry, the Wikipedia entry is consulted 200 times more often. The stats.grok.se service reveals how many times a particular article was viewed; the Wikipedia article on Barack Obama, for instance, was viewed four million times in January 2009, while the Wikipedia article on “normal subgroup” (a mathematical term) was viewed 2962 times in January 2009 (for some contrast, the groupprops article on normal subgroup has been viewed fewer than 1000 times in the past year).

More telling than the sheer number of pageviews, though, is the increasing extent to which I find people not even bothering to remember information knowing that they can “find it on Wikipedia.” Here are some anecdotal examples: in many recent discussions, a friend took out an IPhone to consult a Wikipedia entry to check a point; in a discussion where a friend told me about a certain kind of mollusk that eats its own brain, he told me that for reference I could Google it and follow the link to Wikipedia; even mathematical talks seem to have parts that say, “Wikipedia defines … as …”, despite the admittedly poor treatment of mathematics in Wikipedia. Again, I think this is largely because it is so easy and quick to use Wikipedia that its many obvious disadvantages pale in comparison to the speed and ease of use.

And yet the criticism may help

The criticism of Wikipedia does little to detract potential users from using it for quick reference. By and large, it does little to change Wikipedia’s policies either, in so far as what the critics are critical of is not something that some single entity at Wikipedia can change. However, such criticism can go some way in dampening the enthusiasm of people who edit Wikipedia, and in preventing people from citing Wikipedia.

For instance, Middlebury College forbade students from citing Wikipedia for history articles. This measure was severely criticized in the blogosphere, and adjectives such as “Luddite” were used to describe it. Others have argued that Wikipedia is a good “starting point” for research but people should follow through and cite the original sources. I personally think this is good policy. Actual citation of Wikipedia articles, in so far as it does occur, should follow robust citation conventions using stable versions (i.e., a link to the version of the article at the time the citation was made should be provided, rather than simply a link to the latest version of the article, which could be significantly different from the version at the time). Since citation policy is, in general, decided by fewer people, and since it involves work that generally takes more time (writing papers), I suspect that this is indeed achievable. (Wikipedia itself has various pages, such as this one, that describe how to research with Wikipedia).

Hyperlinking from blogs could follow a similar policy. Tim Bray’s post on linking describes the dilemma of linking to Wikipedia versus linking to the original source, as well as his own way of handling the dilemma. Again, since the number of people who write blogs (well, at least blogs that get read) is considerably fewer than the number of people who use the Internet for reference, there is again a possibility that writings critical of Wikipedia can influence the behavior of bloggers. One concrete step in this direction would be if people linking to Wikipedia articles do so only after reading the article, and indicate whether the link is due to a specific point made in the article, or just as background reference. If the link is to a specific point in the article, linking to a stable version might be desirable.

The other way writings critical of Wikipedia could influence Wikipedia is in terms of the influence they exert on people wondering whether to devote time and effort to Wikipedia. In general, people put in effort on a volunteer project only if the benefits to them exceed the cost, and if writings critical of Wikipedia make people better aware of some of the costs and benefits, it could help them make more informed decisions. Unfortunately, it is unclear whether this impact will be positive or negative on the whole. The problem here is similar to a problem highlighted by research of Michael Kremer and made popular in an article by Steven Landsburg: the more we make careful people refrain from a potentially dangerous activity, the more control the careless people get over it (the argument was originally made in the context of sex and AIDS). In this case, the more that careful and conscientious editors are put off Wikipedia, the more it’ll happen that the careless, sloppy, or partisan editors will take the reins. That’s because the importance of Wikipedia as a reference is so great that there’ll always be people lining up to edit it.

That this possibility is not merely hypothetical follows from the fact that many companies and high-profile individuals actually expend considerable resources maintaining the quality of entries on themselves, while subject-matter experts in an area work hard to check the entries in the subject. For instance, in the Chronicle piece Can Wikipedia make the grade?, Brock Read says:

But as the encyclopedia’s popularity continues to grow, some
professors are calling on scholars to contribute articles to
Wikipedia, or at least to hone less-than-inspiring entries in the
site’s vast and growing collection. Those scholars’ take is simple: If
you can’t beat the Wikipedians, join ’em.

This leads to the interesting possibility that writings critical of Wikipedia may well have a negative effect in the following sense: people who might well be the most careful and conscientious editors are also the ones most likely to get put off editing Wikipedia by the arguments, and other editors get more leeway. As a result, the quality deteriorates somewhat, but the deterioration in quality is so small negligible to the overall ease of use of Wikipedia that people still continue to use it and link to it: they just get more biased articles, less accurate facts, and slightly more instances of vandalism. Of course, this bad outcome depends on the assumption that the people likely to be put off Wikipedia are the ones who may have become its best editors.

Fickle loyalties

Despite my contention that criticism of Wikipedia does little to alter how much people read it, I doubt that too many people are loyal to Wikipedia. People’s loyalty to Wikipedia usually boils down to this mental algorithm: “Go to Google, type the term, search. If a Wikipedia entry shows up, follow it, otherwise, follow whatever else looks relevant.” Estimates suggest that between 50% and 70% of Wikipedia’s traffic is driven by search engines. This suggests that if search engines start devaluing Wikipedia content, the default mental algorithm that many people have will have to be revised: either the search engine or Wikipedia will suffer.

More importantly, what drives people to Wikipedia is, on the whole, a certain kind of brand recognition — a comfort that since this is Wikipedia, and they’ve been here before, they’ll be able to get the information they need with ease. But brand recognition alone can survive only in the absence of competing brands. If people find a single, consistent source that comes up along with Wikipedia among the top few entries, they are likely to give that source a try, at least after they start recognizing it.

In conclusion, I believe that criticism of Wikipedia can help in limited ways: it can make people more careful when citing and linking, and it can be informative to people before they get started on the job of editing Wikipedia (though this, as I pointed out, can be a two-edged sword). But a serious decrease or diversion of usage (and consequently, of editing effort) from Wikipedia can happen only in the presence of a competing resource that offers at least similar levels of ubiquity, ease of use and quick reference, and probably visibility in search engines.

CORRECTION: As Jon Awbrey noted in the comments, Wikipedia Review was not started by Daniel Brandt. The contents of the blog post have been changed to reflect the correction.

November 16, 2008

Wikipedia — side-effects

In a recent blog post, Nicholas Carr talked about the “centripetal web” — the increasing concentration and dominance of a few sites that seem to suck in links, attention and traffic. Carr says something interesting:

Wikipedia provides a great example of the formative power of the web’s centripetal force. The popular online encyclopedia is less the “sum” of human knowledge (a ridiculous idea to begin with) than the black hole of human knowledge. At heart a vast exercise in cut-and-paste paraphrasing (it explicitly bans original thinking), Wikipedia first sucks in content from other sites, then it sucks in links, then it sucks in search results, then it sucks in readers. One of the untold stories of Wikipedia is the way it has siphoned traffic from small, specialist sites, even though those sites often have better information about the topics they cover. Wikipedia articles have become the default external link for many creators of web content, not because Wikipedia is the best source but because it’s the best-known source and, generally, it’s “good enough.” Wikipedia is the lazy man’s link, and we’re all lazy men, except for those of us who are lazy women.

This is an important and oft-overlooked point: when saying whether something is good or bad, we need to look not just at the benefit it provides, but also at the opportunity cost. In the case of Wikipedia, there is at least some opportunity cost: people seeking those answers may well have gone to the “specialist sites” instead of to Wikipedia.

Of course, it’s possible to argue that specialist sites of the required quality do not exist, but it can again be argued, in a counter-response, that specialist sites would have existed in greater number and greater quality if Wikipedia didn’t exist, or at any rate, if Wikipedia weren’t so much of a default. It might be argued, for instance, that of all the free labor donated to Wikipedia, at least a fraction of it could have gone into developing and improving existing “specialist sites”. As I described in another blog post, the very structure of Wikipedia creates strong disincentives for competition.

Wikipedia, Mathworld and Planetmath

In 2003, at a time when I was in high school and used a dial-up to connect to the Internet, I was delighted to find a wonderful resource called Mathworld. I devoured Mathworld for all the hundreds of triangle centers it contained information on, and I eagerly awaited the expansion of Mathworld in other areas where it didn’t have much content. I was on a dial-up connection, so I saved many of the pages for referencing offline.

Later, in 2004, I discovered Planetmath. It wasn’t as beautifully done as Mathworld (Planetmath relies on a large contributor pool with little editorial control, as opposed to Mathworld, that has a small central team headed by Eric Weisstein that vets every entry before publication). But, perhaps because of less vetting and fewer editing restrictions, Planetmath had entries on many of the topics where Mathworld lacked entries. I found myself using both these resources, and was appreciative of the strengths and weaknesses of both models.

A litte later in the year, I discovered Wikipedia. At the time, Wikipedia was fresh and young — some of the policies such as notability and verifiability had not been formulated in their current form, and many of the issues Wikipedia currently faces were non-existent. Wikipedia’s model was even more skewed towards ease of editing. It didn’t have the production quality looks of Mathworld or the friendly fontfaces of Planetmath, but the page structure and category structure was pretty nice. Yet another addition to my repository, I thought.

Today, Wikipedia stands as one of the most dominant websites (it is ranked 8 in the Alexa rankings, for instance). More important, Wikipedia enjoyed steady growth both in contributions and usage until 2007 (contribution dropped a little in 2008). Planetmath and Mathworld, that fit Nicholas Carr’s description of “specialist sites”, on the other hand, haven’t grown that visibly. They haven’t floundered either — they continue to be at least as good as they were four years ago, and they continue to attract similar amounts of traffic. But there’s this nagging feeling I get that Wikipedia really did steal the thunder — in the absence of Wikipedia, there would have been more contributions to these sites, and more usage of these sites.

The relation between Wikipedia and Planetmath is of particular note. In 2004, Wikipedia wasn’t great when it came to math articles — a lot of expansion needed to be done to make it competitive. Planetmath released all of its articles under the GNU Free Documentation License — the same license as Wikipedia. Basically, this meant that Wikipedia could copy Planetmath articles as long as the Wikipedia article acknowledged the Planetmath article as its source. Not surprisingly, many of the Planetmath articles on topics that Wikipedia didn’t have were copied. Of course, the Planetmath page was linked to, but we know where the subsequent action involved with “developing” the articles happened — Wikipedia.

Interestingly, Wikipedia acknowledged its debt to Planetmath — at some point in time, the donations page of Wikipedia suggested donating to Planetmath, a resource Wikipedia credited for helping it get started with its mathematics articles (I cannot locate this page now, but it is possibly still lying around somewhere). Planetmath, on its part, introduced unobtrusive Google ads in the top left column — an indicator that it is perhaps not receiving enough donations.

Now, most of the mathematics students I meet are aware of Mathworld and Planetmath and look these up when using the Internet — they haven’t given up these resources in favor of Wikipedia. But they, like me, started using the Internet at a time when Wikipedia was not in a position of dominance. Will new generations of Internet users be totally unaware of the existence of specialist sites for mathematics? Will there be no interest in developing and improving such sites, for fear that the existence of an all-encompassing behemothing “encyclopedia” renders such efforts irrelevant? It is hard to say.

(Note: I, for one, am exploring the possibility of new kinds of mathematics reference resources, using the same underlying software that powers Wikipedia (the MediaWiki software). For instance, I’ve started the Group properties wiki).

The link-juice to Wikipedia

As Nick Carr pointed out in his post:

Wikipedia articles have become the default external link for many creators of web content, not because Wikipedia is the best source but because it’s the best-known source and, generally, it’s “good enough.” Wikipedia is the lazy man’s link, and we’re all lazy men, except for those of us who are lazy women.

In other words, Wikipedia isn’t winning its link-juice through the merit of its entries; it is winning links through its prominence and dominance and through people’s laziness or inability to find alternative resources. Link-juice has two consequences. The direct consequence is that the more people link to something, the more it gets found out by human surfers. The indirect consequence is that Google PageRank and other search engine ranking algorithms make intensive use of the link structure of the web, so a large number of incoming links increases the rank of a page. This is a self-reinforcing loop: the more people link to Wikipedia, the higher Wikipedia pages rank in searches, and the higher Wikipedia pages rank in searches, the more likely it is that people using web searches to find linkable resources will link to the Wikipedia article.

To add to this, external links from Wikipedia articles are ignored by search engines, based on Wikipedia’s settings. This is ostensibly a move to avoid spam links, but it makes Wikipedia a sucker of link-juice as far as search engine ranking is concerned.

In addition, the way people link to Wikipedia is also interesting. Often, links to Wikipedia articles do not include, in the anchor text, any information that the link goes to the Wikipedia article. Rather, the anchor text simply gives the article name. This sends the message to readers that the article on wikipedia is the first place to look something up.

Even experienced and respected bloggers do this. For instance, Terence Tao, a former medalist at the International Mathematical Olympiad and a mathematician famous for having settled a conjecture regarding primes in arithmetic progressions, links copiously to Wikipedia in his blog posts. To be fair, he also links to articles on Planetmath, and papers on the ArXiV in cases where these resources offer better information than the Wikipedia article. Nonetheless, the copious linking suggests that it is likely that not every link to a Wikipedia article is based on the Wikipedia article genuinely being the best resource on the web for that content.

What can we do about it?

Ignoring a strong centripetal influence, such as an all-encompassing knowledge source, does not make us less immune to its pull. There is a strong temptation to use Wikipedia as a “first source” for information. To counter this pull, it is important to be both understanding of the causes behind it and critical of its inevitability.

The success of a quick reference resource like Wikipedia stems from many factors, but two noteworthy among them are desire to learn and grow and laziness. Our curiosity/desire to learn leads us to look for new information, and our laziness prevents us from exerting undue effort in that regard. Wikipedia capitalizes on both our curiosity/desire to learn and grow and laziness in its readers (quick and dirty access to lots of stuff immediately), contributors (easy edit-this-page), linkers (satisfying reader curiosity by providing web links, but using Wikipedia instead of others thanks to laziness). Wikipedia is what I call a “pinpoint resource” — something that provides one-stop access to very specific queries over a large range of possibilities very quickly.

For something to complete with Wikipedia, it must cater to these fundamental attributes. It must be quick to use, provide quality information, and encourage exploration without making things too hard. It must be modular and easily pinpointable. This doesn’t necessarily mean that everything should be modular and easily pinpointable — there are other niches that don’t compete with Wikipedia. But to compete for the “quick-and-dirty” vote, a site has to offer at least some of what Wikipedia offers.

Of course, one of the questions that arises naturally at this point is: isn’t Wikipedia “good enough” to satisfy passing curiosities? I agree that there is usually no harm in using Wikipedia — when compared with ignoring one’s curiosity. But I emphatically disagree with the idea that we cannot do better with dealing with the passing curiosities and desires people have to learn new stuff and teach others, than funnel it through Wikipedia. Passing curiosities can form the basis of enduring and useful investigations, and the kind of resource people turn to at first can determine how the initial curiosity develops. For this reason, if Wikipedia is siphoning off attention from specialist sites that do a better job, not just of providing the facts, but of fostering curiosity and inviting exploration, then there is a loss at some level.

Create a free website or blog at WordPress.com.