What Is Research?

March 7, 2009

More on Wikipedia criticism

Filed under: Wikipedia — vipulnaik @ 1:54 am

My previous blog post on Wikipedia criticism generated quite a few comments. This was partly because it got covered in this forum post at The Wikipedia Review. There were several points that I made during the post that some of the commentators disagreed with, so I’ll try to elaborate the rationale behind those points a bit here, as well as what might be new insights.

Sausage: eating and making

The perspective from which I’m analysing Wikipedia is primarily an end-user perspective. The central question I explored last time was, “Does criticism of Wikipedia ultimately affect whether people read Wikipedia articles?” My rough conclusion was that there is unlikely to be any direct effect. This is particularly true of criticism that is aimed at Wikipedia’s process, because end-users care more about the end product, rather than the process.

One opposing viewpoint to this is that criticism of Wikipedia may make people less comfortable with using Wikipedia, because it might change the perception of whether the Wikipedia entry is reliable, accurate, or unbiased enough to be used. For instance, if potential web users are aware that Wikipedia entries can be edited by anybody, they may rely less on Wikipedia. I think this is plausible, but my personal experience suggests that if, even after all the possible unreliability is taken into the balance, Wikipedia is still the easiest to use, people will go to Wikipedia.

One argument that I mentioned, and that some commentators also mentioned, was that criticism of Wikipedia may have an indirect effect by discouraging people from contributing. After all, knowing the bad conditions in a sausage factory may not be that much of a disincentive for eating sausage, as long as the sausage is good. But it may discourage people from joining the sausage factory. If new contributors fail to arrive to replace the old ones who leave, then, the argument goes, Wikipedia entries will decay to the point where they get so visibly bad that even the end-users start noticing a quality decline. The argument doesn’t claim that end users care about process, but it does claim that contributors care about process.

In the last blog post, I pointed out one problem with this argument. Namely, if the most conscientious editors are the ones who are put off editing the most by criticism, the people who’re left may be the ones who are most likely to have agendas to peddle. This may result in a decline in quality — but not an obvious or visible one. That’s because the information-peddlers who are still left after some people get put off the sausage factory may also be the people who are most skilled at masking disinformation as information.

Here, I’ll try to elaborate on this, as well as give my understanding of how Wikipedia entries actually evolve.

Improve over time?

The naive belief among wiki-utopians is that Wikipedia entries keep getting better with time. The improvement may not be completely monotone, but it is largely so, with bulk of the edits in the positive direction, punctuated by occasional vandalistic edits and good-faith edits that go against the Wikipedia policy. For instance, consider this piece by Aaron Krowne, written way back in March 2005, as a critical response to this article by Robert McHenry. Another article in the Free Software Magazine gloats about the rapid growth of Wikipedia.

Wikinomics, a book by Tapscott and Williams, says that [Wikipedia] is built on the premise that collboration among users will improve content over time, and later continues on this theme:

Unlike a traditional hierarchical company where people work for managers and money, self-motivated volunteers like Elf are the reason why order prevails over chaos in what might otherwise be an impossibly messy editorial process. Wales calls it a Darwinian evolutionary process, where content improves as it goes through iterations of changes and edits. Each Wikipedia article has been edited an average of twenty times, and for newer entries that number is higher.

In his book The Long Tail, Chris Anderson writes:

What makes Wikipedia really extraordinary is that it improves over time, organically healing itself as if its huge and growing army of tenders were an immune system, ever vigilant and quick to respond to anything that threatens the organism. And like a biological system, it evolves, selecting for traits that help it stay one step ahead of the predators and pathogens in the ecosystem.

Not everybody believes that Wikipedia articles keep increasing in quality. In this philosophical essay, Larry Sanger (a co-founder of Wikipedia) makes the following interesting hypothesis: the quality of a Wikipedia entry does a random walk about the best possible value that the most difficult of the editors watching it can allow. This is merely a hypothesis, born out of Sanger’s experience, and Sanger makes no attempts to provide quantitative or even anecdotal verification of the hypothesis. Others have pointed out that many of the Wikipedia articles that receive featured article status (some of which even make it to the Wikipedia front page) later revert to being middling articles. Consider, for instance, this excerpt from an article by Jason Scott on the general failure of Wikipedia:

It is not hard, browsing over historical edits to majorly contended Wikipedia articles, to see the slow erosion of facts and ideas according to people trying to implement their own idea of neutrality or meaning on top of it. Even if the person who originally wrote a section was well-informed, any huckleberry can wander along and scrawl crayon on it. This does not eventually lead to an improved final entry.

My view: precarious equilibrium

There is exactly one Wikipedia article on every topic. Given Wikipedia’s dominance, this is the canonical source of information on the subject for hundreds of thousands (perhaps millions) of people. This canonicity is part of what makes Wikipedia so appealing to end-users, but it also means that even minor disagreements among potential editors of the article can become pretty significant when it comes to controlling that scarce and extremely valuable resource: the content of the Wikipedia article. After all, it feels like it pays off to put up a fight if fighting a little more can affect what thousands of people will learn about the topic.

A Wikipedia article on a controversial topic typically settles into a precarious equilibrium between different factions or interest groups that want to take the article in different directions (or prevent it from going in different directions). Consider, for instances, articles such as evolution, intelligent design, abortion, and alternative medicine topics ranging from well-known topics such as homeopathy to relatively lesser known topics such as Emotional Freedom Techniques.

The Emotional Freedom Techniques article is an example of such an equilibrium. There are roughly two camps: the proponents/believers, or people who for other reasons, feel that the article should contain more details about the subject. On the other side are the skeptics/disbelievers, or people who otherwise feel that putting too much information on an unproven therapeutic approach may in fact amount to an endorsement by Wikipedia. At some point long past, the article was an editing hotbed (in relative terms to its current status). It was much longer, with a lot of discussion on the talk page; sample, for instance, this revision dated 9 January, 2006.

For the next year, till around February 2007, the article remained in roughly the same state, with proponents adding in positive details and removing negative details and critics doing the opposite. On 30 January 2007, the article was nominated for deletion (here is an archived copy of the deletion discussion). A sequence of edits in the next three weeks gradually reduced the scope of the article to a considerably smaller one. The idea was that in order to “save” the article, it needed to be reduced in scope. The critics had managed to disturb the past equilibrium. By March 2007, a new equilibrium had been established, and modulo the addition and removal of a few references, this new equilibrium has been maintained for the past two years.

My experience suggests that for controversial topics, this is typical: there are two or more camps of editors, and depending on their relative strengths, the article enters a certain state around which it varies a bit but roughly remains the same. Once this equilibrium has been established, it is not easy to break. One way of breaking the equilibrium is using a “war of attrition”: keep making changes in your direction until the other person gets tired and walks away. Another one is to recruit other forces to help you, and a third approach is to threaten drastic measures, such as deletion.

Of course, the story of conflicting agendas plays out even in relatively non-controversial topics. Even editors who aren’t ideologically opposed to each other can find a lot of different things to quibble about, thus barring progress of a Wikipedia article. In the less controversial and more low-profile cases, it isn’t so much blood-curdling fights that create an impasse but simply a lack of common vision on how to take the article forward. Different editors come to Wikipedia with their own baggages and agendas — even simple agendas on how mathematics or physics should be written or what restaurants should be mentioned in the article on a local community. Typically, editors join feeling enthusiastic that they’ll be able to share their ideas and knowledge with the rest of the world, and also learn from how others are sharing. Once they realize that others holding opposing views are going to work in orthogonal or at times opposing directions, they either get put off, or they enter a wiki-war. In some cases, I think there is a clear situation of some people seeking to do constructive editing and others trying to obstruct. In most cases, it is a bunch of minor ideological mismatches that lead to people either getting put off Wikipedia or choosing to get aggressive to defend the articles.

Thus, there are roughly two kinds of articles: the controversial ones where there is a precarious equilibrium between different interest groups trying to pull it in their direction, and the relatively non-controversial ones where competing agendas and views on how things “should be” written lead, not so much to warring, as to a simple lack of activity. The former happens in cases where the stakes are more significant, and where people can feel good and hot about taking particular stands. The latter happens in more mundane things such as normal subgroup or T. Nagar where people simply couldn’t be bothered to fight.

The precarious equilibrium exists at levels higher than the level of the individual article. For instance, Wikipedia has a long history of a battle between inclusionists and deletionists. In the beginning, when the encyclopedia was small, deletionists hardly existed. As the encyclopedia became larger, deletionists started gaining the upper hand, as the need for keeping the encyclopedia free of garbage began to be appreciated. The deletionists had their heyday in 2005-2006, but inclusionists have started gaining ground again. In a recent Guardian column, Seth Finkelstein describes some of the battles and underlying agendas.

Stagnation is not the same as death

The precarious equilibrium for controversial topics and the relative stagnation for non-controversial topics may suggest that Wikipedia article quality could well get on the decline, to the point where users start noticing. However, this is probably not going to happen, for many reasons. First, encyclopedia articles on non-controversial topics are often the kind of thing that do not get outdated anyway. For instance, the definition of a normal subgroup is unlikely to ever change. Similarly, basic definitions such as friction in physics and historical articles such as Aristotle aren’t likely to get outdated either.

For controversial topics, it may well happen that a precarious equilibrium may inhibit development, but then again, the whole problem is that nobody can define “development” of an article in clear terms.

But the bigger problem, both for controversial and non-controversial topics, is that it is highly unlikely that Wikipedia article quality will actually decline significantly. So, a stagnation or stabilization in article quality can spell doom for Wikipedia only if some competing resource is trying to improve. Stagnation or stability equals death only in the presence of serious competition.

Okay, so can competition succeed?

The problem here is that the success of an encyclopedia effort requires a large amount of collation of people’s efforts, and this kind of collation doesn’t happen easily. What Wikipedia has managed to do is give enough people the impression that it can be a useful place to pool their efforts. I remain unconvinced of whether Wikipedia has actually achieved to solve the many inherent problems of large-scale collaboration, but the very fact that they can convince a lot of people, even if for a short period of time for each person, to spend some time on Wikipedia, is impressive.

People often join Wikipedia sold on the idea of working together with others, sharing their ideas, and learning from others. But not too many people are really willing to learn or change their worldviews, and the one-article paradigm of Wikipedia really forces a lot of conflicts out into the open. Thus, after trying a bit to make their voice heard amidst the din, a number of people leave.

Of course, most people who leave are bound to think of themselves as in the “right” and stubborn other Wikipedians as in the wrong, which creates a temporary bonhomie between disgruntled ex-Wikipedians. Such temporary good feelings towards one’s fellow wronged might result in a new idealistic commitment to create something better than Wikipedia. But beneath that is still the fact that many of the disgruntled ex-Wikipedians have agendas that compete with and are incompatible with each other, and new efforts that seek to do better than Wikipedia haven’t yet found a way to overcome this problem. (Fundamentally, I don’t think a way exists that the problems can be overcome). There are many examples of Wikipedia forks that have rapidly settled down into obscurity. An example is Veropedia, run by ex-Wikipedian Danny Wool and Cassiopedia, The True Encyclopedia (this already seems to have vanished and been replaced by some new wiki). Other encyclopedias that try to do Wikipedia right include, for instance, Conservapedia, intended as the replacement to Wikipedia for conservative Christians.

An example of a could-be-better-than-Wikipedia encyclopedia effort is Citizendium, founded by Wikipedia co-founder Larry Sanger. Sanger, and many of the others who work on Citizendium, seem bent on avoiding the many edit wars and other conflict situations that arise in Wikipedia. So far, so good — the Citizendium has survived and has been growing slowly for the last one and a half years. Yet, beyond the substantially greater civility and the substantially lesser activity, there is little to distinguish Citizendium from Wikipedia in terms of the competing agendas of its users. The main difference right now, as far as I can see, is that the Citizendium articles typically settle into an equilibrium of inactivity (which is similar to most Wikipedia articles on non-controversial topics) as opposed to a precarious equilibrium born of warring parties.

I personally do not think that there is room for another Wikipedia-like endeavor, at least in the near future. This does not mean that everything that seeks to do Wikipedia better will necessarily fail outright. It is probable that Citizendium will continue to grow over the next few years, and may at some stage become good enough as a general-purpose encyclopedia. Nonetheless, it is unlikely to become seriously competitive with Wikipedia in the near future.

The direction in which competition to Wikipedia could indeed be dangeorus is the direction of an increased number of more specialist sites, that help provide answers to people’s queries in somewhat more specialized topics. These specialized sites, of course, have their own conflict problems, but may be able to overcome these problems better simply because there is no single one of them. This allows people with competing agendas to work for competing specialist sites, rather than battle needlessly on the same turf. The best example of a somewhat specialized wiki-based site is Wikitravel, which is a great site for travel information. Since this is a relatively more narrow-focused site, it has clearer policy that reduces conflict over the structuring of articles. There are many others at varying ends of the spectrum between general and special. For instance, there’s WikiHow, which also seems to be doing pretty well for itself: a wiki-based how-to manual. This sacrifices some of the canonicity of the Wikipedia entry by allowing different how-to articles to be written by different people. There are a lot of substantially more specialized and narrow efforts, ranging from the hastily conceived to the well-planned. My own efforts at a group theory wiki, followed by an effort to generalize this to subject wikis in general, is one small example.

Yes, but how can diversified competition succeed?

If, as I believe, a challenge to Wikipedia can be presented only through a large number of specialist sites that compete healthily with each other and with Wikipedia, we have a bit of a paradox. The paradox is that the very reason people go first to Wikipedia is so that they do not need to navigate through or remember a bunch of different sites — Wikipedia is useful as a one-word solution to the problem of finding information.

The paradox isn’t all that big once we remember that for each specialized topic, there’s likely to be only one, or a few, places to go to. And the other important point is that within each resource, locating the article or piece of information that’s needed is pretty fast. Imagine, for instance, that people surfing casually for medical information, instead of going to the Wikipedia entry, were guided towards a collection of competing medical information websites. At first sight, a random surfer may just pick one website, and look up information there. If the surfer found that information well-presented and useful, the surfer may continue to visit that specialized site for medical information of that sort. Another surfer with somewhat different tastes may not like that first pick and may try a different medical information site. Since the medical information sites need to compete for users, they strive to provide better information that answers users’ needs more effectively.

There are two differences with Wikipedia: first, different people use different competing sites. Second, each site restricts itself to something it can specialize in, so the choices a person makes with regard to medical information sites can be independent of the choices the person makes with regard to sports information sites or glamour/fashion sites.

The idea that diverse competition can succeed against Wikipedia is also possibly an affront to people who view the collaborative principles behind Wikipedia as morally superior to the cut-throat competition that characterizes much of the messy market. Somehow, competition seems to be inherently more destructive and wasteful than the “working together” that Wikipedia engenders. Seem as it may, I think that a more diverse range of knowledge offerings can actually help reduce effort as people spend less effort fighting each other and canceling each other’s efforts, and more effort building whatever things they believe in.

Further, there are ways to ensure that a spirit of competition coexists with a spirit of sharing of ideas and knowledge. Academic research and software development often follow extremely open sharing principles, and yet can be fiercely competitive. The key thing here is that since a lot of independent entities are separately pursuing visions and borrowing ideas from each other, there is little destructive warring for the control of a single scarce resource. A spirit of sharing and openness can be backed by open content licenses such as the Creative Commons licenses.

What about search engine and link dominance?

Back at the beginning of the 21st century, when Google was nascent and Wikipedia non-existent, there were a number of books highlighting possibly disturbing tendencies that might develop on the Internet. Among these was Republic.com by Cass R. Sunstein (also co-author of Nudge, and now a member of the Obama administration). Sunstein warned of the dangers of group polarization on the Internet, with extremely personalized surfing, linking only to similar sites. This, Sunstein argued, could potentially lead to two bad outcomes: the absence of a public space, where issues of general interest could be addressed, and the total non-exposure to opposing or different thoughts and ideas.

The problem today seems to be of a somewhat different nature: the presence of a few sites that dominate much of Internet surfing. Wikipedia is increasingly becoming a destination for information-seekers, both as a direct destination and as a destination via search engines and links. As I explained earlier, I believe that the canonicity of Wikipedia as an information source is what makes it so attractive to edit and control.

That is why some people have suggested that Google and other search engines that place Wikipedia highly are largely responsible for Wikipedia’s success. Some research has shown that about half of the visits to Wikipedia still come through search engines. This suggests a “solution” to the problem of Wikipedia dominance: demote Wikipedia in the search engines.

I don’t think such a solution will either work or make sense.

Unlike Wikipedia, which faces no serious competition, search engines face tremendous competition. Google may be a market leader but it cannot afford to sit back and relax. Search engines also have a strong incentive to please their users. This means that if Google is placing Wikipedia high up in its entries, then it has a strong incentive to do so: that’s what its users want. Whether this incentive is due to conscious tweaking by Google employees or simply an unforeseen consequence of Google’s PageRank algorithms is unclear, but if it were something that displeased users, Google would fix it.

I suspect that a lot of the traffic that comes to Wikipedia through search engines actually comes through algorithms of the sort: “Do a search and pick a Wikipedia entry if it shows up in the top five, otherwise pick whatever seems relevant.” Here, search engines are being used as a “Wikipedia+”: Wikipedia, plus the rest of the web if Wikipedia failed. If the search engine failed to turn up Wikipedia in the top five, and the user later found a Wikipedia entry on the topic that he or she felt should have been up there, the user may start bypassing the search engine and go directly to Wikipedia.

Second, even if search engines stop favoring Wikipedia, the default-to-Wikipedia rule runs through many things other than search engines. Many smartphone applications and other Internet-connected devices such as Amazon’s Kindle give Wikipedia privileged status: for instance, the Kindle enables special access to Wikipedia for no extra charge. Some smartphone lookup services utilize Wikipedia articles as a knowledge base.

But even if all these big guns decided to disfavor Wikipedia, the link dominance of Wikipedia is too widespread: there are any number of blogs that provide links to Wikipedia articles on topics, many of them probably based not so much on the particular qualities of the Wikipedia entry as on the brand name.

So yes, I do believe that if all linkers, commentators, search engines, and smartphone applications suddenly revolted against Wikipedia, people, finding a lot fewer links to Wikipedia, may start forgetting Wikipedia, or at any rate, make it less of a default. I just don’t see any reason for such a collective epiphany to occur.

2 Comments »

  1. I’ll continue to suggest that your current analysis is still to Flatlandish to show the forest or the trees.

    In my view, this all goes back to rather ancient (“Pandemonium” type) models of collective intelligence that are doomed to fail, but these models are currently all the rage simply because they are easier to implement and don’t take a lot of thought at start-up time.

    I started a thread at The Wikipedia Review to examine this issue more carefully.

    Fallacies Of Dyadicism, Connectionism, Behaviorism

    It’s the sort of thing that might take some time and real thought.

    Jon Awbrey

    Comment by Jon Awbrey — March 16, 2009 @ 2:06 pm

  2. All Wikipedia articles are biased. It would be much better to have many articles written from different points of view, then having one article claiming it is written from a neutral point of view when it is not. There are many statements in Wikipedia articles with the notation (citation needed), yet many statements are immediately removed if the author does not include a citation. Wikipedia articles are filled with original research and opinions of the authors even though this is a violation of wikipedia rules. Wikidedia has all kinds of arcane rules. If you know them you can use them to chase knowledable editors away. I use wikipedia, but I know the information provided is often biased, incomplete, one-side, and often nothing more than a flight of fancy.

    Comment by frank0truth — June 25, 2012 @ 6:35 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: