What Is Research?

February 23, 2009

Wikipedia criticism, and why it fails to matter

Filed under: Wikipedia — vipulnaik @ 10:50 pm

Over the past few months, I’ve been collecting newspaper and magazine articles about the phenomenon of Wikipedia. (I’ve myself written two blog posts on Wikipedia here and here). Prominent among the Wikipedia critics is Seth Finkelstein, a consulting programmer who does technology journalism on the side and publishes columns in the Guardian. Seth’s criticism is largely related to the politics of getting people to work for free. The Register has published many news and analysis articles critical of Wikipedia, such as this, this, this, and many others. The Register points out the many flaws in Wikipedia’s editing system, and has been critical of what it terms the cult of Wikipedia.

A critic who takes a somewhat different and perhaps more holistic view is Jason Scott, famous for running TEXTFILES.COM. Jason Scott has written many critical pieces on Wikipedia, such as this and this. He’s given three famous speeches about Wikipedia: The Great failure of Wikipedia (transcript), Mythapedia, and Brickipedia. Scott, who gave Wikipedia a try for some time and has experience with the MediaWiki software, says that Wikipedia employs “child labor” and compares it to a casino. Scott also hits on a powerful point: that it is precisely the canonicity and first-go reference nature of Wikipedia combined with the speed at which edits become visible that forms the “crack” for people to edit the site (a point he explores in depth in his Mythapedia speech).

A somewhat more distanced critic of Wikipedia is Nicholas Carr. Carr occasionally talks about Wikipedia on his blog, and his entries on Wikipedia are rarely full of undiluted optimism and admiration. For instance, his blog post on the centripetal web talks about how the Web, instead of becoming decentralized, is becoming systematically more concentrated towards fewer sites — the prime example of such a site being Wikipedia. In a later blog post titled All hail the information triumvirate!, Carr talks about how the Web, Google, and Wikipedia have come to acquire a fairly dominant position in many people’s daily life and work.

And then there’s Wikipedia’s co-founder, Larry Sanger, who left the project in 2002 after spearheading it for a little over a year. In October 2007, Sanger started a new encyclopedia project called Citizendium, The People’s Compendium, that recently crossed 10,000 articles. Sanger, who did a doctorate in philosophy, has been watching and writing about Wikipedia, and he recently came out with this philosophical paper in the Episteme journal. The paper uses what appears to be epistemological reasoning, at least part of which boils down to the idea that since experts are needed to judge the accuracy of Wikipedia, Wikipedia hasn’t managed to get rid of expertise. (Brock Read at the Chronicle wrote a short piece mentioning Sanger’s paper).

Of course, this hardly completes the list of Wikipedia critics. There’s Wikipedia Watch, started by Daniel Brandt, a confirmed Wikipedia critic. There’s the Wikipedia Review. There’s Robert McHenry, former Britannica editor-in-chief, who has written pieces critical of Wikipedia such as this and this. And there’s the self-described anti-Web-2.0 polemicist Andrew Keen, author of The Cult of the Amateur. One of the Wikipedia-critical pieces that often gets quoted is Digital Maoism: The hazards of the new online collectivism by Jaron Lanier.

Does the criticism matter?

Does criticism of Wikipedia serve any purpose (constructive or destructive) other than being an excuse to fill journal columns and blog space (I might note that the critical articles I wrote about Wikipedia have driven the most traffic to my blog)? it is hard to say. I want to argue here that it does not at least serve the obvious purpose of keeping potential readers away from Wikipedia.

My reason here is simple: the cost in terms of time, money, and effort, of accessing and using Wikipedia are just so low that any kind of cost-benefit analysis is simply too much of a long stretch to be done credibly. Secondly, the hidden costs of using Wikipedia are rarely borne by the user himself or herself, or are borne with what is an extremely low probability. Even if I were to believe the point made by Jason Scott and Nick Carr that the growing monopoly of Wikipedia in terms of information is not a good thing and our own laziness is what gives Wikipedia that power, such a belief is rarely enough to stop me from going and looking up the Wikipedia entry anyway.

I don’t have access to Wikipedia’s usage logs, approximate statistics are provided by wikigeist and various other Internet usage measurement services. At the time of writing, Wikigeist claims that the Wikipedia main page was viewed more than 200,000 times in the past one hour, while the hundredth entry, one on Google Earth, was viewed 475 times. Various estimates of Wikipedia usage put the number of daily pageviews in the hundreds of millions, and some studies have indicated that for a given topic with both a Britannica entry and a Wikipedia entry, the Wikipedia entry is consulted 200 times more often. The stats.grok.se service reveals how many times a particular article was viewed; the Wikipedia article on Barack Obama, for instance, was viewed four million times in January 2009, while the Wikipedia article on “normal subgroup” (a mathematical term) was viewed 2962 times in January 2009 (for some contrast, the groupprops article on normal subgroup has been viewed fewer than 1000 times in the past year).

More telling than the sheer number of pageviews, though, is the increasing extent to which I find people not even bothering to remember information knowing that they can “find it on Wikipedia.” Here are some anecdotal examples: in many recent discussions, a friend took out an IPhone to consult a Wikipedia entry to check a point; in a discussion where a friend told me about a certain kind of mollusk that eats its own brain, he told me that for reference I could Google it and follow the link to Wikipedia; even mathematical talks seem to have parts that say, “Wikipedia defines … as …”, despite the admittedly poor treatment of mathematics in Wikipedia. Again, I think this is largely because it is so easy and quick to use Wikipedia that its many obvious disadvantages pale in comparison to the speed and ease of use.

And yet the criticism may help

The criticism of Wikipedia does little to detract potential users from using it for quick reference. By and large, it does little to change Wikipedia’s policies either, in so far as what the critics are critical of is not something that some single entity at Wikipedia can change. However, such criticism can go some way in dampening the enthusiasm of people who edit Wikipedia, and in preventing people from citing Wikipedia.

For instance, Middlebury College forbade students from citing Wikipedia for history articles. This measure was severely criticized in the blogosphere, and adjectives such as “Luddite” were used to describe it. Others have argued that Wikipedia is a good “starting point” for research but people should follow through and cite the original sources. I personally think this is good policy. Actual citation of Wikipedia articles, in so far as it does occur, should follow robust citation conventions using stable versions (i.e., a link to the version of the article at the time the citation was made should be provided, rather than simply a link to the latest version of the article, which could be significantly different from the version at the time). Since citation policy is, in general, decided by fewer people, and since it involves work that generally takes more time (writing papers), I suspect that this is indeed achievable. (Wikipedia itself has various pages, such as this one, that describe how to research with Wikipedia).

Hyperlinking from blogs could follow a similar policy. Tim Bray’s post on linking describes the dilemma of linking to Wikipedia versus linking to the original source, as well as his own way of handling the dilemma. Again, since the number of people who write blogs (well, at least blogs that get read) is considerably fewer than the number of people who use the Internet for reference, there is again a possibility that writings critical of Wikipedia can influence the behavior of bloggers. One concrete step in this direction would be if people linking to Wikipedia articles do so only after reading the article, and indicate whether the link is due to a specific point made in the article, or just as background reference. If the link is to a specific point in the article, linking to a stable version might be desirable.

The other way writings critical of Wikipedia could influence Wikipedia is in terms of the influence they exert on people wondering whether to devote time and effort to Wikipedia. In general, people put in effort on a volunteer project only if the benefits to them exceed the cost, and if writings critical of Wikipedia make people better aware of some of the costs and benefits, it could help them make more informed decisions. Unfortunately, it is unclear whether this impact will be positive or negative on the whole. The problem here is similar to a problem highlighted by research of Michael Kremer and made popular in an article by Steven Landsburg: the more we make careful people refrain from a potentially dangerous activity, the more control the careless people get over it (the argument was originally made in the context of sex and AIDS). In this case, the more that careful and conscientious editors are put off Wikipedia, the more it’ll happen that the careless, sloppy, or partisan editors will take the reins. That’s because the importance of Wikipedia as a reference is so great that there’ll always be people lining up to edit it.

That this possibility is not merely hypothetical follows from the fact that many companies and high-profile individuals actually expend considerable resources maintaining the quality of entries on themselves, while subject-matter experts in an area work hard to check the entries in the subject. For instance, in the Chronicle piece Can Wikipedia make the grade?, Brock Read says:

But as the encyclopedia’s popularity continues to grow, some
professors are calling on scholars to contribute articles to
Wikipedia, or at least to hone less-than-inspiring entries in the
site’s vast and growing collection. Those scholars’ take is simple: If
you can’t beat the Wikipedians, join ’em.

This leads to the interesting possibility that writings critical of Wikipedia may well have a negative effect in the following sense: people who might well be the most careful and conscientious editors are also the ones most likely to get put off editing Wikipedia by the arguments, and other editors get more leeway. As a result, the quality deteriorates somewhat, but the deterioration in quality is so small negligible to the overall ease of use of Wikipedia that people still continue to use it and link to it: they just get more biased articles, less accurate facts, and slightly more instances of vandalism. Of course, this bad outcome depends on the assumption that the people likely to be put off Wikipedia are the ones who may have become its best editors.

Fickle loyalties

Despite my contention that criticism of Wikipedia does little to alter how much people read it, I doubt that too many people are loyal to Wikipedia. People’s loyalty to Wikipedia usually boils down to this mental algorithm: “Go to Google, type the term, search. If a Wikipedia entry shows up, follow it, otherwise, follow whatever else looks relevant.” Estimates suggest that between 50% and 70% of Wikipedia’s traffic is driven by search engines. This suggests that if search engines start devaluing Wikipedia content, the default mental algorithm that many people have will have to be revised: either the search engine or Wikipedia will suffer.

More importantly, what drives people to Wikipedia is, on the whole, a certain kind of brand recognition — a comfort that since this is Wikipedia, and they’ve been here before, they’ll be able to get the information they need with ease. But brand recognition alone can survive only in the absence of competing brands. If people find a single, consistent source that comes up along with Wikipedia among the top few entries, they are likely to give that source a try, at least after they start recognizing it.

In conclusion, I believe that criticism of Wikipedia can help in limited ways: it can make people more careful when citing and linking, and it can be informative to people before they get started on the job of editing Wikipedia (though this, as I pointed out, can be a two-edged sword). But a serious decrease or diversion of usage (and consequently, of editing effort) from Wikipedia can happen only in the presence of a competing resource that offers at least similar levels of ubiquity, ease of use and quick reference, and probably visibility in search engines.

CORRECTION: As Jon Awbrey noted in the comments, Wikipedia Review was not started by Daniel Brandt. The contents of the blog post have been changed to reflect the correction.

Doing it oneself versus spoonfeeding

In previous posts titled knowledge matters and intuition in research, I argued that building good intuition and skill for research requires a strong knowledge and experience base. In this post, I’m going to talk about a related theme, which is also one of my pet themes: my rant at the misconception that doing things on one’s own is important for success.

The belief that I’m attacking

It is believed in certain circles, particularly among academics, that doing things by oneself, working out details on one’s own, rather than looking them up or asking others, is a necessary step towards developing proper understanding and skills.

One guise that this belief takes is a skew of learning paradigms that go under names such as “experiential learning”, “inquiry-based learning”, “exploratory learning”, and the like. Of course, each of these learning paradigms is complex, and the paradigms differ from each other. Further, each paradigm is implemented in a variety of different ways. My limited experience with these paradigms indicates that there is a core belief common to the paradigms (I may be wrong here) which is that it is important for people to do things by themselves rather than have these things told to them by others. An extreme believer of this kind may consider with disdain the idea of simply following or reading what others, but a more moderate and mainstream stance might be that working things out for oneself, rather than following what others have done, is generally preferable, and following others is a kind of imperfect substitute that we nonetheless often need to accept because of constraints of time.

Another closely related theme is the fact that exploratory and inquiry-based methods focus more on skills and approaches rather than knowledge. This might be related to the general view of knowledge as something inferior, or less important, than skill, attitude, and approach. Which is why, in certain circles, the person who “is smart” and “thinks sharply” is considered inferior to the person who merely “knows a lot”. This page, for instance, talks about how inquiry-based learning differs from the traditional knowledge-based approach to learning because it focuses more on “information-processing skills” and “problem-solving skills”. (Note: I discovered the page via a Google search a few months back, and am not certain about how mainstream its descriptions are). (Also note: I’ve discussed more about this later in the post, where I point out other sides of this issue).

Closely related to the theme of exploration and skills-more-than-knowledge is the theme of minimal guidance. In this view, guidance from others should be minimal, and students should discover things their own way. There are many who argue both for and against such positions. For instance, a paper (Kirschner, Sweller, and Clark) that I discovered via Wikipedia argues why minimally guided instruction does not work. Nonetheless, there seems to be a general treatment of exploration, self-discovery, and skills-over-knowledge as “feel-good” things.

Partial truth to the importance of exploration

As an in-the-wings researcher (I am currently pursuing a doctoral degree in mathematics) I definitely understand the importance of exploration. I have personally done a lot of exploration, much of it to fill minor knowledge gaps or raise interesting but not-too-deep questions. And some of my exploration has led to interesting and difficult questions. For instance, I came up with a notion of extensible automorphism for groups and made a conjecture that every extensible automorphism is inner. The original motivation behind the conjecture was a direction of exploration that turned out to have little to do with the partial resolution that I have achieved on the problem. (With ideas and guidance from many others including Isaacs, Ramanan, Alperin, and Glauberman, I’ve proved that for finite groups, any finite-extensible automorphism is class-preserving, and any extensible automorphism sends subgroups to conjugate subgroups). And I’ve also had ideas that have led to other questions (most of which were easy to solve, while some are still unsolved) and others that have led to structures that might just be of use.

In other words, I’m no stranger to exploration in a mathematical context. Nor is my exploratory attitude strictly restricted to group theory. I take a strongly exploratory attitude to many of the things I learn, including things that are probably of little research relevance to me. Nor am I singularly unique in this respect. Most successful researchers and learners that I’ve had the opportunity to interact with are seasoned explorers. While different people have different exploration styles, there are few who resist the very idea of exploration. Frankly, there would be little research or innovation (whether academic or commercial) if people didn’t have an exploratory mindset.

So I’m all for encouraging exploration. So what am I really against? The idea that, in general, people are better off trying to figure things out for themselves rather than refer to existing solutions or existing approaches. Most of the exploration that I’ve talked about here isn’t exploration undertaken because of ignorance of existing methods — it is exploration that builds upon a fairly comprehensive knowledge and understanding of existing approaches. What I’m questioning is the wisdom of the idea that by forcing people to work out and explore solutions to basic problems while depriving them of existing resources that solve those problems, we can impart problem-solving and information-processing skills that would otherwise be hard to come by.

Another partial truth: when deprivation helps

Depriving people of key bits of knowledge can help in certain cases. These are situations where certain mental connections need to be formed, and these connections are best formed when the person works through the problem himself or herself, and makes the key connection. In these cases, simply being told the connection may not provide enough shock value, insight value, richness or depth for the connection to be made firmly.

The typical example is the insight puzzle. By insight puzzle, I mean a puzzle whose solutions relies on a novel way of interpreting something that already exists. Here, simply telling the learner to “think out of the box” doesn’t help the learner solve the insight puzzle. However, if a situation where a similar insight is used is presented shortly before administering the puzzle, the learner has a high chance of solving the puzzle.

The research on insight puzzles reveals, however, that in order to maximize the chances of the learner getting it, the similar insight should be presented in a way that forces the learner to have the insight by himself/herself. In other words, the learner should be forced to “think through” the matter before seeing the problem. The classic example of this is a puzzle that involves a second use of the word “marry” — a clergyman or priest marrying a couple. One group of people were presented, before the puzzle, with a passage that involved a clergyman marrying couples. Very few people in this group got the solution. Another group of people were presented a similar passage, except that this passage changed the order of sentences so that the reader had to pause to confront the two meanings of “marry”. People in this second group scored better on the test because they had to reflect upon the problem.

There are a couple of points I’d like to note here. That depriving people of some key ingredients forces them to reflect and helps form better mental connections is true. But equally important is the fact that they are presented with enough of the other ingredients in a manner that the insight represents a small and feasible step. Secondly, such careful stimulation requires a lot of art, thought, and setup, and is a far cry from setting people “free to explore”.

When to think and when to look

Learners generally need to make a trade-off between “looking up” answers and “thinking about them”. How this trade-off is made depends on a number of factors, including the quality of insight that the looked-up answer provides, the quality of insight that learners derive from thinking about problems, the time at the learner’s disposal, the learner’s ultimate goals, and many others. In my experience, seasoned learners of a topic are best able to make these trade-offs themselves and determine when to look and when to struggle. Thus, even if deprivation is helpful, external deprivation (in the sense of not providing information about places where they can look up answers) does not usually make sense. There are two broad exceptions.

The first is for novice learners. Novice learners, when they see a new problem, rarely understand enough about their own level of knowledge to know how long they should try the problem, what kind of place to look up if any, and what the relative advantages of either approach are. By “novice learner” I do not mean to suggest a general description of a person. Everybody is a novice learner in a topic they pick up for the first time. It is true that some people are better in general as learners in certain broad areas — for instance, I’d be a better learner of mathematical subjects than most people, including mathematical subjects I have never dealt with. However, beyond a slight headstart, everybody goes through the “novice learner” phase for a new field.

For novice learners, helpful hints on what things they should try themselves, how long they should try those things, and how to judge and build intuition, are important. As such, I think that these hints need to be made much better in quality than they typically are. The hint to a learner should help the learner get an idea about the difficulty level in trying the problem, the importance of “knowing” the solution at the end, the relative importance of reflecting upon and understanding the problem, and whether there are some insights that can only be obtained by working through the problem (or, conversely, whether there are some insights that can only be obtained by looking at the solution). Here, the role of the problem-provider (who may be an instructor, coach, or a passive agent such as a textbook, monograph, or video lecture series) is to provide input that helps the learner decide rather than to take the decision-making reins.

A second powerful argument is for learners whose personality and circumstances require “external disciplining” and “external motivation”. The argument here is essentially a “time inconsistency” argument — the learner would ideally like to work through the problem himself or herself, but when it comes to actually doing the problem, the learner feels lazy, and may succumb to simply looking up the solution somewhere. (“Time inconsistency” is a technical term used in decision theory and behavioral economics). Forcing learners to actually do the problems by themselves, and disciplining them by not providing them easy access to solutions, helps them meet their long-term goals and overcome their short-term laziness.

I’m not sure how powerful the time inconsistency argument is. Prima facie evidence of it seems huge, particularly in schools and colleges, where students often choose to take heavy courseloads and somehow wade through a huge pile of homework, and yet rarely do extra work voluntarily on a smaller scale (such as starred homework problems, or challenging exercises) even when the load on them is low. This fits the theory that, in the long haul, these students want to push themselves, but in the short run, they are lazy.

I think the biggest argument against the time inconsistency justification for depriving people of solutions is the fact that the most clear cases of success (again in my experience) are people who are not time inconsistent. The best explorers are people who explore regardless of whether they’re forced to do so, and who, when presented with a new topic, try to develop a sufficiently strong grasp so that they can make their own decisions of how to balance looking up with trying on their own.

Yet another argument is that laziness works against all kinds of work, including the work of reading and following existing solutions. In general, what laziness does is to make people avoid learning things if it takes too much effort. Students who decide not to solve a particular problem by themselves often also don’t “look up” the solution. Thus, in the net, they never learn the solution. Thus, even in cases where trying a problem by oneself is superior to looking it up, looking it up may still be superior to the third alternative: never learning the solution.

A more careful look at what can be done

It seems to me that providing people information that helps them decide which problems to work with and how long to try before looking up is good in practically all circumstances. It’s even better if people are provided tools that help them reflect and consolidate insights from existing problems, and if these insights are strengthened through cross-referencing from later problems. Since not every teaching resource does this, and since exploration at the cutting edge is by definition into unknown and poorly understood material, it is also important to teach learners the subject-specific skills that help them make these decisions better.

Of course, the specifics vary from subject to subject, and there is no good general-purpose learner for everything. But simply making learners and teachers aware of the importance of such skills may have a positive impact on how quickly the learners pick such skills.

Another look at exploratory learning

In the beginning, I talked about what seems to be a core premise of exploratory learning — that learners do things best when they explore by themselves. Strictly speaking, this isn’t treated as a canonical rule by pioneers of exploratory learning. In fact, I suspect that the successful executions of exploratory learning succeed precisely because they identity the things where learners investing their time through exploration yields the most benefit.

For instance, the implementation of inquiry-based learning (IBL) in some undergraduate math classes at the University of Chicago results in a far from laissez faire attitude towards student exploring things. The IBL courses seem, in fact, to be a lot more structured and rigid than non-IBL courses. Students are given a sheet of the theorems, axioms and definitions of the course, and they need to prove all the theorems. This does fit in partly with the “deprivation” idea — that students have to prove the theorems by themselves, even though proofs already exist. On the other hand, it is far from letting students explore freely.

It seems to me that while IBL as implemented in this fashion may be very successful in getting people to understand and critique the nature and structure of mathematical proofs, it is unlikely to offer significant advantages in terms of the ability to do novel exploration. That’s because, as my experience suggests, creative and new exploration usually requires immersion in a huge amount of knowledge, and this particular implementation of IBL trades off a lot of knowledge for a more thorough understanding of less knowledge.

Spoonfeeding, ego, and confidence issues

Yet another argument for letting people solve problems by themselves is that it boots their “confidence” in the subject, making them more emotionally inclined to learn. On the other hand, spoonfeeding and telling them solutions makes them feel like dumb creatures being force-fed.

In this view, telling solutions to people deprives them of the “pleasure” of working through problems by themselves, a permanent deprivation.

I think there may be some truth to this view, but it is very limited. First, the total number of problems to try is so huge that depriving people of the “pleasure” of figuring out a few for themselves has practically no effect on the number of problems they can try. Of course, part of the challenge is to make this huge stream of problems readily available to people who want to try them, without overwhelming them. Second, the “anti-spoonfeeding” argument elevates an issue of acquiring subject-matter skills to an issue of pleasing learners emotionally.

Most importantly, though, it goes against the grain of teaching people humility. Part of being a good learner is being a humble learner, and part of that involves being able to read and follow what others have done, and to realize that most of that is stuff one couldn’t have done oneself, or that would have taken a long time to do oneself. Such humility is accompanied by pride at the fact that one’s knowledge is built on the efforts of the many who came before. To use a quote attributed to Newton, “If I have seen so much, it is because I stand on the shoulder of giants.”

Of course, a learner cannot acquire such humility if he or she never attempts to solve a problem alone, but a learner cannot acquire it if he or she simply tries to solve problems rather than ask others or use references to learn solutions. It’s good for learners to try a lot of simpler problems that they get, and thus boost confidence in their learning, but it is also important that for hard problems, learners absorb the solutions of others and make them their own.

February 20, 2009

A quick review of the polymath project

Filed under: polymath — vipulnaik @ 12:15 am

In an earlier blog post on new modes of mathematical collaboration, I offered my critical views on Michael Nielsen’s ideas about making mathematics more collaborative using the Internet. Around the time, Timothy Gowers, a prominent mathematician, was inspired by Michael Nielsen’s post, to muse in this blog post about whether massively collaborated mathematics is possible. The post was later critiqued by Michael Nielsen.

Since then, Gowers decided to actually experiment with solving a problem using collaborative methods. The project is called the “polymath” project. “Polymath” means a person with extensive knowledge of a wide range of subjects. Gowers was arguably punning on the word, with the idea being that when many people do math together, it is like a “polymath”.

Gowers, who is more of a problem-solver than a theory-builder, naturally chose solving a problem as the testing ground for collaborative mathematics. Further, he chose a combinatorial problem (the density Hales-Jewett theorem) that had already been solved, albeit by methods that were not directly combinatorial, and defined his goal as trying to get to a combinatorial solution for the problem. Gowers wrote a background post about the problem and a post about the procedure, where he incorporated feedback from Michael Nielsen and others. These rules stipulated, among other things, that those participating in the collaborative project must not try to think too much about the problem away from the computer, and must not do any technical calculations away from the computer. Rather, they should share their insights. The idea was to see whether sharing and pooling insights led to discovery faster than working on them alone. I may have misunderstood Gowers’ words, so I’ll quote them here:

If you are convinced that you could answer a question but that it would just need a couple of weeks to go away and try a few things out, then still resist the temptation to do that. Instead, explain briefly, but as precisely as you can, why you think it is feasible to answer the question and see if the collective approach gets to the answer more quickly. (The hope is that every big idea can be broken down into a sequence of small ideas. The job of any individual collaborator is to have these small ideas until the big idea becomes obvious — and therefore just a small addition to what has gone before.) Only go off on your own if there is a general consensus that that is what you should do.

In the next post, Gowers listed his ideas broken down into thirty-eight points. He also clarified the circumstances under which the project could be declared finished. In Gowers’ words:

It is not the case that the aim of the project is to find a combinatorial proof of the density Hales-Jewett theorem when k=3. I would love it if that was the result, but the actual aim is more modest: it is either to prove that a certain approach to that theorem (which I shall soon explain) works, or to give a very convincing argument that that approach cannot work. (I shall have a few remarks later about what such a convincing argument might conceivably look like.)

In the next post, Gowers explained the rationale for selecting this particular problem. He explained that, first, he wanted to select a serious problem, the kind whose solution would be considered important for researchers in the field. Second, he didn’t want to select a problem that was parallelizable in a natural sense — rather, he believes that the solution to every problem does parallelize at some stage, and how this parallelization is to occur can itself be determined.

By this time, Gowers’ blog was receiving hundreds of comments, mostly comments by Gowers himself, but also including comments from distinguished mathematicians such as Terence Tao. Tao has his own blog, and he published a post giving the background of the Hales-Jewett theorem and a later post with some of his own ideas about the problem.

A few days later, Gowers announced at the end of this post that there was a wiki on the enterprise of solving the density Hales-Jewett theorem. In the same post, Gowers also summarized all the proof strategies that had come up thanks to the comments. Since then, there have been no more blog posts about the problem.

A look at the wiki

It’s still early days to know the eventual shape that the Polymath1 wiki will take. One thing that seems to be conspicuous by its absence is a copyright notice. This could create problems, particularly considering that this is a collaboratively edited website aimed at solving a problem.

There are some other things that I think need to be decided.

  1. Is the wiki intended only to provide leads or reference points to ideas elaborated elsewhere, or is it intended to provide the structure, substance and background material as well? If the former is the case, then the wiki can be designed in a problem-centric fashion. However, if the wiki is designed this way (i.e., only to provide leads), its generic comprehensibility is going to be poor. Moreover, the “cross-fertilization” of ideas with other problems is going to be minimal if the organization is centered completely around the density Hales-Jewett theorem. On the other hand, if the wiki provides too much of background information, it would be better to organize it according to the background information. This would make it lose its problem-specific focus. I think there is a trade-off here.

  2. Style of pages: The pages currently have a very conversational style. This may be because, currently, the pages are adaptations of material put up in blog posts and blog comments. But this conversational style makes it hard to use the pages as a handy reference or lookup point.

  3. Classifying page types: There needs to be some sort of separation between definition pages, pages about known theorems, pages about speculation and conjectures, and pages describing conjectures and thoughts. As of now, such a separation or classification is not available.

  4. Interfacing with other reference sources: If (and this goes back to the first point) it is decided that the wiki will not provide too much background information and will focus on a style suited to the problem focus, then some decisions will need to be made on how to link up to outside reference sources.

  5. Linking mechanisms between pages: A person who reads about one idea, definition, theorem, or conjecture, should have a way of knowing what else is most closely related to that. Robust linking mechanisms need to be decided for this.

To give an illustration of this, consider the current page on Line (permalink to current version). This page introduces definitions for three kinds of “lines” talked about in combinatorics — combinatorial lines, algebraic lines, and geometric lines. Some of the things I’d recommend for this page are:

  • Create separate pages for combinatorial line, algebraic line, geometric line.

  • In each page, create a “definition” section with a precise definition, and perhaps an “examples” section with examples, as well as links to the other two pages, explaining the differences.

  • For the page on combinatorial line, link to generalizations such as combinatorial subspace.

  • For the page on combinatorial line, provide a reverse link to pages that use this concept, or link to expository articles/blog entries that explain how and why the concept of combinatorial line is important.

Here are some suggestions on the theorem pages.

  • Create a separate section in the theorem page giving a precise statement of the theorem.

  • For each theorem, have sections of the page devoted to listing/discussing stronger and weaker theorems, generalizations and special cases. For instance, the coloring Hales-Jewett theorem is “weaker” than the density Hales-Jewett theorem as well as the Graham-Rothschild theorem.

Another suggestion I’d have would be to use the power of tools such as Semantic MediaWiki to store the relationships between theorems in useful ways.

I’ll post more comments as things progress.

February 13, 2009

Knowledge matters

It is fashionable in certain circles to argue that, particularly, for subjects such as mathematics that have a strong logical and deductive component, it is not how much you know that counts but how you think. According to this view, cramming huge amounts of knowledge is counter-productive. Instead, mastery is achieved by learning generic methods of reasoning to deal with a variety of situations.

There are a number of ways in which this view (though considered enlightened by some) is just plain wrong. At a very basic level, it is useful to counter the (even more common) tendency to believe that in reasoning problems, it is sufficient to “memorize” basic cases. However, at a more advanced level, it can come in the way of developing the knowledge and skills needed to achieve mastery.

My first encounters with this belief

During high school, starting mainly in class 11, I started working intensively on preparing for the mathematics Olympiads. Through websites and indirect contacts (some friends, some friends of my parents) I collected a reasonable starting list of books to use. However, there was no systematic preparation route for me to take, and I largely had to find my own way through.

The approach I followed here was practice — lots and lots of problems. But the purpose here wasn’t just practice — it was also to learn the common facts and ideas that could be applied to new problems. Thus, a large part of my time also went to reviewing and reflecting upon problems I had already solved, trying to find common patterns, and seeing whether the same ideas could be expressed in greater generality. Rather than being too worried about performing in an actual examination situation, I tried to build a strong base of knowledge, in terms of facts as well as heuristics.

In addition, I spent a lot of time reading the theoretical parts of number theory, combinatorics, and geometry. The idea here was to develop the fact base as well as vocabulary so that I could identify and “label” phenomena that I saw in specific Olympiad problems.

(For those curious about the end result, I got selected to the International Mathematical Olympiad team from India in 2003 and 2004, and won Silver Medals both years.)

At no stage during my preparation did I feel that I had become “smarter” in the sense of having better methods of general reasoning or approaching problems in the abstract. Rather, my improvements were very narrow and domain-specific. After thinking, reading, and practicing a lot of geometry, I became proportionately faster at solving geometry problems, but improved very little with combinatorics.

Knowledge versus general skill

Recently, I had a chance to re-read Geoff Colvin’s interesting book Talent is overrated. This book explains how the myth of “native talent” is largely just a myth, and the secret to success is something that Colvin calls “deliberate practice”. Among the things that experts do differently, Colvin identifies looking ahead (for instance, fast typists usually look ahead in the document to know what they’ll have to type a little later), identifying subtle and indirect cues (here Colvin gives examples of expert tennis players using the body movements of the person serving to estimate the speed and direction of the ball), and, among other things, having a large base of knowledge and long-term memory that can be used to identify a situation.

Colvin describes how mathematicians and computer scientists had initially hoped for general-purpose problem solvers, who knew little about the rules of a particular problem, but would find solutions using the general rules of logic and inference. These attempts failed greatly. For instance, Deep Blue, IBM’s chess-playing computer, was defeated by then world champion Garry Kasparov in a tournament, despite Deep Blue’s ability to evaluate a hundred million of moves every second. What Deep Blue lacked, according to Colvin, was the kind of domain-specific knowledge of what works and where to start looking, that Kasparov had acquired through years of stored knowledge and memory about games that he had played and analyzed.

A large base of knowledge is also useful because it provides long-term memory that can be tapped on to complement working memory in high-stress situations. For instance, a mathematician trying to prove a complicated mathematical theorem that involves huge expressions may be able to rely on other similar expressions that he/she has worked with before to “store” the complexity of this expression in a more simple form. Similarly, a chess player may be able to use past games as a way of storing a shorter mental description of the current game situation.

A similar idea is discussed in Gary Klein’s book Sources of Power, where he describes a Recognition-Primed Decision Model (RPD model) used by people in high-stress, high-stakes situation. Klein says that expert firefighters look at a situation, identify key characteristics, and immediately fit it into a template that tells them what is happening and how to act next. This template need not be precisely like a single specific past situation. Rather, it involves features from several past situations, mixed and matched according to the present situation. Klein also gives examples of NICU nurses, in charge of taking care of babies with serious illnesses. The more experienced and expert of these nurses draw on their vast store of knowledge to identify and put together several subtle cues to get a comprehensive picture.

Knowledge versus gestalt

In Group Genius: The Creative Power of Collaboration, Keith Sawyer talks about how people solve insight problems. Sawyer talks about gestalt psychologists, who believed that for “insight” problems — the kind that require a sudden leap of insight — people needed to get beyond the confines of pre-existing knowledge and think fresh, out of the box. The problem with this, Sawyer says, is that study after study showed that simply telling people to think out of the box, or to think differently, rarely yielded results. Rather, it was important to give people specific hints about how to think out of the box. Even those hints needed to be given in such a way that people would themselves make the leap of recognition, thus modifying their internal mental models.

I recently had the opportunity to read an article, Understanding and teaching the nature of mathematical thinking, by Alan Schofield, published in Proceedings of the UCSMP International Conference on Mathematics Education, 1985 (pages 362-379). Schofield talks about how a large knowledge base is very crucial to being effective at solving problems. He refers to research by Simon (Problem Solving and Education, 1980) that shows that domain experts have a vocabulary of approximately 50,000 “chunks” — small word combinations that denote domain-specific concepts. Schofield then goes on to talk about research by Brown and Burton (Diagnostic models for procedural bugs in basic mathematical science, Cognitive Science 2, 1978 ) that shows that people who make mistakes with arithmetic (addition and subtraction) don’t just make mistakes because they don’t understand the correct rules well enough — they make mistakes because they “know” something wrong. Their algorithms are buggy in a consistent way. This is similar to the fact that people are unable to solve insight problems, not because they’re refusing to think “outside the box”, but because they do not know the correct algorithms for doing so.

Schofield then goes on to describe the experiences of people such as himself in implementing George Polya’s problem-solving strategies. Polya enumerated several generic problem-solving strategies in his books How to solve it, Mathematical discovery, and Mathematics and plausible reasoning. Polya’s heuristics included: exploiting analogies, introducing and exploring auxiliary elements in a problem solution, arguing by contradiction, working forwards, decomposing and recombining, examining special cases, exploiting related problems, drawing figures, and working backward. But teaching these “strategies” in classrooms rarely resulted in an across-the-board improvement in students’ problem-solving abilities.

Schofield argues that the reason why these strategies failed was that they were “underspecified” — just knowing that one should “introduce and explore auxiliary elements”, for instance, is of little help unless one knows how to come up with auxiliary elements in a particular situation. In Euclidean geometry, this may be by extending lines far enough that they meet, dropping perpendiculars, or other methods. In problems involving topology, this may involve constructing open covers that have certain properties. Understanding the general strategy helps a bit in the sense of putting one on the lookout for auxiliary element, but it does not provide the skill necessary to locate the correct auxiliary element. Such skill can be acquired only through experience, through deliberate practice, through the creation of a large knowledge base.

In daily life

It is unfortunately true that much of coursework in school and college is based on a learn-test-forget model — students learn something, it is tested, and then they forget it. A lack of sufficient introspection and a lack of frequent reuse of ideas learned in the past leads students to forget what they learned quickly. Thus, the knowledge base gets eroded almost as fast as it gets built.

It is important not just to build a knowledge base but to have time to reflect upon what has been built, and to strengthen what was built earlier by referencing it and building upon it. Also, students and researchers who want to become sharper thinkers in the long term need to understand the importance of remembering what they learn, putting it in a more effective framework, and making it easier to recall at times when it is useful. I see a lot of people who like to solve problems but then make no effort to consolidate their gains by remembering the solution or storing the key ideas in long-term memory in a way that can be tapped on later. I believe that this is a waste of the effort that went into solving the problem.

(See also my post on intuition in research).

February 2, 2009

On new modes of mathematical collaboration

(This blog post builds upon some of the observations I made in an earlier blog post on Google, Wikipedia and the blogosphere, but unlike that post, has a more substantive part dedicated to analysis. It also builds on the previous post, Can the Internet destroy the University?.)

I recently came across Michael Nielsen’s website. Michael Nielsen was a quantum computation researcher — he’s the co-author of Quantum computation and quantum information (ISBN 978-0521632355). Now, Nielsen is working on a book called The Future of Science, which discusses how online collaboration is changing the way scientists solve problems. Here’s Nielsen’s blog post describing the main themes of the book.

Journals — boon to bane?

Here is a quick simplification of Nielsen’s account. In the 17th century, inventors such as Newton and Galileo did not publish their discoveries immediately. Rather, they sent anagrams of these discoveries to friends, and continued to work on their discoveries in secret. Their main fear was that if they widely circulated their idea, other scientists would steal the idea and take full credit for it. By keeping the idea secret, they could develop it further and release it in a more ripe form. In the meantime, the anagram could be used to prove precedence in case somebody else also came up with the idea.

Nielsen argues that the introduction of journals, combined with public funding of science and the recognition of journal publications as a measure of academic achievement, led scientists to publish their work and thus divulge it to the world. However, today, journal publishing competes with an even more vigorous and instantaneous form of sharing: the kind of sharing done in blogs, wikis, and online forums. Nielsen argues that this kind of spontaneous sharing of rough drafts of ideas, of small details that may add up to something big, opens up new possibilities for collaboration.

In this respect, the use of online tools allows for a “scaling up” of the kind of intense, small-scale collaboration that formerly occurred only in face-to-face contact between trusted friends or close colleagues. However, Nielsen argues that academics, eager to get published in reputable journals, may be reluctant to use online forums to ask and answer questions of distant strangers. Two factors are at play here: first, the system of academic credit and tenure does little to reward online activity as opposed to publishing in journals. Second, scientists may fear that other scientists can get a whiff of their idea and beat them in the race to publish.

(Nielsen develops “scaling up” more in his blog post, Doing Science Online).

Nielsen says that this in inefficient. Economists do not like deadweight losses (Market wiki entry, Wikipedia entry) in markets — situations where one person has something to sell to another, and the other person is willing to pay the price, but the deal doesn’t occur. Nielsen says that such deadweight losses occur routinely in academic research. Somebody has a question, and somebody else has an answer. But due to the high search cost (Market wiki entry, English Wikipedia entry), i.e., the cost of finding the right person with the answer, the first person never gets the answer, or has to struggle a lot. This means a lot of time lost.

Online tools can offer a solution to the technical problem of information-seekers meeting information-providers. The problem, though, isn’t just one of technology. It is also a problem of trust. In the absence of enforceable contracts or a system where the people exchanging information can feel secure about not being “cheated” (in this case, by having their ideas stolen), people may hesitate to ask questions to the wider world. Nielsen’s suggestions include developing robust mechanisms to measure and reward online contribution.

Blogging for mathies?

Some prominent mathematical bloggers that I’ve come across: Terence Tao (Fields Medalist and co-prover of the Green-Tao theorem), Richard E. Borcherds (famous for his work on Moonshine), and Timothy Gowers. Tao’s blog is a mixed pot of lecture notes, updates on papers uploaded to the ArXiV, and his thoughts on things like the Poincare conjecture and the Navier-Stokes equations. In fact, in his post on doing science online, Nielsen uses the example of a blog post by Tao explaining the hardness of the Navier-Stokes equation. In Nielsen’s words:

The post is filled to the brim with clever perspective, insightful observations, ideas, and so on. It’s like having a chat with a top-notch mathematician, who has thought deeply about the Navier-Stokes problem, and who is willingly sharing their best thinking with you.

Following the post, there are 89 comments. Many of the comments are from well-known professional mathematicians, people like Greg Kuperberg, Nets Katz, and Gil Kalai. They bat the ideas in Tao’s post backwards and forwards, throwing in new insights and ideas of their own. It spawned posts on other mathematical blogs, where the conversation continued.

Tao and others, notably Gowers, also often throw ideas about how to make mathematical research more collaborative. In fact, I discovered Michael Nielsen through a post by Timothy Gowers, Is massively collaborated mathematics possible?, which mentions Nielsen’s post on doing science online. (Nielsen later critiqued Gowers’ post. Gowers considers alternatives such as a blog, a wiki, and an online forum, and concludes that an online forum best serves the purpose of working collaboratively on mid-range problems: problems that aren’t too easy and aren’t too hard.

My fundamental disagreements

A careful analysis of Nielsen’s thesis will take more time, but off-the-cuff, I have at least a few points of disagreement about the perspective from which Nielsen and Gowers are looking at the issue. Of course, my difference in perspective stems from my different (and altogether considerably fewer) experience compared to them.

I fully agree with Nielsen’s economic analysis with regard to research and collaboration: information-seekers and information-providers not being able to get in contact often leads to squandered opportunities. I’ve expressed similar sentiments myself in previous posts, though not as crisply as Nielsen.

My disagreement is with the emphasis on “community” and “activity”. Community and activity could be very important to researchers, but in my view they can obscure the deeper goal of growing knowledge. And it seems that in the absence of strong clusters, community and activity can result in a system that is almost as inefficient.

In the early days of the Internet, mailing lists were a big thing (they continue to be a big thing, but their relative significance in the Internet has probably declined). In those days, the Usenet mailing lists and bulletin board systems often used to be clogged with the same set of questions, asked repeatedly by different newbies. The old hands, who usually took care of answering the questions, got tired of this repetition of the same old questions. Thus was born the “Usenet FAQ”. With this FAQ, the mailing lists stopped getting clogged with the same old questions and people could devote attention to more challenging issues.

Forums (such as Mathlinks, which uses PHPBB) are a little more advanced than mailing lists in terms of the ability to browse by topic. However, they are still fundamentally a collection of questions and answers posted by random people, with no overall organizing framework that aids exploration and learning. In a situation where the absence to a forum is no knowledge, a forum is a good place. In fact, a forum can be one input among many for building a systematic base of knowledge. But when a forum is built instead of a systematic body of knowledge, the result could be a lot of duplication and inefficiency and the absence of a bigger picture.

Systematic versus creative? And the irony of Wikipedia

Systematic to some people means “top-down”, and top-down carries negative connotations for many; or at any rate, non-positive connotation. For instance, the open source movement, which includes Linux and plenty of “free software”, prides itself on being largely a bottom-up movement, with uncoordinated people working of their own volition to contribute small pieces of code to a large project. Top-down direction could not have achieved this. In economic jargon, when each person is left to make his or her own choices, the outcome is invariably more efficient, because people have more “private information” about their interests and strengths. (Nielsen uses open source as an example for where science might go by being more open in many of his posts, for instance, this one on connecting scientists to scientists).

But when I’m saying systematic, I don’t necessarily mean top-down. rather, I mean that the system should be such that people know where their contributions can go. The idea is to minimize the loss that may happen because one person contributes something at one place, but the other person doesn’t look for it there. This is very important, particularly in a large project. A forum to solve mathematical questions has the advantage over offline communication: the content is available for all to see. But this advantage is truly meaningful only if everybody who is interested can locate the question easily.

Systematic organization does not always mean less of a sense of community and activity, but this is usually the case. When material is organized through internal and logical considerations, considerations of chronological sequence and community dynamics take a backseat. The ultimate irony is that Wikipedia, which is often touted as the pinnacle of Web 2.0 achievement, seems to prove exactly the opposite: the baldness, anti-contextuality, canonical naming, and lack of a “time” element to Wikipedia’s entries is arguably its greatest strength.

Through choices of canonical naming (the name of an article is precisely its topic), extensive modularization (a large number of individual units, namely the separate articles), a neutral, impersonal, no-credit-to-author-on-the-article style, and extensive strong internal linking, Wikipedia has managed to become an easy reference for all. If I want to read the entry on a topic, I know exactly where to look on Wikipedia. If I want to edit it, I know exactly what entry to edit, and I’m guaranteed that all future people reading the Wikipedia entry looking for that information will benefit from my changes. In this respect, the Wikipedia process is extraordinarily efficient. (It is inefficient in many other ways, namely, the difficulty of quality control, measured by the massive amount of volunteer hours spent combating obvious and non-obvious spam, as well as the tremendous amount of time spent in inordinate battle over control and editing of particular entries).

The power of the Internet is its perennial and reliable availability (for people with reliable access to electricity, machines, and Internet connections). And Wikipedia, through the ease with which one can pinpoint and locate entries, and the efficiency with which it funnels the efforts both of readers and contributors to edit a specific entry, builds on that power. And I suspect that, for a lot of us, a lot of the time we’re using the Internet, we aren’t seeking exciting activity, a sense of community, or personal solidarity. We want something specific, quickly. Systematic organization and good design and architecture that gets us there fast is what we need.

What can online resources offer?

A blog creates a sense of activity, of time flowing, of comments ordered chronologically, of a “conversation”. This is valuable. At the same time, a systematic organized resource, that organizes material not based on a timeline of discovery but rather based on intrinsic characteristics of the underlying knowledge, is usually better for quick lookup and user-directed discovery (where the user is in charge of things).

It seems to me that the number of successful “activity-based online resources” will continue to remain small. There will be few quality blogs that attract high-quality comments, because the effort and investment that goes into writing a good blog entry is high. There may be many mid-ranging blogs offering random insights, but these will offer little of the daily adventure feeling from a high-traffic, high-comment blog.

On the other hand, the market was quick “pinpoint references” — the kind of resources that you can use to quickly look something up — seems huge. A pinpoint reference differs from a forum in this obvious way. In a forum you ask a question and wait for an answer, or, you browse through previously asked questions. In a pinpoint reference, you decide you want to know about a topic, and go to the page, and BANG, the answer’s already there, along with a lot of stuff you might have thought of asking but never got around to, all neatly organized and explorable.

Fortunately or unfortunately, the notion of “community” and “activity” is more appealing in a naive, human sense than the notion of pinpoint references. “Chatting with a friend” has more charm to it than having electricity. But my experience with the way people actually work seems to suggest that people value self-centered, self-directed exploration quite a bit, and may be willing to sacrifice a sense of solidarity or “being with others in a conversation” for the sake of more such exploration. Pinpoint resources offer exactly that kind of a self-directed model to users.

My experiment in this direction: subject wikis

I started a group theory wiki in December 2006, and have since extended it to a general subject wikis website. The idea is to have a central source, the subject wikis reference guide, from where one can search for terms, get short general definitions, with links to more detailed entries in individual subject wikis. See, for instance, the the entry on “normal”.

I’ve also recently started a blog for the subject wikis website, that will describe some of the strategies and approaches and choices involved in the subject wikis.

It’s not clear to me how this experiment will proceed. At the very least, my work on the group theory wiki is helping me with my research, while my work on the other wikis (which has been very little in comparison) has helped me consolidate the standard knowledge I have in these subjects along with other tidbits of knowledge or thoughts I’ve entertained. Usage statistics seem to indicate that many people are visiting and finding useful the entries on the group theory subject wiki, and there are a few visitors to each of the other subject wikis as well. What isn’t clear to me is whether this can scale to a robust reference where many people contribute and many people come to learn and explore.

Blog at WordPress.com.