What Is Research?

October 13, 2006

Reading research papers

From what I’ve gathered through talking to people on their way to a Ph.D., there’s quite a difference between the mentality required for coursework and the mentality required for research. For instance, in a course, the subject matter has been distilled and organized in a particular manner by the instructor. There is a clear path to follow: attend the lectures, read the lecture notes, read the text books and other references, solve problems, do the assignments, and sit the examinations. Even if everybody does not follow this clear path, the fact that it exists is a source of reassurance.. it is always there to fall back upon.

Doing research, which may involve solving open problems or extending existing results, is a different ballgame. At best, the student is given a problem and some material to chew upon and is then practically let loose on it. At worst, the student is told to find his or her own problem and work on it and keep using the advisor for course correction. Clearly, a completely different approach is required for this: an approach where the student figures out what information to collect, how to collect it, how to use it, whether to discard it, and so on.

An important distinguishing feature about research orientation, then, is broad reading with a narrow focus and a specific objective. Since this kind of focus cannot be provided in the routine college environment, students keen on developing the research orientation need to find other means of developing the skills. Summer schools and summer camps usually help in providing such focus. For instance, in the VSRP programme at TIFR, that I attended this summer, I was asked to read a paper on Lie Group Representations of Polynomial Rings. Here’s a link to the final presentation I gave on a part of the paper. I’ve chronicled about this paper in earlier posts on this same blog. Check out this post and subsequent posts.

Students can be encouraged to read papers even within coursework, by having student seminars as part of the course accreditation. Some of my courses at CMI this semester have student seminars. For instance, in the course on Representation Theory of Finite Groups, each student is supposed t give a seminar on a topic decided by the instructor; I have to give my seminar on Artin’s Theorem. In the Elementary Differential Geometry course (course details are available here), a list of seminar topics was given and each student had to select a topic, I chose the Whitney embedding theorem and a write-up of what I presented is available here.

Apart from reading research papers for these courses, I have also been reading research papers to seek and collect knowledge in various areas of mathematics.

First, some differences between the textbook and the research paper:

  1. The textbook presentation, or the lecture note, is meant to be an introduction to the subject. It is intended to provide overall motivations, basic definitions, and a level of familiarity and comfort to people who are new to the subject. Steps are left out or missed only if they are easy for the reader to fill out or filling them is an instructive exercise for the reader.
    The research paper, on the other hand, is meant to be a concise introduction to a new discovery or a new idea or a new formulation, for people who are already familiar with the area. Definitions and background are provided only in order to set notations and conventions, explain the authors’ mindset and revive the memory of readers. Efforts are not made to be complete. Further, the authors tend to skip on steps which: (a) have been proved elsewhere (b) require routine checking that other experts can do (c) provide no insights and detract from the essence of the paper.
  2. A (well-written) research paper has a clear end in mind, which it tries to outline in the beginning. It then gradually builds up the arsenal and ammunition needed towards proving this end. At some point in the paper, the authors usually discuss how this new result sheds new light in the areas being explored.
    A textbook, on the other hand, may not have a clear, specific result that it intends to establish. Rather, it aims to develop a backdrop and a framework in the minds of students.

Based on my experiences (both positive and negative) in trying to grasp research papers, I have come up with the following strategy:

  1. Try to get an idea of what the paper is trying to prove. This can usually be gleaned from the abstract, from the introduction, or from the beginning of the second section (if the first section is for preliminaries).
    Look for something marked Theorem 1 or Main Theorem.
  2. Understand carefully the statements of previously written results in that area, and use that understanding to try to figure the import of the new result obtained. Try to state the new result obtained in as many different flavours as possible. Make all of them as appetizing as can be!
  3. Now, look at the statements of the lemmas and corollaries, and try to understand each statement. Attempt a broad trajectory that describes how the theorem is obtained, via the lemmas and corollaries. Do not look at the proofs yet, unless they help significantly in understanding the statements.
    While trying to understand the statements of the lemmas and corollaries, it may be necessary to familiarize oneself with the notation of the paper.
  4. After a short break, look at this trajectory, and try to figure out which steps in the deduction process are clear and obvious. Often it may happen that many steps in the deduction process are not too hard. Figuring out that one already understands a lot of the proof before having seen the actual proof is a great confidence-booster.
    For the parts where the proof seems clear, look at the actual proofs and see whether they match the proof in your mind.
  5. Now, it is time to focus on the non-obvious parts of the proof. Gently look at the proofs of each of these. Some of these may turn out to be clear once you read the proof. For others, however, the proof may involve some new idea. Zero in on the proofs that are hard to understand. Note the crucial leaps of thought. Don’t be in a hurry to digest these pieces.
  6. Come back after another break. Recall the proof skeleton, and the proofs of the easy part. Now, in easy sessions, master the hard parts. Take special care to master those parts that fill you with the maximum discomfort.

This approach steadily zooms in on the proof details by beginning at the main result, then proceeding to the proof skeleton, and then finally going to the nitty-gritties of the actual proof.

What are the kind of results one obtains with this approach?

Some observations:

  • Steady documentation at each step is particularly useful. In this respect, I think one way of documentation is to prepare a presentation on the paper. A nice tool for preparing presentations is the document class beamer in LaTeX.
    Here’s an example of the PDFized version of a file using beamer: An electric story of a drunkard. The original LaTeX file looks like this.
  • Often, reading a research paper is disconcerting because one realizes the many gaps in one’s knowledge on encountering statements that the authors claim are obvious but that are not obvious at all. This has happened to me quite often! But whenever I have followed this zoom-in strategy of first concentrating on the broad motivations, then strengthening the proof skeleton, and then going in for the actual proof details, I have found that the disconcerting parts only come towards the end, by which time I have already gained a lot of confidence in the paper.
  • The “zoom-in approach” works best if the reader is used to looking at things and ideas in terms of their motivations, and understands the broad motivations in the topic where the research paper was written. These motivations are meant to be developed in the regular coursework, through comments and remarks made by the instructor, through the structure of the course outline, trough comments in the book, through the choice of exercises and problems that the student solves.
    However, even students not used to looking at things motivationally can start doing so by applying the zoom-in approach to a given paper!

Now, a chronicle, of some of the mistakes I have made when reading research papers:

  • Reading the first two pages and then quitting: True, this isn’t a really bad thing if the paper is well-written, because the author would have put the statement of the main theorems in the first two pages. However, simply knowing the statement of the theorem, without understanding the proof skeleton, may sometimes be useless.
    In some cases, the proofs may be hard. But in the past, I have often skipped the proofs simply because they seemed too tedious. However, now that I have started applying the “zoom-in” approach, I am able to absorb a little of the proof skeleton even if the steps of the actual proof remain unclear.
  • Getting disheartened because many statements in the beginning don’t seem to make sense: The introduction of the research paper usually contains both background preliminaries and a summary of important results shown in the paper. While reading the paper on Lie Group Representations of Polynomial Rings, I thought that the first few pages contained background preliminaries, and was disheartened at the fact that figuring out their meaning took me a lot of time. Only after crossing those initial pages did I discover that the content of the first few pages was not background preliminaries, but results proved in the paper.
    To avoid confusing background preliminaries (viz what is assumed) and the core content of the paper (viz what is established/proved) it is important to have a look at the whole paper. A strategy that I have followed since the experience with the Lie Group Representations paper is to create a mapping of the introductory section onto the rest of the paper. This way, it is clear to me which parts of the introduction have what purpose.
  • Not having any clear targets: A huge research paper can be daunting, but at the same time, it may be difficult to set intermediate targets. That’s what happened with the Lie Group Representations of polynomial Rings paper. It took me a lot of time to get a hang of the structure of the paper.
    In retrospect, I feel that after mapping the paper, and getting a hang of its structure, I should have singled out the results that it was important for me to master, and then applied the “zoom-in” approach towards mastering them.

I’ll post more on this. Looking forward to comments in the meantime.


October 11, 2006

A new theory of mine

Filed under: Uncategorized — vipulnaik @ 2:56 pm

I have this new theory for sequences of objects of various kinds, and I’m trying to figure out what to do with the theory. I haev prepared lots of write-ups on the theory, as well as fanned out my ideas in many directions. But as yet, I haven’t somehow been able to share my idea with others, or bring my write-ups into a cogent and consistent form.

In this blog post, I plan to give a basic outline of the theory, along with references to more detailed write-ups (which I will put on my homepage). Thus begins a rather loose introduction:

Consider the matrix groups GLn(k) where k is a field. For any fixed n, this is the group of invertible matrices of order n. The question I wanted to ask was: what is the relationship between the matrix guops of different orders? There is a nice relationship by block concatenation. Given a matrix in GLm(k) and a matrix in GLn(k) we can obtain a matrix in GLm+n(k) by putting the matrix of order m in the top left corner and the matrix of order n in the bottom right corner, and the remaining entries as zero.

This is a homomorphism GLm(k) X GLn(k) to GL(m+n)(k). If we call this homomorphism Phim,n, then we have some associativity rulse for the mappings Phi, the mappings Phi are all injective, and there aer also some interesting refinement conditions.

This led me to consider the abstract situation: a sequence of groups Gm with m varying over nonnegative integers, along with block concatenation maps Phim,n:Gm X Gn to Gm+n. I assumed conditions of associativity and refinability, and christened the resulting general structure as Addition to Product Sequence (APS). If all the block concatenation maps are injective, then it is termed an Injective Addition to Product Sequence (IAPS).

From the above discussion, the general linear groups over a field (and more generally, over a commutative ring with identity) form an APS.

Question: what are the general properties of APSes? What are the examples of APSes?

A lot of what we do over individual groups can be done over IAPSes of groups. We can defien the cnocept of a sub-IAPS, and a normal sub-IAPS. The quotient of an IAPS by a normal sub-IAPS is again an APS, but the quotient APS may not be injective. The quotient of an IAPS by a sub-IAPS is in general an APS of sets only (not of groups). The quotient is injective if and only if a certain condition called being saturated is satisfied for the sub-IAPS.

Some examples of IAPSes of groups within the matrix algebra setting:

  • The orthogonal groups form a sub-IAPS of the IAPS of general linear groups. That’s because the block concatenation of two orthogonal matrices is an orthogonal matrix. This sub-IAPS is saturated in the following sense: given an orthogonal matix obtained as the block concatenation of two invertible matrices, both the invertible matrices are themselves orthogonal. The quotient space of the general linear IAPS by the orthogonal IAPS forms an IAPS of sets: this can be thought of as the IAPS of smymetric positive definite bilinear forms.
  • The symplectic groups form a sub-IAPS of the IAPS of general linear groups. That’s again because the block concatenation of two symplectic matrices is a symplectic matrix. This is again saturated, and the quotient space is the spcae of nondegenerate alternating forms.
  • The special linear groups form a sub-IAPS of the IAPS of general linear groups. In fact, this is a normal sub-IAPS. The quotient APS is a constant Abelian group with block concatenation simply being the multiplication map within the Abelian group. The sub-IAPS is not saturated, because there can be invertible matrices that are not unimodular, but whose block concatenation is unimodular.
  • Given a homomorphism of rings, there is an induced homomorphism of the corresponding general linear IAPSes. The kernel of this homomorphism is termed an IAPS of congruence subgroups. Here’s the typical example: the ring of integers and the quotient map from that ring of integers to the ring of integers modulo an integer m. The kernel of this quotient map, forms an IAPS, which is called the IAPS of congruence subgroups.

Once I started looking for APSes, I didn’t cease finding them. Roughly the raison d’etre for IAPSes is as follows: take an object and take the sequence of its powers (direct powers or free powers, in some suitable sense). Then, the automorphism groups of these powers form an IAPS of groups. Guess how? Roughly, for the block concatenation map Phim,n, the automorphism of the mth power acts on the first m coordinates and the automorphism of the nth acts on the last n coordinates.

Here are specific situations:

  • The general linear IAPS over a ring R assigns to each n the automorphism group of the free module Rn.
  • The permutation IAPS assigns to each n the symmetric group on n elements. Note that the permutation IAPS can be embedded inside the orthogonal IAPS over any ring.
  • The general affine IAPS over a ring assigns to each n the affine group of order n over the ring, which is the semidirect product of Rn by GL(n,R) under the usual action.
  • The polynomial automorphism IAPS. Fix a base ring (or base field). Then, consider the IAPS whose nth member is the automorphism group of the polynomial ring in n variables over that base ring or field. These form an IAPS. And this IAPS clearly contains the general affine IAPS.
  • The function field automorphism IAPS. Fix a base field. Consider the IAPS whose nth member is the automorphism group of the pure transcendental extension of the base field of transcendence degree n. This IAPS contains the polynomial automorphism IAPS.
  • The free group automorphism IAPS. This is the IAPS that assigns to each n the automorphism group of the free group on n letters.
  • The tensor algebra automorphism IAPS over a base ring or base field. This assigns to each n the automorphism group of the free tensor algebra in n variables over the base ring (or base field).

There are other IAPSes that don’t quite fit into the above framework but arise naturally: for instance, the mapping class groups form IAPSes, the braid groups form IAPSes. And then, various subs of IAPSes can be defined.

What I’m interested in getting out of IAPS theory is the following:

  • See under what circumstances we can come up with suitable notions of determinant, transpose, parabolic structure, unipotent structure and so on.
  • Analyze the conjugacy classes and see under what circumstances we can get a canonical form. For instance, the permutation IAPS has a canonical form for conjugacy classes through the cycle decomposition, while the general linear IAPS over a field has a canonical form for conjugacy classes through the rational canonical form.
  • Try to understand the generating sets and see whether we can get certain special generating sets that are present in members of small index.

Another interesting observation I made is that just like we do representation theory in the general linear IAPS, we can do representation theory of a group in an arbitrary IAPS. COncepts such as direct sum decomposition of representations canbe formulated in the IAPS language. Concepts such as irreducible representation and complete reducibility can be formulated in the language of IAPSes with an additional parabolic structure.

And reversing roles, we can try representing the members of an IAPS inside another IAPS. For instance, we can study the representations of SLn(Fp) in the general linear IAPS over complex numbers. Here, the IAPS theory of both these IAPSes comes out.

By looking at what happens in the case of the representation theory of the permutation IAPS and of the linear IAPS over finite fields, I have tried to see what we can say in general about the representation theory of one IAPS inside the other. I have got some promising frameworks into which the permutation and linear case both fit.

A quick summary:

  • An APS is a sequence of groups indexed by the natural numbers along with block concatenation maps which are homomorphisms from the direct product of two members to the member whose index is the sum of their indices. The block concatenation maps are required to satisfy some conditions, most notably associativity.
  • When the block concatenation maps are injective, I call the APS an injective APS or IAPS.
  • Though I defined an APS of groups, one can also define APS of rings, APS of sets, APS of monoids etc.
  • There are notions of sub-IAPSes and quotient IAPSes. The quotient of an IAPS by a sub-IAPS is again an APS of groups if and only if the sub-IAPS is normal at every member. It is an IAPS if the sub-IAPS is saturated. These are terms I introduced myself.
  • The general linear IAPS has interesting sub-IAPSes: the special linear IAPS (normal but not saturated, the quotient is a constant Abelian group IAPS), the orthogonal IAPS (saturated but not normal), the symplectic IAPS.
  • IAPSes arise as automorphisms of power sequences: the automorphisms of free modules give the general linear IAPS, the automorphisms of free groups give another IAPS, the automorphisms of polynomial rings give the polynomial automorphism IAPS, the automorphisms of the function field give the function field IAPS. Other IAPSes: the braid group, the affine group, the mapping class group.
  • There is a notion of a parabolic structure on a general IAPS, and such a structure comes naturally for IAPSes that arise as automorphisms of power sequences.
  • I am keen on figuring out when additional structure such as determinant, transpose and parabolic structure can be imposed on the IAPS.
  • I am keen on studying representation theory in an arbitrary IAPS, or possibly in an IAPS with a parabolic structure.
  • I am keen on looking at canonical forms for conjugacy classes for arbitrary IAPSes.
  • I am keen on looking at representation theory of the individual members of arbitrary IAPSes, to tie in the similarities between the representation theory of the symmetric groups and of the linear groups over finite fields.

Please do post your comments on the following:

  • Is the general idea of IAPS clear?
  • Does the notion of IAPS seem a useful abstraction (at a conceptual level)?
  • Do IAPSes holda promise of providing uniform tools for studying the very diverse range of IAPSes?
  • Do IAPSes hold a promise of providing a better language for discussing representation thery?
  • What are the aspects that seem to interest you and on which you would like clear expostulation/elaboration?
  • Do you think I should put in the effort of presenting the theory formally or should I wait for something more from it? If so, what kind of thing should I wait for?

Blog at WordPress.com.