A Second Look at the Cathedral and Bazaar

This paper provides an overview of the weaknesses of Eric Raymond's (ESR) paper The Cathedral and the Bazaar (CatB) as well as the more coherent demonstration of the fact that the bazaar metaphor is internally contradictive. It is also to a certain extent a reaction to the publication of Eric Raymond's new book The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary (Sebastopol, Calif.: O'Reilly & Associates, 1999). In this paper a more objective picture of the status competition in the OSS environment is provided.
Contents

Introduction
Some Vulnerabilities of The Cathedral and the Bazaar
Status Competition in Internet-Based Developer Communities
Conclusion

Introduction

"Facts are stubborn things; and whatever may be our wishes, our inclinations, or the dictates of our passions, they cannot alter the state of facts and evidence."
- John Quincy Adams

Open source is a very interesting and influential phenomenon. It is especially intriguing to me because I believe that it can play a positive role in developing countries. In order to ensure its long-term sustainability we need to see it "as is" and clearly identify possible pitfalls as well as open source's strong and weak points. Fundamentally, we need a reliable map of the open source environment.
The publication of Eric Raymond's (ESR) new book The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary (Sebastopol, Calif.: O'Reilly & Associates, 1999) makes a fresh and critical review of his most influential paper even more necessary. Besides The Cathedral and the Bazaar (CatB), several papers by ESR are included in this book. None of the papers are so well written, influential and important as CatB. It is no wonder that The Cathedral and the Bazaar is sometimes considered as a Manifesto of the Open Source Movement. This paper will try to analyze just CatB.
In my earlier paper I argued that the bazaar metaphor is internally contradictive. In this paper we would like to concentrate on the entire CatB paper and try to dissect the main ideas of CatB.
I earlier noted that, in CatB, open source is described as a revolutionary phenomenon. To me it is just another form of a scientific community. Similarly for me the development of Linux is not a new and revolutionary model, but just a logical continuation of the Free Software Foundation's (FSF) GNU project. This project has strong connections to MIT. I am convinced that this academic connection was crucial to the success of GNU just like the connection to the University of Helsinki immensely helped the Linux project in its early, most difficult stages.
This paper consists of two parts. In the first part I will analyze the key ideas of CatB. In the second part I will examine the distortions of status competition phenomena in CatB and will try to provide a more objective picture of the status competition in the OSS environment.

Some Vulnerabilities of The Cathedral and the Bazaar

"If liberty means anything at all, it means the right to tell people what they do not want to hear."
- George Orwell

In this part of the paper I would like to concentrate on the ideas expressed in CatB. These ideas later uncritically became a part of open source folklore; they are frequently reproduced in papers and interviews. Many open source authors base their arguments on an implicit assumption that these ideas are true. Some of the most important ideas in CatB include:

Brooks' Law does not apply to Internet-based distributed development;

"Given enough eyeballs, all bugs are shallow";

Linux belongs to the Bazaar development model;

OSS development model automatically yields the best results;

The Bazaar development model is a new and revolutionary model of software development.

I will try to demonstrate that all of these ideas are very vulnerable. Let's start with the remarks on Brooks' Law as it was one of the most important statements in CatB.

Brooks' Law does not apply to Internet-based distributed development.

"The most common lie is that with which one lies to oneself; lying to others is relatively an exception."
- Fredrich Nietzsche

One of the most indefensible ideas of CatB is that Brooks' Law is non-applicable in the Internet-based distributed development environment as exemplified by Linux. From CatB (Italics in quotes are mine; original italics, if any, are bold italics):
"In The Mythical Man-Month, Fred Brooks observed that programmer time is not fungible; adding developers to a late software project makes it later. He argued that the complexity and communication costs of a project rise with the square of the number of developers, while work done only rises linearly. This claim has since become known as "Brooks's Law" and is widely regarded as a truism. But if Brooks's Law were the whole picture, Linux would be impossible."

This belief that programmer time scales differently as soon as programmers are connected to the Internet and are working on open source projects is repeated elsewhere in a different form:

"Perhaps in the end the open-source culture will triumph not because cooperation is morally right or software "hoarding" is morally wrong (assuming you believe the latter, which neither Linus nor I do), but simply because the closed-source world cannot win an evolutionary arms race with open-source communities that can put orders of magnitude more skilled time into a problem."

First I would like to stress that the famous book The Mythical Man-Month has acquired the status of a classic in the software engineering field. The book is definitely, by several orders of magnitude, more important than CatB; this critique will not harm the book. Many other concepts, phrases and even chapter titles from that famous book have become part of software engineering terminology. I can mention "the second-system effect", "ten pounds in a five-pound sack", "plan to throw one away", "How does a project get to be a year late?...one day at a time". In the early 60s while working as a project manager of Operating System/360 (OS/360), Frederick Brooks observed the diminishing output of multiple developers and that the man-month concept is but a myth. It is as true in 1999 as it was in 1975 when the book was first published.

The real problem with the CatB statement is that due to the popularity of CatB this statement could discourage OSS community from reading and studying The Mythical Man-Month, one of the few computer science books that has remained current decades after its initial publication. Actually the term "Brooks' Law" is usually formulated as "Adding manpower to a late software project makes it later". The term "mythical man-month" (or "mythical man-month concept") is usually used to identify the concept of diminishing output of multiple developers even if all work on a given project from the very start. One of the best explanations of this concept was given by Ray Duncan in his Dr. Dobbs review of The Mythical Man-Month:

"What is a mythical man-month anyway? Consider a moderately complex software application from the early microcomputer era, such as the primordial version of Lotus 1-2-3, Ashton-Tate dBASE, or Wordstar. Assume that such a program might take one very smart, highly-motivated, expert programmer approximately a year to design, code, debug, and document. In other words, 12 man-months. Imagine that market pressures are such that we want to get the program finished in a month, rather than a year. What is the solution? You might say, "get 12 experienced coders, divide up the work, let them all flog away for one month, and the problem will be solved. It's still 12 man-months, right?

Alas, time cannot be warped so easily. Dr. Brooks observed that man-months are not - so to speak - factorable, associative, or commutative. 1 programmer * 12 months does not equal 12 programmers * 1 month. The performance of programming teams, in other words, does not "scale" in a linear fashion any more than the performance of multi-processor computer systems. He found, in fact, that when you throw additional programmers at a project that is late, you are only likely to make it more late. The way to get a project back on schedule is to remove promised-but-not-yet-completed features, rather than multiplying worker bees.

When you stop to think about it, this phenomenon is easy to understand. There is inescapable overhead to yoking up programmers in parallel. The members of the team must "waste time" attending meetings, drafting project plans, exchanging EMAIL, negotiating interfaces, enduring performance reviews, and so on. In any team of more than a few people, at least one member will be dedicated to "supervising" the others, while another member will be devoted to housekeeping functions such as managing builds, updating Gantt charts, and coordinating everyone's calendar. At Microsoft, there's at least one team member that just designs T-shirts for the rest of the team to wear. And as the team grows, there is a combinatorial explosion such that the percentage of effort devoted to communication and administration becomes larger and larger."

Most top-level software professionals are more like artists, in spite of the technical nature of their specialty. It is not a coincidence that another classic book on programming is entitled The Art of Computer Programming. Communication, personality and political problems definitely creep into any project, as any manager of a sizable programming project can attest. These problems certainly drag productivity down.
It's simply naive to assume that for the same team Internet connectivity can improve performance in comparison with, say, LAN connectivity or using the same mainframe. Moreover, if we are assume the same level of developers, geographically compact teams will always have an edge over distributed Internet-connected teams. Open source uses the Internet to connect a geographically distributed pool of talent. In turn, it potentially raises the quality of that pool in the absence of geographical barriers. Reducing the effects of distance does not eliminate other constraints under which such projects operate, but can dramatically increase the quality of the pool of developers. That's the only advantage that I see.

I believe that the illusion of the non-applicability of "mythical man-month postulate" and Brooks' law is limited only to projects for which a fully functional prototype already exists and most or all architectural problems are solved. This may have been the case for Linux, which is essentially an open source re-implementation of Unix. With some reservations, it is probably applicable for all systems for which both the specification (Posix in case of Linux) and reference implementation (say FreeBSD or Solaris) already exist and are available to all developers. As was pointed out in the Halloween-I document:

"The easiest way to get coordinated behavior from a large, semi-organized mob is to point them at a known target. Having the taillights provides concreteness to a fuzzy vision. In such situations, having a taillight to follow is a proxy for having strong central leadership. Of course, once this implicit organizing principle is no longer available (once a project has achieved "parity" with the state-of-the-art), the level of management necessary to push towards new frontiers becomes massive. This is possibly the single most interesting hurdle to face the Linux community now that they've achieved parity with the state of the art in UNIX in any respects."

"Given enough eyeballs, all bugs are shallow"

"Only two things are infinite, the universe and the human stupidity, and I'm not sure about the former."
- Albert Einstein

One of the most important ideas promoted by CatB was the motto-style phase attributed to Linus Torvalds - "Given enough eyeballs, all bugs are shallow":

"In the bazaar view, on the other hand, you assume that bugs are generally shallow phenomena - or, at least, that they turn shallow pretty quick when exposed to a thousand eager co-developers pounding on every single new release. Accordingly you release often in order to get more corrections, and as a beneficial side effect you have less to lose if an occasional botch gets out the door."

The debugging of a complex system is much more difficult undertaking that simply getting a huge number of "eager co-developers" to analyze lines of code. For most complex projects for every second or third bug located and fixed another one may be introduced. Here Linux distributions can serve as a pretty good illustration as most "production" versions in retrospect (taking into account the number of bugs and fixes applied to them before they became obsolete) probably can be reclassified as beta.

CatB assumes that several talented developers can successfully work on the same piece of code in parallel without any coordination other than e-mail; one of them eventually will fix the code quicker than in the commercial environment with specially trained testers. Certainly, if enough talented developers try to find the same bug simultaneously it probably will be found sooner or later. But there are several problems with this idea of parallel debugging. Several questions arise:

Is this the best way to utilize talented developers?
Why waste their skills on debugging? In some commercial environments, professional testers provide an important edge. Synergy between volunteer developers and commercial developers and testers could be beneficial. Talented developers usually dislike debugging code not their own unless this is absolutely necessary; they want to create their own code. Moreover if we assume that ten talented developers work constantly to isolate one given bug, we might call this plan a kind of "Mongol horde" approach. Improved project management could be more beneficial than throwing ten programmers after a single bug. Talented developers who understand a given system are extremely scarce and this sort of approach is extremely wasteful.

Are all bugs created equal?
No. There are at least three major types of bugs - code errors, logical errors and architectural problems. One large subclass are just plain vanilla coding bugs (for example if (i=1){...} in C) that are not that difficult to find. The second are errors in logic that are approximately one order harder to find and fix. The most complex bugs are consequences of architectural flaws (Unix security) or limitations of tools used (C and memory leaks and overflow bugs). Numerous architectural flaws of Linux are widely known and are being rectified over time. From the software engineering perspective they are similar to architectural problems of any prototype that was converted into a production system; initially Linux was expected to be a temporary system to be eventually replaced by Hurd. Therefore the possibility of completely debugging Linux is related to making Linux architecturally superior to alternatives. The problem of stagnation of early architecture solutions, wasting time and effort in debugging instead of solving underling architectural problems, is common to OSS projects. The peculiarities of the open source status game give a higher priority to those "heroic" efforts in debugging rather than emphasizing software architecture or planning.

Is it easy to force gifted programmers to search for the same bug?
It depends. Usually forcing a lot of talented developers to look at the same piece of code is like herding cats. It's nether easy, nor a rewarding task, unless the bug is really critical (technically or politically). Talented developers are first programmers not testers; they usually prefer making their own bugs to fixing bugs of others. Any situation where many talented developers actually work on the same segment of code is more of an exception than a rule. With the increasing complexity of a given project, this pooling of talent will occur very rarely and only for the most critical or politically important bugs. For any sufficiently complex project there will be never be sufficient "eyeballs" to locate and eradicate all bugs.

Does it make sense to fix bugs in ugly source code?
Usually not. If the source is ugly (and as Ken Thompson pointed out, some parts of Linux are) additional bugs could be easily introduced by fixing an existing one. Rewriting, not fixing, is a more viable option here. The open source model, with its over-reliance on debugging, could be at a disadvantage. In the commercial environment a talented manager could partially avoid this problem by exercising his or her judgment and power. In open source, modules incorporated at early stages of a given project could outlive their utility in short order. Thanks to inertia and programming overload on key developers, there may be no effort to rewrite these modules until serious problems occur that justify the effort. Sometimes only when a problem become visible will you be able to attract a decent developer to rewrite code - a very unrewarding task indeed, as any programmer can attest. In this case the publicity about the bug makes fixing it a worthwhile investment in the status game (often the situation with security bugs).

The seemingly infinite number of bugs (closely related to architectural flaws) preclude the positive influence of random bug fixing on the product as a whole. For Linux, I see no breakthrough in quality in comparison with Solaris or FreeBSD. In his interview in the IEEE Computer Society's magazine Computer Ken Thompson stated:

"Computer: In a sense, Linux is following in this tradition. Any thoughts on this phenomenon?

Thompson: I view Linux as something that's not Microsoft - a backlash against Microsoft, no more and no less. I don't think it will be very successful in the long run. I've looked at the source and there are pieces that are good and pieces that are not. A whole bunch of random people have contributed to this source, and the quality varies drastically.

My experience and some of my friends' experience is that Linux is quite unreliable. Microsoft is really unreliable but Linux is worse. In a non-PC environment, it just won't hold up. If you're using it on a single box, that's one thing. But if you want to use Linux in firewalls, gateways, embedded systems, and so on, it has a long way to go."

In his response to the ESR's letter on his interview, Ken Thompson noted that:
"I do believe that in a race, it is naive to think Linux has a hope of making a dent against Microsoft starting from way behind with a fraction of the resources and amateur labor. (I feel the same about Unix.)"

Even superficial analysis of the Bugtrack archive confirms that most developers prefer making their own bugs, not fixing bugs of others. For accidental contributions to the kernel, the situation can be even worse. In a very interesting recollection of his early Linux experience Lars Wirzenius wrote:

"For example, my one stab at kernel programming resulted in a bug that took three years to track down and fix, and even then it was done by someone hacking OS/2. I'm referring to the sprintf function inside the kernel."

This discussion suggests that limiting those who can directly contribute to the kernel to a few most qualified developers (the core team) is a good idea. The Bazaar model should be avoided in kernel development as much as possible.

Does Linux belongs to the Cathedral model or to the Bazaar model?

"The true faith compels us to believe there is one holy Catholic Apostolic Church and this we firmly believe and plainly confess. And outside of her there is no salvation or remission from sins."
- Boniface VII, Pope (1294-1303)

Many have pointed out that the level of decentralization in the Linux world is open to review. The black and white picture painted in CatB (monolithic, authoritarian Cathedral model vs. democratic, distributed Bazaar model) is too simplistic. These metaphors for high centralization (Cathedral) and no centralization (Bazaar) do not account for the size of a given project; its complexity, timeframe and time pressures; its access to resources and tools; and, whether we are talking about core functionally (like Linux kernel) or peripheral parts of the system. For large projects like operating systems it is especially important that the core of the system is developed in a highly centralized fashion with a small core team. Peripheral parts of the system can benefit from a more relaxed, more decentralized approach. I believe that CatB fails to distinguish between these two types of activities, as the following quote demonstrates:

"In retrospect, one precedent for the methods and success of Linux can be seen in the development of the GNU Emacs Lisp library and Lisp code archives. In contrast to the cathedral-building style of the Emacs C core and most other FSF tools, the evolution of the Lisp code pool was fluid and very user-driven. Ideas and prototype modes were often rewritten three or four times before reaching a stable final form. And loosely-coupled collaborations enabled by the Internet, á la Linux, were frequent."

One should not contrast these two activities, Emacs C core and Lisp code. They are different and should be examined with different models in mind. There are advantages in using mixed models other than the pure centralized (Cathedral) or completely decentralized (Bazaar) extremes. It's hardly surprising that in reality a mixed model dominates or that there's a place for highly centralized development in the Linux world. In the opinion of Linus Torvalds:
"Open source may sound democratic, but it isn't. At the LinuxWorld Expo on Wednesday, leaders of some of the best-known open source development efforts said they function as dictators.

The ultimate example is Linux itself. Creator Linus Torvalds has final say over all changes to the kernel of the popular open source clone of Unix. Because the Linux development community has grown so large, most software patches are reviewed by many different people before they reach him, Torvalds said.

If he rejects a patch, it can mean a lot of other people threw a lot of effort down the drain, he said. However, it enables him to keep Linux organized without spending all of his time on it, he added.

"My workload is lower because I don't have to see the crazy ideas," Torvalds said. "I see the end point of work done for a few months or even a year by other people.""

One can immediately see elements that are foreign to the Bazaar style in the current stage of the Linux kernel development as described by the principal author of the kernel. It looks more like a highly centralized (Cathedral) development model. For example you cannot communicate with Linus directly but need to supply patches to his trusted lieutenants. If the patch is rejected, there is no recourse, which sounds pretty un-democratic. Jordan Hubbard noted:

"Despite what some free-software advocates may erroneously claim from time to time, centralized development models like the FreeBSD Project's are hardly obsolete or ineffective in the world of free software. A careful examination of the success of reputedly anarchistic or "bazaar" development models often reveals some fairly significant degrees of centralization that are still very much a part of their development process."

Of course these arguments do not exclude the fact that some activities in Linux can be classified as belonging to the decentralized (Bazaar) model, especially development of drivers, utilities and small applications. The same is even more true in the DOS/Windows/Windows NT world. Certainly there's more shareware and freeware available for DOS/Windows compared to Linux. DOS was the first democratic software environment that utilized the power of Internet and long before Linux established major archives that "aptly symbolized" the Bazaar style by taking "submissions from anyone". Symtel and cdrom.com are just a few examples. Jonathan Eunice remarked:

"The problem is not that it reaches wrong conclusions, but rather that it assumes an open-is-good/commercial-is-bad worldview that we doubt most IT managers share. Beyond proselytizing, it also paints all of software development against just two exemplars. The Cathedral represents a monolithic, highly planned, top-down style of software development. The Bazaar, which he recommends, represents a chaotic, evolutionary, market-driven model. Rarely do such simple categorizations do justice to reality. Finally, the combination of Netscape's strategic shift and the anti-Microsoft tone of the open source community lead one to assume that Netscape exemplifies the Bazaar, leaving Microsoft as the regressive, dogma-driven Cathedral. But that too is oversimplified.

... Raymond's paper touches upon, but never highlights, the synthesis of his antipodes: "when you start community-building, what you need to be able to present is a plausible promise. Your program doesn't have to work particularly well. It can be crude, buggy, incomplete, and poorly documented. What it must not fail to do is convince potential co-developers that it can be evolved into something really neat in the foreseeable future.

What a perfect explanation of the Microsoft juggernaut! As Redmond understands, your product need not be strong at first shipment as long as one has a "plausible premise" of later product strength, of setting a standard, or of uniquely satisfying a pressing need - and as long as the early versions provide a foundation of revenue, user base, and an investment pool that is wisely reinvested to achieve continual improvement and eventual leadership. The units of measurements may differ between the open source and commercial communities, but the concept of economic foundation is identical.

That Microsoft - the antithesis of the Linux/freeware/open source community - can so effectively utilize the same Bazaar principles suggests two conclusions. First, that the Cathedral and the Bazaar provide too few data points to use in plotting the programming process. Second, that there are a set of surprisingly universal guidelines involving flexibility, modularity, and an aggressive feedback loop that all software developers should seriously consider adopting as their guiding principles."

Many Microsoft business strategies have close analogs in open source. As the authors of the Halloween-I document put it:

"Different reviewers of this paper have consistently pointed out that internally, we should view Microsoft as an idealized OSS community but, for various reasons do not."

Based on the accounts in James Wallace and Jim Erickson's book Hard Drive: Bill Gates and the Making of the Microsoft Empire (New York: Harper Business, 1992) Microsoft's early business practice seemed to follow Bazaar-style ideas:

Release early. Release often.
Initiative is the key. Turn users into testers or co-developers. Attracting developers is very important as it helps bring in users. Microsoft could be considered a pioneer in utilizing the following recommendation from CatB:

"Given a bit of encouragement, your users will diagnose problems, suggest fixes, and help improve the code far more quickly than you could unaided."

The Importance of Having Users.
Cultivate your user base. Avoid perfection like a plague. Consider lack of perfection not as a problem but as an opportunity for improvements in future versions. As Linus Torvalds noted:
"... the future of the operating system software will be driven by user needs instead of programmers' curiosity ... The technology is not driving the market anymore. What people want is driving it... ."

This is also the Microsoft style; Linus' quote could easily have been attributed to any Microsoft executive. Essentially, competitors attacking a given product on the technical front can be counterattacked on the user-base front. Ultimately, you can destroy competitors independently of the technical merits of a product just by winning the installed base war. Having a large installed base is as important - or even more important - than the technical merits of a given product. If necessary, market share can be won by giving software away, giving the source away or both.

Microsoft, and others, have understood and demonstrated on many occasions that market share is a convertible currency. For example, there is simply the "halo effect" from users simply saying to others "IE is the top Internet browser" or "Red Hat is a top Linux distributor". The "McDonald's approach" - a quick serving of standardized meals that well suit the limited needs of an average person - can be considered a vital part of Microsoft's (and certainly others) strategy. Microsoft simply discovered early in the software business that a mediocre product, available now with minimal hardware requirements and reasonable ease of use, has better chances of market acceptance than a truly innovative one.

To enter the business or software development one needs carefully select and attack existing weak spots.
Try to find an area where there is an untapped market, a weak competitor or where a large dissatisfied user community exists. Watch the competitor's achievements and difficulties and use them as opportunities.

Good programmers know what to write. Great ones know what to rewrite (and reuse).
Don't pay too much attention to innovative features and architectural superiority as long as you can provide a plausible promise. Recycle from existing products. Expect that some remote future versions (magic version 6.0 for Microsoft) will make a given product decent. Preempt the market with an average product in advance of the competition. Establish the product as the standard so you can enlist the help of others to polish and improve the basic architecture. The following quote from CatB could probably be used, without changes, in any Microsoft developer's seminar:

"And consider Linux. Suppose Linus Torvalds had been trying to pull off fundamental innovations in operating system design during the development; does it seem at all likely that the resulting kernel would be as stable and successful as what we have?"

These opinions are shared by other critics of CatB. For example, Jonathan Eunice wrote:

"... The reality is that Microsoft turns out highly-capable programs, which it rapidly refines. While they do have a certain all-singing, all-dancing abundance of features, they are also unusually modular and cross-leveraged. Redmond's development model emphasizes many of those very attributes Raymond identifies with the Bazaar: quick turnaround, modular construction techniques, a large and active user base, many entities striving to improve it in a thousand parallel dimensions, and a strong feedback loop."

There is also one more source of centralization in the OSS world that implicitly acts against the Bazaar model. Linux (and most other prominent OSS products) use the GNU license; it was probably the smartest decision that Linus Torvalds ever made. GNU-based commercial space gives a tremendous advantage to the first entrenched distributor/integrator, to the first entrenched brand, inhibiting forking.

The authors of the Halloween-I document pointed out this aspect of "license space" in a similar way:

"Like commercial software, the most viable single OSS project in many categories will, in the long run, kill competitive OSS projects and 'acquire' their IQ assets. For example, Linux is killing BSD Unix and has absorbed most of its core ideas (as well as ideas in the commercial UNIXes). This feature confers huge first mover advantages to a particular project.
... Possession of the lion's share of the market provides extremely powerful control over the market's evolution."

Software evolution, not revolution, is the theme. A large community of users inhibits revolutionary changes. From the perspective of those out in the trenches, Linux, gcc, Perl, Apache and other open source products are tools. And in the opinion of users, a standard tool is much better than a non-standard tool. It is very difficult to unseat the dominance of an application that implements a standard. Nevertheless, success creates conditions that may lead to eventual stagnation. Like for any Posix-compatible OS, the possibility of innovation in Linux is limited by adherence to Posix standards. I am inclined to see part of the value and attractiveness of Linux as a new giant standard SuperBIOS not owned by any company. In a sense the game is over with stability becoming very important for application development. Eventually, we will see a lot of polishing of basic subsystems with better scheduling, SMP, TCP/IP, file systems and so on, but no radical architectural or API-related changes. In essence for Linux innovation will be, by and large, limited to applications. As Jamie Zawinski put it:

"There exist counterexamples to this, but in general, great things are accomplished by small groups of people who are driven, who have unity of purpose. The more people involved, the slower and stupider their union is."

Linux definitely passed the small group phase some time ago. Linux may become an innovative OS if it moves into a new domain, such as handheld devices, perhaps the most dynamic part of the OS market. But this is difficult due to a large installed base in the PC environment and architectural solutions that fit this particular environment better than others.

Does OSS development model automatically yield the best results?

"People think just because it is open-source, the result is going to be automatically better. Not true. You have to lead it in the right directions to succeed. Open source is not the answer to world hunger."
- Linus Torvalds

On the quality of OSS software. I do not want to turn this paper into a discussion about family values. I just want to discuss a single fact that I am aware of. For several years I studied the phenomenon of the so called Orthodox file managers. From what I saw of development in the OSS environment (Midnight Commander and Northern Captain) and in the Windows environment ( FAR and Windows Commander) there were no significant differences in quality between the best OSS versions and the best commercial versions. In small projects even the number of developers is less important than it was suggested in CatB. While Midnight Commander was developed by a team, Northern Captain and two commercial (shareware to be more exact) versions were developed by single programmers. In fact, the author of FAR is mainly known as the author of a very influential archive manager, RAR, that competes in quality with gzip. For him FAR development was a part-time job. Therefore I would agree with the first part of the following CatB quote:
"Perhaps this should have been obvious (it's long been proverbial that "Necessity is the mother of invention") but too often software developers spend their days grinding away for pay at programs they neither need nor love. But not in the Linux world - which may explain why the average quality of software originated in the Linux community is so high."

To be fair, "the average quality of software" for the Windows community (low-cost commercial products, shareware and freeware) is also exceptionally high despite weaknesses of the underlying OS. Just look at Windows games. Certainly Windows archives are "a great babbling bazaar of differing agendas and approaches" just like Linux archive sites.

What is really new in the Linux development model?

"Paris is well worth a Mass."
- Henri IV

In a previous paper, I considered Linux as a logical continuation of the GNU project of the Free Software Foundation, a project with strong connections to MIT. I am convinced that this connection was crucial to the success of GNU just like the connection to the University of Helsinki immensely helped Linux in its early and most difficult stages. I saw the development of GCC and Emacs as grand repetitions of the development of Linux. While I do not agree with the arguments in CatB that turn Linux into a new development model, I now see one important innovation that already has started an important trend. For operating systems, the developmental and organizational complexity of the projects make the consequences of making code openly and freely available quite different from the consequences of opening source code in applications or other small or medium sized projects.
An important implicit assumption in CatB is that open source code is the best thing since sliced bread. It doesn't matter if we are talking about a one hundred line program or a one hundred thousand line program. CatB explicitly assumes that Fetchmail and Linux belong in the same class. In reality, they are quite different projects with very little in common. From the software engineering point of view, there are tremendous differences between a relatively simple project like Fetchmail and a large and complex project like the Linux kernel.

The actual problem is program comprehension. It's always good to have source code but the problem is that often it's not enough. One needs additional documentation to understand a given program. This need grow exponentially with the growth of the code base. Reverse engineering - that is, the examination of code in order to reconstruct design decisions taken by the original implementors - and program understanding are two important branches of software engineering that deal with the issues of inadequacy of source code. Just try to analyze a non-trivial compiler without language definition. Any large and complex program often has something similar to built-in language or a collection of subroutines that emulate some abstract computer. Therefore source code is just one important part of the complex infrastructure and the intellectual knowledge base that are used in large software projects.
Anyone who has worked in a large re-engineering project forever remembers the feeling of frustration when first facing mountains of poorly documented (although not always poorly written) code. The availability of source code does not help much if key developers are no longer available. If the program or system is written in a relatively low level language like C, Cobol or Fortran and is poorly documented, then all major design decisions are lost in details of coding and need to be re-engineered. In such situations the value of higher level documentation, like specification of interfaces and descriptions of architecture, can be greater than the source code itself.

This realization of the inadequacy of source code has led to attempts to merge code and higher level documentation. One of the best known attempts was by Donald Knuth; see his essays in the book Literate Programming (Stanford, Calif.: Center for the Study of Language and Information (CSLI), 1992). Perhaps the most famous suppressed book in the history of computer science was Lions' Commentary on Unix, that contains high level descriptions of the Unix source code as well as related algorithms. It was illegally pirated for more than twenty years after its initial publication in 1977.
Complexity and size effectively close source code for system programming projects like OSes compilers after, say, 100K lines of code without good higher level documentation or participation in the project from its early stages. This "binarization" of source code in large system programming projects may mean that there is little strategic importance to keep the source code of system programs closed after it reaches a certain level of maturity, with corresponding size and the level of complexity. Real competitors were always aware about the state of development in another camp; it was application developers who suffered from a lack of information.

This approach is probably most viable for - but not limited to - programming products like operating systems where application developers can benefit from easy access to vital information to simplify debugging. As such, it could be considered as a variation of the principle on "the importance of having users" discussed earlier. Applications are the key to the user kingdom; as Henri IV remarked "Paris is well worth a Mass".

Moreover the complexity of a mature programming product serves as a barrier of entry almost as effectively as NDA (please note that NDA does not prevent access to the source code, it just creates an additional barrier to this access). The speed of development of a given commercial product can make misappropriation of parts of the code not as attractive as one might assume, especially if competitors are at a mature stage and took a different path in development.

Essentially, to create a proprietary advantage you just need to create a proprietary infrastructure for understanding the code. In the simplest case, hiring several key developers, and by that owing a large part of the project noosphere, might be sufficient to create a proprietary advantage necessary for making commercial distribution viable.

Taking these factors into account means that I would disagree with CatB's view of Netscape as a testbed for OSS viability:

"Netscape is about to provide us with a large-scale, real-world test of the bazaar model in the commercial world. The open-source culture now faces a danger; if Netscape's execution doesn't work, the open-source concept may be so discredited that the commercial world won't touch it again for another decade."

Netscape is just the wrong test; I do not believe that the failure of Netscape will undermine open source movement. Netscape's level of complexity, introduced by opening a pre-existing project rather late in its development cycle, is probably the main reason for difficulties in attracting new developers. It may be some time before anyone will be able to decide if this experiment was successful or not. Given the significance of Netscape relative to IE, certainly special efforts have been made and will be made to avoid failure. But it is clear is that complexity of Netscape's code represents a formidable barrier of entry that is not easily overcome by even highly motivated and qualified developers.

The Netscape experience suggests an interesting hypothesis. The pool of key developers is usually formed early in the life of a complex project when the project and program are still understandable and the intellectual and conceptual barrier to entry is low. After the project reaches a certain level of maturity, it essentially closes itself due to the "binarization" of code. The complexity of code makes the cost of entry into the project in mature stages much higher than at the beginning. It would be interesting to check this hypothesis for major programs like Perl, Apache and Linux. Does the pool of developers for large projects remains more or less constant after the product reaches a certain level of maturity? If you are not onboard from the beginning, do you need to spend a lot of time and resources to catch up?

Certainly one striking example of the difficulty in dealing with large amounts of source code is the so-called Y2K problem. Ignoring all of the hype, the essence of the problem is that, despite the availability of source code, many companies have spent and are spending substantial amount of time and money trying to fix one trivial logical error, the way some programs represent years with just two digits. The most important lesson here is that for old and complex software systems even small problems are multiplied by the complexity and age of the underling system and thus may turn into huge problems. Understanding legacy code without architectural blueprints is probably one of the most difficult activities for programmers. Often it is more difficult that writing new code. Therefore in the absence of the original developers or substantial documentation, writing a completely new code instead of patching old code may be a viable and cost effective strategy.

In any case, concealing architectural information can be an effective strategy for controlling an open source project. We will examine this issue later.

Status Competition in Internet-Based Developer Communities

"It can be said that the first wisdom of sociology is this. Things are not what they seem."
- Peter Berger

Status competition, like most sociological phenomena, is very complex and should not be idealized. In CatB much of the discussion on status competition and performance in work groups is both idealized and primitive. A vast amount of literature in economics, evolutionary anthropology and especially in sociology exists on the subject, and seems to have been ignored by ESR. Given that there are no decent references on these issues in CatB, the reader might draw the wrong impression in several ways.
Similar to academic hierarchies, status competition involves group members evaluating themselves relative to their colleagues according to some shared value scale. It's naive to think that status competition always enhances group performance by pushing members of a given group to work harder. Sometimes status competition can negatively influence performance by inducing unproductive behavior.

Since open source is a social phenomenon, the status of any member is influenced both by contributions to one or more projects (contribution in the purely technical sense) and by non-productive, social activities of status enhancement. Political behavior including political maneuvering are usually concealed in programming groups involved in open source projects. Those who have political power usually deny it; those who want it pretend they don't. Those who are skilled in political maneuvering conceal their abilities.

Political maneuvering can be a very successful way to raise status, especially in large groups, but the success of such maneuvering can lower the overall morale of the group. Therefore, group performance may fluctuate and be unstable over time, especially if the group does not provide clear lines for status advancement. I believe that CatB fails to recognize several important dimensions of status competition. Among them I would like to outline:

Hierarchical structure and corresponding distribution of political power in the open source environment;

the possibility of unfair status hierarchies (favoritism);

poisoning of the peer review process and problems with egoless programming;

the danger of overload and burnout;

the fear of exclusion as a motivational factor;

the possibly of wrong status achievement lines;and,

the role of the press.

They will be discussed below.

Hierarchical structure and corresponding distribution of political power in the open source environment.

"All issues are political issues, and politics itself is a mass of lies, evasions, folly, hatred and schizophrenia."
- George Orwell

CatB tends to describe open source as supportive, harmonious, trusting, collaborative and cooperative. Such non-political positioning can lead us erroneously to believe that members of the open source movement always behave in ways consistent with the interest of the movement. Let's start our discussion of this concept from the following quote from CatB:

"While cheap Internet was a necessary condition for the Linux model to evolve, I think it was not by itself a sufficient condition. Another vital factor was the development of a leadership style and set of cooperative customs that could allow developers to attract co-developers and get maximum leverage out of the medium.

But what is this leadership style and what are these customs? They cannot be based on power relationships - and even if they could be, leadership by coercion would not produce the results we see.

... The "severe effort of many converging wills" is precisely what a project like Linux requires - and the "principle of command" is effectively impossible to apply among volunteers in the anarchist's paradise we call the Internet. To operate and compete effectively, hackers who want to lead collaborative projects have to learn how to recruit and energize effective communities of interest in the mode vaguely suggested by Kropotkin's "principle of understanding". They must learn to use Linus' Law."

Ignoring political behavior and hierarchical structures in the open source community means ignoring reality. In the OSS community as a whole, and in each project in particular, there are political systems with corresponding and sometimes fuzzy hierarchical structures. That fact can explain many of the irrational behaviors in the OSS movement. Why else would some distort or withhold information, restrict their output, overpublicize their successes, hide their failures, distort statistics or otherwise engage in activities that appear to be at odds with OSS goals?

Power in social relationships is usually defined as the ability to force the other persons to do something that they would not do otherwise. It is symbolized in status. Power is a function of dependence, enabling one to manipulate the behaviors of others. For example, Linus Torvalds has considerable power because he can accept or reject patches from other developers. Power is a complex phenomenon that has many dimensions. Among them I would like to mention persuasive power- control on allocation and manipulation by symbolic rewards valued by the group (the open source press has this kind of power and from this point of view is a political superstructure) and knowledge poweror access to unique information. If an individual controls unique information and that information is needed to make important decisions than that individual has knowledge-based power and can exercise it. When people get together in open source activities power links and coordinates will be established. It's only a matter of time when power will be exerted.

Sometimes the natural process of creation of hierarchical structures is considered by developers as a defensive reaction to noise and e-mail overload. Alan Cox described it in this way:

"Linux 8086 went on, the real developers have many of the other list members in their kill files so they can communicate via the list and there are simply too many half clued people milling around. It ceased to be a bazaar model and turns into a core team, which to a lot of people is a polite word for a clique. It is an inevitable defensive position in the circumstances."

We will define political behaviors as those activities that are not required as part of one's role in a given open source project or movement, but that influence, or attempt to influence, behavior of other members within the group. Some of these acts are trivial like flaming, bypassing chain of commands, forming coalitions, obstructing unfavorable decision of the leader or developing contacts outside the group to strengthen one's position within the group. Other are borderline like in-fighting, sabotage, whistle blowing and public protests that affect status of the project or movement as a whole. Most politics in open source communities is trivial although some individuals try to play "hardball". It's quite pragmatic; borderline political behaviors pose a very real risk of group sanctions, including loss of both status and membership in the project or group.

As more status in open source becomes more connected with reward allocation there will be more political maneuvering. This is a typical problem for leaders in any kind of project. CatB describes the leaders of OSS projects as democratic individuals. But, in practice, key developers tend to see their position as a license to make unilateral decisions. These leaders fought hard and often paid high personal costs to achieve their status. Sharing their power with others runs directly against their own aims and ambitions, although too tight of a grip can lead to undesirable consequences. The notion that the "principle of command" is abolished and of an "anarchist's paradise"in CatB are oversimplifications. The history of OSS projects provide plenty of convincing counterexamples:

"One of the people that had also been actively working on the task of building networking support was Fred van Kempen. After a period of some uncertainty following Ross's resignation from the lead developer position Fred offered his time and effort and accepted the role essentially unopposed. Fred had some ambitious plans for the direction that he wanted to take the Linux networking software and he set about progressing in those directions. Fred produced a series of networking code called the 'NET-2' kernel code (the 'NET' code being Ross's) which many people were able to use pretty much usefully. Fred formally put a number of innovations on the development agenda, such as the dynamic device interface, Amateur Radio AX.25 protocol support and a more modularly designed networking implementation. Fred's NET-2 code was used by a fairly large number of enthusiasts, the number increasing all the time as word spread that the software was working. The networking software at this time was still a large number of patches to the standard release of kernel code and was not included in the normal release. The NET-FAQ and subsequent NET-2-HOWTO's described the then fairly complex procedure to get it all working. Fred's focus was on developing innovations to the standard network implementations and this was taking time. The community of users was growing impatient for something that worked reliably and satisfied the 80% of users and, as with Ross, the pressure on Fred as lead developer rose.

Alan Cox proposed a solution to the problem designed to resolve the situation. He proposed that he would take Fred's NET-2 code and debug it, making it reliable and stable so that it would satisfy the impatient user base while relieving that pressure from Fred allowing him to continue his work. Alan set about doing this, with some good success and his first version of Linux networking code was called 'Net-2D(debugged)'. The code worked reliably in many typical configurations and the user base was happy. Alan clearly had ideas and skills of his own to contribute to the project and many discussions relating to the direction the NET-2 code was heading ensued. There developed two distinct schools within the Linux networking community, one that had the philosophy of `make it work first, then make it better' and the other of 'make it better first'. Linus ultimately arbitrated and offered his support to Alan's development efforts and included Alan's code in the standard kernel source distribution. This placed Fred in a difficult position. Any continued development would lack the large user base actively using and testing the code and this would mean progress would be slow and difficult. Fred continued to work for a short time and eventually stood down and Alan came to be the new leader of the Linux networking kernel development effort."

Actually all major OSS projects are hierarchically structured. This structure allows the head of a given project to dictate his will, which if necessary can be defended by political means - by direct exercise of power as in the example above. The claim that "They cannot be based on power relationships" has pretty superficial connections with reality.

For the same reason, knowledge sharing has its limits in OSS. We will discuss this later in more detail when I examine the concept of "egoless programming". Knowledge-based power is one of the most effective means to force others to perform as desired. Competence is the most legitimate source of political power and status in the OSS movement. No leader will ever distribute all the information he possess because ultimately it undermines his power. Actually, it is often physically difficult for a given leader to distribute all information given that any leader is usually overloaded. Open source is not immune from politically enforced limits on information sharing within a project. The mere availability of source code does not automatically translate into access to the most critical information.

Those with insufficient power are often seeking it, forming coalitions with others to enhance power and status. Coalitions are a natural phenomenon and cannot be avoided; there is a strength in numbers. The natural way to gain influence is to became a member of a coalition. Once a given coalition gains sufficient members it can challenge a single leader and promote any desirable changes. In this sense this sort of coalition becomes a collective dictator.

The possibility of unfair status hierarchies (favoritism).

"All animals are equal, but some animals are more equal than others."
- George Orwell

Favoritism is the conferral of any benefit, reward or privilege from someone in power to a member of the group (e.g. preferential job assignments, or, negatively, conscious ignorance of someone's contribution) that is based on personal preferences, not individual performance or technical merits. Typically favoritism has been described in traditional organizations (see, for example, [Prendergast & Topel, "Favoritism in Organizations," Journal of Political Economy, volume 104 (October 1996): pp. 958-978), distributed Internet projects are not immune to this danger. Favoritism, or even its perception, fosters a lack of respect, encourages distrust, destroys initiative and creates other problems which undermine the morale of a group.

If leaders compromise their authority and respect by allowing favoritism, real or perceived, problems will creep into a given project. If the leader of a project acts as a charismatic authority than that leader can delegate tasks based on personal preferences. To effectively control their followers a leader must be recognized as important; typically in OSS projects the early followers display a high level of loyalty to their leader. It's natural for an OSS project leader to trust, respect and depend on early users and early co-developers, the old guard. This relationship is the result of experience, common interests, goals or backgrounds or simply the longevity of their relationship. However, long-term sustainability of a project depends upon an environment where all participants are valued as individuals and treated with fairly and equally.

Favoritism can have a negative connotation. Consider the pain of participating in a project where you think the leader doesn't like you. You may feel trapped, but what can you do other than leave? Certainly, you'll need some sense of humor in this situation. The creation of an elite group and the exclusion of some developers over others can creates problems. Rejected developers, for example, can migrate to competing projects.

Poisoning of the peer review process and problems with egoless programming.

"The radical invents the views. When he has worn them out, the conservative adopts them."
- Mark Twain

People outside the academy usually assume that the academic peer review process is a simple, objective and effective mechanism. In reality simple, objective and effective peer reviews are only a statistical approximation, with wide variations possible in individual cases:

"The process of peer review has attracted its share of criticism from academics (and others) over the years. A number of commentators (e.g., Agger, 1990; Readings, 1994) argue that scholarly refereeing is inherently conservative. Those selected to be referees, at least for 'established' international periodicals, are generally 'recognized' scholars in their field who have already passed through the various publication hoops themselves. Original work which challenges orthodox views, while ostensibly encouraged, is - in practice - frequently impeded by academics who have a stake in keeping innovative critical scholarship out of respected journals. For if a contributor to a major journal rubs against the grain of conventional scholarly wisdom in a given discipline, it is likely his or her submitted manuscript will have to pass through the hands of one or more academics who are prime representatives of prevailing opinion.
... In the case of journals, much depends on the goodwill of editors. Anecdotal tales of being 'set up' by editors abound in academic corridors. Such experiences - where referees known to be especially 'vicious' in their criticisms, or to have strong prejudices against particular perspectives, are selected - can be devastating for beginning scholars setting out on the path to an academic career.

... Agger (1990) maintains that, given the shortage of journal space and the abundance of manuscripts in most fields of study, the balance of power at present rests very much in the hands of those who edit, review for, and produce the journals. There is, his analysis suggests, simply not enough room for everybody - at least not in 'respected', international journals. Agger claims that much of the writing produced by academics is either never published or ends up in local, unrefereed publications. As a result, it remains - as far as the international scholarly community is concerned - largely 'invisible'. With so many academics competing to publish in prestigious refereed journals, traditional canons of collegial respect and scholarly support can sometimes disappear."

Let's face it. Your work can be rejected out of overload, petty jealousy, incompetence or political motives. Any aspiring OSS participant needs to be aware of these possibilities. The rosy view of open source as an ideal community of constantly cooperating individuals is an illusion. At the same time the benefits of peer review outweigh a situation where 'anything goes'.

For similar reasons, the concept egoless programming does not really work in open source. It may be one thing to share source code, yet another to share underling ideas. The key to successful programming is not "source code", it's "understanding". The leaders of large and successful open source projects are often in a conflicting position due to political factors that inhibit a desire to share a higher level picture.

Egoless programming is usually defined (IEEE Std 610.12-1990) as:

"A software development technique based on the concept of team, rather than individual, responsibility for program development. Its purpose is to prevent individual programmers from identifying so closely with their work that objective evaluation is impaired."

In commercial environments it is a very difficult task to "prevent individual programmers from identifying so closely with their work". In OSS projects this over-identification with the project (ego-related motives) was recognized in CatB as one of the major forces that drives the movement. That means that the base for egoless programming is very shaky, if it exists at all. Over-identification with the project rules. Protecting one's status is a part of game whether we want it or not. Status itself is the result of possession of a higher level understanding of some part of the system (in the case of the leading developer - the critical part of the system). Gerard Weinberg noted the flaws of egoless programming when he wrote (IEEE Software, volume 16, number 1 (1999), pp. 118-120):
"The concept of "egoless programming," first described in 1971 in The Psychology of Computer Programming, is probably the most cited, most misunderstood, and most denied of all concepts expressed in the original edition. I've often wondered if I could have written this section with more persuasive power. Perhaps there would have been no controversy had I used the term "less-ego programming." Perhaps I needed more examples, or better examples. Perhaps I needed more experimental evidence.

... So, I've learned something in 25 years: If your reasons for not wanting your code reviewed are not based on logic, you'll never be convinced by logical arguments to change your ways ... ."

While giving about a hundred interviews and speeches Linus Torvalds never authored a single technical paper on the structure of the Linux kernel (if we do not count the attempt of justification of the monolithic structure of Linux kernel in The Linux Edge). We need to understand that leading developers are in conflicting positions as for sharing strategic, structural information. There are various possible rationalizations for this behavior. Ultimately, reverse engineering is an issue; an intelligent programmer with hard work and luck could uncover the architecture of a program or project given time and a few clues. The sheer ability to re-engineer crucial architectural information is a very fine test of programming abilities. As such, it is an effective test for inclusion into a given project's elite;this barrier effectively acts as the ultimate test of programming skill. This "selfish" element is actually recognized in CatB. Unfortunately the underlying conflict is ignored, that is the desire to preserve power and to ensure cooperation among co-developers:
"The Linux world behaves in many respects like a free market or an ecology, a collection of selfish agents attempting to maximize utility which in the process produces a self-correcting spontaneous order more elaborate and efficient than any amount of central planning could have achieved ...

... The "utility function" Linux hackers are maximizing is not classically economic, but is the intangible of their own ego satisfaction and reputation among other hackers ...

... Linus, by successfully positioning himself as the gatekeeper of a project in which the development is mostly done by others, and nurturing interest in the project until it became self-sustaining, has shown an acute grasp of Kropotkin's "principle of shared understanding"."

The same situation exists in science. While scientists are eager to share their discoveries, they usually carefully guard and often conceal the exact methods that lead to those discoveries. "Reveal the code, conceal the kitchen" is the principle that probably is as applicable to the OSS movement as "the principle of common understanding" proposed in CatB.

The danger of overload and burnout.

Volunteer OSS development is risky for yet another reason. Development on a serious application isn't just putting in a couple of hours each weekend. It's more like a full time job. Overload is possible if someone, in addition to their open source "hobby", is handling a demanding full-time job or studies. Some of the best code comes from those with taxing regular jobs in universities or corporations.

Paradoxically, success spells real danger for the well-being of a volunteer developer with a regular job. If your program is successful and user feedback is enthusiastic, you'll find yourself in a feedback loop that take more and more of your time. E-mail and phone calls pile up. As a professional computer science educator, I see a danger in romanticizing the OSS world, especially for college audiences. Burnout has been defined as a syndrome of physical and emotional exhaustion, involving the development of a negative self-concept, negative job attitudes, and loss of concern and feeling for clients. Examine a portion of the 1998 Linux timeline:
"March

Linus 3.0 is announced. The birth of Linus's second daughter causes great joy, and substantial disruption in kernel development as all work stops and many patches get lost. Some grumbling results as it becomes clear just how dependent the entire process is on Linus's continual presence.

October

Tensions explode on linux-kernel after Linus drops a few too many patches. Linus walks out in a huff and takes a vacation for a bit. Things return to normal, of course, but some people get talking. It becomes clear once again that the Linux kernel is getting to be too big for one person to keep on top of. Some ways of reducing the load on Linus are discussed, but nothing is really resolved."
Burnout is a major problem in the helping occupations, where people give a lot to others but fail to take care of themselves in the process. Not only programmers, but professionals in education, journalism, medicine, social work law enforcement are prone to burnout. It can begin when a developer has difficulty setting priorities, and despite user pressure for fixes and enhancements, understands that his volunteer participation in the project should have lower priority than job and family requirements. Some contributing factors include:

Long-term stress due to impossibly heavy workloads;
Conflict between idealistic expectations and reality;
E-mail overload;
Poor time management (for example, debugging binges);
Engaging in a rat race against commercial developers or commercial products;
Role conflicts with conflicting priorities from a regular job, family and volunteer efforts;
Architectural problems discovered late in the development cycle, with a failure to use the right tools or methods;
Perfectionism; that is, difficulties in adapting to changes in feedback volume and an inability to reject user requirements for enhancements and bug fixes;
Setting unrealistic deadlines with a parallel inability to meet multiple schedules; and,
Lack of support from other developers in the OSS project.

Effects from apathy to heart attacks have been reported in burnout studies. The percentage of OSS project leaders who have suffered at least once from depression induced by burnout is unknown. Most programmers eventually learn to deal with this kind of stress. Those who burn out don't. Often, the types of people who burn out are those who show the most promise at the beginning of their careers. They are perfectionists, idealists and workaholics. They start out enthusiastic about OSS, dedicated and committed to doing the best job they can. They typically are energetic, have positive attitudes and are high achievers. And they often try to bite off just too big piece to chew.

The management of a large project can significantly increase the danger of burnout for the leading developer. Usually the leading developer assumes all management overhead. Unless both user and developer bases are small just the quantity of e-mails can be overwhelming. For successful projects this means that user feedback interferes with coding and debugging and all three together interfere with normal routines at home or work. The problem of technostress due to information overload (combination of performance anxiety, information overload and role conflicts) among OSS players and burnout of the leaders is not even mentioned in CatB. Craig Brod defines technostress as:

"... a modern disease of adaptation caused by an inability to cope with the new computer technologies in a healthy manner. It manifests itself in two distinct and related ways: in the struggle to accept computer technology, and in the more specialized form of overidentification with computer technology."
Again I would like to point out that leaders of successful OSS projects get so much e-mail that just reading it may constitute a substantial workload. In Linus Torvalds' 1995 interview to IT4 he stated:
"Q: What do you consider a normal day (e.g. what takes your time, normally: school, Linux, spare time)?

A: Linux takes up my "working hours" - even just reading email takes a minimum of two hours a day, and to actually react to that email and do development fills up the rest of the day quite nicely indeed, thank you very much. However, I do take time off for hobbies etc, and I can essentially do Linux at my work at the university (they know I do Linux development, and they allow for that fact)."

There is also another aspect here that any viable OSS model should be able to predict. It is"death on positional treadmill" as one of the extreme cases of status completion. As in sports everybody wants to be a champion and at least at the beginning the OSS project leader needs to prove that he is better than his competitors. The problem is that sometimes this turns into a rat race against a stronger or commercial competitor and with each new release the leader is expected to achieve at least parity ASAP no matter at what cost. Loyal users ask for new features in a competing product and expect them without realizing that this is a volunteer development and as such it can have different priorities and a different pace of development. For example in the past a lot of users expressed dissatisfaction about the slow pace of Debian releases in comparison with Red Hat.

Due to his central position he leading developer also needs to assume most management duties; a large part of user feedback will be directed to him. That means that unless there is a hierarchical limitation on information flow (both in getting and sharing information) in comparison with his lieutenants, the leader of an OSS project will be severely hampered by an abundance of information. Eventually his technical expertise may suffer to the extent that it results in some discontent.

Let's summarize our findings. Starting an open source project may be fun and easy but success can be very difficult. Sustained development of a large open source product presents a real danger for volunteer developers and may turn an initially interesting project into exhausting maintenance of a complex program. Certainly an OSS developer might better consider himself an academician. After creating the basic elements of a program and leading it to a certain level of maturity, he may prefer to step down or to turn himself into a commercial developer and see if the world will adopt or reject it. There are obvious ego-related and ideological problems with such a decision, but splitting the product into commercial and non-commercial branches (like TCL and Sendmail), or creating a commercial company to benefit from volunteer efforts can be rational choices that should be seriously considered. For a complex and successful OSS product the volunteer stage should probably be a transitional one.

The fear of exclusion as a motivational factor.

"You can get much further with a kind word and a gun that you can with a kind word alone."
- Al Capone

Theoretically there is nothing to fear. There is no obligation to support any OSS product. Just stop developing/supporting the product if you do not wish to. The reality is more complex. The more successful a given product happens to be, the more you are tied to it; even being dissatisfied and frustrated you may prefer to continue development and support because the fear of exclusion lurks. To create a successful product, you need to sacrifice time and energy. The developer of a successful OSS product may invest years, making countless sacrifices just to find the time to develop. In some cases a member may have alienated family and friends. Not everyone may be able to appreciate his excessive zeal in pursuing the open source path. In such a case you can found yourself trapped. There is no safe exit from the current situation and in both cases you will lose. In case you decided to continue somecomfort and understanding can be found in places like Slashdot, Linuxtoday and other chat groups that emphasize the value of open source efforts. In case you stop you need to find a new value system and that can present some problems. CatB considered that

"But what is this leadership style and what are these customs? They cannot be based on power relationships - and even if they could be, leadership by coercion would not produce the results we see."

Not only the possibility of direct coercion is present in OSS projects, implicit coercion through the fear of exclusion is also a powerful tool. The true value of the informal group of co-developers and users often is only revealed after a developer has abandoned his project. And the loss can be painful as the social and professional bonds that the developer acquired may not survive his or her "defection".

The possibility of wrong status achievement lines.

"It is only shallow people who do not judge by appearances."
- Oscar Wilde

From my experience as a computer science teacher, I suspect that the majority of people who are participating in OSS, especially those who spend more than nominal amounts of time, are young adults (ages 16-25). The high school and college years are years of transition from adolescence to adulthood. To be considered an adult, a young person must make a set of long-term decisions, in most cases there is insufficient information to analyze the consequences of certain actions. For example, there is a need to choose a career, pick a mate and adopt a value system. Programming in general and OSS in particular provides a kind of escape from this sort of decision-making. It's not accidental that in the hacker community activities outside traditional the coding/compiling/debugging cycle are generally suspect, and regarded as lower status activities. "Code first, debug next, think later" seems to catch the essence of a hacker style. The related coding/debugging binges act in some circles as a way of distinguishing hackers from regular programmers. If negligence to food and clothing extends to architectural issues than time spend on debugging will dominate. Any bug not detected in the design phase will cost approximately ten times more time to detect at the coding phase and additional ten times more at the debugging phase. Wrong or primitive tools can contribute to overload too.

Some identify the ability to work 48 hours in a row as "a good thing" and a proof of membership in hackerdom. Unfortunately the binge-style, all-night coding/compiling/debugging sessions can be definitely harmful. As a result only the best people who transgress this practice do good work. Others sink because they do not follow established practices because they are not rewarded by the peer group at the same rate. Certainly, one of the most negative aspects of the hacker culture is its dismissive attitude toward software architecture and good software engineering practices. As in piano playing, only the most talented programmers can improvise on the keyboard; that is, work without explicit research, design and coordination phases, without studying the best books on the subject.

While partially escapist, hacking can become the main status achievement line. CatB does not really explain what is at stake that make status competition in open source development so intense and important. I suspect that it's not only altruism or ego-related motives - it can become an alternate career; as in business startups this is a risky one in which only a few enjoy eventual fame and high remuneration. Here the analogy with the academy holds; some can receive substantial financial benefits from attaining a high informal status position in the OSS community. As in academic circles, reputation is a convertible currency for the top members of the community. But that possibility can enslave individuals too.

The role of the press.

"I shall always confine myself strictly to the truth, except when it is attended with inconvenience; I shall witheringly rebuke all forms of crime and misconduct, except when committed by the party inhabiting my own vest ... ."
- Mark Twain's pledge upon becoming editor of the Buffalo Express

Linus Torvalds (like Microsoft to the Internet) was "Johnny-come-lately" to the Unix world. But his first and major success was the ability to overtake a very important stronghold - an existing development community (Minix community) and the corresponding channel of communication (Usenet group) and later to create his own. At that moment the Minix community - a very elite community of Unix enthusiasts - had approximately 40,000 members and included a substantial pool of talented developers (some who already successfully tried to rewrite parts of Minix to make it more Posix compatible). This redirection of the existing community toward a new goal was one of the most important events in Linux history, demonstrating the power of the Internet as a new press. None of other major OSes were developed this way. Of course the community was ripe for hijacking. There was some pre-existing discontent between the policy of the community leader ( Andy Tannenbaum) and aspirations of the top layer of the community. Core members of the Minix community formed the core of the Linux team. The hierarchy of the members was already largely established, which probably helped to avoid unnecessary conflicts.
Specialized Linux magazines emerged in 1994 when Robert Young (one of the founders and former CEO of Red Hat) and the ACC Corporation published the first two issues of the Linux Journal. Later Phil Hughes became an editor. In March 1994 SSC, the publishers of Unix Pocket References, took over publication of the Linux Journal from Robert Young and the ACC Corporation. It's unclear to what extent these connections helped Red Hat but it's quite clear that from the beginning, the specialized press was a powerful, highly political and certainly pro-Linux force.

This pro-Linux press immensely helped Linux, so the fighting between alternatives occur not only on technical grounds as CatB stipulates. Without this level of political support it is possible that FreeBSD (traditionally the OS of choice for ISPs) would have been more popular. Generally the reasons for Linux's popularity are more complex that outlined in CatB. I would just like to stress that the Linux press serves as important leverage against competitors like FreeBSD and its role is completely overlooked in CatB.

Among Linux press electronic news services like Slashdot and Linuxtoday are probably as important or even more important than traditional publications like Linux Journal, Linux Gazette and Linux Focus. Although Linuxtoday emerged as a leading news service for the community, recently commercialized Slashdot.org is a more interesting phenomenon. It is a unique blend of news service, technical discussion and advocacy forum that created a BBS-style community of "slashdotters". The number of participants is so large that the so-called "slashdot effect" is observed: when an interesting paper or application is reported on Slashdot there are often related server crashes from overload.

Slashdot.org powerfully illustrates both the idealism and the intolerance of the OSS movement. As the largest and most popular community Slashdot.org serves as a very important organizing point for Linux advocacy. The role of Slashdot founders in evangelizing Linux far exceed the role of traditional evangelists like ESR and is generally undervalued in the Linux community.

Slashdot also serves as a political instrument for suppressing opposition, but an investigation of this phenomena is beyond the scope of this paper. I would like just to briefly mention "Slashnoise". Slashnoise can be generally described as letters and other forms of commentary from readers expressing opinions about programs or papers that have never been read or used by the commentators. Such letters could be considered a special form of Lysenkoism, very close in spirit to "letters of workers and peasants" to the newspaper Pravdaduring the Lysenko years with the immortal and perfectly applicable phrase - "I did not read [the book], but I condemn it". Some researchers, such as Dr. Shaffer of Harvard (The Addiction Letter, (August 1995)), consider forums like Slashdot to be addictive. In that sense, Slashdot zealots are prisoners of their own "Slashdot cage":
"On-line service is not as reliable as cocaine or alcohol, but in the contemporary world,it is a fairly reliable way of shifting consciousness ... . Compulsive gamblers are also drawn to the tug of war between mastery and luck. When this attraction becomes an obsession, the computer junkie resembles the intemperate gambler ... . Unlike stamp collecting or reading, computers are a psycho-stimulant, and a certain segment of the population can develop addictive behavior in response to that stimulant."

I suspect that the frequency distribution of postings among Slashdot correspondents is very uneven and is probably close to the Pareto Distribution. It is also informally known as 80-20 distribution, or in other words approximately 20% of the user population ("Slashaddicts") are participating in almost 80% of the discussions. Confirming this frequency distribution would be difficult. Many post under a generic handle like "Anonymous Coward" and any individual could use multiple handles.
The Linux project also created an interesting form of Samizdat activity that is called the Linux documentation project. Over the course of several years the project was very successful in producing small documents like How-To manuals. Larger documents on the scale of books have not really appeared in any abundance. The scarcity of documentation on the internals of the Linux system is confusing. Why? I believe that the problem is really a matter of internal conflicts and vulnerability. According to CatB there are no problems over documentation:
"Many people (especially those who politically distrust free markets) would expect a culture of self-directed egoists to be fragmented, territorial, wasteful, secretive, and hostile. But this expectation is clearly falsified by (to give just one example) the stunning variety, quality and depth of Linux documentation. It is a hallowed given that programmers hate documenting; how is it, then, that Linux hackers generate so much of it? Evidently Linux's free market works better to produce virtuous, other-directed behavior than the massively-funded documentation shops of commercial software producers."

In reality Linux documentation project suffers from bureaucratization (for example, submissions are limited to SGML and plain text, HTML is not accepted) and poor management (earlier this year some contributors complained that it took more than a month to post a submitted version of a HowTo document; now steps have been taken to fix this situation).

I suspect that documentation authors feel the same sort of insecurity as leaders of successful OSS projects, but they have even less ways to limit abuse. The GPL-style license for documentation is a much less attractive license than GPL for code. If the large volume of code provides some measure of security through obscurity, documentation is not in quite the same position. It's not an accident that the best Linux books have been produced by commercial book publishers and some early participants in the Linux documentation project later published new books with commercial publishers.

Conclusion

Many of the problems with open source and CatB's interpretation of it are very complex. I sincerely hope that the limited review of these issues in this paper will stimulate a fruitful discussion and can be used by future researchers and open source developers.

The main problem with the Bazaar model is that it has no predictive power for OSS's strong and weak points. As such, it has a limited role in providing an important metaphor, a kind of mystical man-month measure of programmer productivity. I am strongly convinced that an academic model of OSS explains the phenomenon much better. The reasons for the popularity of Linux are much more complex than the "Bazaar development model" described in CatB.

At the same time, CatB is an important paper that stimulated a discussion of OSS as a social organism and the value of using Internet for software development. Although the value of CatB is somewhat limited by some of its liberties with facts and its moralizing tone, it very well describes important attributes that may be instrumental in ensuring the success of both the OSS and commercial software development process.

CatB encouraged a whole re-thinking of the best ways to develop software, especially the value of openness. The latter can be successfully used both by the open source community and traditional software producers to maximize leverage and increase the probability of success of software-related projects.

In conclusion, I would like to stress that, despite its weaknesses, CatB is very significant in opening discussions of OSS phenomenon. Later works on OSS have had the benefit of referring to a much wider range of materials, including CatB. Although some of the ideas expressed in CatB are wide of the mark and some important topics were simply omitted, it made an extremely important contribution by providing a framework of discussion of open source phenomena. In this role CatB should not be undervalued.

About the Author

Nikolai Bezroukov is a Senior Internet Security Analyst at BASF Corporation, Professor of Computer Science at Fairleigh Dickinson University (NJ) and Webmaster of www.softpanorama.org - Open Source Software University - a volunteer technical site for the United Nations SDNP program that helps with Internet connectivity and distributes Linux to developing countries. He authored one of the first classification system for computer viruses and an influential Russian language book on the subject -- Computer Virology in 1991 (see also the author current views on the subject). Right now is more interested in e-commerce security, Perl and so called Orthodox File Managers ( Midnight Commander, etc.).
E-mail: postmaster@softpanorama.org

Editorial history

Paper received 21 November 1999; revision received 22 November 1999; accepted for publication 22 November 1999; revision received 29 November 1999; revision received 6 December 1999; revision received 9 December 1999

Copyright © 1999, First Monday

A Second Look at the Cathedral and Bazaar by Nikolai Bezroukov
First Monday, volume 4, number 12 (December 1999),
URL: http://firstmonday.org/issues/issue4_12/bezroukov/index.html