First Monday

TOOL: The Open Opinion Layer by Hassan Masum

TOOL: The Open Opinion Layer by Hassan Masum
Shared opinions drive society: what we read, how we vote, and where we shop are all heavily influenced by the choices of others. However, the cost in time and money to systematically share opinions remains high, while the actual performance history of opinion generators is often not tracked.

This article explores the development of a distributed open opinion layer, which is given the generic name of TOOL. Similar to the evolution of network protocols as an underlying layer for many computational tasks, we suggest that TOOL has the potential to become a common substrate upon which many scientific, commercial, and social activities will be based.

Valuation decisions are ubiquitous in human interaction and thought itself. Incorporating information valuation into a computational layer will be as significant a step forward as our current communication and information retrieval layers.


Enabling Open Opinions: The Use and Context of TOOL
Algorithms and Architecture
Operational Behavior
Looking Forward: A Collaborative Decision Substrate
Conclusion: Decision Substrates and Distributed Democracy




Enabling Open Opinions: The Use and Context of TOOL

"Truth is one forever absolute, but opinion is truth filtered through the moods, the blood, the disposition of the spectator." - Wendell Phillips


When faced with information overload, how can we focus on the best and ignore the rest? Opinions are a crucial method for the distributed filtering and judging of information. For example, many users of find its book ratings, comments, and similar books features to be a convenient first pass for finding high-quality items.

However, opinion collections are largely in closed formats, making consolidation, search and analysis difficult. Creating and operating a searchable, authenticated opinion collection requires significant investment, which reduces the diversity and utility of readily-available opinions on public, interpersonal, and socio-environmental issues. And there is a lack of complementary analytical tools for the formation of beneficial idea-clusters, self-examination of our aggregate societal mind, and rewarding of generators of farsighted opinions.

Many potential benefits come with an extensible open opinion layer architecture. Such an open architecture could act as a common tool allowing users to pool their experience for each others' benefit, while retaining control over their own opinions. We'll call this generic architecture "TOOL: The Open Opinion Layer". In the remainder of the article, we give a high-level overview of the features, operational behavior, and potential benefits and pitfalls of TOOL; by abstracting these high-level features from specific underlying technologies, we gain clearer insight into what is possible and desirable.

Part of TOOL will be a suite of opinion combination and analysis tools. Collaborative filtering mechanisms use similarity and pattern-matching heuristics to share opinions among people of like interests; Sarwar et al. (2001) review current methods and suggest an item-based approach. Social network analysis is a closely related discipline looking at patterns of interpersonal communication. Both these areas identify clusters of people who are similar with respect to tastes, desires, or goals - an intrinsic human motivation that may well provide impetus for adoption by individual users. In order to learn from past mistakes, TOOL should also be extensible to include "tests against reality" for any domains which are not entirely matters of personal taste; analytical tracking of opinions against reality will identify good ideas and their generators. Collaborative filtering is a stepping stone, but pales in importance relative to "collaborative judging".

An important benefit of TOOL would be the encouragement of cooperation and resultant enhancement of social capital. As any successful businessperson knows, dealing with reputable buyers and sellers is almost always desirable. Since any economic agent has an incentive to cheat if it can do so successfully, reputation-reporting methods which report such perfidy are crucial in providing a disincentive to counter the cheating impulse; see the Reputation Research Network. Evolutionary game theory provides a good testbed for demonstrating such theories. Gintis (2000) is an excellent problem-oriented overview of basic methods and models, while Nowak and Sigmund (1998) discuss a model in which reputations encourage fair behavior, even for those agents which never interact directly. Opening up access to knowledge generally promotes freedom and productivity - as is argued by Brin (1998), transparency and fairness tend to go hand in hand. Commerce and communities can only be successful where information exchange is common and trust is established (Jakobsson and Yung, 1998).

In the long run, TOOL could become a sort of "universal credit-allocation layer". Filtered opinions will guide us to the best investment and consumption opportunities, usually more efficiently and reliably than word-of-mouth (or even personal experience - think auto mechanics). Over time, some classes of opinions may assume part of the functionality of present-day money - most obviously in measuring worth, but also conceivably as a means of payment. Finally, the process of specifying and reconciling conflicting "ontologies of objects about which one could have an opinion" (and looking at the relative weights placed by others on their ontologies of common systems) could have a salutary effect on inter-group understanding - being able to compare how others classify and value objects of importance in the world should promote finding areas of commonality, isolating crucial differences, and generating acceptable compromises.


Why would busy people bother using TOOL? An initial incentive will be to help remember good and bad books, references, music, games, experiences, stores, products, and so on - a memory aid that would be shared with family, friends, and colleagues. As each person's network of opinions gets more detailed, the next useful step will be to get good recommendations and find other people with similar interests.

Beyond these basic uses, there will be a host of potential applications, some of which we outline below. Clearly not everyone will use TOOL for all these applications - social scientists will want to analyze the space of ideas, activists to find and shape clusters of opinions, researchers to use the weighted space of informed opinions to collaboratively seek truth, and investors to trade idea derivatives or hedge risk.

Finding Objects and Experiences

Extending the quality and number of experiences accessible to us reduces friction, drudgery, and pain. This applies to a wide range of experiences, from finding a niche artist who fits your tastes to bringing new research or investment opportunities to the attention of those who trust your judgement.

Generating and Authenticating Reputations

Reputations are crucial for the effective functioning of human society and commerce. They come in a variety of domains, from credit worthiness to word of mouth to accolades for achievement. Once basic material needs have been met, a good reputation in the right domain is often worth more than large amounts of money. The world of academic research provides an object lesson in the power of reputation as a driving force. The existence of trusted reputation mechanisms encourages better behavior in all areas covered by the reputation; extending the reach and ease-of-use of reputation mechanisms could increase incentives for ethical behavior.

Analysis and Decision-Making

Individual decisions are usually made with reference to the opinions of many experts and interested parties, whether through verbal communication, rankings, or market prices. In the world of judgement, no one is an island. Collective decisions commonly require a great deal of "information compression" between the parties involved - each of us has limited time and energy and can therefore only evaluate a relatively small amount of information from others, and hence summaries and statistical measures give us an approximate sense of what others think. In both the individual and collective cases, access to the opinions and analyses of others is crucial in making effective decisions.

The Active Noosphere

Once a fluid ocean of opinions exists, another significant portion of collective human thought will explicitly exist outside biological brains. The arisal of extra-human information (as stored in databases and the Web) and algorithms (as instantiated in programs and real-time systems) has wrought profound changes; judgement networks will be just as pervasive a paradigm shift.

From a broader societal point of view, TOOL could provide a social substrate that would quantify and encourage the production of useful ideas, honest behavior, and healthy communities.

Many of the complex social problems we face involve both informational and motivational constraints; the generation of models which synthesize facts across many domains or do not yield economic benefit for long periods of time seems to be systematically underinvested in, and one causative factor is the difficulty of capitalizing on being part of such a model-generation effort. At an individual level, the venture-capital industry has created an efficient mechanism to reward entrepreneurs for the creation of good product and service ideas, but no such efficient reward mechanism exists for the creation of more intangible or dispersed ideas, especially when they possess the characteristics of public goods. Inefficient proxies do exist, like promoting your idea as the "Next Big Thing" and going out on the lecture circuit, but such proxies skew incentives and aren't necessarily the most efficient way of idea dispersal.

At a collective level, consider that one way of ranking the intelligence of organisms is with respect to their effectiveness in modeling the external world. Humans are the undisputed leaders in Earth's biosphere - but in order to proceed beyond the limited power of a single mind, world-models need to be shared and collaboratively built, through the creation of a vast public space of dynamic distilled judgements.

Rating Domains

It's easiest to envision applying TOOL in well-defined domains where an opinion is not path-dependent, i.e. where the value of the opinion is independent of historical context. For example, most personal e-mail wouldn't qualify, since the information conveyed depends too much on the user's history. These domains can be taxonomized by media type, interested population size, persistence, economic value, rating privacy required, legal sensitivity, information versus information source (e.g. particular news story versus news channel), and so on.

Reliable opinions are particularly useful for high-value goods where mistakes matter, especially those which are hard to evaluate (e.g. due to lack of reliable quality signalling mechanisms like price differentiation, as is the case with the narrow price range of recorded music). Along with the complementary split into experiential goods (which have a value highly dependent on the consumer) and objective goods (which usually possess a lower-variance value defined by a multiuser interaction or consensus), this suggests a four-quadrant division:


Easy to Rate
Difficult to Rate
Subjective / Experiential
"That's the Way I Like It" - You try it and like it (or not).
"Wisdom" - You need to live through the experience to understand its value.
Objective / Aggregate
"Just the Facts" - Information that is easily transmissible and verified.
"Hidden Value" - Subtle knowledge and valuations of our common environment.


This division, while suggestive, is but one of many. Here is another, taking into account the differing characteristics of opinions with regard to privacy and persistence: Private opinions need more care with regard to security and redistribution than public ones; persistent opinions will usually involve substantially greater specification and evaluation effort than ephemeral ones. Note that the rating of people needs special care due to their possession of intentionality, privacy preferences, and legal rights.


Private Opinions
discussion group posts
short social interactions (e.g. parties)
polls & votes
Web pages
credit ratings
Web sites
peer-reviewed academic journals / conference articles (negative ratings)
people's abilities / personality / skills
Public Opinions
magazine and newspaper articles
radio, TV, news
group effectiveness & discussion
opinion leaders in times of flux
real-time metrics (e.g. market mood)
books, music, movies
games (video, crossword, board)
consumer goods / services (i.e. rated by Consumer Reports)
paintings & other appraised items
historical, political, and philosophical context



Algorithms and Architecture

"What I cannot build, I do not understand." - Richard Feynman

Architecture Overview

The generic architecture may be divided into three conceptual layers:


Issues Handled
Data - "How to Store It"
Formats for storing raters and opinions, ontologies, and opinion domains.
Server - "How to Get It"
Finding relevant opinions. Authentication, privacy (e.g. zero-knowledge opinion gathering); P2P protocols.
Analysisr - "How to Use It"
Deriving knowledge about products, raters, and combinations thereof. Combining opinions for decision-making.


From the user's point of view, we can specify some natural actions for the early implementations of TOOL. These are actions a user explicitly initiates. To be practically useful, such online actions must be fast and reliable; the main obstacles will be inter-server time and trust in the short term, and perhaps difficulty of semantic analysis in the long term. Once these actions are set up and operating robustly, more complex operations can be built up. Implemented automated collaborative filtering and recommender systems have many relevant mechanisms to adapt. Note that we use "Product" to mean any rated object, not just a product in the commercial sense.

Data Layer: Extensible Opinion Ontologies

Clearly, one of the main challenges will be coming up with an extensible ontology that describes both the items being rated and the actual rating. This will get progressively more difficult as the idea of a "rating" is generalized from a scalar value to include multiple rating dimensions, conditional statements, time-varying opinions, and so on. A similar generalization of a generic data structure took place when the idea of a "document" was successively generalized from plain text to fancy writing to hyperlinked documents to "active documents". As a practical first step, data could be stored using some XML-based rating schema (See RDF and its applicability to metadata, ontologies and ratings at W3C's Semantic Web project.).

One possible partition of objects is into Products and Opinions. Producers make Products, Raters use Products and make Opinions, and Users use Opinions. Again, a Product can be any created object, not just a commercial product. To define appropriate Product and Rater record formats, one should look at bibliographic records for field-tested examples and practical advice. The Dublin Core is a metadata initiative which has looked in detail at resource specification syntax. Information retrieval, representation, and database technologies are also relevant.

Let's look at one possible example of what could be in Product and Record fields, to make the concepts concrete:



Server Layer: Storing and Finding Opinions

Server Configurations

Fully-Centralized: All opinions on one server. While there would be database replication and synchronization issues if usage is large, these are relatively well understood. Large servers will use distributed back-end algorithmic processing (e.g. specialized servers for data-mining algorithms).

Community-Centralized: Server granularity is by user community. This could contain a mixture of any of the other types of Server setups. Coherent communities of interest or mutual usefulness have strong user bases, often motivated by non-monetary considerations; however, their biases may create trust and information reliability problems for inter-community linking. It's interesting to speculate on which domains would work well with most Opinion communication being intra-community (e.g. geographical and specialist communities), versus domains which would work best with more inter-community communication (e.g. consumer items, experiences for which creating a useful Opinion needs no special skill, and cases where broad coverage to capture rare events or statistical trends is important). Clearly effective implementation of the latter type of domain will need efficient distributed indexing, and perhaps distributed "statistics propagation" (i.e. sending out summary statistics on Opinions for different domains across the network at frequent intervals).

Domain-Centralized: Server granularity is by product domain (e.g. books, music, travel); this would allow easy networking of the preexisting opinion servers that exist without necessarily threatening their business model. As with existing opinion servers, a domain-centralized server could be used as an audience draw for advertising, portal, and other such revenue-generating purposes. Domain servers could be found through a centralized ontology server.

Product-Centralized: Server granularity is by particular product. While this would in theory work similar to the domain-centralized case, in practice there would be a large moral hazard issue; most servers that focus on a particular product would probably be strongly for or against the product, with product sellers and satisfied owners being in the former category and product competitors and dissatisfied owners being in the latter category. Such moral hazard would call the integrity of the server into question (and if cryptographic protocols were robust enough to prevent tampering and allow open access, the incentive for setting up a product-centralized server might be minimal).

Rater-Centralized: Server granularity is by particular rater. This would be a small-scale, massively-distributed case, with each rater's Opinions stored on e.g. their machine or a Web page. An "opinion editor" could be used to fill out a generic rating schema, or a more domain-specific schema retrieved from an appropriate server or other rater. Servers storing information from many raters would store only links to Opinions at the rater's site (with appropriate caching possible). This configuration would allow operation by ordinary distributed users, with no expensive central server or bandwidth required.

Server Algorithms and Protocols

Finding relevant opinions in a large distributed ocean of opinions is similar to finding relevant Web pages in a large distributed ocean of Web pages. The latter topic has received intense research effort since the widespread adoption of the Web, building on previous work in information retrieval. Belew (2000) surveys key concepts from the point of view of "relevance-finding", which is a central issue in the creative use of information.

Unusual or unpopular opinions can be drowned out, either deliberately or through tyranny of the majority. As in evolutionary computation, a niche strategy may be appropriate to combat this; one could weight unusual opinions more highly, if one were so inclined. As opposed to drowning out opinions, one could actively encourage opinions on a topic via an Opinions Wanted function: users would specify an item and any special Opinion criteria or reward, to encourage third-party judging. Examples of relevant situations include getting an appraisal, or awarding a prize or grant for the best writing on a given topic.

Ontologies will be necessary to effectively navigate the space of opinions. One could specify a portion of the ontology and a desired level of detail, and search for a close match within sufficiently highly-ranked Opinions. The web of links implicitly generated by TOOL usage behavior and interlinking of Opinions will be an interesting derived data structure - one could analyze graph-theoretic properties of this structure, use spreading activation models to implicitly define domains of relevancy on the fly, and so on.

Opinions are important. However, just as in the real world, rankings should also where possible make use of objective (open, authenticated, fully-specified) tests and measurements; the final ranking would then be a composite of objective data points and subjective opinions. Nyms and reputation markets imply persistent anonymous reputations: see Stiegler (1999), and "True Names" in Vinge and Frenkel (2001). Stiegler postulates a world where markets are widely and explicitly used as decision-making tools; more sophisticated and tunable markets would be needed for analyzing information, with e.g. targeted transaction taxes, escrows, or minimum aggregate Opinion for certain functions.

Many people will want to store networks of friends, or an index to their Opinions. Friends could themselves have varying reliability levels in different domains, although these ratings of friends would presumably usually be private. Cascading through friends allows us to make use of friends of friends, and so on, in what has been termed a "Web of Trust" - a necessity since friend ratings will be sparse in the overall ontology of opinion space.

Distributed Server issues

Although a centralized server is easier to implement, distributed servers have the potential to be more robust and incrementally implementable. However, there are replication, trust, and synchronization issues. For example, duplicate servers for a domain will require synchronization and consolidation protocols, and rogue servers at the domain, product, or rater level could try many different attacks to skew Opinions.

Distributed database algorithms will be borrowed from algorithms for indexing large-scale databases and the Web. Data structures can be derived from the basic Opinion-level records, for searching and indexing across servers; as a simple example, each server could maintain separate indexes of all raters and products on that server, which could in turn be consolidated into larger indexes if most servers are reasonably stable with respect to their content.

Ideally, TOOL would be implemented on a combination of small and large servers, with many users locally storing their opinions as well as mirrors of opinions of their friends and acquaintances. P2P protocols may be useful, as they must consider many of the same problems as TOOL, such as relatively seamless distribution over multiple servers, malignant nodes, and resource utility ranking. OpenCola makes a start in considering self-adaptive mechanisms such as automatic distributed caching of popular opinions.

Analysis Layer: Patterns of Thought and Experience

These are actions that infer patterns or statistical information from multiple Opinions. They could run autonomously on servers, or be explicitly invoked by users. Relevant fields to take algorithms from include pattern recognition (Theodoridis and Koutroumbas, 1999), statistical analysis, competitive intelligence, and market research.

As discussed in Watts (1999), many "Small-World" connection graphs with relatively sparse link density have the interesting property that the median path length between a random pair of nodes grows only logarithmically with the total number of nodes. One characteristic feature of such small-world graphs is the presence of inter-component linker nodes: these are nodes that have links to two or more components that would otherwise have no short connecting path. In the TOOL context, Raters or Products that act as such linker nodes would be of particular interest as pathways between genres or disciplines.

Watching Users as they operate in a domain gives implicit Opinions about what they find worthwhile. The next step is looking at what Users correlatively use or Raters correlatively rate. More sophisticatedly, one would look at directional paths users take, e.g. in learning; shortcuts could be added from the start to finish of such paths, given data on final satisfaction level of a User with some known initial seeking state. Related algorithms used in recommender systems include component generation, correlation analysis, and clustering. For example, Cohen and Fan (1999) discuss using a Web spider to collect semantically related lists for input to CF algorithms; such a method could be adapted to autonomously gather semantically related Opinion lists for recommendation or clustering use by a Server or User.

Performance of a particular Opinion Source can be monitored over time, given either an explicit trusted source of "correct" Opinions after the fact or an implicit source of such Opinions from "tests against reality". This will be a key algorithm to find reliable sources of news and fruitful new ideas. If additional Opinion Source properties are available, then machine learning algorithms can find common properties of good Sources.

Where precisely the boundaries of a "story" lie may be nebulous and changing, but we nevertheless conceptually organize events into stories and threads. The same will be true of Opinions, e.g. the collection of Opinions on a particular company's product quality, service levels, internal work environment, ethical practices and so on likely form a "story" about that company (which would be very valuable in both commercial and social terms). Communities of interest often arise spontaneously and naturally, self-identify, and then self-organize. However, many stories may be diffuse enough that it will be difficult to spot them initially; other stories are complex enough that it would be useful to have tools that automatically track related information on a particular topic. Research is ongoing in computational methods of topic tracking.


Operational Behavior

"The greatest enemy of truth is very often not the lie - deliberate, contrived, and dishonest - but the myth - persistent, pervasive, and unrealistic." - John F. Kennedy

Dealing with Privacy, Ignorance, and Malice


Users may not wish their preferences or rankings to be public knowledge; Huberman et al. (1999) shows how cryptographic protocols can be adapted to allow peer-to-peer authenticated and anonymous opinion analysis. There should be a spectrum of anonymity available to users, where less anonymity will generally coincide with greater persistence and believability: public real name, public nym (i.e. authenticated alias), private (server-only) nym, per-session nym (e.g. polling). As indicated previously, it will clearly be essential to deal with user, data, and server authentication, using cryptographic and security protocols.

Yenta is a prototype system which implements cryptographically-secure matchmaking for various group activities, with motivation and implementation described in Foner (1999). As an implementation of a decentralized and secure opinion matching protocol, it's a good testbed to learn from. One practical problem with it seems to be the usual one of cold start and critical mass; TOOL may be able to bypass this by providing immediate value as a secure and convenient store for personal opinions (in the Rater-centralized server model).

Garfinkel (2000) discusses general privacy issues from our growing "data shadows", showing how improving surveillance and database technology could eventually make all public actions and behavior patterns of any individual an open book, at least to those willing to pay a price for the information. Thinking about TOOL in this larger context, it's clear that stored information about e.g. consumer behavior would give implicit information about a user's preferences in material domains, while credit-repayment behavior would be one "opinion about a person" which many might wish to consider. One could easily conceive of autonomous algorithms that scan through newsgroups for postings on particular topics, or that do analysis of video streams from public places. It is important to resolve just what kind of information is fair game to use in forming an opinion about a person, and in propagating that opinion to others.


The bestseller effect swamps out niche interests and opinions by the weight of the majority. A solution is to cluster raters by sufficiently-discriminating properties. Cluster analysis and marketing provide algorithms for doing this under various model and data assumptions. Note that in many domains users have highly nonlinear sophistication and tastes. As coherent subcultures can more easily find unusual products to invest in, producer incentives will change to encourage niche products and discourage least-common-denominator products.

Mass delusion and herd effects are well-documented, e.g. stock market bubbles; see Galbraith (1990) for a concise history. One potential solution is to track raters' opinions and compare them with future results, and pay more attention to raters with good track records, as we discuss at greater length later. In a similar vein, transparent quality testing and other compare-with-reality tests can help keep opinions grounded. Finally, experience may demonstrate the value of maintaining a minimum level of "opinion diversity" on important issues, with rationale similar to that for maintaining biodiversity. These solutions rely on people making the effort to take appropriate precautions, although having a higher skepticism level could be a default opt-out option.


An obvious attack is to set up pseudonyms or pay real names to give bogus bad Opinions. Humans seem to have a cognitive bias to believe bad news, hence the effectiveness of slander and scandal. Solutions: giving more weight to persistent and highly-ranked nyms, sharing blacklists from trusted sources, reducing ratings for later-disproved allegations. Reputation, monetary or identity requirements would discourage frivolous bad ratings and still allow whistleblowers, although defining "frivolous" can be problematic (e.g. if a product just rubs someone the wrong way).

Opinion manipulators could replace some opinions with random or targeted ones, or corrupt an entire server. Solution: cryptographically signed and verified opinions, server blacklists. Similarly, opinion hushers might wish to delete occurrences of certain opinions, and perform other similar attacks; solutions could include information distribution and anonymization protocols and counterattacks.

In general, designers and implementors of TOOL must consider common attacks and defenses in opinion manipulation from advertising and marketing, politicians (spin doctors, dirty tricks), intelligence agencies (very dirty tricks), and other media masters. Examples: avoiding scandalous allegations that are difficult to disprove, guarding against manufactured consent.

Private Incentives, Public Goods, and Legal Issues

Ideally, the TOOL protocol would overcome search, trust, and information consolidation issues sufficiently to be implementable in a highly distributed manner, while also interoperating seamlessly with large centralized servers for appropriate domains. Non-profit and research organizations will be willing to invest in such large servers if they receive sufficient informational or altruistic benefits, in keeping with their particular goals. If many TOOL users wish to maintain control over their opinions or set up rating domains on the fly, and the incentive for large organizations to subsidize trustworthy TOOL servers is low enough, a robust peer-to-peer architecture will be a natural choice for the underlying protocol.

High-quality and niche businesses may find it advantageous to underwrite TOOL, since having easy access to observations on one's product should favor better product makers; this act of underwriting could itself increase the perceived reputation of a company. The economics literature has devoted substantial effort to this idea of "signalling", as reviewed in Riley (2001). If the integrity and efficacy of opinions hosted on TOOL becomes commonplace knowledge, one could therefore envisage a substantial fraction of advertising budgets being redirected to product testing and opinion-generation incentives.

Direct revenue could be generated through TOOL in many ways: Sales of customized services and organization-specific data formats and analysis heuristics. Advertising and referral fees (although neutrality and sponsor independence will be an issue). Open-source economic models as discussed in Shapiro and Varian (1998); for example, subsidize a TOOL server for a particular domain and sell services related to that domain. Subscription to large servers for extra service such as better searches (or access to excludable opinions) - but the basic client should remain free to encourage more users and hence a better aggregate opinion pool. Opinion marketing is already a lucrative business and will only become more so, but due to privacy issues most users will prefer to opt-in to allow their data to be shared; users who do not trust servers to maintain their privacy will likely give up some functionality for cryptographically secure opinion sharing.

TOOL can be viewed as infrastructure: just as datacom is a "bit transfer layer", this is a higher-level "opinion transfer layer". Business models for other infrastructure companies may therefore be relevant. Once the worth of the concept has been proven, economic models for operating public goods could become appropriate, since societies may find it a worthwhile investment to have a ubiquitous and relatively incentive-neutral opinion layer. There is an interesting parallel to the world of open-source software; as discussed in Raymond (1999), open-source culture is largely run on reputation incentives. It seems plausible that open-opinion societies and open-source software also share an important benefit; the more eyeballs there are looking for bugs or subpar performance, the more such bugs will be found and fixed.

Once TOOL became sufficiently widespread and trusted, information sources, aggregators, and distributors such as authors and publishers would likely contribute reviews and synopses as a matter of course, as this information is a form of advertising. Opinions and reviews by other interested parties would be provided with a variety of motivations. This "opinion and reviews" layer could be an additional optional layer on top of regular access to information, through Web sites, libraries, and other interfaces. Providing an opinion layer that is non-proprietary and interconnectible would be tremendously valuable for end users, creating a great deal of mindshare for the layer creators. It could quickly gain market share over proprietary and locked-in alternatives, from network effects and the advantage of opinion portability across multiple domains.

The widespread implementation and use of TOOL raises many interesting legal issues: libel, IP infringement, discrimination based on negative Opinions, illegal action (e.g. physical threats or destruction) based on strong Opinions of influential Raters, fraud (based on false positive Opinions), and the legality of future click-wrap licenses which forbid propagating opinions on the product. We point the interested reader to Lessig (2001) for a cogent discussion which includes some of these issues.

Other possible negative uses of TOOL: cohesion in unethical groups could improve their efficiency; global tar-and-feathering (with getting a second chance through starting over in an obscure location difficult); self-reinforcement of stereotypes (and hence possible polarization of opinion nodes into a few disconnected components); targeting of people or groups who have opinions that are unpopular with or dangerous to other groups; potential increase of velocity of opinion shifts (with consequences analogous to hot money flows in international economies).

Models and Mathematics

Schafer et al. (1999) suggests a taxonomy for recommender systems, with the two dimensions being i) whether recommendations are implicit (done automatically by the system through user observation) or explicit (requiring user input to seek out recommendations), and ii) whether recommendations are ephemeral (independent of the history of a particular user) or persistent (building up a user profile over time). TOOL could be used in any of these four modes, with explicit and persistent recommendations obviously requiring more user effort and privacy precautions. Another relevant dimension is whether TOOL will be built, supported, and paid for primarily by opinion generators or opinion consumers; the former group of "influencers" would have some motivation to skew the overall opinion pool, while the latter group of "truth-seekers" would like an unbiased and open opinion pool with good search and analysis algorithms.

The "Small World" effect is well-known, and illustrated by many social and natural systems where path lengths between most pairs of nodes in relatively sparse graphs are surprisingly small. A well-known example is the "six degrees of separation" effect, where most people in a country are connected through a chain of acquaintances with an average path length of six. This effect is studied and quantified in Watts (1999) and subsequent papers. In the reputational context, there is fertile ground for research here: endorsement by authority figures or opinion leaders (which could be tied to gossip, social network, and opinion coalescence research); graph structure of coherent subcultures; number and quality of links between disparate domains (Watts points these out as being key statistics of many small-world graphs). There is related work being done in the structure of the Internet (which, as a relational network of ideas, implicitly teaches us about human idea and opinion space via its structure).

The speed of consolidation of "sources we listen to" is a very relevant variable; since we are time and memory bounded creatures, we each necessarily have relatively few sources of Opinions that we can directly monitor. Switching costs, friend and referral networks, and human cognitive and leader-generating biases will probably lead to a rapid decrease in the number of popular information sources in any model that includes them as factors. One relevant metric is the accuracy of such reputation sources versus their popularity; other questions include how much consolidation happens, what the effect is of greater variance and differing product coverage, and so on. Ogus et al. (1999) give a simple model showing rapid concentration of the Web page market to relatively few sellers, assuming some brand loyalty and sharing of opinions among consumers.

A simple mathematical model of opinion source consolidation uses a non-stationary Markovian setup, with the probability of going to a particular source varying proportionally to that source's current popularity. Page et al. (1998) discuss the PageRank system for inferring global page quality from the observed link structure of a set of documents; opinion source consolidation might be viewed as a time-evolving extension of such models. Given that you are at a particular source, another vector stores the relative probabilities of going to each other opinion source; this vector represents commonalities or antipathies (i.e. link strengths) between opinion sources. Each source has a scalar of intrinsic quality, or for greater generality a vector with each dimension corresponding to a different type of utility; in the latter case, one could experiment with different utility vectors on the part of the listening agents. So summarizing and formalizing, with OS1 to OSN being the different opinion sources, one could define the state transition matrix as follows:

  1. Stay with OSi with probability stayFn(myUtilFn(Quality(OSi))), where Quality can be a scalar or vector function and myUtilFn tells how my satisfaction level changes with perceived quality levels.
  2. Otherwise, change from OSi to OSjwith probability moveFn(popularityj * linkStrengthij). The popularity and linkStrength measures could as a simple case be endogenously derived, from the current proportion of agents listening to OSj and the past history of inter-source movements respectively.

Note that this whole model could be applied to two different scenarios: opinion sources as external from agents, or opinion sources as the agents themselves. More sophisticated probabilistic methods such as diffusion and percolation theory may prove useful, particularly in combination with microaxiomatic simulation methods as discussed in Thompson (2000). There has also been related research done into the consolidation of Web traffic to relatively few sources, which shows that natural assumptions lead to a highly asymmetric traffic distribution with the top few sites capturing most of the traffic. Users often find it easier to filter by staying with well-known reputable sites, i.e. filtering by source URL reputation.

Reputation Agencies, Trust, Attention Economics, and Testbeds

In Klein (1997), reputation agencies are explored with respect to their economic and social function. Merry points out the role of gossip and scandal in social control, especially in bounded social systems where interdependence and ostracism costs are higher. Newman discusses the role of Dun & Bradstreet in providing necessary creditworthiness information to potential trade partners. It's instructive to note that credit rating agencies, which fulfil a similar function, have well-publicized data inaccuracies that adversely affect many retail customers. Brearly tells of the genesis and growth of Underwriters' Laboratories, a well-known safety inspection and certification agency initially funded by the insurance industry; once their certification label became known and trusted, consumers were willing to pay a premium for their products, and high-quality producers were therefore willing to pay for certification. Finally, Klein discusses "knower organizations" in general, and divides them along two axes: whether the knower is renumerated primarily by trusters (buyers) or promisors (sellers), and whether the knower provides opinion generation, conveyance, or both.

Francis Fukuyama (1995) talks about high-trust and low-trust societies. His general argument is that high-trust societies develop flexible and distributed organizations relatively easily, which gives them both economic and social advantages. In low-trust societies, trust is limited to family and close acquaintances, with negative consequences for private enterprise and the public good. His later book (Fukuyama, 1999) searches for the origins of norms and social order, and taxonomizes social norms in a four-quadrant system divided by rational-arational and spontaneous-hierarchical axes. An important point made is that, while spontaneous (or bottom-up) social norms are empirically observed to develop more often than an individualistic utility-maximizing framework would suggest, there are many factors which affect the spontaneous development of cooperative social protocols: reputation monitoring and enforcement (especially for larger communities), clear group membership indicators, costs to enter and exit groups, repeated interaction, common culture, asymmetric power distribution, lack of transparency, and path dependence. A related work discussing the decline and revival of community in the North American context is Putnam (2000).

Attention economics analyzes the consequences of treating human attention as a scarce resource, as discussed qualitatively in Goldhaber (1997). Note that most work in the area to date is on "attention microeconomics", dealing with the choices faced by a single spender of attention; "attention macroeconomics" would deal with aggregate information flows in society as a whole, and is a very interesting discipline waiting to be born. A simple principle is that more information per unit time means less time per information category. We must of necessity cope by filtering strategies; two common ones include relying on the synopses and filters of others, and choosing the learning method with the least cognitive load. Such filtering strategies in turn set up selection pressure on information producers to match the biases of information consumers, e.g. by emphasizing style over substance and shouting louder and louder to be heard at all; an "arms race" may result where producers use increasing resources to gain the attention of consumers. TOOL may provide an alternative environment where producer resources are naturally rechanneled to achieving performance metrics, and consumers have more efficient idea filters and attractors.

Advertising and quantitative market research deal with measuring and manipulating opinions. As such, they contain valuable lessons for any domain in which opinions are an important currency. However, there may be a qualititative difference between typical marketing domains and TOOL, due to the distributed nature of the opinions being measured, ability to selectively weight opinions based on past utility of the sources, and existence in some domains of objective and persistent "tests against reality" which track performance of opinion sources over substantial periods of time.

Games are one area in which many rating statistics are routinely generated and used; Smith (1993) surveys mathematics and practicalities of game rating systems derived from pairwise comparisons. A comprehensive rating profile could be used for partner and team generation, as well as to suggest areas of strength and weakness. In fact, if we drop privacy concerns for a moment and imagine combining ratings on all aspects of personality, the resulting "continuously-updated report card" could be very useful as a self-development / compatibility testing / mental health / aptitude testing / career filtering tool; this suggests using such ratings for individualized education and counseling. In Masum et al. (2002) we propose a framework - the "Turing Ratio" - that could potentially use pairwise comparisons and other competitions to derive a metric for open-ended task performance; the idea is to extend the original Turing Test to a more graduated scale for near-human or superhuman performance levels, on well-defined yet open-ended tasks such as strategic games.

Financial markets would provide an excellent testbed and source of ideas for TOOL. Currently, many retail customers get opinions on financial products from friends, personal experience, and a few large news and opinion servers, while large investors have substantial research resources and are more peer-to-peer in their approach. Just as with any large-scale implementation of TOOL, many financial players have strong incentives to mislead and skew the opinion pool of other players; in response, safeguards of varying efficacy have been established at both formal (regulatory authorities, legal action) and informal ("common sense") levels. When large traders make swaps and other peer-to-peer deals, issues of trust and reliability are crucial underpinnings on which complex financial deals higher in the "financial protocol stack" are built. The agreements and authentication mechanisms of large players create a predictable environment in which small players can function. We may also find that a sufficiently trustworthy reputation system will herald a new form of currency; in the scientific and hacker worlds, a good reputation is very portable, often translatable to traditional money, and sometimes tradeable for difficult-to-purchase power or access. Perhaps TOOL will be a key step toward more direct investment in human capital.


Looking Forward: A Collaborative Decision Substrate

"None of us is as smart as all of us." - Japanese proverb

Combining Opinions with Reality

To go beyond the recommender system aspect of TOOL, one desirable next step is adding objective tests from the external environment. Tying TOOL to a persistent tracker of the effect of given opinions on the real world would allow analysis and selection of opinion sources, which over time should improve the quality of opinion sources. Note that the mere existence of such a mechanism may discourage many intentionally frivolous or malicious opinions, if the generators of such opinions realize that an opinion-tracker may hurt their reputation down the line. Examples of such mechanisms are predictive algorithms such as Hanson's Idea Futures, as described in Hanson (1999); measuring the desire of opinion-generators for currently-nonexistent choices; and autonomous suggestion or generation of tests (through data mining, adaptive, or economic techniques) to validate or discredit strongly-held opinions, and to exploit other suspected rating and social network patterns.

Extending Idea Futures is a particularly attractive way to move TOOL from collaborative filtering to "collaborative judging". The Delphi Method is a process for structuring group communication to deal with complex problems; Adler and Ziglio (1996) contains theoretical ideas and case studies in social situations. Idea Futures is a similar economically-based model developed by Robin Hanson that sets up tradeable and well-specified propositions about future events. The hope is that liquid idea markets and knowledgeable investors will result in proposition prices reflecting accurate collective judgements.

The idea has been implemented in the Foresight Exchange and discussed in Kittlitz (1999); another well-known implementation with good empirical results for predicting election results is the Iowa Electronic Markets. One obstacle to extending the use of these markets is the feeling by regulatory agencies that financial products should have some demonstrated non-gambling use. Many OTC ("over the counter") derivatives are equivalent to betting on some future event; the difficulty of using them for Idea Futures is that they typically have large transaction and search costs. But perhaps over time many common ones will become commoditized; the CME is a good place to watch, and is now trading Weather Futures.

Pennock (1999) explores the related computational theory of combining different agents' beliefs; combining Idea Futures and Decision Markets with Bayesian Networks could provide a powerful collaborative decision-making framework. The efficiency and accuracy of the best-known Idea Futures implementations is investigated in Pennock et al. (2001). They find them to be reasonably reliable in practice, and suggest that such "Web games" form a useful pool of informed opinion, even in their current incarnation where no real money is at stake.

There is a deep epistemological issue here; how are the objects about which opinions are being traded defined? In the theory of financial markets, artificial constructs known as Arrow-Debreusecurities give a theoretical answer to this question. These are atomic securities which allow investors to buy and sell a stake in specific states of the world at some future time. Ideally, the presence of enough of these securities would allow an investor to construct a portfolio that hedges or speculates on any possible future market event. In practice, real markets fallfar short of providing such completeness, due to factors such as high overhead and the difficulty of defining an unambiguous "basis set" of atomic securities. Each sector of human activity has a corresponding ranked set of products for which reputations or market-derived opinions would be in demand if they existed. The products with greatest potential demand would tend to be those for which the overhead of setting up a market would be feasible. Many non-economic sectors are currently underserved in the market for hedging against future states of the world.

The Acid Test

Every significant new technology has potential for good and evil, and TOOL is no exception; we already see many examples of the mixed benefits of opinions. Knowledge of consumer opinions helps target real market needs, but leads to intrusive marketing techniques and potential unethical treatment based on projected customer value. Medical records can be used to evaluate and mitigate future health risks and prevent fraud, or to deny insurance and even employment (and genetic information may be far more controversial). Profiling of individuals and groups can lead to proactive threat identification, or to overzealous action against potential threats and misuse by unethical agencies.

As a first step toward evaluating the ethical and practical benefit of TOOL, I suggest the following questions be asked by future observers and builders:

Base layer:

Analysis layer:


Conclusion: Decision Substrates and Distributed Democracy

We have discussed how TOOL could be used to reduce opinion-search costs, to observe and analyze opinion streams, and to reward opinion-generators objectively. The end result of these steps will be an "adaptive decision substrate", complementing and enhancing the collective observation and decision processes of markets and human consciousness. This substrate will become just as ubiquitous as our current-day monetary system, and just as essential for carrying out trade, directing human attention, and incentivizing society. See Chislenko (1997) for an insightful discussion of this prospect.

I believe that TOOL could eventually become a "credit allocation layer" for many human endeavors. One of the advantages of markets is their resource allocation and price discovery functions, implicitly performing many complex decisions in a largely distributed manner. The ability to directly generate and analyze robust opinions can be expected to amplify the computational power of our societal infrastructure. In many adaptive computational systems, a key function is the fitness or reward function, which provides feedback that directs the system toward better behavior. The ability to objectively reflect on our individual and aggregate behavior - and come to a consensus on what works and what doesn't - will profoundly increase our collective capability to solve the increasingly complex problems that we will be faced with in the future.

In AI research, developing good methods for credit allocation is well known to be a key hurdle to developing better algorithms. The difficulty - and it is one common to almost all areas of model-building from complex phenomena - lies in deciding which of many actions taken in an environment lead to good or bad performance in that environment. This is easy to tell in many everyday situations: "Hmm, why did she fail that exam? Maybe skipping all the classes, not studying, and spending the night before the exam partying had something to do with it." However, it becomes increasingly more difficult as the environment gets more complex: "Why did his marriage turn out to be a failure? What should he have done differently?" The difficulties only compound when policies are being made that affect thousands or millions of people. Integrating opinion storage and dissemination with credit allocation methods from AI in the same substrate might highlight patterns of efficiency (or stupidity) that our unaided human minds cannot readily perceive. Being able to reliably perceive chains or trees of dozens of influences that lead to a particular positive outcome might in turn spawn the "IVC" (intellectual venture capital) industry, where one could invest small amounts in many highly risky early-stage ideas, or reward the generators of such ideas after the fact even if they had delayed or tenous connections to the final outcome.

A related societal benefit would come from tracking the "happiness levels" of populations. Monitoring these levels could direct attention toward trouble spots, preferably before highly unhappy people explode in ways destructive to themselves or others. Similarly, a population's opinions on a variety of personal and social issues could be aggregated into a social analogue of GNP (maybe "GHP", for gross human product) that would more directly measure the wealth levels experienced by the population.

As we have seen, the TOOL framework describes a feasible, robust, distributed protocol with the potential to be useful in a broad variety of situations. The next step is to implement an extensible subset of this protocol and validate its utility. Positive user feedback and continued usage in real-world domain will demonstrate the power of sharing and analyzing our opinions and implicit valuations of the universe around us. End of article


About the Author

Hassan Masum is a Ph.D. candidate in the Department of Computer Science at Carleton University in Ottawa, Canada. His research interests include evolutionary computation, heuristic optimization, policy and strategic analysis, and social algorithms.



Alexander Chislenko, Ben Houston, and Philip Painter contributed many ideas during the embryogenesis of this article. Thanks also to Hyder Masum, Franz Oppacher, and Peter Turney for valuable discussion and feedback.



Michael Adler and Erio Ziglio (editors), 1996. Gazing into the Oracle: The Delphi Method and Its Application to Social Policy and Public Health. London: Jessica Kingsley Publishers.

Richard K. Belew, 2000. Finding Out About: A Cognitive Perspective on Search Engine Technology and the WWW. New York: Cambridge University Press.

David Brin, 1998. The Transparent Society: Will Technology Force Us to Choose Between Privacy and Freedom? Reading, Mass.: Addison-Wesley.

Alexander Chislenko, 1997. "Automated Collaborative Filtering and Semiotic Transports," at, accessed 2 May 2002.

William W. Cohen and Wei Fan, 1999. "Web-Collaborative Filtering: Recommending Music by Crawling the Web," WWW9 (Ninth International World Wide Web Conference, at, accessed 2 May 2002.

Leonard Newton Foner, 1999. "Political Artifacts and Personal Privacy: The Yenta Multi-Agent Distributed Matchmaking System," PhD Thesis, MIT Media Laboratory, at, accessed 2 May 2002.

Francis Fukuyama, 1999. The Great Disruption: Human Nature and the Reconstitution of Social Order. New York: Free Press.

Francis Fukuyama, 1995. Trust: The Social Virtues and the Creation of Prosperity. New York: Free Press.

John Kenneth Galbraith, 1990. A Short History of Financial Euphoria. Knoxville, Tenn.: Whittle Direct Books.

Simson Garfinkel, 2000. Database Nation: The Death of Privacy in the 21st Century. Cambridge, Mass.: O'Reilly.

Herbert Gintis, 2000. Game Theory Evolving: A Problem-centered Introduction to Modeling Strategic Behavior. Princeton, N.J.: Princeton University Press.

Michael H. Goldhaber, 1997. "The Attention Economy and the Net," First Monday,, volume 2, number 4 (April), at, accessed 2 May 2002.

Robin Hanson, 1999. "Decision Markets," IEEE Intelligent Systems, volume 14 number 3 (May/June), pp. 16-19.

Bernardo A. Huberman, Matt Franklin, and Tad Hogg, 1999. "Enhancing Privacy and Trust in Electronic Communities," Proceedings of the ACM Conference on Electronic Commerce (EC-99), pp. 78-86; abstract at

Markus Jakobsson and Moti Yung, 1998. "On Assurance Structures for WWW Commerce," In: Rafael Hirschfeld (editor). Financial Cryptography, Second International Conference, FC'98, Anguilla, British West Indies, 23-25 February, 1998, Proceedings. Also Lecture Notes in Computer Science, number 1465. Berlin: Springer-Verlag, pp. 141-157.

Christopher M. Kelty, 2001. "Free Software/Free Science," First Monday, volume 6, number 12 (December), at, accessed 2 May 2002.

Ken Kittlitz, 1999. "Experiences with the Foresight Exchange," Extropy: Journal of Transhumanist Solutions (June), at, accessed 2 May 2002.

Daniel B. Klein (editor), 1997. Reputation: Studies in the Voluntary Elicitation of Good Conduct. Ann Arbor: University of Michigan Press.

Lawrence Lessig, 2001. The Future of Ideas: The Fate of the Commons in a Connected World. New York: Random House.

Hassan Masum, Steffen Christensen, and Franz Oppacher, 2002. "The Turing Ratio: Metrics for Open-Ended Tasks," Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2002). San Francisco: Morgan Kaufmann, pp. 973-980.

M.A. Nowak and K. Sigmund, 1998. "The Dynamics of Indirect Reciprocity," Journal of Theoretical Biology, volume 194, number 4, pp. 561-574.

Ayla Ogus, Michael de la Maza and Deniz Yuret, 1999. "The Economics of Internet Companies," presented at Computing in Economics and Finance: Fifth International Conference of the Society for Computational Economics, at, accessed 2 May 2002.

Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd, 1998. "The PageRank Citation Ranking: Bringing Order to the Web," Stanford University Technical Report, at, accessed 2 May 2002.

David M. Pennock, 1999. "Aggregating Probabilistic Beliefs: Market Mechanisms and Graphical Representations," PhD Thesis, University of Michigan, at, accessed 2 May 2002.

David M. Pennock, Steve Lawrence, Finn Årup Nielsen, and C. Lee Giles, 2001. "Extracting Collective Probabilistic Forecasts from Web Games," Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001) (August), at, accessed 2 May 2002.

Robert D. Putnam, 2000. Bowling Alone: The Collapse and Revival of American Community. New York: Simon and Schuster.

Eric Raymond, 1999. The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. Cambridge, Mass.: O'Reilly.

John G. Riley, 2001. "Silver Signals: Twenty-Five Years of Screening and Signaling," Journal of Economic Literature, volume 39, number 2 (June), pp. 432-478.

Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl, 2001. "Item-based Collaborative Filtering Recommendation Algorithms," WWW10 (Proceedings of the 10th International World Wide Web Conference) (May), at, accessed 2 May 2002.

J. Ben Schafer, Joseph Konstan, and John Riedl, 1999. "Recommender Systems in E-Commerce," EC-99: Proceedings of the ACM Conference on Electronic Commerce (November), pp. 158-166.

Carl Shapiro and Hal R. Varian, 1998. Information Rules: A Strategic Guide to the Network Economy. Boston: Harvard Business School Press.

Warren D. Smith, 1993. "Rating Systems for Gameplayers, and Learning," NEC Institute Technical Report, at, accessed 2 May 2002.

Marc Stiegler, 1999. Earthweb. Bronx, N.Y.: Baen Books.

Elaine Svenonius, 2000. The Intellectual Foundation of Information Organization. Cambridge, Mass.: MIT Press.

Sergios Theodoridis and Konstantinos Koutroumbas, 1999. Pattern Recognition. San Diego, Calif.: Academic Press.

James R. Thompson, 2000. Simulation: A Modeler's Approach. New York: Wiley.

Vernor Vinge and James Frenkel, 2001. True Names and the Opening of Cyberspace Frontier. New York: TOR.

Duncan J. Watts, 1999. Small Worlds: The Dynamics of Networks Between Order and Randomness. Princeton, N.J.: Princeton University Press.

Editorial history

Paper received 13 May 2002; accepted 14 June 2002.

Contents Index

Copyright ©2002, First Monday

Copyright ©2002, Hassan Masum

TOOL: The Open Opinion Layer by Hassan Masum
First Monday, volume 7, number 7 (July 2002),