First Monday

Escher Staircases on the World Wide Web

Abstract
Escher Staircases on the World Wide Web by Ronald Rousseau and Mike Thelwall

It is shown that Escher Staircases, i.e. cycles of four nodes in a graph with reciprocal links, form a basic structural element on the World Wide Web.

Contents

Introduction
Methods
Results
Discussion and conclusion

 


 

++++++++++

Introduction

Finding and describing link structures is one of the main topics of webometrics, the subfield of informetrics, defined by Björneborn and Ingwersen (in press) as "The study of the quantitative aspects of the construction and use of information resources, structures and technologies on the Web drawing on bibliometric and informetric approaches." Links on the Web and citation relations between scientific articles can both be described as mathematical graphs. Yet, there is one major difference between the two. The World Wide Web is in principle a bidirectional graph, i.e. links between nodes can go from node n1 to node n2 and vice versa. An article citation graph, on the other hand, is in principle unidirectional. If article a1 cites article a2, then it almost never occurs that article a2 also cites article a1. Certainly, it seems impossible that article a1 cites article a2, a2 cites a3, a3 cites a4 and a4 cites a1. The first author has searched for this phenomenon for more than 10 years, but has never found it. (Of course, now that the cat is out of the bag, four friends can easily arrange this to happen, perhaps through exchanging preprints.)

This, however, might easily happen on the Web, or, e.g., for author citation graphs. Because of its resemblance to the famous Escher lithograph "Ascending and descending" (see Figure 1) we call this structure an Escher Staircase. More precisely, an Escher Staircase is defined as a set of four nodes in a graph (nj, j=1,..,4) with reciprocal links (not just unidirectional!) from n1 to n2, from n2 to n3, from n3 to n4 and finally, closing the staircase, from n4 to n1.


Figure 1: Maurits Cornelis Escher (1898–1972): "Ascending and descending" (lithograph, March 1960).
M.C. Escher’s "Ascending and Descending" © 2004 The M.C. Escher Company — Holland. All rights reserved. Used by permission. www.mcescher.com.

Note that the staircase in the picture of Maurits Cornelis Escher was actually an idea of Roger Penrose and his father Lionel in 1958.

So we began our search for an Escher staircase on the Web.

 

++++++++++

Methods

The link structure of New Zealand university Web sites was obtained by using a specialist information science Web crawler (Thelwall, 2001a; 2003). The sites were crawled in December 2003 and the results placed online ( http://cybermetrics.wlv.ac.uk/database/). It has been shown that a specialist crawler gives a more accurate and more controlled coverage than a commercial search engine (Thelwall, 2001b).

An existing program was used to convert the link structure to domain Alternative Document Model form (Thelwall and Harries, 2003). This identifies links using only the domain name of the source URL and target URL. The purpose of this was to identify Escher Staircases for interlinked domains rather than individual pages.

A new program was written to extract Escher Staircases from the resulting file. The program was based upon a standard breadth first search, using the following rules to identify legal paths:

(a) There must be a sequence of five domains;
(b) Each domain must link to the domain before it and after it;
(c) The first domain must be the same as the last domain, and all of the other domains must be different; and,
(d) (Optional) None of the domains can be the home domain of a university.

Escher Staircases were then identified in all of the university link files separately, using option (d). Following standard practice (e.g., Thelwall and Harries, 2003) we removed all links within each individual university when analysing the combined set. The Escher program produced two outputs: a list of all Escher Staircases and a Pajek matrix of all links in all Escher Staircases. Pajek (Batagelj and Mrvar, 2003) is a program for graphing and describing networks. Given the large number of Escher Staircases, the results are most usefully presented in graphical form.

 

++++++++++

Results

Figures 2 to 5 are network diagrams for the universities that had at least one Escher Staircase (excluding the home domain). Massey University is omitted because it had so many domains that its diagram was a jumbled mass of links. The structure can be seen in the smaller sites but is not evident in the larger ones. For the larger graphs the positioning is determined by the Kamada and Kawai (1989) algorithm, with slight manual adjustments for clarity. The smaller graphs (Figure 5 and Figure 7) are in a purely manual arrangement, again for clarity. There are two versions of the diagrams for international university interlinking: Figure 6, with home domains, and Figure 7 without. The node labels are the domain names, with any initial ‘www’ removed. The missing universities are Auckland University of Technology, Lincoln University and Victoria University of Wellington, none of which had any Escher Staircases, in addition to Massey University.


Figure 2: Auckland University domain–based Escher Staircases.Larger version of image.


Figure 3: University of Canterbury domain–based Escher Staircases.Larger version of image.


Figure 4: Otago University domain–based Escher Staircases.Larger version of image.


Figure 5: Waikato University domain–based Escher Staircases.Larger version of image.


Figure 6: New Zealand universities domain–based Escher Staircases with home domains.Larger version of image.


Figure 7: New Zealand universities domain–based Escher Staircases.Larger version of image.

 

++++++++++

Discussion and conclusion

The existence of Escher Staircases is partly determined by site size: a small site is less likely to have some than a large site. But it is also partly determined by the structure of the site: one with many links per page will tend to naturally grow Escher Staircases.

Inspection of the individual diagrams shows some trends. Figure 7 gives mainly theme–based staircases: for libraries, math and computer science. Interestingly, however, they are all interconnected. This is partly due to the pseudo–home domains of www2.auckland.ac.nz and www2.vuw.ac.nz, which function as cross–topic connectors (Björneborn, 2004). Figure 6, with the home domains, shows how many extra staircases the home domains generate. Figures 2 to 5 echo the themes of math, computer science and library–related sites. Other subjects are also apparent. Figure 5 is a disconnected network, with a student support staircase separate from the main group.

In network analysis dyads and triads, especially transitive triads (being transitive closures) are considered to be basic building blocks of networks (Wasserman and Faust, 1994). The Escher Staircase clearly is another structural element that is common to social network and informetric (here: webometric) studies (Otte and Rousseau, 2002).

Of course, there will be larger cycles than four–cycles. We challenge the reader to find the largest possible cycle with reciprocal links. End of article

 

About the authors

Ronald Rousseau is Associate Professor at the Katholieke Hogeschool Brugge–Oostende ( KHBO), Industrial Sciences and Technology, Belgium and Guest Professor at Antwerp University in Belgium.
E–mail: ronald.rousseau@khbo.be.

Mike Thelwall is a Reader in the Maths & Stats Section as well as Head of the Statistical Cybermetrics Research Group in the School of Computing and Information Technology, University of Wolverhampton.
E–mail: m.thelwall@wlv.ac.uk.

 

References

V. Batagelj and A. Mrvar, 2003. "Pajek — Analysis and visualization of large networks," In: M. Jünger and P. Mutzel (editors). Graph Drawing Software. Berlin: Springer–Verlag, pp. 77–103.

L. Björneborn, 2004. "Small–world link structures across an academic Web space: a library and information science approach," PhD Thesis. Royal School of Library and Information Science, Copenhagen, Denmark; at http://www.db.dk/lb/phd/phd-thesis.pdf, accessed 12 May 2004.

L. Björneborn and P. Ingwersen, in press. "Towards a basic framework for webometrics," Journal of the American Society for Information Science and Technology.

T. Kamada and S. Kawai, 1989. "An algorithm for drawing general undirected graph," Information Processing Letters, volume 31, number 1, pp. 7–15.

E. Otte and R. Rousseau, 2002. "Social network analysis: a powerful strategy, also for the information sciences," Journal of Information Science, volume 28, number 6, pp. 443–455.

M. Thelwall, 2003. "A free database of university Web links: Data collection issues," Cybermetrics, volume 6/7, issue 1, at http://www.cindoc.csic.es/cybermetrics/articles/v6i1p2.html, accessed 12 May 2004.

M. Thelwall, 2001a. "A Web crawler design for data mining," Journal of Information Science, volume 27, number 5, pp. 319–325.

M. Thelwall, 2001b. "Extracting macroscopic information from Web links," Journal of the American Society for Information Science and Technology, volume 52, number 13, pp. 1157–1168.

M. Thelwall and G. Harries, 2003. "The connection between the research of a university and counts of links to its Web pages: An investigation based upon a classification of the relationships of pages to the research of the host university," Journal of the American Society for Information Science and Technology, volume 54, issue 7, pp. 594–602.

S. Wasserman and K. Faust, 1994. Social network analysis. Cambridge: Cambridge University Press.


Editorial history

Paper received 31 March 2004; accepted 26 April 2004.


Contents Index

Copyright ©2004, First Monday

Copyright ©2004, Ronald Rousseau and Mike Thelwall

Escher Staircases on the World Wide Web by Ronald Rousseau and Mike Thelwall
First Monday, volume 9, number 6 (June 2004),
URL: http://firstmonday.org/issues/issue9_6/rousseau/index.html