First Monday

FM Reviews

Heting Chu.
Information Representation and Retrieval in the Digital Age.
Medford, N.J.: Information Today, 2003.
cloth, 250 p., ISBN 1-573-87172-9, US$44.50, US$35.60 (ASIST members).
Information Today: http://books.infotoday.com/

Heting Chu. Information Representation and Retrieval in the Digital Age

I rarely write book reviews, but as an educator and author I liked this book so much that I could not resist spreading the word about it in this open access digital journal with the widest reach for the target audience. When I taught the introductory course on Information Storage and Retrieval a few years ago, I was looking hard and long for a book which would cover well the foundations as well as the historical and current issues at the appropriate depth and breadth for beginning master students of library and information science. I could not find one. The books were either outdated or too theoretical, and they were also way too expensive.

Heting Chu, an associate professor at the Palmer School of Library and Information Science of Long Island University, wrote a book with the perfect ingredients in perfect proportion. It presents not only a gourmet mixed plate which certainly pleases the palate of students and educators, but which is also healthy and affordable at the same time — not a small feat.

The book consists of 12 chapters with about 20 pages per chapter on average, discussing — after presenting an intellectual timeline and the major milestones in information representation and retrieval (IR&R) — indexing, abstracting and other metadata issues, controlled vocabulary and natural language searching, information retrieval models, approaches and techniques applied to textual and multimedia information, human–computer interaction, evaluation of content and software features, and artificial intelligence. One may wonder how so many issues can be covered in a 250 page book (I filled that many pages in my book dedicated to content evaluation of textual databases alone).

Well, the book proves that it can be done very well, providing the essential information in a concise style — exactly what is needed in the typical 12–15 week long trimester or semester. Introductory classes about IR&R should function like bite–size buffets to let people taste the variety of a cuisine, who can then proceed next time to a full meal of what they liked the most. I am borrowing this analogy from the practice of fellow faculty member, donna Bair–Mundy, who has developed and been teaching with great success such a course for years. It helps students considerably because they get the big picture, and still can have a sense of the different components and issues within the global framework of IR&R. This approach provides them additional guidance to take courses dedicated to abstracting & indexing, advanced searching, or human–computer interaction, whichever may have caught their fancy in the introductory overview course.

Chu writes concisely and clearly, impressively distills the essence of long papers, and provides references to them and other relevant papers for further reading. The organization of the chapters, sections and subsections is exemplary and clearly reflects her mastery of the topics. Many of the chapters are appropriate for introducing topics also on specialized courses. I was particularly pleased with the strong emphasis on the importance of browsing as part of the information retrieval process.

There was one sub–topic which I really missed, the emergence and exploding use of Digital Object Identifier (DOI) and reference linking through CrossRef, which are key tools and services for efficient information representation and retrieval in the digital age, still simple enough for an introductory course. Through a reference to HighWire Press, a digital facilitator, which presents awesome examples for the empowering use of DOI and reference linking in many of the open access scholarly journals it hosts, beginning students and practitioners could be led to see the state of the art in IR&R.

Of course, I have disagreement with the author here and there, such as in the discussion of stop–word lists which mentions that "engineering would be meaningless either as an indexing term or a query term in an engineering database" and implies that it is prevented from becoming indexing and query term. If it were a stop word, indeed, than how would one search, say for the concept of reverse engineering which occurs in nearly 1,500 records in the Compendex database. It would have been better to identify real examples to illustrate the use of unusual stop words, such as "you" and "love" in the CD–ROM version of OCLC’s Music Library. Not as if love and you would be meaningless words, because they occur in so many song titles, that their index entries would have filled half the space. These are just minor quibbles in my praise for an excellent book.

The table of contents should be posted by the publisher (neither Amazon, nor Barnes and Noble had it at the time I wrote the review) as it is provides such a clear picture of the lay of the IR&R land, letting the readers see the forest as well as the trees. Many potential buyers look first at the table of contents, and the one of this book is exemplary. Many would also realize that the table of contents offers an excellent outline as–is for classroom presentations in a course. I highly recommend the book which makes me pine for teaching that introductory course again. — Peter Jacso, Professor, Department of Information and Computer Sciences, University of Hawaii. End of Review

++++++++++

Ian Graham.
A Pattern Language for Web Usability.
London: Pearson, 2003.
paper, 304 p., ISBN 0-201-78888-8, UK£36.99.
Pearson: http://www.pearsoned.co.uk

Ian Graham. A Pattern Language for Web Usability.

Ian Graham’s book is about creating usable Web sites and, apart from an introduction and examples of its use, the core of the text describes 79 design patterns and guidelines. How useful are they?

Example number 27 describes the now well–known "no frames" rule. In it, the author talks of using layers and tables instead of frames, and of cutting and pasting and automating pastes if using dynamic pages. When using layers with Netscape 4 it is advisable to apply the well–known fix for resizing. Here one little shortcoming I found is that Graham fails to mention the use of templates as perhaps the most common technique for automating common look–and–feel: Users of the most popular Web authoring package, Macromedia Dreamweaver MX, are probably familiar with using such templates alongside CSS (cascading style sheets) for uniformity.

Out of curiosity, I examined the companion Web site at http://www.trireme.com in November 2003, which the author created for the company he works for. There are rather a lot of pictures on the home page, which seems to be going against his rule number 42 ("minimise download time"). It also did not pass basic accessibility checks (there were some missing ‘alt’ tags), as specified in his rule number 57. The navigation is of the "pop–up surprise" variety. But these are side issues: Of course there are problems with any book — it does not invalidate the central point it tries to make.

If I were redesigning the home page, I would move the pictures up from the bottom, in order to eliminate the need for vertical scrolling (usually desirable on a home page). More use, too, could be made of style sheets to ensure a common look and feel.

In terms of writing style, the book comes across as a little idiosyncratic. There is a picture for each pattern ("a sensitizing image"); there is humour, if not a little cringing at times — the ubiquitous KISS (Keep It Simple Stupid) rule number 38 is illustrated with, you’ve guessed it, a couple kissing. In the old days of the Internet this could have been illustrated comparing the exemplary simple uncluttered home page of Google with (say) Yahoo — surely a major factor in its subsequent success.

The book could usefully be consulted as a kind of checklist, but it would not seem very practical to have to flick through numerous pages, in order to assess whether the style in question would irritate. I am not quite sure about the real value of this title, because, despite the interesting pattern language approach it takes, it does not seem to add much to the subject area.

To me this book is far more suitable for project managers, who need to get a an idea of the main issues related to Web design and usability, as opposed to a technical professional or an expert practitioner.

In short, it is a book I would be happy to keep in my library and one I would lend out to whomever wanted a quick overview, rather than an encyclopedic in–depth study of the subject. — Peter Cambridge. End of Review

++++++++++

Jason Whittaker.
The Cyberspace Handbook.
New York: Routledge, 2003.
paper, 336 p., ISBN 0-415-16836-8, US$25.95.
Routledge: http://www.routledge.com

Jason Whittaker. The Cyberspace Handbook.

Whittaker teaches journalism and new media at Falmouth College of Arts in Cornwall, U.K. and his background in computer journalism shines through. This book is very well written and both authoritative and informative while being easy to read. It is split into four sections and I’ll discuss each separately.

First the "Introduction and Contexts" walks briefly through what is meant by cyberspace, something of the history of the Internet and the Web and looks at the physical Web, with reference to Dodge and Kitchen’s Atlas of Cyberspace, and the big players on the Web — Microsoft, AOL–Time Warner and the Open Source movement. If this book has a fault it is some of the errors and omissions or areas of ambiguity, and this is where I hit the first. On page 44, Whittaker discusses Open Source, Linux and BSD and includes the line "(Open source) ..., has also been fundamental to the expansion of cyberspace, enabling systems such as TCP/IP and the Web."

While nobody can argue the importance of Open Source in the expansion of cyberspace, this suggests that TCP/IP grew from the Open Source movement when, in reality, Linus Torvalds was three years old when Cerf and Kahn released their paper on TCP/IP in 1974 and fourteen when TCP/IP was built into the University of Berkley version of UNIX. Further on in the book, TCP/IP is discussed in a little more detail and mention is made of Paul Baran, but poor old Cerf and Kahn are not mentioned anywhere.

I’m not nit–picking nor is this a major component of the book or even what the book sets out to cover — but if it is used as a starting point for discovering the history of cyberspace, this element might be misleading. This section goes on to look at the current connection technologies available, with particular reference to the U.K., and deals well with the recent changes in the ISP landscape as well as explaining how companies like Freeserve found themselves part of a French conglomerate and based in Madeira.

The next section is "Using cyberspace" and is as much as anyone might need to introduce them to getting online, dealing with the various connection options and the software "toolkit" needed before dealing briefly with a history of computing technologies including the Difference Engine. Again though I hit a stumbling block, small and minor but, crucially inaccurate. On page 83 Whittaker suggests that Kildall’s CP/M, was rewritten for the 8086 and 8088 processors and used by IBM as Q–DOS before being purchased by Microsoft and renamed MS–DOS. In fact Q–DOS was developed by Tim Paterson of Seattle Computer Products and Gates bought it and licensed it on to IBM because Kildall (or his wife) wouldn’t sign the IBM non–disclosure agreement and thus CP/M didn’t get to be the OS of choice on IBM’s PC. CP/M became DR–DOS. A small but significant point in the development of the PC we use today and also a critical issue in the growth and dominance of Microsoft. This section also includes a great overview of software technologies including vector and raster imaging, compression, modelling and rendering in 2D and 3D and animation. The chapter on Webcasting and digital broadcasting encompasses the differing technologies on both sides of the Atlantic (PAL and NTSC) as well as HDTV and discusses the differences in various Webcast codecs. The "coming of age" events in the story of Webcasting are dealt with here before the discussion moves on to gaming — again a great introduction to how digital games have grown from Spacewar on the PDP–1 at MIT in 1961 to online, immersive worlds such as Doom, Quake and Unreal.

The "Reading/writing cyberspace" section deals with using the Internet for research before moving into an excellent section about online journalism and the impact this has had on print journalism and the various influences and controlling interests in both delivery modes. Jayne Armstrong, co–programme leader of the BA(Hons) Journalism at Falmouth College of Arts has contributed a superb chapter on Internet forms and e–zines and then Whittaker’s section on writing for the Web covers style, the shape of an online publishing team and some of the legal issues surrounding writers and libel. This is sure–footed and informative. There follows a section about the practical aspects of creating Web pages and sites using HTML and Dreamweaver MX and, I’m afraid, some syntax errors have crept in. On page 217 "src" is described as a tag rather than as an attribute of a tag (this is repeated elsewhere) and on page 219 "src" is suggested as the attribute used in the <a> tag for creating hyperlinks rather than "href". The section about creating and linking to databases and using FTP to publish sites is excellent.

The final section is "Regulations, institutions and ethics" and is a tour de force. It brings together the current, post–9/11, state of cyberspace regulation with reference to U.S. and U.K. legislation as well as the work of the Campaign for Digital Rights and other pressure groups. E–commerce and online communities are dealt with as are the dotcom boom (and bust), spam and viral marketing. The book also includes a comprehensive glossary, a Web resources section and 13 pages of bibliography.

I really enjoyed this book and in many areas found it a great source of information that I would otherwise not have found. If the target audience is students of journalism or those interested in learning a little about a great deal this is the book for them. The errors and omissions don’t detract from a very useful book but might have been caught with some sympathetic proof–reading by a technologist. The writing style is crisp and works well in both the "how to" sections as well as the critical analysis sections and it is that rare animal, an easy to read book about some complex technologies and ideas – it easily meets my "good book to read over a pint" test. I have already recommended it to non–technical friends and I think it’s a great start point for anyone who wants to cut through the hype and actually start living and working in cyberspace. — Nigel Gibson. End of Review


Contents Index

Copyright ©2004, First Monday