Cataloging the Web

	Sign In Sign-Up

Cataloging the Web
Making the WWW More Accessible

1. Cataloging the Web
 
Eyler Coates
A practical system must take into account the nature of materials on the Web, the people who create it, the search engines that find it, and the needs of the people who research it. A system similar to that provided by library services will not work, because so many of the elements are different. For example, classification systems (such as Dewey, LC) are irrelevant because they are designed for grouping physical objects on shelves for browsing, access and retrieval.
 
Tricia Brauer
I'd have to take exception with, at least in part, the last two sentences, since I browse the web much as I would a library shelf and I'd certainly appreciate having like (subject-wise) items grouped together. In essence isn't that what Yahoo does? This does not mean that I don't heartily agree with the fact that search and retrieving capabilities on the web need to be greatly improved and operate more efficiently, and many of the following suggestions have merit. But, if an item were classified and that classification number were searchable then I would be able to call up additional items of the same classification. I regularly use my library's OPAC to search by call number (being a cataloger these come to mind), but I submit that I know any number of reference librarians that do the same thing, usually because the correct LCSH term doesn't come to mind and the cataloger forgot to make a subject cross reference that the reference librarian or Joe Q. Public is more likely to search on; e.g., Job interviews instead of Employment interviews.
 
Eyler Coates
There is no doubt that having a classification system similar to Dewey or LC would be a plus, and certainly not a minus; the real question may come down to, Is it a vital asset? Can it be done without? The general public is not much affected by a library's classification system, except that when they find a part of the library that has a book in which they are interested, they are likely to find other related books in the same location. No doubt, having classification numbers would be a help to librarians as they assist the public in retrieving materials of interest off the Web. But those numbers would be pretty much meaningless to the general public. Moreover, pulling up related materials only through a keyword or subject approach would, for the public, offer only the slightest disadvantage over having also a class number approach, if any. For them, subject headings are one way of grouping related items, and class numbers are another. It seems to come down to weighing one interest against another. Admittedly, the statement in question was made from the perspective of the whole system that is being proposed, and especially from the perspective of the Web user at home, surfing alone. In terms of a total system highly dependent on the Webmaster for the basic grunt work of cataloging his own site, assigning classification numbers would be out of the picture. But the possibility of having a professional cataloger provide classification numbers as an addition to the other elements, possibly at some later stage, might be more reasonable. But I suspect that in the non-library aspect of a Search Engine's world, this is one extra that would be quickly dropped for expediency's sake.
 
Steve Shadle
One of the specific points I would like feedback on is whether there are institutions out there that feel the need to assign classification solely for subject access (i.e., for resources that don't sit on a shelf). Do catalog users use classification as a subject retrieval mechanism?
 
David Miller
Yes, if we remember that catalog users include librarians. This is unfortunately, and too often, a source of inside jokes and rib-poking, as if there was something inappropriate about librarians making their own tools easier for themselves to use. There isn't.
Many's the time when, using our simple character-based INNOPAC system, I've been able to dig up a number of additional resources for students by redirecting a subject or keyword search into the classification number index. Actually, "indexes". We use both DDC and LCC for different parts of the collection, and the respective indexes contain any classification number found in the record for an item, not only those used for shelving. In addition, I let significantly different class numbers from the same system remain in a record, unless (and this is rare) they're utterly inappropriate. So, we have really divorced classification from its confinement to shelving.
Now, the INNOPAC command says something like "See Items Nearby On Shelf." This obviously isn't always true, given what I've outlined above -- but so far the misnomer hasn't upset anyone.
So -- assign classification numbers to intangible resources. Yes, as many as you like. The shelving issue is a red herring.
 
Steve Shadle
My understanding is that the use of classification solely for grouping physical objectives is a North American (or at least non-European) practice and that European libraries more frequently used classed catalogs and that it is not an uncommon practice to assign multiple classifications to a work. Finding works on related subjects is important and unless a subject authority system (whether a simple hierarchy like YAHOO or a more complex structure like LCSH) is in place, classification can be used to facilitate this type of access.
 
Hal Cain
Until the recent cessation, Australian National Bibliography in print form was a classified (DDC) list, with occasional secondary entries incorporated, with indexes. The substitute monthly record file is also available in classified order. It's a useful sequence for a general selection tool and for subject awareness short of SDI.
 
Eyler Coates
Unnecessary elements only add unnecessary complication. Web document research doesn't need complete bibliographic data. What's really needed is an efficient means of retrieval. If users want more complete information, they can click on the document itself, unlike in a library where they would need to go up an elevator to the fourth floor to look at the document. Therefore, Web Cataloging need only concern itself with retrieving a good list of mostly relevant documents that the user can then examine more closely.
 
Steve Shadle
I had this same thought, but I've had students who disagree with this point. When servers are down, when the Net is overloaded and one can't connect to a resource for whatever reason, the catalog can serve as a much quicker mechanism for identifying and citing resources. It seems that this generation of workstation users are impatient with even a 15-second wait...getting an instant summary and brief description from a catalog record may provide a better service to a large group of users.
 
Hal Cain
I have the same feeling myself about Internet delays -- we find quite problematic delays around lunchtime and after school, when (we assume) school and tertiary students are using it in numbers. Any reasonable way of streamlining one's selection is worthwhile.
 
Eyler Coates
Another factor is the level of expertise of the people that will necessarily be doing the cataloging. Already, Web resources are vast, constantly changing, and only promise to be more so in the future. It is necessary, therefore, that the cataloging be simple enough to be done by an ordinary Webmaster and not require the services of a trained professional.
 
Steve Shadle
In my humble opinion, the use of user-supplied data would help immensely in bringing organization to the vast majority of materials on the net. However, there are some basic concepts in bibliographic description and identification (e.g., name authority) that have the potential to be useful both in web browsers and online catalogs. Cutter's principles don't become irrelevant in a networked world.
 
Eyler Coates
The basic requirements of people who search the Web for materials are (1) Titles, (2) Names, (3) Subjects, and (4) Brief Abstracts accompanying the results of searches for the first three elements. Other elements, such as dates, publishers (Webmasters?), editions, etc., could be obtained by clicking on the document itself. It is interesting that computers give incredible forms of access to data, but to date, the best that Search Engines can do has been to index every word that appears on a webpage. This, however, produces very unsatisfactory search results most of the time. If searches could be conducted via the three elements above, with an abstract included with each finding, the results would be far more satisfactory for users.
What is needed, therefore, are four page attributes, which could be included in each document's head. Two of these already are usually found on every webpage.
 
Titles, Abstracts, Names
 
(1) TITLES. Already included in the header of every Webpage:
<title>--------------</title>
(2) DESCRIPTION (Abstract). This is a standard META tag, using the designation:

<meta name="description" content="---------">
(3) NAMES. This would require a new META tag, which could include up to five names, selected appropriately by the Webmaster, of authors, editors, corporations, subjects of biographies, etc., using the designation:

<meta name="names" content="------">
Distinguishing between authors, editors, etc., would be unnecessary, because the abstract should make clear the relationship of the various names to the page content. Also, the document is right there and can be clicked on if the user wants more precise information.
 
J. McRee Elrod
Perhaps this should be qualified as personal and corporate names, to clarify that both are included, and to exclude "names" like names of chemicals, plants, and such.
 
Eyler Coates
This is a very good point and should be included in any compilation of general instructions for Webmasters.
 
Go To Part 2

Post your comments to this page:

All rights reserved. Each contributor to these pages retains the copyright to their own statements, and all quotations therefrom must be attributed to the contributor, not to the editor or any other entity.

Top of This Page | Front Page | Recent Postings Archive | To Part 2

This page hosted by GeoCities. Get your own Free Home Page.