Cataloging the Web

Making the WWW More Accessible

Recent Postings





Subject: Re: Cataloging the Web
Date: Tue, 29 Apr 1997 20:40:53 +0100
From: Robert Cunnew robert@cunnew.demon.co.uk
Organization: N/A
Newsgroups: bit.listserv.autocat,schl.sig.lmnet

In article 3364D3F3.3194@worldnet.att.net, "Eyler Coates, Sr."
eyler.coates@worldnet.att.net writes
>
>> Given the undesirability of precoordination in subject indexing, I
>> wonder whether there is a need for (5) Forms, taken from a short list of
>> appropriate terms, eg information service, promotional material, images,
>> sounds, software (form not subject, ie downloadable), news, directory,
>> discussion forum ...
>
>I regrettably must confess that I am unfamiliar with your frame of
>reference here and am uncertain of your meaning.  I had suggested five
>subject headings as a maximum, though if the Keyword option were
>selected, it might seem that more than five would be required.  Are we
>talking about the same thing?

Sorry, I was suggesting that we need to categorise Web pages by the form
the information is in (eg "Images") as well as the subject of the
information (eg "Hale-Bopp").  Systems like LCSH mix the two functions
but if you're using postcoordinate indexing you really need to separate
them.
--
Robert Cunnew
Librarian, Chartered Insurance Institute, London




Subject: Re: Cataloging the Web Date: Wed, 30 Apr 97 13:29:52 +0000 From: Sonja Scarseth scarseth@admin.aurora.edu To: "Eyler Coates, Sr." eyler.coates@worldnet.att.net CC: AUTOCAT@LISTSERV.ACSU.BUFFALO.EDU On Tue, 29 Apr 1997, Eyler Coates, Sr. wrote: > > > > >(4) SUBJECTS. This would require the existence of a standard list of > > >subject headings, probably made available by the Search Engine (see > > >below), from which Webmasters could select (with helps) up to five > > >appropriate headings for their own page and put them in a tag: > > > <meta name="subjects" content="--------"> > > >Rather than use something as complicated as the Library of Congress > > >Subject Headings, something less detailed such as the Sears Subject > > >Headings, would probably be sufficient for the WWW. I deleted most of this message, but Mr. Coates goes on to suggest that Webmasters develop their own keyword lists and that humans add appropriate cross references. I would suggest that their efforts be spent, rather, on working out ways of turning LCSH into keywords, and take advantage of all the work that has already done on cross referencing. Why reinvent the wheel? x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x Sonja Scarseth 630/844-5443 The Word sets us free scarseth@admin.aurora.edu Aurora University Library, 347 S. Gladstone Ave., Aurora IL 60506 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Subject: Re: Cataloging the Web Date: Wed, 30 Apr 1997 10:48:51 -0700 From: "Eyler Coates, Sr." eyler.coates@worldnet.att.net> Organization: http://www.webspawner.com/users/EylerCoates/ To: triciab@tab.com References: 2.2.32.19970429201624.006c8048@round-rock.tx.us> Thank you for your response; it contains some valuable points. I have not seen it posted on the Newsgroup, but then I have received others that were not posted, and I have seen at least one reponse on the Newsgroup to a posting without having seen the original posting! So, I thank you for sending it to me, as I would like very much to include your comments in the redaction (see signature). I am sorry if my original posting gave the misapprehension that I was summarizing the article in Slate. My posting was a response and reaction to the Slate article. Apparently, I should have made that clearer, because I notice that others also thought my posting was a summary of the Slate article. triciab@tab.com wrote: > > I'd have to take exception with, at least in part, slate's statement: " A > system similar to that provided by library services will not work, because > so many of the elements are different. For example, classification systems > (such as Dewey, LC) are irrelevant because they are designed for grouping > physical objects on > shelves for browsing, access and retrieval." Since I browse the web much as > I would a library shelf and I'd certainly appreciate having like > (subject-wise) items groups together. In essence isn't that what Yahoo does. > This does not mean that I don't heartily agree with the fact that search and > retrieving capabilities on the web need to be greatly improved and operate > more efficiently and many of the following suggestions have merit. > But, if an item were classified and that classification number were > searchable then I would be able to call up additional items of the same > classification. I regularity use my libraries OPAC to search by call number > (being a cataloger these come to mind), but I submit that I know any number > of reference librarians that do the same thing (usually because the correct > LCSH term doesn't come to mind and the cataloger forgot to make a subject > cross reference that the reference librarian or Joe Q. Public is more likely > to search on; i.e., Job interviews instead of Employment interviews. There is no doubt that having a classification system similar to Dewey or LC would be a plus, and certainnly not a minus; the real question may come down to, Is it a vital asset? Can it be done without? The general public is not much affected by a library's classification system, except that when they find a part of the library that has a book in which they are interested, they are likely to find other related books in the same location. No doubt, having classification numbers would be a help to *librarians* as they assist the public in retrieving materials of interest off the Web. But those numbers would be pretty much meaningless to the general public. Moreover, pulling up related materials only through a keyword or subject approach would, for the public, offer only the slightest disadvantage over having also a class number approach, if any. For them, subject headings are one way of grouping related items, and class numbers are another. It seems to come down to weighing one interest against another. Admittedly, the statement in question was made from the perspective of the whole system that is being proposed, and especially from the perspective of the Web user at home, surfing alone. In terms of a total system highly dependent on the Webmaster for the basic grunt work of cataloging his own site, assigning classification numbers would be out of the picture. But the possibility of having a professional cataloger provide classification numbers as an *addition* to the other elements, possibly at some later stage, might be more reasonable. But I suspect that in the non-library aspect of a Search Engine's world, this is one extra that would be quickly dropped for expediency's sake. Eyler Coates ============================================================= All of the postings to this thread are available in a redacted form, without repetitions and irrelevant matter, at: Cataloging the Web Making the WWW More Accessible http://www.geocities.com/Athens/Forum/1683/cwindex.htm ==============================================================
Subject: Re: Cataloging the Web Date: Wed, 30 Apr 97 15:47:07 +0000 From: Steve Shadle shadle@u.washington.edu> To: "AUTOCAT: Library cataloging and authorities discussion group" AUTOCAT@LISTSERV.ACSU.BUFFALO.EDU>, "Eyler Coates, Sr." eyler.coates@worldnet.att.net> CC: INTERCAT@oclc.org On Tue, 29 Apr 1997, Eyler Coates, Sr. wrote: > Unfortunately, the WWW is too big (and growing), too wild, and too > mutable to be limited to the facilities appropriate for a former age. We > are talking about an explosion, and we are just now at the beginning of > it. We have materials being produced and made available so easily and so > cheaply, no system that relies on the passage of such an enormous flow of > materials through the hands of professionals working on them one at a > time will meet the challenge of this new age. At best, such a system > would always mean that only a small portion of the available materials > would be processed, thus failing to provide technological advances in > cataloging to correspond with the technological advances in data > production. But currently we only provide access (through library and other bibliographic information services) to a portion of available materials, but these are materials that have (at some level) been evaluated as useful for our users or germane to our missions. (Contrary to popular belief, the Library of Congress does *not* contain every book ever published ;-) I wholeheartedly agree that user-supplied metadata can bring order to the Internet universe. But if the Web is growing as exponentially as presumed, then its even *more* important that we lay the groundwork for the ability to enable a person to: * find a work by a given author (Where's *my* John Smith? Who are Carr/Holt/Kellow/Plaidy/Tate? Which one is Bill Clinton: William E., William J. or William R.?) * identify the intellectual work (vs. the manifestation) (Hamlet, the Apocryhpa or Beethoven's Eroica by any other name) * provide information about the bibliographic relations between works (editions, revisions) * identify the genre/format of the work (web site vs. text document; review article vs. research report). Our current catalogs attempt these functions with some degree of success. For those items which are significant to our users, shouldn't we provide this same level of identification/control? I'm not saying we need to do it in the same way. I think the Dublin Core comes closer to providing the elements necessary for us to provide this level of description than Eyler's suggested metadata structure (although his elements can serve as a basis for basic description; the Dublin Core is expansible after all). I applaud Eyler's suggestions for metadata subject authority and think the use of user-supplied data for the Web universe is better than we've ever been able to do in print. But I can't help but think that there's more to it than just subjects and that we can still provide a selection (not everything that's submitted to Yahoo gets in) and identification role as we do with print resources. --Steve Apologies for quoting all of Eyler's message, but I wanted those on InterCat to have the context of his comments. Steve Shadle Serials Cataloger University of Washington Libraries shadle@u.washington.edu
Subject: Re: Cataloging the Web Date: Wed, 30 Apr 1997 08:02:12 -0400 From: "Rebecca S. Guenther" rgue@loc.gov> Newsgroups: bit.listserv.autocat > > And can anyone out there tell me what the current status of the Dublin > Core (and other metadata schemes) are in terms of development, > establishment and actual acceptance in the community? > See the paper submitted to the USMARC Advisory Group at Midwinter 1997: "Discussion Paper No. 99: Metadata, Dublin Core, and USMARC: a review of current efforts" gopher://marvel.loc.gov/00/.listarch/usmarc/dp99.doc There was another workshop in Canberra, Australia in March 1997 that worked further on refinement of the data elements and syntax for embedding META tags in HTML. ^^ Rebecca S. Guenther ^^ Senior MARC Standards Specialist ^^ Network Development and MARC Standards Office ^^ Library of Congress ^^ Washington, DC 20540-4020 ^^ (202) 707-5092 (voice) (202) 707-0115 (FAX) ^^ rgue@loc.gov
Subject: Re: Cataloging the Web Date: Wed, 30 Apr 97 01:43:44 +0000 From: "Ruth Lewis" Ruth.Lewis@natlib.govt.nz> To: eyler.coates@worldnet.att.net CC: AUTOCAT@listserv.acsu.buffalo.edu On 29 Apr 97 at 20:12, Eyler Coates, Sr. wrote: > Robert Cunnew wrote: > > In article 336372F2.7388@worldnet.att.net>, "Eyler Coates, Sr." > > eyler.coates@worldnet.att.net> writes > > > > >(4) SUBJECTS. This would require the existence of a standard list of > > >subject headings, probably made available by the Search Engine (see > > >below), from which Webmasters could select (with helps) up to five > > >appropriate headings for their own page and put them in a tag: > > > <meta name="subjects" content="--------"> > > >Rather than use something as complicated as the Library of Congress > > >Subject Headings, something less detailed such as the Sears Subject > > >Headings, would probably be sufficient for the WWW. > > > > I'm not familiar with Sears, but isn't it - like LC - precoordinate? > > Please don't let's suggest that the Web is cluttered up with nineteenth > > century notions of subject access designed for catalogue cards. Simple > > postcoordinate terms are what is required, eg term 1, Libraries, term 2, > > United States, *not* "Libraries - United States" or whatever Sears has > > to offer. Search engines may not be perfect but they *can* do Boolean, > > even if it's often implicit rather than explicit. > > These are excellent points. I received an email response that suggested > the possibility of keywords as an alternative to either Sears or LCSH. > The matter of subject headings or keywords seems to be a crucial part of > the system. If this general scheme were adopted, it seems that the > "keyword" option might be the most desirable. It may be that a Search > Engine could provide a standard list of Keywords created from those > actually used by Webmasters, and that this list could serve as a > reference list to maintain a level of uniformity. Webmasters could then > create new Keywords if there were none on the list adequate for their > needs. Thus, the Keyword List would be constantly brought up to date. > Such a list would also be useful for user/researchers while browsing. In > addition, a human being (cataloger) could monitor the list, creating > appropriate "See Also" references as part of the Search Engine's > offerings and perhaps even making redundant Keywords, created by > Webmasters (who are necessarily amateur catalogers), all refer to the > same items. This is a controlled vocabulary, rather than keywords, is it not? I always understood keywords to be taken from the text/web page unaltered. Whether pre- or post-coordinated, i think some kind of controlled vocabulary is the only way to get decent subject searching. Ruth Lewis Database Quality Librarian Online Services, National Library of New Zealand ruth.lewis@natlib.govt.nz Telephone (64 4) 474 3037 Toll free 0800 736 561 Fax (64 4) 474 3042 These opinions are my own and are not necessarily National Library of New Zealand policy.
Subject: Re: Cataloging the Web Date: Tue, 29 Apr 97 20:16:24 +0000 From: triciab@tab.com To: eyler.coates@worldnet.att.net I'd have to take exception with, at least in part, slate's statement: " A system similar to that provided by library services will not work, because so many of the elements are different. For example, classification systems (such as Dewey, LC) are irrelevant because they are designed for grouping physical objects on shelves for browsing, access and retrieval." Since I browse the web much as I would a library shelf and I'd certainly appreciate having like (subject-wise) items groups together. In essence isn't that what Yahoo does. This does not mean that I don't heartily agree with the fact that search and retrieving capabilities on the web need to be greatly improved and operate more efficiently and many of the following suggestions have merit. But, if an item were classified and that classification number were searchable then I would be able to call up additional items of the same classification. I regularity use my libraries OPAC to search by call number (being a cataloger these come to mind), but I submit that I know any number of reference librarians that do the same thing (usually because the correct LCSH term doesn't come to mind and the cataloger forgot to make a subject cross reference that the reference librarian or Joe Q. Public is more likely to search on; i.e., Job interviews instead of Employment interviews. Tricia Brauer Round Rock Public Library 3216 E. Main St. Round Rock, Tx 78664 (512) 218-7007 (512) 218-7061 FAX triciab@round-rock.tx.us
Subject: Re: Cataloging the Web Date: Tue, 29 Apr 1997 07:51:24 +1100 From: Hal Cain hecain@ormond.unimelb.edu.au Reply-To: Hal.Cain@ormond.unimelb.edu.au Organization: Joint Theological Library, Parkville, Victoria, Australia. Newsgroups: bit.listserv.autocat References: 199704281550.BAA22733@gateway.ormond.unimelb.edu.au Steve Shadle wrote: > > I agree with much of what is presented in the article summary, but I do > have a couple comments that I would like to hear other people's thoughts > on. > > > different. For example, classification systems (such as Dewey, LC) are > > irrelevant because they are designed for grouping physical objects on > > shelves for browsing, access and retrieval. Unnecessary elements only > > My understanding is that the use of classification *solely* for grouping > physical objectives is a North American (or at least non-European) > practice and that European libraries more frequently used classed catalogs > and that it is not an uncommon practice to assign multiple classifications > to a work. Until the recent cessation, _Australian National Bibliography_ in print form was a classified (DDC) list, with occasional secondary entries incorporated, with indexes. The substitute monthly record file is also available in classified order. It's a useful sequence for a general selection tool and for subject awareness short of SDI. > > of retrieval. If users want more complete information, they can click on > > the document itself, unlike in a library where they would need to go up > > an elevator to the fourth floor to look at the document. Therefore, Web > > Cataloging need only concern itself with retrieving a good list of mostly > > relevant documents that the user can then examine more closely. > > I had this same thought, but I've had students who disagree with this > point. I have the same feeling myself about Internet delays -- we find quite problematic delays around lunchtime and after school, when (we assume) school and tertiary students are using it in numbers. Any reasonable way of streamlining one's selection is worthwhile. > And can anyone out there tell me what the current status of the Dublin > Core (and other metadata schemes) are in terms of development, > establishment and actual acceptance in the community? > The latest I can put my hand (or mouse pointer) on is at: http://www.oclc.org:5046/research/dublin_core/ but there must be other material too. Hal Cain, Joint Theological Library, Parkville, Victoria, Australia hecain@ormond.unimelb.edu.au
Subject: Re: Cataloging the Web Date: Tue, 29 Apr 97 13:33:50 +0000 From: robertson@aztec.lib.utk.edu (Michelle Martin Robertson) To: "Eyler Coates, Sr." eyler.coates@worldnet.att.net Greetings! I find this discussion very interesting, and I have a couple of comments to make. I'll assume that someone has expanded upon the "forms" question by now; my newsreader is always behind. What concerns me most about the keyword approach is that, the way the web is growing, simply having "subjects" and even "forms" as types of keywords to identify web sites will become grossly insufficient. If web sites ever begin a trend towards higher degrees of specificity, I think we'll end up with a similar problem to the current one. Take this example: We find a web site that addresses "The Effect of Man on the Environment." The webmaster has assigned the subjects "man" and "environment." But how can we differentiate between this website and the ones on "The Effects of the Environment on Man?" It seems to me that it would be best to have a variety of possible types of "subject" entries. An "effect of" subject-tag would be very useful in this situation. Of course, most sites wouldn't use it, but to some it would be essential for proper identification. The result for the first site would be something like <subject = environment> and <effect of = man>. A "form" example (off the top of my head): You come across a site dedicated to Marie Antoinette's fictional appearances in literature. How do you differentiate between this site and one that discusses her as a historical character? According to Library of Congress practice, her name would be followed by "in literature." But if you simply provide a subject for her name and for "literature" on the web, that sounds like it might be a site that presents literature written by her, which is very misleading. With the "form" heading/tag the last example would be solved. Sites that *contain* literature could have the tag <form = literature>. Sites that are *about* literature would have tag <subject = literature>. Sites that focus on both would have both. If there is literature by Marie Antoinette at the site, there could be a tag <author = Marie Antoinette>. This would need to be distinguished from the tag for the author of the web page, though... <grin> I think that keywords are definitely the way to go on the web, for simplicity's sake. But I am sure that users would benefit from greater specificity within the keyword framework. If the subject tags are well-documented, webmasters should have plenty of incentives to use them. - Michelle On Mon, 28 Apr 1997 09:44:35 -0700, you wrote: [snipped in various places] >Robert Cunnew wrote: >> >> In article 336372F2.7388@worldnet.att.net, "Eyler Coates, Sr." >> eyler.coates@worldnet.att.net writes >> >> >(4) SUBJECTS. This would require the existence of a standard list of >> >subject headings, probably made available by the Search Engine (see >> >below), from which Webmasters could select (with helps) up to five >> >appropriate headings for their own page and put them in a tag: >> > <meta name="subjects" content="--------"> >> >> Given the undesirability of precoordination in subject indexing, I >> wonder whether there is a need for (5) Forms, taken from a short list of >> appropriate terms, eg information service, promotional material, images, >> sounds, software (form not subject, ie downloadable), news, directory, >> discussion forum ... >I regrettably must confess that I am unfamiliar with your frame of >reference here and am uncertain of your meaning. I had suggested five >subject headings as a maximum, though if the Keyword option were >selected, it might seem that more than five would be required. Are we >talking about the same thing? >Eyler Coates >============================================================= >All of the postings to this thread are available in a redacted >form, without repetitions and irrelevant matter, at: > Cataloging the Web > Making the WWW More Accessible > http://www.geocities.com/Athens/Forum/1683/cwindex.htm >============================================================== --------------------------------------------------------- Michelle Martin Robertson robertson@aztec.lib.utk.edu University of Tennessee, Knoxville Libraries

Get Previous Postings

 

Post your comments to this page:

Your name or handle:

Please include a phrase to identify the part of the text you are commenting on. It is not necessary to quote a whole section of the text.

 

Cataloging the Web: Front Page

This page hosted by GeoCities. Get your own Free Home Page.