Cataloging the Web

	Sign In Sign-Up
Cataloging the Web
Making the WWW More Accessible
Recent Postings



Subject: Re: Cataloging the Web
Date: Tue, 29 Apr 1997 20:40:53 +0100
From: Robert Cunnew robert@cunnew.demon.co.uk
Organization: N/A
Newsgroups: bit.listserv.autocat,schl.sig.lmnet

In article 3364D3F3.3194@worldnet.att.net, "Eyler Coates, Sr."
eyler.coates@worldnet.att.net writes
>
>> Given the undesirability of precoordination in subject indexing, I
>> wonder whether there is a need for (5) Forms, taken from a short list of
>> appropriate terms, eg information service, promotional material, images,
>> sounds, software (form not subject, ie downloadable), news, directory,
>> discussion forum ...
>
>I regrettably must confess that I am unfamiliar with your frame of
>reference here and am uncertain of your meaning.  I had suggested five
>subject headings as a maximum, though if the Keyword option were
>selected, it might seem that more than five would be required.  Are we
>talking about the same thing?

Sorry, I was suggesting that we need to categorise Web pages by the form
the information is in (eg "Images") as well as the subject of the
information (eg "Hale-Bopp").  Systems like LCSH mix the two functions
but if you're using postcoordinate indexing you really need to separate
them.
--
Robert Cunnew
Librarian, Chartered Insurance Institute, London






Subject: Re: Cataloging the Web
Date: Wed, 30 Apr 97 13:29:52 +0000
From: Sonja Scarseth scarseth@admin.aurora.edu
To: "Eyler Coates, Sr." eyler.coates@worldnet.att.net
CC: AUTOCAT@LISTSERV.ACSU.BUFFALO.EDU

On Tue, 29 Apr 1997, Eyler Coates, Sr. wrote:
> >
> > >(4) SUBJECTS.  This would require the existence of a standard list of
> > >subject headings, probably made available by the Search Engine (see
> > >below), from which Webmasters could select (with helps) up to five
> > >appropriate headings for their own page and put them in a tag:
> > >       <meta  name="subjects" content="--------">
> > >Rather than use something as complicated as the Library of Congress
> > >Subject Headings, something less detailed such as the Sears Subject
> > >Headings, would probably be sufficient for the WWW.

I deleted most of this message, but Mr. Coates goes on to suggest that
Webmasters develop their own keyword lists and that humans add appropriate
cross references.  I would suggest that their efforts be spent, rather, on
working out ways of turning LCSH into keywords, and take advantage of all
the work that has already done on cross referencing.  Why reinvent the
wheel?

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Sonja Scarseth                                    630/844-5443
The Word sets us free                   scarseth@admin.aurora.edu
Aurora University Library, 347 S. Gladstone Ave., Aurora IL 60506
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x






Subject: Re: Cataloging the Web
Date: Wed, 30 Apr 1997 10:48:51 -0700
From: "Eyler Coates, Sr." eyler.coates@worldnet.att.net>
Organization: http://www.webspawner.com/users/EylerCoates/
To: triciab@tab.com
References: 2.2.32.19970429201624.006c8048@round-rock.tx.us>

Thank you for your response; it contains some valuable points.  I have
not seen it posted on the Newsgroup, but then I have received others that
were not posted, and I have seen at least one reponse on the Newsgroup to
a posting without having seen the original posting!  So, I thank you for
sending it to me, as I would like very much to include your comments in
the redaction (see signature).

I am sorry if my original posting gave the misapprehension that I was
summarizing the article in Slate.  My posting was a response and reaction
to the Slate article.  Apparently, I should have made that clearer,
because I notice that others also thought my posting was a summary of the
Slate article.

triciab@tab.com wrote:
>
> I'd have to take exception with, at least in part, slate's statement: " A
> system similar to that provided by library services will not work, because
> so many of the elements are different.  For example, classification systems
> (such as Dewey, LC) are irrelevant because they are designed for grouping
> physical objects on
> shelves for browsing, access and retrieval." Since I browse the web much as
> I would a library shelf and I'd certainly appreciate having like
> (subject-wise) items groups together. In essence isn't that what Yahoo does.
> This does not mean that I don't heartily agree with the fact that search and
> retrieving capabilities on the web need to be greatly improved and operate
> more efficiently and many of the following suggestions have merit.
> But, if an item were classified and that classification number were
> searchable then I would be able to call up additional items of the same
> classification.  I regularity use my libraries OPAC to search by call number
> (being a cataloger these come to mind), but I submit that I know any number
> of reference librarians that do the same thing (usually because the correct
> LCSH term doesn't come to mind and the cataloger forgot to make a subject
> cross reference that the reference librarian or Joe Q. Public is more likely
> to search on; i.e., Job interviews instead of Employment interviews.

There is no doubt that having a classification system similar to Dewey or
LC would be a plus, and certainnly not a minus; the real question may
come down to, Is it a vital asset?  Can it be done without?  The general
public is not much affected by a library's  classification system, except
that when they find a part of the library that has a book in which they
are interested, they are likely to find other related books in the same
location.  No doubt, having classification numbers would be a help to
*librarians* as they assist the public in retrieving materials of
interest off the Web.  But those numbers would be pretty much meaningless
to the general public.  Moreover, pulling up related materials only
through a keyword or subject approach would, for the public, offer only
the slightest disadvantage over having also a class number approach, if
any.  For them, subject headings are one way of grouping related items,
and class numbers are another.  It seems to come down to weighing one
interest against another.  Admittedly, the statement in question was made
from the perspective of the whole system that is being proposed, and
especially from the perspective of the Web user at home, surfing alone.
In terms of a total system highly dependent on the Webmaster for the
basic grunt work of cataloging his own site, assigning classification
numbers would be out of the picture.  But the possibility of having a
professional cataloger provide classification numbers as an *addition* to
the other elements, possibly at some later stage, might be more
reasonable.  But I suspect that in the non-library aspect of a Search
Engine's world, this is one extra that would be quickly dropped for
expediency's sake.

Eyler Coates

=============================================================
All of the postings to this thread are available in a redacted
form, without repetitions and irrelevant matter, at:

                     Cataloging the Web
                Making the WWW More Accessible

   http://www.geocities.com/Athens/Forum/1683/cwindex.htm

==============================================================






Subject: Re: Cataloging the Web
Date: Wed, 30 Apr 97 15:47:07 +0000
From: Steve Shadle shadle@u.washington.edu>
To: "AUTOCAT: Library cataloging and authorities discussion group" AUTOCAT@LISTSERV.ACSU.BUFFALO.EDU>,
     "Eyler Coates, Sr." eyler.coates@worldnet.att.net>
CC: INTERCAT@oclc.org

On Tue, 29 Apr 1997, Eyler Coates, Sr. wrote:

> Unfortunately, the WWW is too big (and growing), too wild, and too
> mutable to be limited to the facilities appropriate for a former age.  We
> are talking about an explosion, and we are just now at the beginning of
> it.  We have materials being produced and made available so easily and so
> cheaply, no system that relies on the passage of such an enormous flow of
> materials through the hands of professionals working on them one at a
> time will meet the challenge of this new age.  At best, such a system
> would always mean that only a small portion of the available materials
> would be processed, thus failing to provide technological advances in
> cataloging to correspond with the technological advances in data
> production.

But currently we only provide access (through library and other
bibliographic information services) to a portion of available materials,
but these are materials that have (at some level) been evaluated as useful
for our users or germane to our missions.  (Contrary to popular belief,
the Library of Congress does *not* contain every book ever published  ;-)

I wholeheartedly agree that user-supplied metadata can bring order to the
Internet universe.  But if the Web is growing as exponentially as
presumed, then its even *more* important that we lay the groundwork for
the ability to enable a person to:

* find a work by a given author (Where's *my* John Smith?  Who are
Carr/Holt/Kellow/Plaidy/Tate?  Which one is Bill Clinton: William E.,
William J. or William R.?)
* identify the intellectual work (vs. the manifestation) (Hamlet, the
Apocryhpa or Beethoven's Eroica by any other name)
* provide information about the bibliographic relations between works
(editions, revisions)
* identify the genre/format of the work (web site vs. text document;
review article vs. research report).

Our current catalogs attempt these functions with some degree of success.
For those items which are significant to our users, shouldn't we provide
this same level of identification/control?  I'm not saying we need to do
it in the same way.  I think the Dublin Core comes closer to providing the
elements necessary for us to provide this level of description than
Eyler's suggested metadata structure (although his elements can serve as a
basis for basic description; the Dublin Core is expansible after all).

I applaud Eyler's suggestions for metadata subject authority and think the
use of user-supplied data for the Web universe is better than we've ever
been able to do in print.  But I can't help but think that there's more to
it than just subjects and that we can still provide a selection (not
everything that's submitted to Yahoo gets in) and identification role as
we do with print resources. --Steve

Apologies for quoting all of Eyler's message, but I wanted those on
InterCat to have the context of his comments.

Steve Shadle
Serials Cataloger
University of Washington Libraries
shadle@u.washington.edu






Subject: Re: Cataloging the Web
Date: Wed, 30 Apr 1997 08:02:12 -0400
From: "Rebecca S. Guenther" rgue@loc.gov>
Newsgroups: bit.listserv.autocat

>
> And can anyone out there tell me what the current status of the Dublin
> Core (and other metadata schemes) are in terms of development,
> establishment and actual acceptance in the community?
>
See the paper submitted to the USMARC Advisory Group at Midwinter 1997:
"Discussion Paper No. 99: Metadata, Dublin Core, and USMARC: a review of
current efforts"
gopher://marvel.loc.gov/00/.listarch/usmarc/dp99.doc

There was another workshop in Canberra, Australia in March 1997 that
worked further on refinement of the data elements and syntax for embedding
META tags in HTML.

^^  Rebecca S. Guenther
^^  Senior MARC Standards Specialist
^^  Network Development and MARC Standards Office
^^  Library of Congress
^^  Washington, DC 20540-4020
^^  (202) 707-5092 (voice)    (202) 707-0115 (FAX)
^^  rgue@loc.gov






Subject: Re: Cataloging the Web
Date: Wed, 30 Apr 97 01:43:44 +0000
From: "Ruth Lewis" Ruth.Lewis@natlib.govt.nz>
To: eyler.coates@worldnet.att.net
CC: AUTOCAT@listserv.acsu.buffalo.edu

On 29 Apr 97 at 20:12, Eyler Coates, Sr. wrote:

> Robert Cunnew wrote:
> > In article 336372F2.7388@worldnet.att.net>, "Eyler Coates, Sr."
> > eyler.coates@worldnet.att.net> writes
> >
> > >(4) SUBJECTS.  This would require the existence of a standard list of
> > >subject headings, probably made available by the Search Engine (see
> > >below), from which Webmasters could select (with helps) up to five
> > >appropriate headings for their own page and put them in a tag:
> > >       <meta  name="subjects" content="--------">
> > >Rather than use something as complicated as the Library of Congress
> > >Subject Headings, something less detailed such as the Sears Subject
> > >Headings, would probably be sufficient for the WWW.
> >
> > I'm not familiar with Sears, but isn't it - like LC - precoordinate?
> > Please don't let's suggest that the Web is cluttered up with nineteenth
> > century notions of subject access designed for catalogue cards.  Simple
> > postcoordinate terms are what is required, eg term 1, Libraries, term 2,
> > United States, *not* "Libraries - United States" or whatever Sears has
> > to offer.  Search engines may not be perfect but they *can* do Boolean,
> > even if it's often implicit rather than explicit.
>
> These are excellent points.  I received an email response that suggested
> the possibility of keywords as an alternative to either Sears or LCSH.
> The matter of subject headings or keywords seems to be a crucial part of
> the system.  If this general scheme were adopted, it seems that the
> "keyword" option might be the most desirable.  It may be that a Search
> Engine could provide a standard list of Keywords created from those
> actually used by Webmasters, and that this list could serve as a
> reference list to maintain a level of uniformity.  Webmasters could then
> create new Keywords if there were none on the list adequate for their
> needs.  Thus, the Keyword List would be constantly brought up to date.
> Such a list would also be useful for user/researchers while browsing.  In
> addition, a human being (cataloger) could monitor the list, creating
> appropriate "See Also" references as part of the Search Engine's
> offerings and perhaps even making redundant Keywords, created by
> Webmasters (who are necessarily amateur catalogers), all refer to the
> same items.

This is a controlled vocabulary, rather than keywords, is it not?   I
always understood keywords to be taken from the text/web page
unaltered.  Whether pre- or post-coordinated, i think some kind of
controlled vocabulary is the only way to get decent subject
searching.

Ruth Lewis
Database Quality Librarian
Online Services, National Library of New Zealand
ruth.lewis@natlib.govt.nz
Telephone (64 4) 474 3037
Toll free 0800 736 561
Fax (64 4) 474 3042

These opinions are my own and are not necessarily National Library of New Zealand policy.





Subject: Re: Cataloging the Web
Date: Tue, 29 Apr 97 20:16:24 +0000
From: triciab@tab.com
To: eyler.coates@worldnet.att.net

I'd have to take exception with, at least in part, slate's statement: " A
system similar to that provided by library services will not work, because
so many of the elements are different.  For example, classification systems
(such as Dewey, LC) are irrelevant because they are designed for grouping
physical objects on
shelves for browsing, access and retrieval." Since I browse the web much as
I would a library shelf and I'd certainly appreciate having like
(subject-wise) items groups together. In essence isn't that what Yahoo does.
This does not mean that I don't heartily agree with the fact that search and
retrieving capabilities on the web need to be greatly improved and operate
more efficiently and many of the following suggestions have merit.
But, if an item were classified and that classification number were
searchable then I would be able to call up additional items of the same
classification.  I regularity use my libraries OPAC to search by call number
(being a cataloger these come to mind), but I submit that I know any number
of reference librarians that do the same thing (usually because the correct
LCSH term doesn't come to mind and the cataloger forgot to make a subject
cross reference that the reference librarian or Joe Q. Public is more likely
to search on; i.e., Job interviews instead of Employment interviews.

Tricia Brauer
Round Rock Public Library
3216 E. Main St.
Round Rock, Tx 78664

(512) 218-7007
(512) 218-7061 FAX
triciab@round-rock.tx.us






Subject: Re: Cataloging the Web
Date: Tue, 29 Apr 1997 07:51:24 +1100
From: Hal Cain hecain@ormond.unimelb.edu.au
Reply-To: Hal.Cain@ormond.unimelb.edu.au
Organization: Joint Theological Library, Parkville, Victoria, Australia.
Newsgroups: bit.listserv.autocat
References: 199704281550.BAA22733@gateway.ormond.unimelb.edu.au

Steve Shadle wrote:
>
> I agree with much of what is presented in the article summary, but I do
> have a couple comments that I would like to hear other people's thoughts
> on.
>
> > different.  For example, classification systems (such as Dewey, LC) are
> > irrelevant because they are designed for grouping physical objects on
> > shelves for browsing, access and retrieval.  Unnecessary elements only
>
> My understanding is that the use of classification *solely* for grouping
> physical objectives is a North American (or at least non-European)
> practice and that European libraries more frequently used classed catalogs
> and that it is not an uncommon practice to assign multiple classifications
> to a work.

Until the recent cessation, _Australian National Bibliography_ in print
form was a classified (DDC) list, with occasional secondary entries
incorporated, with indexes.  The substitute monthly record file is also
available in classified order.  It's a useful sequence for a general
selection tool and for subject awareness short of SDI.

> > of retrieval.  If users want more complete information, they can click on
> > the document itself, unlike in a library where they would need to go up
> > an elevator to the fourth floor to look at the document.  Therefore, Web
> > Cataloging need only concern itself with retrieving a good list of mostly
> > relevant documents that the user can then examine more closely.
>
> I had this same thought, but I've had students who disagree with this
> point.

I have the same feeling myself about Internet delays -- we find quite
problematic delays around lunchtime and after school, when (we assume)
school and tertiary students are using it in numbers.  Any reasonable
way of streamlining one's selection is worthwhile.

> And can anyone out there tell me what the current status of the Dublin
> Core (and other metadata schemes) are in terms of development,
> establishment and actual acceptance in the community?
>
The latest I can put my hand (or mouse pointer) on is at:
http://www.oclc.org:5046/research/dublin_core/
but there must be other material too.

Hal Cain, Joint Theological Library, Parkville, Victoria, Australia
hecain@ormond.unimelb.edu.au






Subject: Re: Cataloging the Web
Date: Tue, 29 Apr 97 13:33:50 +0000
From: robertson@aztec.lib.utk.edu (Michelle Martin Robertson)
To: "Eyler Coates, Sr." eyler.coates@worldnet.att.net

Greetings!

I find this discussion very interesting, and I have a couple of
comments to make.  I'll assume that someone has expanded upon the
"forms" question by now; my newsreader is always behind.

What concerns me most about the keyword approach is that, the way the
web is growing, simply having "subjects" and even "forms" as types of
keywords to identify web sites will become grossly insufficient.  If
web sites ever begin a trend towards higher degrees of specificity, I
think we'll end up with a similar problem to the current one.  Take
this example:  We find a web site that addresses "The Effect of Man on
the Environment."  The webmaster has assigned the subjects "man" and
"environment."  But how can we differentiate between this website and
the ones on "The Effects of the Environment on Man?"  It seems to me
that it would be best to have a variety of possible types of "subject"
entries.  An "effect of" subject-tag would be very useful in this
situation.  Of course, most sites wouldn't use it, but to some it
would be essential for proper identification.   The result for the
first site would be something like <subject = environment> and 
<effect of = man>.

A "form" example (off the top of my head):  You come across a site
dedicated to Marie Antoinette's fictional appearances in literature.
How do you differentiate between this site and one that discusses her
as a historical character?   According to Library of Congress
practice, her name would be followed by "in literature."  But if you
simply provide a subject for her name and for "literature" on the web,
that sounds like it might be a site that presents literature written
by her, which is very misleading.

With the "form" heading/tag the last example would be solved.  Sites
that *contain* literature could have the tag <form = literature>.
Sites that are *about* literature would have tag <subject =
literature>.   Sites that focus on both would have both.  If there is
literature by Marie Antoinette at the site, there could be a tag
<author = Marie Antoinette>.  This would need to be distinguished from
the tag for the author of the web page, though... <grin>

I think that keywords are definitely the way to go on the web, for
simplicity's sake.  But I am sure that users would benefit from
greater specificity within the keyword framework.  If the subject tags
are well-documented, webmasters should have plenty of  incentives to
use them.

- Michelle

On Mon, 28 Apr 1997 09:44:35 -0700, you wrote:

[snipped in various places]
>Robert Cunnew wrote:
>>
>> In article 336372F2.7388@worldnet.att.net, "Eyler Coates, Sr."
>> eyler.coates@worldnet.att.net writes
>>
>> >(4) SUBJECTS.  This would require the existence of a standard list of
>> >subject headings, probably made available by the Search Engine (see
>> >below), from which Webmasters could select (with helps) up to five
>> >appropriate headings for their own page and put them in a tag:
>> >       <meta  name="subjects" content="--------">
>>
>> Given the undesirability of precoordination in subject indexing, I
>> wonder whether there is a need for (5) Forms, taken from a short list of
>> appropriate terms, eg information service, promotional material, images,
>> sounds, software (form not subject, ie downloadable), news, directory,
>> discussion forum ...

>I regrettably must confess that I am unfamiliar with your frame of
>reference here and am uncertain of your meaning.  I had suggested five
>subject headings as a maximum, though if the Keyword option were
>selected, it might seem that more than five would be required.  Are we
>talking about the same thing?

>Eyler Coates

>=============================================================
>All of the postings to this thread are available in a redacted
>form, without repetitions and irrelevant matter, at:

>                     Cataloging the Web
>                Making the WWW More Accessible

>   http://www.geocities.com/Athens/Forum/1683/cwindex.htm

>==============================================================

---------------------------------------------------------
Michelle Martin Robertson     robertson@aztec.lib.utk.edu
University of Tennessee, Knoxville Libraries
Get Previous Postings

Post your comments to this page:

Cataloging the Web: Front Page
This page hosted by GeoCities. Get your own Free Home Page.