Making the WWW More Accessible
The Initial Posting
Subject: Cataloging the Web Date: Sun, 27 Apr 1997 08:38:26 -0700 From: "Eyler Coates, Sr."
Organization: http://www.webspawner.com/users/EylerCoates/ Newsgroups: bit.listserv.autocat,schl.sig.lmnet The current issue of Slate (http://www.slate.com) contains an article by Bill Barnes in his Webhead column, "Search Me," on the inadequacy of access to "the vast resources" of the Web as compared to any library. He examines the various search engines available, and details how unsatsifactory they are. He didn't really propose a solution to the problem, but his conclusion included the following: "If the producers of every site cataloged it themselves, then Yahoo! wouldn't have a hard time keeping up with them. Of course, everyone would have to agree on standard ways to do this, and if everyone agreed, for-profit search sites like Yahoo! probably wouldn't be necessary." A lot of people have discussed this problem, but no one that I have discovered has proposed a simple, comprehensive, workable solution as yet. I posted to Slates's "The Fray" an outline of the following possible overall structure that might meet the problem. It is presented in more detail here in hopes that other people might have some input, and we can collectively arrive at a mechanism for providing better access to the WWW. CATALOGING THE WEB A practical system must take into account the nature of materials on the Web, the people who create it, the search engines that find it, and the needs of the people who research it. A system similar to that provided by library services will not work, because so many of the elements are different. For example, classification systems (such as Dewey, LC) are irrelevant because they are designed for grouping physical objects on shelves for browsing, access and retrieval. Unnecessary elements only add unnecessary complication. Web document research doesn't need complete bibliographic data. What's really needed is an efficient means of retrieval. If users want more complete information, they can click on the document itself, unlike in a library where they would need to go up an elevator to the fourth floor to look at the document. Therefore, Web Cataloging need only concern itself with retrieving a good list of mostly relevant documents that the user can then examine more closely. Another factor is the level of expertise of the people that will necessarily be doing the cataloging. Already, Web resources are vast, constantly changing, and only promise to be more so in the future. It is necessary, therefore, that the cataloging be simple enough to be done by an ordinary Webmaster and not require the services of a professional. The basic requirements of people who search the Web for materials are (1) Titles, (2) Names, (3) Subjects, and (4) Brief Abstracts accompanying the results of searches for the first three elements. Other elements, such as dates, publishers (Webmasters?), editions, etc., could be obtained by clicking on the document itself. It is interesting that computers give incredible forms of access to data, but to date, the best that Search Engines can do has been to index every word that appears on a webpage. This, however, produces very unsatisfactory search results most of the time. If searches could be conducted via the three elements above, with an abstract included with each finding, the results would be far more satisfactory for users. What is needed, therefore, are four page attributes, which could be included in each document's head. Two of these already are usually found on every webpage. (1) TITLES. Already included in the header of every Webpage: --------------(2) DESCRIPTION (Abstract). This is a standard META tag, using the designation: . (3) NAMES. This would require a new META tag, which could include up to five names, selected appropriately by the Webmaster, of authors, editors, corporations, subjects of biographies, etc., using the designation: Distinguishing between authors, editors, etc., would be unnecessary, because the abstract should make clear the relationship of the various names to the page content. Also, the document is right there and can be clicked on if the user wants more precise information. (4) SUBJECTS. This would require the existence of a standard list of subject headings, probably made available by the Search Engine (see below), from which Webmasters could select (with helps) up to five appropriate headings for their own page and put them in a tag: Rather than use something as complicated as the Library of Congress Subject Headings, something less detailed such as the Sears Subject Headings, would probably be sufficient for the WWW. The only problem is, such a list is not presently available on the Web, although it probably would not be too difficult to establish a similar list of subject headings that Webmasters could use. If some public institution supplied such a list on the WWW, all search engines could use it, and everybody would benefit from the uniformity. Simplicity combined with lots of help would be necessary, because it would require application by Webmasters without professional training. Note that the Search Engine itself would contain the equivalent of "See" and "See Also" references. "See" references would automatically bring up the referenced materials in most cases, whereas "See Also" references should present options to the user. In some cases, a search could bring up a request for more specific attributes from the user. All of these would be elements built into the search engine's program. The first requirement to initiate the system would be for some enterprising Search Engine to offer searches based on the META tags described above. This need not replace the present word searches, and it would permit the system to be introduced gradually. Then what would be required is the cooperation of Webmasters in providing the necessary META tags. Would they do it? Of course they would do it! Right now, many of them try all kinds of tricks in order to get their page recognized, such as filling the page with certain key words. The one thing that Webmasters want after going to all the trouble of creating a Webpage is for everyone to have access to it. Surely, if they were required to put in the appropriate META tags to get listed properly, you can bet they would do it. Search engines that rely on just the META tags will be easier to set up. Since a search engine would be concerned only with the data in a page's heading, this would require less storage and might even enable a search engine to visit every page on the Web, and do it more frequently. Eyler Coates -- ============================================================ Thomas Jefferson on Politics & Government http://pages.prodigy.com/jefferson_quotes Eyler Robert Coates, Sr. email@example.com ============================================================
This page hosted by GeoCities. Get your own Free Home Page.