Cataloging Guidelines

From Project Gutenberg, the first producer of free electronic books (ebooks).

Jump to: navigation, search

Contents

Possible tasks

Useful links

Catalog record review notes

Background on catalog record review

When a new etext is posted to Project Gutenberg, a corresponding catalog record is automatically created. This is great because it means there is immediate basic access to newly posted texts. However, sometimes there are things about the automatically created records which need fixing up by a human cataloger for optimal access. These notes are intended to provide a suggested procedure for fixing up a new record.

Some useful websites

Catalog record review procedure

Check names of creators, etc., to see if they need merging or modification.

It may not be possible to catch all names that need merging (say if there's an obscure pseudonym for the author already in the PG catalog, one might not know to merge that with the form on the current record), but at least look at the author in the PG author browse view, or use the catalogers' author look-up page to search on the last name to see if there's an obvious merging candidate nearby.

Also be aware that a title can be automatically linked to the wrong author. The easiest way to check may be to look up the title at LC or WorldCat and compare their author entries for the book with ours. You can use your head, too-- someone born in the 1860's didn't edit a work published in the 1850's!

For a newly-added author (with just one associated title, say), check in the LoC name authorities file to see if our form of the name is as complete as it could be (not just "Melville, H." as the author of Moby Dick for instance). Optionally, you may add one or more aliases for significantly different forms of the author's name (see Charles Dickens for an example), and perhaps a link to a Wikipedia entry or other source on the author. Aliases should have "Heading" status if they will file far from the main entry alphabetically, but "No Heading" status if they will file right next to the main entry (to reduce clutter). If our information for the author doesn't include any dates, check the LoC name authorities to see if there are some dates to add (but of course only if a name authority record you find with dates actually corresponds to our author). Note that our date fields don't accept non-numeric data, so date info like "fl. 1600-1615" can't be entered in the date fields. Suggested work-around: include that information in the name field instead. BC dates must be entered as negative numbers, and will display with "BC". If date information is unambiguous, you need only enter the birth and/or death date in the "earliest" fields. Only if there is ambiguity about one of the dates do you need to use both the "earliest" and "latest" fields. Some authors have an uncertain date, like "Du Haillan, Bernard de Girard, seigneur, 1535?-1610" or "Xuan, Ding, 1832-1880?". To make a date display with a following question mark "?" if you don't have a range of dates to enter, enter that date only in the "latest" field. To do occasional checks for authors that would benefit from this treatment (they will look better in the public display), go into the Authors interface and search for "*?"

Look at the creators/editors/translators/illustrators/etc. (if more than one) to see if some (possibly all but one) of them should be linked with "No Heading."

This reduces redundant title entries in the list of search results users get, but does not prevent access through searches on the "No Heading" person. One way to decide who gets a Heading: search the title at the Library of Congress or WorldCat and see who's in the 100 field in a record for the title. That person gets "Heading", everyone else gets "No Heading." Note that when the main author is also listed as an editor you will get an error message when you try to change their editor link from "Heading" to "No Heading" (it's not a problem if the editor is a different person). The only work-around I've found is to remove the editor link and recreate it, this time with "No Heading". Or if you feel it's redundant, just remove it.

Skim record for anything odd

An example would be a malformed autogenerated note field (it will look obviously screwy). If you're not sure how to fix it, mention it on gutcat.

If there is no language code attached to the bib record, it may be because we don't yet have the language in the list of codes. Marcello advises:

In doubt consult:

 http://www.ethnologue.com

use the three letter code (ISO 639-3) and the main language name in the header to insert new languages into the database. (So if anybody complains about the name you simply refer them to ethnologue.)

Alternatively refer to this list of ISO 639-2 codes:

 http://www.loc.gov/standards/iso639-2/php/code_list.php

Check title for possible typos and correct number of non-filing characters

A warning flag for typos is if you search the title in LC and WorldCat and get no hits (but don't be surprised at no hits for science fiction short stories). If you get no hits, check in the text itself and see if the title in the header matches what's on the "title page". If there is an error in title, correct the 245. Note that you will need to specify a language (if it is not yet specified) to save any changes to the title that you make. If you make any changes at the beginning of the title, make sure that the number of non-filing characters remains correct. The number of non-filing characters is the number of letters in any initial article plus one for the following space, so for "The " it's 4, for "An " it's 3, etc. If a title (or author) typo is found in the header, report to the errata team. A significant author typo would be an actual misspelling of part of the name, not just a different form of the name. So for L. Frank Baum, "L. Fran Baum" would be a typo but "Lyman Frank Baum" wouldn't. If you have reason to suspect a typo, compare the spelling in the header to the spelling in the etext proper. If it doesn't match, you have probably found a typo.

Even if you don't edit the 245, it's worth checking to make sure the number of non-filing characters is correct, especially for a title in a language other than English. If it is a language with which you are unfamiliar, you can find out how many non-filing characters are needed by searching the title in another library catalog and then choosing the MARC display. The second digit after the "245" field label is the number of non-filing characters in the title. If you suspect that the first word of the title is an article, try omitting that word when you do your search. Additionally, if the title begins with punctuation (e.g. " ' or ...), then edit the non-filing character field to take that into account, so that the title will be filed starting at the first non-punctuation and non-space character.

Optional extras

Add LoCC (Library of Congress Class)

Add appropriate subject heading(s)

Look up the title in the Library of Congress (and possibly WorldCat) to find subject headings to use. For more tips, see the subject cataloging notes.

Add 505 (contents note) if applicable

Include all info in a single contents note. Ignore "non-filing characters" field. See etext:10023 for an example. I usually only include a 505 if I've found one I can copy and paste from another catalog, but if you do that, make sure that the contents of our edition matches what you're pasting in. (OCLC docs)

Add 010 (LCCN) if applicable

If you find a record at the Library of Congress with a 260 which matches the publication info on the title page of our text, you might add an 010 field and paste in the LC Control No. from the LC record. See etext:30674 for an example. If the 010 includes a space (usually the case if it includes any letters), you need not remove it (the link will work either way). Records with 010 beginning with "unk" are poor quality, I wouldn't bother adding an 010 in that case.

Add 240 (uniform title) if applicable

I interpret "language" field for 240 to be the primary language of the text in the 240 field itself. So "Picture of Dorian Gray. French" would get language "English." (OCLC docs)

Add 246 (alternate title) if applicable

Doesn't come up too much except for texts in Chinese, see special cataloging procedures. (OCLC docs)

For more information...

about any of these fields, see the OCLC bibliographic formats and standards document

Other advice

Sometimes it is useful to see the upload message from the text-preparer. For this, you would need to subscribe to the whitewashers' email list. Ask on gutcat about how to subscribe. You can generally skim through the messages searching on the word "note" to see if there are any items that require changes in the bib record for the text. You may also find it useful to subscribe to the posted list, though if you are on the whitewashers' list it may not be necessary.

Useful automation bits

javascript:(function () {for (n=0; n<document.links.length;n++) { if (/etext/.test(document.links[n].href)) { document.links[n].href="http://www.gutenberg.org/catalog/admin/mn_books_loccs?mode=add&step=update&fk_loccs=BS&fk_books="+document.links[n].href.match(/[0-9]+/) } } })()

See Also