Cataloging Guidelines
From Project Gutenberg, the first producer of free ebooks.
Contents |
Useful links
- Library of Congress catalog
- Library of Congress authorities
- WorldCat meta-catalog
- Cataloger's Reference Shelf
- A Summary of Commonly Used MARC 21 Fields
- Internet Speculative Fiction Database
Catalog record review notes
Background on catalog record review
When a new etext is posted to Project Gutenberg, a corresponding catalog record is automatically created. This is great because it means there is immediate basic access to newly posted texts. However, sometimes there are parts of the automatically created records that need fixing by a human cataloger for optimal access. These notes provide a suggested procedure for fixing up a record.
There are different ways to decide which records to tackle. You might work on newly-posted records, or records grouped on a bookshelf that interests you, or records for works by a favorite author, or just pick an arbitrary ebook number and start counting up or down. You might want to see the Cataloging Progress page so you can find a good spot.
Catalog record review procedure
Check the author(s)
Make sure the title is linked to the right author(s). To check, look up the title at the Library of Congress or WorldCat and compare their author entries for the book with ours.
Sometimes duplicate author entries are accidentally created in the PG catalog. To see whether a pair of duplicate author entries needs to be merged, look at the author in the PG author browse view, or use the catalogers' author look-up page to search on the last name to see if there's an obvious merging candidate.
For a newly-created author (with just one associated title, say), we may only have minimal information in our author record. Check in the Library of Congress name authorities file to see if our form of the name is as complete as it could be (not just "Melville, H." as the author of Moby Dick for instance). Optionally, you may add one or more aliases for significantly different forms of the author's name (see Charles Dickens for an example), and perhaps a link to a Wikipedia entry or other source on the author. Aliases should have "Heading" status if they will file far from the main entry alphabetically, but "No-Heading" status if they will file right next to the main entry (to reduce clutter-- "No-Heading" aliases do not display).
- Birth and death dates
- If our author record doesn't include any dates, check the Library of Congress name authorities to see if the author's record there includes dates we can add (you may also include date information from another source).
- If date information is unambiguous, you need only enter the birth and/or death date in the "earliest" fields. Only if you have a range of dates do you need to use both the "earliest" and "latest" fields.
- Some authors have an uncertain date, like "Du Haillan, Bernard de Girard, seigneur, 1535?-1610" or "Xuan, Ding, 1832-1880?". To make a date display with a following question mark "?" if you don't have a range of dates to enter, enter that date only in the "latest" field.
- BC dates must be entered as negative numbers, and will display with "BC".
- The date fields don't accept non-numeric data, so date info like "fl. 1600-1615" can't be entered in the date fields. Suggested work-around: include that information in the name field instead.
Look at the list of authors, etc. (if more than one) to see if most should be linked with "No Heading."
This reduces redundant title entries in the list of search results users get, but does not prevent access through searches on the "No Heading" person. One way to decide who gets a Heading: search the title at the Library of Congress or WorldCat and see who's in the "Personal name" (100) field in a record for the title. That person gets "Heading"; everyone in "Related names" gets "No Heading." When the main author is also listed as an editor (or illustrator) you will get an error message when you try to change their editor link from "Heading" to "No Heading". The work-around is to remove the editor link and recreate it, this time with "No Heading". Or if you feel it's redundant, just remove it, optionally adding a Note (500 field) saying: Edited by the author.
Check title for possible typos and correct number of non-filing characters
A warning flag for typos is if you search the title in LC and WorldCat and get no hits (but don't be surprised at no hits for science fiction short stories). If you get no hits, check in the text itself and see if the title in the header matches what's on the "title page". If there is an error in the title, correct it. You will need to specify a language (if it is not yet specified) to save any changes to the title.
If you make any changes to the beginning of the title, make sure that the number of non-filing characters remains correct. The number of non-filing characters is the number of letters in any initial article plus one for the following space, so for "The " it's 4, for "An " it's 3, etc. Initial quotation marks or other punctuation, like " ' or ... should also be counted as non-filing characters. You can confirm how many non-filing characters are needed for a title in an unfamiliar language by searching the title in another library catalog and then choosing the MARC display. The second digit after the "245" field label is the number of non-filing characters in the title. If you suspect that the first word of the title is an article, try omitting that word when you do your search.
Skim record for anything odd
An example would be a malformed autogenerated note field (it will look obviously weird). If you're not sure how to fix it, mention it on gutcat.
If there is no language code attached to the bib record, it may be because we don't yet have the language in the list of codes. Mention it on gutcat or see special cataloging procedures.
Optional steps
Add Library of Congress Class (LoCC)
- This will commonly be the letters at the beginning of the call number (if it is a Library of Congress-style call number beginning with a letter).
- Some of the really broad letter codes have been subdivided somewhat by number, so I would not use just "E" or just "F" but would use a narrower code, like E151 or F1001. Go with the nearest numeric code smaller than the call number in the record you found. "D" is still mostly lumped all together, EXCEPT for material on World War I, which should get D501. Splitting "D" up a bit more would probably be good.
- Be careful about PZ's from LC-- PZ1-4 were used in the past to group popular fiction in English (regardless of original language) outside the normal order. This is no longer common usage, and the Library of Congress has stopped using PZ1-4 (but has not updated existing call numbers to be in line with current practice), so treat low PZ call numbers from LC with suspicion, and if possible check in WorldCat for a better alternative.
- Canadian literature should most likely be PS instead of PR, though you will see PR in some catalogs.
- You can add more than one LoCC if the work usefully fits into more than one category, see etext:18845 for example.
Add appropriate subject heading(s)
Look up the title in the Library of Congress (and possibly WorldCat) to find subject headings to use. See the subject cataloging notes.
Add contents note (505 field) if applicable
Include all table of contents info in a single contents note. Ignore "non-filing characters" field. See etext:10023 for an example. I usually only include a contents note if I've found one I can copy and paste from another catalog, but if you do that, make sure that the contents of our edition match what you're pasting in.
Add uniform title (240 field) if applicable
I mostly use this with translations, and I interpret the "language" for the uniform title to be the primary language of the text in the 240 field itself. So "Picture of Dorian Gray. French" would get language "English."
Add alternate title (246 field) if applicable
Doesn't come up too much except for texts in Chinese, see special cataloging procedures.
Other advice
Sometimes it is useful to see the upload message from the text-preparer. For this, you would need to subscribe to the whitewashers' email list. Ask on gutcat about how to subscribe. You can generally skim through the messages searching on the word "note" to see if there are any items that require changes in the bib record for the text. You may also find it useful to subscribe to the posted list, though if you are on the whitewashers' list it may not be necessary.
See Also
- Cataloging Progress -- a place to post incomplete projects so they don't get lost and may be furthered by others. Also some project ideas for when you need inspiration.
- Subject cataloging notes -- About adding subjects to catalog records.
- Special cataloging procedures -- notes on less-common tasks.
- Programming Project Ideas -- a place to list suggestions to improve the software/website