1 / 62

Understanding Indexes: Headings

Understanding Indexes: Headings. www.exlibrisgroup.com. Prepared by Marina Spivakov, 2002; Updated by Jerry Specht, June 2003. Scope of the Lecture. Points for discussion in each index:. Index structure (Oracle tables) Specifying index Index creation and update Performance issues.

wannamaker
Télécharger la présentation

Understanding Indexes: Headings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Understanding Indexes: Headings www.exlibrisgroup.com Prepared by Marina Spivakov, 2002; Updated by Jerry Specht, June 2003

  2. Scope of the Lecture Points for discussion in each index: Index structure (Oracle tables) Specifying index Index creation and update Performance issues Understanding Indexes

  3. NOTE: This Power Point discusses Headings in 14.2 (and in general). It is supplemented by a NAAUG.INDEX_CH.ppt which follows and which describes Headings features new in 15.2. Understanding Indexes

  4. Where to get your own copy: Both this Power Point presentation and the following may be found on the US documentation server ( http://support.exlibris-usa.com/D ) in the NAAUG_Indexes_2003 directory. Understanding Indexes

  5. Headings Index Understanding Indexes

  6. Headings Index • Headings indexes are whole phrases from the record such as author, title, subject, publishers, etc. Understanding Indexes

  7. Database Tables • Heading index: • Z01 – phrase dictionary • Z02 – pointers to the documents Understanding Indexes

  8. Filing text (stripped sub-fields, stripped punctuation, add leading zeros to numeric fields, character conversion etc.) Z01 Z01 record unique identifier, link to other records Authority link Display text

  9. Z01- Z02 link Z02 record Z01 record Bibliographic record

  10. How to Define the Headings Index? • Tables to remember • tab00.lngdefines system index codes & filing procedures • tab11defines connections between the bibliographic record fields and the indexes • tab_filing defines filing procedures • tab_expanddefines expand procedures which have to be activated when index is created • tab_character_conversion_line defines character conversion routines • unicode_to_filing_nn character conversion table used for normalization of headings

  11. How to Define the Structure of the Headings Index – Interrelation of Tables tab00.lng tab11 tab_filing tab_expand

  12. Z01:Display Text and Filing TextUseful Details Understanding Indexes

  13. Z01 – Display Text and Filing Text Display text - data for the display text is taken directly from the record. Filing text– data undergoes filing and character conversion processing. Understanding Indexes

  14. Bibliographic document 1 Bibliographic document 2 z01 z01 Z01 –Display Text If two records generate headings that have a common filing text but different display texts, the system will create two headings, not one.

  15. Z01 –Display Text In order to achieve normalization of headings in 14.2, the headings themselves must be changed to the same form. The only exception is the suppression of end punctuation, specified in tab00.eng: tab00.lng, col.4: 0 - no suppression 1 or space - suppress punctuation at the end each sub-field when creating a Z01 heading. Understanding Indexes

  16. Z01 –Display Text Normalization Bibliographic document 1 Bibliographic document 2 Tab00.lng z01 z01

  17. Z01 –Display Text Normalization Bibliographic document 1 Bibliographic document 2 Tab00.lng z01 NOTE: Version 15 allows more advanced normalization of headings.

  18. Normalization of Headings – Cataloger’s Assistant Detect Similar Headings (p_manage_26)reports headings which differ in display text only, i.e. headings which are the same except for punctuation and case differences. Example of output file

  19. Normalization of headings – Cataloger’s Assistant Correct allows you to: • Discover inconsistencies • Change bibliographic documents without going to Cataloguing module • Reindex the documents (creates Z07)

  20. Filing of Headings • Headings are filed (organized in the index, sorted) according to the filing text of the heading. • Data for the filing text field is processed in two ways: • Text goes through the appropriatefiling routine. • Characters go throughcharacter conversion. Understanding Indexes

  21. Filing Routines From version 14, the filing routines are made up of a group of individual procedures. Filing routines are defined in tab_filing: Understanding Indexes

  22. tab_filing - Structure • 1 2 3 4 • !!-!-!!!!!!!!!!!!!!!!!!!!-!!!!!!!!!!!!!!> • 01 # compress ’ • 01 # char_conv FILING-KEY-01 • Col.1: procedure identifier • Col.2: alpha of the text • Col.3: procedure name • Col.4: procedure parameters Understanding Indexes

  23. Examples of Filing Procedures • compress • Strips characters listed in col. 4 • (e.g., ()[]:,) • delete_subfield • Changes subfield sign to blank • (e.g., $$x) • to_blank • Changes characters listed in col. 4 to blanks. Understanding Indexes

  24. Examples of Filing Procedures • to_lower • Changes all characters to lower case. • to_carat • Changes subfield sign to two caret (^^) signs in order to achieve hierarchical sorting of headings. • suppress • Suppresses all text contained within <<…>>, as well as the signs themselves. Understanding Indexes

  25. Examples of Filing Procedures • expand_num • For filing numbers numerically, adds leading zeroes to numbers to fixed length of 7 (e.g. 17 -> 0000017). • mc_to_mac • Changes initial “mc” to “mac” (for interfiling McKay and MacKay). • non_filing • Suppresses initial text according to non-filing indicator defined in tab11. Understanding Indexes

  26. Examples of Filing Procedures • compress_blank • Strips blanks (e.g. ISBN). • numbers • Compresses a comma and a dot between numbers (e.g., 2,153 • changes to 2153). • non_numeric • Deletes all non-numeric characters (e.g. for ISSN). Understanding Indexes

  27. Examples of Filing Procedures • abbreviation • Compresses a dot between single characters (e.g., I. B. M. changes to I B M, I.B.M. changes to IBM). • build_filing_key_lc_call_no • Special procedure for correct sequencing of LC call numbers. Understanding Indexes

  28. Examples of Filing Procedures • char_conv • Performs character conversion. • Characters can be: • - filed as themselves • - ignored • - converted to spaces or to one or more different characters. • Examples. • ue (0075 0065) • ü (00FC) • u (0075) • &(0026)and(0041 004E 0044)

  29. Examples of Filing Procedures – Character Conversion • tab_filing • 01 # char_conv FILING-KEY-01 • $alephe_unicode/ • tab_character_conversion_line • FILING-KEY-01 ##### # line_utf2line_sb unicode_to_filing_01 • FILING-KEY-02 ##### # line_utf2line_sb unicode_to_filing_02 • FILING-KEY-03 ##### # line_utf2line_sb unicode_to_filing_03 • $alephe_unicode/

  30. Character Conversion Tables • unicode_to_filing_nnis the one actually used by • the index creation process. • unicode_to_filing_nn_source - raw material, • ‘human interface’ for character conversion • definitions. All the editing has to be done in this table. • Process unicode_to_filing_nn_sourceusing • UTIL P/3 in order to createunicode_to_filing_nn • UTIL P/3 performs an additional translation in order • to remove null characters.

  31. changes characters specified in col.4 to blank compresses a comma and a dot between numbers. IMPORTANT NOTE • The procedures must be listed in the logical order. • For example, the following setup is not logical: • ‘2,153’has to be turned into‘2153’bynumbers • But here, it will first be changed to ‘2 153’byto_blank

  32. Filing of Headings – Putting it Together… tab00.lng tab_filing tab_character_conversion_line FILING-KEY-01 ##### # line_utf2line_sb unicode_to_filing_01 unicode_to_filing_01

  33. Index Creation and Update • The headings index is : • Created by p_manage_02 • Enriched by ue_08 • Updated by ue_01 Note : In the authority libraries the headings are created when the document is updated, before ue_01 indexes it. Understanding Indexes

  34. Maintenance of the Browse • Index : • -Alphabetize long headings • - Resequencing • - Delete unlinked headings Understanding Indexes

  35. What are Long Headings? • z01-filing-sequence = 69* characters • z01-display-text = 2000 characters • * “Effective” length = 34 characters with double-byte • p_manage_17 (Alphabetize Long • Headings) sorts those headings whose • display text is longer than 69 characters. Understanding Indexes

  36. Alphabetize Long Headings • Before p_manage_17… • After p_manage_17… Understanding Indexes

  37. Alphabetize Long HeadingsHow does it work? util-g-2 Last heading (z01) indexed by p_manage_02 or ue_01 Last heading (z01) processed by p_manage_17 START: last-acc-number FINISH: last-long-acc-number

  38. When to run p_manage_17? • p_manage_17 must be run periodically (e.g. daily) in order to alphabetize long headings that were added since the last time this function was run. Understanding Indexes

  39. If the rules for filing text creation have been changed… • Runp_manage_16(Alphabetize Headings - Setup ) • p_manage_16 recreates filing text Understanding Indexes

  40. Unlinked Headings • What are unlinked headings? • These are headings which do not have pointers to documents (Z01s without corresponding Z02s). • How are unlinked headings created? • When a heading is modified, the existing Z01 is NOT updated. Instead, the Z02 record linking the heading to the bib record is deleted and a NEW Z01 record with a new Z02 is created. Thus, “orphaned”, outdated Z01s can accumulate. Understanding Indexes

  41. Unlinked Headings • How to delete unlinked headings? • Run p_manage_15 (Delete Unlinked Headings) periodically. NOTE: The job does not delete Z01 records which have an authority link. This is in order to keep the cross-references, which are not linked to the documents directly (do not have attached Z02 records). Understanding Indexes

  42. Performance Issues Understanding Indexes

  43. Performance Issues • In order to display the browse list the system must count the documents which are connected to a heading (Z02 records attached to Z01). Understanding Indexes

  44. Performance Issues • Pre-14.2 ALEPH: p_manage_10updates Z01 (z01_number_of_doc)with the number of documents available for each heading. • 14.2 and higher: the system allows extensive use of base and denied records (per user profile) functionality.That is why browse list can benefit from beingpre-filtered.The system counts thenumber of records on the fly. Understanding Indexes

  45. Performance issues • How to speed up z02 count when the headings are displayed? Count limit. A heading with records greater than the number defined in this counter will display with + rather than the number itself Understanding Indexes

  46. Base Filtered Headings (Z0102) Understanding Indexes

  47. Z0102 Pre-14.2 – Problem: The smaller a logical base is, the more work the system has to do in order to find 20 headings which are in the base to show in the Browse list display. Solution: There is a new index Z0102 which ‘divides’ Z01 into sections in accordance with the existing logical bases. Understanding Indexes

  48. Z0102 Example of Z0102 record: Understanding Indexes

  49. Z0102 Structure Z0102 record is built for each Z01 in a logical base, giving the filing text and sequence. The record does not include pointers to the doc records; this is still done by Z02. Z01 Z0102

  50. Z0102 When a logical base is being browsed, the system uses the Z0102 table to “decide” whether to display the heading (Z01) without having to retrieve the documents attached to the heading, read them, and then “decide”. Understanding Indexes

More Related