190 likes | 262 Vues
FRBRization of European Catalogues challenges and some solutions. Trond Aalberg Norwegian University of Science and Technology (NTNU). Workshop on FRBR in The European Library 9 October 2008, National Library of Portugal – Lisbon - Portugal. Overview. FRBRization?
E N D
FRBRization of European Catalogueschallenges and some solutions Trond Aalberg Norwegian University of Science and Technology (NTNU) Workshop on FRBR in The European Library 9 October 2008, National Library of Portugal – Lisbon - Portugal
Overview • FRBRization? • FRBR and new requirements for bibliographic information • Challenges, problems and possibilities • With some examples
FRBRization • Catchy term for ”the FRBR model applied on existing bibliographic information” • Converting existing bibliographic information • Or just interpreting (run-time) • Different levels of ambition: • Following the FRBR model or just FRBR-inspired • User interface only – presenting search results and allowing users to navigate along the axis of FRBR relationships • Data model that implements (part of) the FRBR model
Cross-catalogue FRBRization • FRBRization is even more relevant in a broader context: • reuse of information across catalogues • as a framework for portals - integrated access to multiple catalogues or cross domain integration • novel user interfaces – explorative • In Europe • Diversity in language, format and cataloguing practise
What FRBR really is about • Emhasis on ”content” and the documentation of intellectual/artistic endavour • What are the works and expressions in this product • Who are the actors and how do they relate to the expressions and works • It’s like drawing a map.... • More consistently structured bibliographic information • That can be processed and interpreted – not only searched and displayed
Our focus • Conceptual models are ideal solutions • ”This is where we want to go” objective • But how do we get there? • Existing bibliographic information is a valuable asset • One of the problems for future implementations of FRBR will be compatibility with already created information • Identification of entities and relationships • Experimenting with different rules, algortihms etc. • Gathering statistics and evaluating the results • Looking for solutions
Our experience so far..... • Based on FRBRization of different collections • BIBSYS (Norwegian catalogue - BIBSYSMARC) • The Slovenian National Bibliography (UNIMARC) • BTJ (Swedish catalogue - MARC 21) • Different catalogues, different formats, different practises • Many catalogue-spesific rules are needed • A certain level of FRBRization is easy to achieve • For ”richer” FRBRization there is a number of common problems related to the poor structuring capabilities of the MARC formats
Persons and Corporate Bodies • Persons and Corporate Bodies are usually easy to identify • Specific fields for these entities • Duplicate entities is a frequent problem • Despite the use of authority control • Relatorcodes are needed to associate persons and corporate bodies to the correct kind of product entity • For records with multiple persons and multiple works/expressions it is often difficult to set up the correct relationships....
Works and Expressions • Works can be identified by titles and associated creators (if applicable) • Major challenge is to find and select title, identify multiple works, .. • Problems related to the identification of persons are ”inherited” • Expressions can be identified by the work it is associated to and additional expression-level information • Typical problems • Lack of original title/uniform title when title statement is inappropriate • Often inconsistent practise for work titles within and across catalogues
Not always easy... 100 1 $a Sjöwall, Maj, $d 1935- 240 14 $a Den vedervärdige mannen från Säffle. $l Tyska 245 14 $a Das Ekel aus Säffle ; $b Verschlossen und verriegelt : zwei Romane / $c Maj Sjöwall, Per Wahlöö 260 $a Erftstadt : $b Area, $c 2006 300 $a 639 s. 500 $a Den vedervärdige mannen från Säffle / ... in der deutschen übersetzung von Eckerhard Schultz -- Det slutna rummet / ... in der deutschen übersetzung von Hans-Joachim Maass 700 12 $a Sjöwall, Maj, $d 1935-. $t Det slutna rummet. $l Tyska 700 12 $a Wahlöö, Per, $d 1926-1975. $t Det slutna rummet. $l Tyska 700 1 $a Schultz, Eckehard $4 trl 700 1 $a Maass, Hans-Joachim $4 trl 700 12 $a Wahlöö, Per, $d 1926-1975. $t Den vedervärdige mannen från Säffle. $l Tyska 740 4 $a Det slutna rummet
Manifestations • Each record describes a single manifestation • and manifestations can easily be identified by e.g. ISBN and/or title statment etc. • But there are different solutions used for multivolumed publications • Record linking • Note fields • Linking fields
Major challenges for FRBRization • A number of techniques and a complex set of rules must be applied when interpreting records • Inspecting fields, subfields and even parsing the text in note fields • Interpreting relator codes • No single set of rules for all catalogues • Still struggling with the bascic relationships... • Results must be evaluated and corrected • Equivalent entities has to be identified • Erronously identified entities and relationships has to be removed
What are the consequences? • The current (rather simple) interfaces are tolerant to errors and inconsistencies • The FRBR context adds new requirements to the data
The reason why 020 $a 0396070213 : $c $6.95 040 $a DLC $c DLC $d DLC 050 00 $a PZ3.C4637 $b Hh3 $a PR6005.H66 082 00 $a 823/.9/12 100 1 $a Christie, Agatha, $d 1890-1976. 245 10 $a Hercule Poirot's early cases / $c Agatha Christie. 260 $a New York : $b Dodd, Mead, $c [1974] 300 $a 250 p. ; $c 22 cm. 505 0 $a The affair at the victory ball.--The adventure of the Clapham cook. --The cornish mystery.--The adventure of Johnnie Waverly.--The double clue.--The king of clubs. --The Lemesurier inheritance.--The lost mine.--The Plymouth express.--The chocolate box. --The submarine plans.--The third-floor flat.--Double sin.--The market basing mystery. --Wasps' nest.--The veiled lady.--Problem at sea.--How does your garden grow? 650 0 $a Poirot, Hercule (Fictitious character) $x Fiction. 650 0 $a Private investigators $z England $x Fiction. 650 0 $a Detective and mystery stories, English. 984 $a gsl 991 $b c-GenColl $h PZ3.C4637 $i Hh3 $p 00022213155 $t Copy 1 $w BOOKS
What quality can we achieve? • A large number of records have a ”simple” FRBR structure • Single creator, published once... • The quality from the more complex records is more questionable • But this is where FRBR is mostly needed • Errors and problems that users never would notice, become very visible when FRBRizing
Concluding remarks • Is MARC sufficient for FRBR? • More structured information about expressions, works is possible even in MARC • Extensive use of relatorcodes is needed • Field linking (in MARC 21) could solve many of the problems caused by multiplicity • Can we automatically improve existing records? • By implementing more intelligent entity discovery solutions • Using information from other records/catalogues in the interpretation of others