210 likes | 329 Vues
This document provides a comprehensive overview of the SRU (Search and Retrieve via URL) and Lucene integration developed by Ralph LeVan. It details the functionalities of a simple web service that supports RESTful and SOAP requests, returning XML records for search and retrieval. Key features include a standard query grammar, self-configuring clients, and persistent result sets. The integration leverages Lucene's indexing capabilities, providing detailed descriptions of metadata, database information, and example configurations. Resources for further implementation and examples are also included.
E N D
SRU and Lucene Ralph LeVan Research Scientist levan@oclc.org
SRU Overview • A Simple Web Service • Supports REST-ful and SOAP requests • Responses are always XML records • Supports Search and Retrieve • Uses a Standard Query Grammar • Supports Self-Configuring Clients • A Gateway to Local Databases
SRU Features • Explain Records • CQL Query Grammar • Persistent Result Sets • XML Database Records Returned • Index Browses • Stylesheets
Explain Records • serverInfo • databaseInfo • metaInfo • indexInfo • schemaInfo • configInfo
serverInfo • Generated Automatically • host • port • database
databaseInfo • From SRWDatabase.props • databaseInfo.title • databaseInfo.description • databaseInfo.contact • Provided Automatically • implementation
metaInfo • From SRWDatabase.props • metaInfo.dateModified • metaInfo.aggregatedFrom • metaInfo.dateAggregated
indexInfo • Generated Automatically • “local” index set and Lucene index names • From SRWDatabase.props • qualifier.<indexSet>.<indexName> = <LuceneIndexName> • Used only if you want to map other index names to your Lucene indexes (e.g. qualifier.dc.identifier=id)
schemaInfo • Generated Automatically • LuceneDocument • From SRWDatabase.props • xmlSchemas=<list of name> • <schemaName>.identifier= • <schemaName>.location= • <schemaName>.namespace= • <schemaName>.title= • [<schemaName>.transformer=] • [<schemaName>.resolver=]
schemaInfo Example xmlSchemas=LuceneDocument, DC LuceneDocument.identifier=info:srw/schema/1/LuceneDocument LuceneDocument.location=http://www.oclc.org/standards/Lucene/schema/LuceneDocument.xsd LuceneDocument.namespace=http://www.oclc.org/LuceneDocument LuceneDocument.title=Lucene Demo Database records in their internal format
schemaInfo Example (cont.) DC.identifier=info:srw/schema/1/dc-v1.1 DC.location=http://www.loc.gov/zing/srw/dc-schema.xsd DC.title=DC: Dublin Core Elements DC.transformer=LuceneToDC.xsl
configInfo • Generated Automatically • maximumRecords (20) • numberOfRecords (10) • resultSetTTL (300) • From SRWDatabase.props • configInfo.maximumRecords • configInfo.numberOfRecords • configInfo.resultSetTTL
CQL Query Grammar • Builtin: BasicLuceneQueryTranslator • CqlQueryTranslator • Query makeQuery(CQLNode cn); • Term getTerm(); • From SRWDatabase.props • SRWLuceneDatabase. CqlToLuceneQueryTranslator= <ClassName>
Persistent Result Sets • Builtin: LuceneQueryResult
XML Database Records • Builtin: BasicLuceneRecordResolver • RecordResolver • Void init(Properties props); • Record resolve(Document doc, String IdFieldName, ExtraDataType extraDataType) • From SRWDatabase.props • <schemaName>.resolver=<ClassName> • SRWLuceneDatabase.idFieldName= <FieldName>
Index Browses • Builtin: SRWLuceneDatabase.getTerms()
Stylesheets • From SRWDatabase.props • explainStyleSheet= /SRW/explainResponse.xsl • scanStyleSheet=/SRW/scanResponse.xsl • searchStyleSheet= /SRW/searchRetrieveResponse.xsl
Making the Magic Happen • Drop the SRWLucene.war into your <tomcat>/webapps directory • Restart Tomcat • Edit <tomcat>/webapps/SRWLucene/WEB-INF/classes/SRWServer.props • Restart Tomcat
Sample SRWServer.props db.LuceneDemoDB.class= ORG.oclc.os.SRW.Lucene.SRWLuceneDatabase db.LuceneDemoDB.home= f:/lucene-2.0.0 db.LuceneDemoDB.configuration= SRWDatabase.props
Sample SRWDatabase.props databaseInfo.title=Lucene Demo Database databaseInfo.description=An index of the source code for Lucene databaseInfo.contact=Ralph LeVan (levan@oclc.org) qualifier.cql.serverChoice=contents explainStyleSheet=/SRWLucene/explainResponse.xsl scanStyleSheet=/SRWLucene/scanResponse.xsl searchStyleSheet= /SRWLucene/searchRetrieveResponse.xsl
Resources • http://www.oclc.org/research/software/srw • http://staff.oclc.org/~levan/SRWLuceneSource.jar • http://staff.oclc.org/~levan/SRWLucene.war • http://staff.oclc.org/~levan/Implementing%20an%20SRWLuceneDatabase.doc • http://staff.oclc.org/~levan/SRU%20and%20Lucene.ppt • http://alcme.oclc.org/srw/SRUServerTester.html