1 / 23

Greenstone

Greenstone. Building your own collection. Overview Installation Usage Building a collection. What is Greenstone?. A suite of software which has the ability to serve digital library collections and build new collections.

clint
Télécharger la présentation

Greenstone

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Greenstone Building your own collection

  2. Overview • Installation • Usage • Building a collection

  3. What is Greenstone? A suite of software which has the ability to serve digital library collections and build new collections. It provides a new way of organizing information and publishing it on the Internet or on CD-ROM.

  4. Ways to find information • Searching • Ex, search for particular words in the text • “Full-text search” • Indexes built from different parts of the document • Browsing • Ex, browse document by titles • Involves lists, classification

  5. Metadata • Metadata are descriptive data associated with each document. • For ex, • <Metadata name="PictureN">boon.jpg</Metadata> • <Metadata name="Height">137feet</Metadata> • <Metadata name="Date">1852</Metadata> • <Metadata name="State">Maine</Metadata> • <Metadata name="Title">Boon Island Light</Metadata> • Can be used as searchable index • Used to generate the browsing structures (lists or hierarchical structures) through “classifiers”

  6. Greenstone Document Format • XML format: • Source documents are converted into standard XML format by “plugins.” • Plugins can process plain text, HTML, WORD, and PDF documents, and email messages. • Multimedia documents • Either linked to the textual document or accompanied by textual descriptions. • Multilanguage documents • Unicode to represent the character sets for consistency

  7. Why using Greenstone? • Forget cgi programming • Built-in server • GUI is provided • Easy to use • Making large collection in a short time becomes possible

  8. Installation • Download from the www.greenstone.org page. • Platform: Windows or Unix system. • local library or web library? • Local library has a built-in webserver. • Web library • Configure the external webserver • Point to URL of Greenstone's library executable, like http://localhost/gsdl/cgi-bin/library.exe • "Enter Library" or "Restricted Version”? • “Restricted Version” used only when • networking software has been installed incorrectly. • Windows keeps attempting to dial up your internet service provider. • “Restricted Version” must use a Netscape web browser.

  9. Using Greenstone • Searching and Browsing • punctuations are ignored in search terms • Query types --- “all” and “some” • Icon meanings • Setting the perferences • sensitivity, stemming, Boolean queries • Change language • Change presentation

  10. BUILDING A COLLETION

  11. Using "the Collector" • easy to use • builds collections based on the existing collection with new content • Not feasible to use the “collector” alone to create collections with completely new structures • Building from command line is preferable

  12. Step by step instructions 1. Change to the correct directory > cd “C:\Program Files\gsdl” 2. Invoke setup.bat, which is needed for each new DOS session > setup.bat 3. Make a collection > perl –S mkcol.pl –creator me@cs.tamu.edu Lhouses Lhouses is the collection name.  Now you have a new collection directory called Lhouses.

  13. 4. Populate the collection Copy documents into the Lhouses collection’s import directory. This is can be done through copy and paste using Windows Explorer. Or, on the command line, type > cd "%GSDLHOME%\collect\Lhouses” > xcopy /s document_path\* import If you have stored all the documents in C:\My Document\LHCollection, then document_path is C:\My Document\LHCollection.

  14. 5. Import the collection > perl –S import.pl Lhouses 6. Edit collect.cfgfile It is the configuration file for the collection, which is in the collection’s etc directory. ·Give the collection a name through collectionmeta collectionname Add a description of your collection through collectionmeta collectionextra "barabara…". ·Add a collection icon through collectionmeta iconcollection “_httpprefix_/collect/Lhouses/images/icon.gif”If the image is in the collection.s images directory => collect.cfg

  15. 7. Build the collection > perl –S buildcol.pl Lhouses 8. Make the collection available over the web Either select the contents of the building directory and drag them into the index directory. Or, remove the index directory (and all its contents) by typing rd /s index # on Windows NT/2000 deltree /Y index # on Windows 95/98

  16. and then change the name of the building directory to index with ren building index Finally, mkdir building

  17. Unix commands cd ~/gsdl # assuming default Greenstone in home directory source setup.bash # if you.re running the BASH shell source setup.csh # if you.re running the C shell mkcol.pl .creator me@cs.tamu.edu Lhouses cd $GSDLHOME/collect/Lhouses cp .r document_path/* import/ import.pl dlpeople buildcol.pl dlpeople rm -r index/* mv building/* index

  18. Import and Build processes The import process converts documents of various formats into Greenstone Archive Format. Import.pl needs to know what plugins are to be used. Plugins parse the imported documents and extract metadata from them. See ex. The build process compresses the text, builds full-text indexes according to the collect.cfg, and precalculates the appearance of the collection.

  19. Assigning Metadata from a file and build search indexes • assign metadata from a single file, metadata.xml • Make sure the plugin RecPlug is included in the collect.cfg and the use_metadata_files option is set. • add searching indexes, see collect.cfg • move metadata.xml to the import directory • Import the collection again and rebuilt. >perl –S import.pl Lhouses >perl –S buildcol.pl Lhouses >rd /s index (or deltree /Y index) >ren building index >mkdir building

  20. Create Browsing Indexes Through Classifiers • Vlist, Hlist, Datelist • classifiers contain a metadata argument, by which the documents are classified and sorted. See collect.cfg • For hierarchy classifier, it needs a classification file, which defines the metadata hierarchy. Three parts: Identifier, Position-in-hierarchy, name of the classification. • For ex, subheight.txt and substat.txt • the classification file are put into the etc directory, rebuild (>perl –S buildcol.pl Lhouses, then rd /s index or deltree /Y index, and ren building index, finally, mkdir building)

  21. Formatting Output • Format the document • Format the lists produced by classifiers and searches • Add format strings to collect.cfg • Then rebuild.

  22. Another way of assigning metadata • assigning metadata from a file called index.txt, using the plugin indexplug, see collect.cfg • Put index.txt in the import directory. Modify collect.cfg. Then re-import and rebuild the collection.

  23. References: • Greenstone Installation Guide, • Greenstone Users’ Guide, • Greenstone Developers’ Guide, • Documentations from “Light Houses” Group of CPSC 670, Fall 2001.

More Related