1 / 17

First Indico Workshop

First Indico Workshop. Conversion Server. 29-27 May 2013 CERN. Thomas Baron. Service Description Architecture Conversion Alternatives Future Directions. Service description. Goal. Provide a PDF version of all textual documents uploaded to Indico. Long- term preservation

tarala
Télécharger la présentation

First Indico Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. First Indico Workshop Conversion Server 29-27 May 2013 CERN Thomas Baron

  2. Service Description Architecture Conversion Alternatives Future Directions

  3. Service description

  4. Goal Provide a PDF version of all textual documents uploaded to Indico • Long-termpreservation • Multi-platformreading • Converted formats: .ppt, .pptx, .doc, .docx, .sxi, .odp

  5. Interface On user requestonly

  6. Service parameters Asynchronous About 30 seconds in average • At CERN: An average of 165 conversions per day

  7. architecture

  8. General overview Indico Conversion server files to convert HTTP API PDFs

  9. Integration to indico Indico Currently all entangled to Indico’score Conversion server Makac.export.fileConverter convert files to convert • Configuration in indico.conf • conversion server URL: FileConverter[‘conversion_server’] • callback URL: FileConverter[‘response_url’] • Conversion handled by the Makac.export.fileConverterclass • convertfunction : sends the file • storeConvertedFilefunction: gets the converted file back HTTP API storeConvertedFile PDFs Indico.conf

  10. Conversion server side Indico A dedicated server running non-indico code and software Conversion server getSegFile.py Makac.export.fileConverter Conversion software convert files to convert • Web server: IIS (previously Apache) • Listener script: getSegFile.py ; python; receives the file, savesitlocally, creates the conversion task (a text file) • Conversion Daemon: Engine.py ; python script in scheduledtasks; parses the conversion task files, and for each of themlaunch the conversion, wait for itscompletion and send the file back to the callback URL (to Indico) • Conversion software: performs the conversion www HTTP API task master storeConvertedFile Engine.py PDFs Configuration.py

  11. Conversion alternatives

  12. Fully home made Was the case at CERN until 2009 • Using direct OLE-automation of Microsoft Office applications • python scripts • Example

  13. Using commercial products Exampleat CERN: Neevia Document Converter Pro OLE automation • Pros: • More reliable • Bettererror management • Regular updates • More extensible • More formats • Hot folders • Can beused for other services Features: supports 300 file types Com, hot folder, email interfaces Watermark, stamping etc. Convert to PDF, PostScript, TIFF (including Class F), BMP, PNG, PCX, JPEG

  14. NEEVIA doc Converter Pro Simplified automation code in python • NDocConverter = win32com.client.Dispatch("docConverter.docConverterClass") • NDocConverter.DocumentOutputFormat = "PDF" • NDocConverter.DocumentOutputFolder = output_dir • NDocConverter.JobOption = "printer" • rv = NDocConverter.SubmitFile( file_path, "") • rv= NDocConverter.CheckStatus( file_path, "")

  15. Future directions

  16. Whatshouldbecoming Unfortunately the featureis not directly usable by external instances • Rewrite the conversion server-side code • Veryold! • Improve the conversion server monitoring at CERN • Not plannedyet • Replace currentimplementationwith a plugin on the Indicoside • v1.5 (2015?)

  17. Thomas Baron Thomas.baron@cern.ch

More Related