440 likes | 548 Vues
Learn efficient alternate media workflow strategies for PDF files, including text extraction, combining pages, and optimizing for accessibility.
E N D
Alternate Media Workflow Strategies for PDF Gaeir Dietrich DirectorHigh Tech Center Training Unit
PDF • Great starting point • Contains all text and graphics • Easy to generate Word files once you learn how • Reduces retyping • Excellent format for creating large print
What is a PDF? Portable document format (PDF) Reads the same on any computer Looks like the book Contains all the text Easy for publishers
Types of PDF Documents • Text-based PDF • Searchable • Graphical PDF • Picture of text (i.e., a graphic) • Use text-selection (I-beam) toolto tell the difference • Text can be selected; graphics cannot
PDFs and Publishers • Fairly easy for publishers • Usually even small publishers can create a PDF • Most accurate format • Looks like the book • Includes page numbers and all text • Will be complete • BUT watch out for teacher’s editions
Requesting through ATPC • Use the ATPC request form • www.atpc.net • If additional processing is required, send syllabus! • Please note: Look for the “conversation” feature on the ATPC interface
Security Issues • PDF files can be locked • Some files can be read with TTS but no text extracted • Some files cannot be read • Sometimes OmniPage and/or FineReader can OCR locked files • If you receive a locked PDF, go back to the publisher
Working with PDF Files • Native utilities from Adobe • Adobe Reader • Acrobat Pro • Optical character recognition (OCR) • Free extraction tool: Balabolka
Different Acrobats • Adobe Reader • Free • Open, view, and read (including TTS) • www.adobe.com/products/reader/ • Adobe Acrobat Professional • www.uscollegebuy.com Discounted Price • Crop pages, delete/combine pages, renumber pages, extract text • Required for alternate format producers
Access with Acrobat Reader • Access text-based PDFs within Reader • Reads aloud • But does not highlight or track • Enlarges text • Nice reflow feature • Changes text/background colors • Text highlighting, sticky notes, and comments
Adobe Reader Reality Check • As good as Kurzweil? • NO • As good as PDF Aloud? • NO • Appropriate as only assistive technology? • NO • Nice as a free, widely available option? • YES
Production Features in Reader • Really designed for reading, not reformatting • Export PDF • Subscription service (about $20/year) • Upload PDF file, service auto-converts to Word, download
Process with Acrobat Pro • Cropping • Enlargement for printing • Tiling • Extracting/deleting pages • Combining/inserting pages • Text extraction • Works best with text-based PDF • Does have built-in OCR capability
Customize Quick Tools • Click on the “gear” • View > Show/hide > Toolbar Items > Quick Tools
Please Note • To enable single-key shortcuts • Open Preferences dialog box Ctrl + K • Under General > select Use Single-Key Accelerators To Access Tools (first checkbox under Basic Tools)
Cropping • Tools > Pages > Crop • Shortcut: C • (Please note: This shortcut brings up the mouse-driven cropping tool—must double click to open the dialog box!)
Enlarging • Choose paper size/printer • File > Print > Size…to Fit • Shortcut: Ctrl + P (tab through) • Tip: Crop document before enlarging
Tiling • Choose paper size/printer • File > Print > Poster > Tile Scale and Overlap • Shortcut: Ctrl + P (tab through) • Tip: Crop document before tiling
Extracting Pages • Tools > Pages > Extract • Delete Shortcut: Ctrl + Shift + D • Extract Pages Shortcut: Alt V + T + P (opens Pages pane; F6 focuses in pane and can arrow down)
Tips for Extracting Chapters • Crop on complete file before extracting • Work on a copy!!!!! • Extract from end toward front! • Use table of contents to help • Place focus on first page of chapter to extract (beginning with last)
Combining • File > Pages > Insert • OR • Create > Combine files
Auto Extracting Text • File > Save As > MS Word • Retains styles and paragraphs • File > Save As > More options… • Text (Accessible) • Lose styles, places hard returns at end of line • Text (Plain) • Lose styles, keeps paragraphs • Shortcut: Alt F + A
Better Text Extraction OCR programs analyze text and structure Acrobat Pro has built-in OCR, but other programs provide more control Can control which text to include
More Control over Text • For graphical PDFs • Or • To maintain more control over extracting text from text-based PDFs • Use an OCR program!
Processing Graphical PDFs • Must use OCR program • Use virtual printer with Kurzweil • Creates KESI files • Will not work with locked files • Use OmniPage or FineReader • Sometimes work with locked files • OP handles tighter security than FineReader does • Nothing works on some locked PDFs
Want to Stay in PDF? • Sometimes students do want a text-based PDF • Can OCR in Adobe Pro • Tools> Recognize Text
OCR Programs • ABBYY FineReader Pro • Easier to learn • Somewhat better with structure • About $75 • Nuance OmniPage • A bit more accessible • A bit better with STEM materials • About $100
Kurzweil Users: Please Note! • If students are using Kurzweil, then use Kurzweil for the OCR • Do not OCR and then load into Kurzweil unless you do not care about the page structure • Use KESI virtual printer • Print from Acrobat or Adobe Reader • Creates KESI files • Will not work with locked files
OCR Programs • Treat all graphics files the same • PDFs, TIFFs, JPEGs • Load image file • Create templates • Zone (analyze structure) • Run OCR
OCR Process Details • Crop before loading into OCR engine • Turn on multiple languages as needed • If doing math, turn on Greek • Only turn on the languages you need • Edit in the OCR program • Some OCR programs have font matching features • Save to Word
Once in Word • Learn to use “show hidden” • Ctrl + Shift + 8 • Beware of the optional hyphen • Search and replace to delete • Search for ^- replace with nothing • Run spell check • Use styles to structure files for braille program
Summary • Source files vs. end-user files • Source files = for you to create alt media from • End-user files = alt media formats • PDF • Consider PDFs as source files (files to process) that sometimes double as end-user files (for certain students with limited access issues)