Home

OmniPage 16 - HAW Hamburg

image

Contents

1. Installation The installation procedure starts by opening the setup file on the CD Just follow the steps which are shown there Normally the installation window appears after inserting the CD Starting OmniPage 16 OmmiPage 16 When starting OmniPage 16 a window appears where you can choose between three view options for the J Classic view menu arrangement figure 1 For those familiar with OmniPage This view has a similar look and feel to previous versions of OmniPaae Choose the classic view option The OmniPage Desktop M Flexible View For advanced OmniPage customers will then appear figure 2 I This view is a new alternate layout of the OmniPage function panels stacked ina Jf tabbed view to give each panel more space QuickConvert View For new OmniPage customers This view is designed for quick and easy Ii document conversion without having to learn a lot The most important conversion options are clearly visible on one screen All function panels and toolbars can be moved reconfigured and customized in each view to your particular needs including use of dual screens You can save your own custom layouts You can switch between views under the Windows menu Do not show this again Figure 1 Menu View Option Window The OmniPage Desktop The OmniPage Desktop is shown in figure 2 Lined Wei ae Leen T Lhi apne Tih 5 E a TI De wR 2 Figure 2 OmniPage Desktop
2. On top of the OmniPage Desktop there is the main menu bar figure 3 Untitled OmniPage Document 1 OmniPage 16 File Edit view Format Tools Process Window Help ly bla yk PO Of a J Figure 3 Main Menu Bar Standard View It has the following options File Here you can open and save OPDs OmniPage 16 Documents Edit Here you find functions like Copy Paste or Find amp Replace View The view options do not need to be changed Format Here you find all options concerning the character format when editing OCR scanned text Tools Here you find possibilities for optimizing the work with OmniPage 16 In this memorandum the Workflow Assistant will be discussed Process Here you can start workflows and find options to edit your OPD Window Choose the menus which are displayed or change the main menu view Help The symbols on the bottom of figure 2 are for quick access of often used commands Below the main menu bar there is the workflow display number 1 in figure 2 which shows the process chain of the current workflow On the left side you find the thumbnail view of the pages which are loaded and ready for an OCR process 2 in figure 2 Next to the thumbnail view there is the page image where you can see the page which shall be processed next 3 There you can split the page into different areas pictures text tables etc and even define areas which shall not be scanned by OCR However this action is not r
3. Hochschule fur Angewandte Wissenschaften Hamburg Hamburg University of Applied Sciences me Bl Memo Aero_M_Omnipage16_2009 01 07 doc Date 2009 01 07 From To Daniel Schiktanz Kolja Seeckt HAW Hamburg HAW Hamburg daniel schiktanz haw hamburg de kolja seeckt haw hamburg de OmniPage 16 In Aero a huge amount of photo scanned documents is available for research purposes Scanning these documents via full text search is essential for an effective work Unfortunately a lot of documents are photo scanned and not ready for full text search That is why it is necessary to find a way of converting these photo scanned documents into editable formats which can be scanned by full text search programs This memorandum describes an effective way of converting these documents with the help of OmniPage 16 Introduction OmniPage 16 is a quite popular OCR program Optical Character Recognition developed by Nuance Communications Inc The full version costs 119 Its purpose is converting photo scanned documents into editable formats like DOC or PDF A professional version is avail able as well OmniPage 16 Pro and costs about 350 The following instructions relate to the English user interface and only present a way of con verting photo scanned documents into editable PDF files as economically as possible Further functions of OnmiPage 16 will not be discussed For more information read the user s manual of the program
4. Medical Byelorussian v 0 CO CORO ETC CUE Retain features Look for headers and footers Retain text and background color Use PDF Fonts Look for hyperlinks Retain inverted text Figure 6 Recognize Images Window For Workflow Definition The Recognize Images window does not only concern images but also characters of the pages to be processed On top blue box in figure 6 you can define the layout description This set ting should be left on Automatic On the right you can choose whether the OCR process shall be optimized for speed or accuracy For Aero purposes Speed would be the right choice In the red box in figure 6 choose the languages contained in the pages to be processed If neces sary you can activate the utilization of a professional dictionary only legal or medical ones In the green box in figure 6 there are all options concerning characters In the Font Matching menu you can choose the fonts used in your processed text When you want to have Greek formula symbols for example you need to activate the symbol font within the menu More over it 1s possible to define characters which shall not be used in the processed text or which shall be used additionally e g the German characters 4 6 or B The other options on the bottom of figure 6 may remain unchanged After clicking Next the Correct Recognition Results window appears As stated before the correction shall be skipped However this step has to be defined
5. somehow in order to create a valid workflow Before the workflow is completely finished you can delete this step again which will be shown later Just ignore the settings for the Correct Recognition Results window and click on Next The next window is the Save window which is shown in figure 7 Workflow Assistant Scan and Save Save Step 4 of 4 Save as Text Image Multiple Output file options 1 UNS File options File type Load Files a Recognize Images Prompting _ Prompt for file saving name and location Start here Create a new File For each image file PDF Edited pdf Naming options Formatting level Use input file names True Page o Correct Racons Res Save automatically with a specific name and location Click to change step w Specify Location sad Input Output lt a single file gt C Dokumente und Einstellungen Daniel Schiktanz C Dokumente und Einstellungen Daniel Schiktanz _ Create files in a timestamp subfolder Figure 7 Save Window For Workflow Definition Select Save as Text on top of the window Below you find the Output file options blue box in figure 7 When your photo scanned documents consist of several photo scanned pages com piled in one respective file choose Create a new file for each image file in the file options menu Under naming options choose Use input file names In the File type menu there is a huge amount of possible formats the proce
6. to start from Workflow No selection necessary Click Next SSS O gt eee Existing Workflows Workflow name Untitled Explanation i Create a workflow by selectind gteps and their settings Click Next to begin Enter Workflow Name Figure 4 Create New Workflow Window Workflow Assistant Scan and Save Load Files Step 1 of 1 Prompt for files Select files For loading each time this workflow is started Click on Tools in the main menu bar and choose Workflow Assis tant The window shown in figure 4 will appear Select Fresh Start for defining a new work Enter the workflow name and then click on Next Now the Load Files window will flow appear figure 5 Start here Load automatically from specific files or Folders i Add files to this list if they reside in different folders and you want to process them i i You can also add folders here to process their contents Browse Preprocessing PDF Rotation None Password Despeckle image C Open as Image Deskew image C Use Tags Look for facing pages Keep original image resolution Figure 5 Load Files Window For Workflow Definition Here you have to choose the files to be processed There are two possibilities Either you choose the files to be processed every time you start the workflow activate the field Se lect files for loading each time this workflow is started located in the blue box in figure 5 or all files will be
7. your workflow will then be shown in the workflow display By clicking on a step the workflow assistant will automatically appear and show all options of the selected step which can be ed ited if necessary By clicking on LSS gt the workflow will be started Once it is running a Pause symbol will be displayed instead of the symbol which was used to start the workflow By clicking on it you can pause the workflow any time Just save the current OPD and reopen it when you want to continue the conversion with the workflow at a later time Other OCR Software in Comparison with OmniPage 16 There are two main competitors next to OmniPage 16 Readiris Pro 11 by I R I S and Fine Reader 9 by Abbyy After studying reviews about these three programs it became clear quite quickly that Readiris Pro 11 is no option because of its bad relation of price to performance PC Welt 2007 So the decision had to be made between OmniPage 16 and FineReader 9 According to CHIP 2008 OmniPage 16 works much faster than FineReader 9 c t 2007 stated that there are no big differences between both programs However OmniPage 16 is better for converting images with low resolution and a lot of graphics and is also cheaper than FineReader 9 which has a better auto correction and produces better results for images of high resolution PCMag 2008 concludes that FineReader is far easier to use than OmniPage Corporate users who work with highly complex documents an
8. ages Because of the conversion the file 1s now ready for a full text search and text can be copied from the searchable image The file size is the same as for a PDF document with 1m age substitutes Once the output file options are set it is possible to enable the prompting option brown box in figure 7 When activated you have to define the saving options for each processed docu ment after the OCR process This option is not recommended when a fully automatic conver sion process shall be done In the red box in figure 7 you can specify the output location of the processed files Since it was selected to use the input file names for the processed files the output folder must be different from the input folder Now all options for the saving process are set You can add and define another saving process by clicking on Next This could be useful when you want to save your processed documents in different formats or different locations Finally delete step 3 of the workflow Correct Recognition Results green box in figure 7 in order to make the workflow fully automatic Now you can click on Finish The workflow has been successfully created and saved It can be edited any time with the workflow assistant Automatic Document Conversion with the defined Workflow Take a look at the workflow display number 1 in figure 2 At the very left there is a drop down menu with all available workflows Select the one you just created All steps of
9. d those who need automatic handling will find that OmniPage provides features that FineReader doesn t FineReader is well suited for work with lots of proofreading and manual interaction whereas OmniPage is more fitted for automatic processes Taking into account the purpose of an OCR program for Aero it appears that OmniPage 16 is the best choice References CHIP 2008 Chip Online de Test Abbyy FineReader 9 0 CHIP Xonio Online GmbH M nchen 2008 URL http www chip de artikel Abbyy FineReader 9 0 OCR Programm_30524314 html 2009 01 04 c t 2007 Nr 25 2007 Magazin f r Computertechnik Test FineReader 9 vs OmniPage 16 Hannover 2007 URL http news idealo de news 1 1366 ct test finereader omnipage html 2009 01 04 OmniPage 16 User s Manual German Edition Nuance Communications Inc Burlington 2007 PCMag 2008 PCMag com The Independent Guide To Technology OmniPage Professional 16 Review Ziff Davis Publishing Holdings Inc New York San Francisco 2008 URL http www pcmag com article2 0 28 17 2305590 00 asp 2009 01 04 PC Welt 2007 PC Welt das Portal f r Computer amp Technik Digital Lifestyle Business IT Test Readi ris Pro 11 IDG Magazine Media GmbH M nchen 2007 URL http www pcwelt de start software_os office tests 92689 readiris_pro_11 2009 01 04 10
10. eally necessary since OmniPage knows how to differentiate between normal text tables and pictures quite well and does it automatically In field 4 there is the Text Editor where the processed text 1s displayed The Text Editor works like a normal text program and allows you to correct mistakes caused by a bad resolution of the scanned document or words which are unknown to OmniPage for example On the bottom you find the Document Manager 5 which shows the current status of the OCR process the number of processed or unknown words and other data for each proc essed page Using the Workflow Function for Document Conversion In this memorandum the focus is on converting photo scanned documents into editable for mats This can be achieved in an economical way with the OmniPage workflows The work flow used for the conversion is defined by three steps 1 Load the document to be processed 2 Perform the OCR process 3 Save the document in an editable format Normally there is an additional step after step 2 where the processed text has to be checked for mistakes because of words unknown to OmniPage or a bad resolution of the photo scanned document However this step can be skipped since the purpose of this memorandum is not producing flawless documents with OmniPage but making photo scanned documents available for full text search in an economical way Defining the Workflow Workflow Assistant Create New Workflow Source Select an item
11. loaded automatically when the workflow is started The last option will be activated when the field Select files for loading each time this workflow is started is deacti vated When choosing the first option it 1s possible to define the folder where the files to be processed are located When you have to select these files after starting the workflow the file browser will open this folder automatically Of course it is possible to switch to other folders When choosing the second option you have to define all files which shall be processed click on Browse located in the red box in figure 5 Here it is also possible to define a folder upon which all files in this folder will be processed All other options Preprocessing PDF do not need to be changed After all options are set click on Next Figure 6 shows the window which appears next Workflow Assistant Scan and Save Recognize Images Step 2 of 2 Layout description Optimize the OCR process for Automatic Se Arey NA ri a Languages and dictionaries Fonts and characters Load Files Click to change step w Languages in document User dictionary lY English A none Font Matching My German Afrikaans Professional dictionaries Reject character Albanian O Dutch Legal E Aymara C Dutch Medical Basque C Engish Legal Additional characters Bemba C English Medical Blackfoot C French Legal Breton CJ French Medical Bugotu CJ German Legal Bulgarian J German
12. ssed document can be saved in Below some impor tant PDF options are explained PDF Formatted True Page This format is a compromise between file size and correctness of the document The text will be completely saved in normal characters besides images so that the overall layout is compa rable to that found in typical PDF documents PDF Formatted Plain Text This format needs the least disk space about one fifth of the True page PDF The whole text is saved in normal characters besides images All characters have the same format and there are no blank lines Keep in mind that produced mistakes by unknown words stay unchanged in the proc essed document for the Just two mentioned formats PDF With Image Substitutes This format resembles the TruePage PDF However unknown words are saved as images in the PDF document which look like the original photo scanned word so that the user can check the original word Although saved as an image in the PDF document the unknown words will be considered during a full text search but in their processed version which might be error prone A PDF document with image substitutes is a very safe way of converting since mistakes of the conversion can be found Yet the processed files need about three times as much disk space as the files processed to a True Page formatted PDF PDF Searchable Image Here the PDF does not contain any processed characters but only the original photo scanned im

Download Pdf Manuals

image

Related Search

Related Contents

Akku-Airless-Handpistole PowerCoat, Art.  Camtray - Seiden & Co. Hotel Supply  MANUAL DE INSTRUCCIONES REFRIGERADORES  Samsung E608 用户手册  Brivo OnAir Administrator`s Manual  Monitores e controladores de custo de consumo de  Cas Interface 2 & Add  TALKBACK INTERCOM SYSTEM  User Guide - Envirocheck  Stampa di etichette  

Copyright © All rights reserved.
Failed to retrieve file