Home > Capturing Data > Capturing Data Using OCR

Capturing Data Using OCR

When the text to capture is embedded in a graphics, the only capture method is to use OCR (Optical Character Recognition). WinTask OCR takes a screenshof of the zone to capture and converts to plain text. OCR Capture Wizard can be called during Recording mode by clicking in the toolbar the OCR   icon then CaptureAreaOCR Wizard icon. Or in the Language window (the right pane in the Editor window), double click in the list of functions the CaptureAreaOCR$ function (which is in Capture functions group) - press F4 if the Language pane is not dsplayed.

The following steps illustrate how to capture share prices embedded in an image. You can watch the video or follow the steps below.

  1. Start WinTask. If the Your First Script Wizard dialog box is displayed, click the Close button. The WinTask Editor window should now be active.

  2. From the WinTask toolbar, click the Rec button  to start recording your actions.

  3. The Start Recording Mode dialog box will appear asking What do you want to start before recording?. Select the Internet Explorer radio button and click the OK button.

  4. In the following dialog box, Launching Internet Explorer, type www.wintask.com/demos/ocr.html into the Web address text field and click the OK button.

  5. In the webpage titled OCR page, let's say that you want to capture the share price of GE.

  6. Click the OCR   icon on the floating WinTask toolbar. A set of OCR icons is displayed, click the he CaptureAreaOCR icon.

  7. The CaptureAreaOCR$ screen is displayed. In Select the OCR engine, select MODI if you have Microsoft Office 2003 or 2007 which includes an OCR engine called MODI (Microsoft Office Document Imaging), or select WinTask. Modi is more accurate than WinTask OCR engine (follow point 2 of this article http://support.microsoft.com/kb/982760 for installing MODI even if you don't have Office). Select WinTask.

  8. Click the Capture button. The mouse cursor changes to a crosshair. Draw a rectangle around the share name, so around GE, and click the left mouse button to capture the data within the rectangle.

  9. The Image Preview field shows the image zone that you did capture. The Text seen by OCR engine field displays the OCRIzed text. Click Insert and Resume button to capture the GE value in the next step.

  10. In Select the OCR engine, select now MODI if you have it (if not stay with WinTask), and click Capture button. With the mouse, draw a rectangle around the GE value and click the left mouse button to capture the data within the rectangle.

  11. The Image Preview field shows the image zone that you did capture. The Text seen by OCR engine field displays the OCRIzed text. Click Insert and Resume button.

  12. Close Internet Explorer.

  1. Stop Recording Mode by clicking the Stop button  on the floating WinTask toolbar.

  2. The WinTask Editor window is now restored and the script statements generated during Recording Mode are inserted into the current script document window.

  3. Add a msgbox(var$) line after each CaptureAreaOCR$ line to display the captured data (see the full script below).

  4. Click  icon to execute the script. You are prompted for a name, call the script for example ocrcapture. The first captured data is displayed, click OK and the second one is then displayed.

The steps above were used to generate the following script statements. Comments have been added to explain each script statement
' Start Internet Explorer and load the web page
StartBrowser("IE", "www.wintask.com/demos/ocr.html",3)
 
' Specify which OCR engine to use - the 2 parameter means WinTask engine
ret = UseOCREngine(2)
 
' Capture the text within the rectangle defined by 297,303,14,28 coordinates
var$ = CaptureAreaOCR$("IEXPLORE.EXE|Internet Explorer_Server|OCR Page - Windows Internet Explorer|1",1,297,303,14,28)

' Line added as requested at step 15
msgbox(var$)
 
' Specify which OCR engine to use - the 1 parameter means MODI engine
ret = UseOCREngine(1)
 
' Capture the text within the rectangle defined by 430,302,18,39 coordinates
var$ = CaptureAreaOCR$("IEXPLORE.EXE|Internet Explorer_Server|OCR Page - Windows Internet Explorer|1",1,430,302,18,39)

' Line added as requested at step 15
msgbox(var$)
 
' Close Internet Explorer
CloseWindow("IEXPLORE.EXE|IEFrame|OCR Page - Windows Internet Explorer",1)
 

See also

Capturing Data in a Windows Application
Capturing Data in a Web Page