Automation Action – Convert Image To Text Using OCR | ThinkAutomation

Automation Action: Convert Image To Text Using OCR

Convert an image or PDF file or attachment to text using optical character recognition (OCR). Can also extract images from PDF files and convert these to text.

The Convert Image To Text Using OCR automation action can be used to convert image files or image attachments to text using optical character recognition (OCR) and assigns the extracted text to a variable. The text can then be used further in your automation workflow. This action can also extract images from PDF files and the convert these images to text. This action uses a local OCR engine.

Select a Image To Convert – this can be any local file or a %variable% replacement. You can specify multiple files if required, separated by commas (any file paths that contain commas must be enclosed in quotes).

Enable the Include Incoming Attachments option to convert attached images matching the Matching Mask. Enter *.* to convert all supported attachments (png, bmp, gif, tiff, jpeg, pdf).

The Language defaults to ‘eng’ (English). You can specify a different three letter language code. You can download additional language packages from https://github.com/tesseract-ocr/tessdata. These should be copied to the Tesseract tessdata folder.

The Output Type can be text, xml or CSV. If the Preserve Layout option is enabled then space padding is preserved.

Select the Page Segmentation Mode. This controls how Tesseract analyzes the layout of the text in the image – essentially, how it segments and organizes blocks of text before attempting character recognition. Select one of:

Choosing the Right PSM

  • Use PSM 6 for a simple block of printed text.
  • Use PSM 1 or 3 for full pages with mixed layout.
  • Use PSM 7 or 8 for single lines or words, such as in scanned forms or boxes.
  • Use PSM 11 or 12 for images with scattered text fragments.

If Auto Rotate is enabled then the Tesseract will be called first to detect how the text is aligned (PSM 0). The image will be rotated if required. If all of your images are correctly aligned then you should not enable this option – since it requires additional processing time.

If multiple images are converted within the same action then the extracted text from each image will be appended to the returned text.

Select the variable to receive the plain text from the Assign To list.

To test the text extraction select or enter an image file and click the Test button. The results will be displayed.

You can also use the Ask AI action with the ‘Ask AI To Respond To A Prompt With An Image’ operation to perform OCR on images.
This action uses the open source Tesseract OCR library. Tesseract is not installed by default with the ThinkAutomation setup. If Tesseract is not installed the Install Tesseract button will be visible.
This is one action from over 180 actions included with ThinkAutomation. The ThinkAutomation business process automation (BPA) solution is designed to automate on-premises and cloud-based business processes that are triggered from incoming messages. Automate messages received by email, database updates, webhooks, web forms, web chat, SMS messages, Twitter, Teams messages, documents, local files and other messages sources. Create any number of workflow automations using the drag-and-drop low-code designer. Simple fixed pricing, with unlimited message processing reduces overall costs compared to hosted automation solutions.

You can also extend ThinkAutomation by creating your own custom automation actions using the built-in designer and C#/VB.net code editor.

Download Free 30 Day Trial

Back To Automation Actions List

ThinkAutomation Home