How to use abby fine reader. How it works: FineReader. Internal software error

One of the most popular functions for working with scanning and processing files of various types is Fine Reader. The functionality of the software product was developed by the Russian company ABBYY, it allows not only to recognize, but also to process documents (translate, change formats, etc.). Many users can only install, but cannot figure out how to use ABBYY FineReader. You can find answers to many questions in this article.

The program allows you to scan and recognize text - and not only

To understand in detail what kind of program ABBYY FineReader 12 is, you need to consider in detail all its features. The first and easiest function is to scan a document. There are two scanning options: with and without recognition. In the case of normal scanning of a printed sheet, you will receive an image that was scanned in the specified folder on your computer device.

ATTENTION. The sheet must be laid on the scanning part of the printer straight, along the lines indicated on the printer. Do not allow the source to be wrinkled, this can lead to poor quality of the final scan.

You must independently decide what you need FineReader for, since the utility has significant functionality, for example, you can choose what color you want to receive the image in, it is possible to convert all photos to black and white. Recognition is faster in black and white, and the processing quality increases.

If you are interested in ABBYY FineReader's OCR function, you need to press a special button before scanning. In this case, there are several options for obtaining information. As a standard, a recognized piece of the sheet will be displayed on your screen, which you can copy or edit manually.

If you select other functions, you can immediately get the file as a Word document or Excel spreadsheet. Selecting functions is very simple, the menu is intuitive, easy to customize due to the fact that all the buttons you need are in front of your eyes.

IMPORTANT. Before recognizing ABBYY FineReader text, you need to select the processing language exactly. Despite the fact that the utility works completely automatically, it happens that the low quality of the source does not allow us to understand what kind of language was in the source. This greatly reduces the quality of the final results of the application.

Multiple operating modes

To fully understand how to use ABBYY FineReader 12, you need to try two modes of operation "Careful" and "Fast recognition". The second mode is suitable for high quality images, while the first mode is suitable for low quality files. Thorough mode takes 3-5 times longer to process files.

The illustration shows the result of the program - text recognition from an image

What other features are there?

OCR in ABBYY FineReader is not the only useful feature. For greater convenience of users, it is possible to translate the document into the formats required by the user (pdf, doc, xls, etc.).

Change text

To understand how to change the text in Fine Reader, the user needs to open the "Service" - "Check" tab. After that, a window will open that will allow you to edit the font, change symbols, colors, etc. If you are editing an image, then you should open the "Image Editor", it almost completely corresponds to a simple paint tool, Paint, but it will allow you to make minimal edits.

ATTENTION. If you still could not figure out how to use ABBYY FineReader productively, you can read the Help section, which can be found in the application window, in the About tab.

Now you know what FineReader is for and you can use it correctly at home or in your office. The functionality of the application is huge, use it and you can be convinced of the indispensability of this software product when processing documents and files during office work.

So, we have FineReader installed on our computer. We turn on the scanner and digitize some multi-page document. Let's call it, conditionally, "Agreement".

We put the first page of the document on the scanner glass, close the lid. Launch the FineReader program. Click the "Scan" button, or press the "Ctrl + K" combination with the keys. The "ABBYY FineReader Scanning" window opens. When digitizing an ordinary text page with a typed font of 11-12 points, leave the settings in the default window and press the "View" button.

The scanner works and after a few seconds we see our page in the viewport. Here we can resize the scan if needed. And then we press the "Scan" button.

FineReader starts the OCR process and within a minute the page image opens in the program window. The right side of the window is now divided into three sections. In the left section "Image" we can edit the image. You can read more about image editing in the lesson: Scanning a book. In the right section "Text" you can immediately make changes to the text - edit the content of the page even before saving it. This is very convenient when you need, for example, to quickly change dates, details, surnames in a document.

The icon of the recognized page appears in the left part of the "Pages" window:

If you don't need to edit anything, replace the first page on the scanner glass with the second page and repeat the technology. Having adjusted the scan size once in the "Scanning ABBYY FineReader" window in "Preview" mode for the first page, now immediately click the "Scan" button. The settings made for the first page are saved, and the subsequent pages are scanned without preview. So we scan all the pages of our document.

We finished, and now, by alternately clicking on the icons, we open the pages, checking their correct sequence.

After that, in the left part of the "Pages" window, select all the icons with the button: "Edit - Select All" or the keyboard shortcut: "Ctrl + A". Then, in the drop-down list next to the "Save" button, select the command: "Save as PDF document":


Now we hit the button itself and save the document with the name "Agreement.pdf" in the "Agreement" folder:


As a result, we get a multi-page text document in pdf format - an electronic version of our document with the code name "Agreement".

So, with FineReader we digitize text documents.

Changing the scanning mode to "color" in the "ABBYY FineReader Scanning" window will also easily digitize color pictures and photographs.

And, having set in the context menu, for example, the command: "Save as Microsoft Word 2007 document" we will transform our project into a single multi-page editable Word document.

In general, the program is easily digestible, intuitive and pop-up tips everywhere.

Abbyy Finereader is a program for recognizing text with images. The source of the pictures is usually a scanner or MFP. You can scan directly from the application window, and then automatically translate the image into text. In addition, Fine Reader can convert images received from a scanner into PDF and FB2 formats, which is useful when creating e-books and documentation for subsequent printing.

How to fix the problem: ABBYY Finereader does not see the scanner.

For Abbyy Finereader 14 (latest version) to work correctly, the following requirements must be met on your computer:

  • processor with a frequency of 1 GHz and support for the SSE2 instruction set;
  • OS Windows 10, 8.1, 8, 7;
  • RAM from 1 GB, recommended - 4 GB;
  • TWAIN or WIA compliant image capture device;
  • Internet access for activation.

If your hardware does not meet these requirements, the program may not work correctly. But even if all the conditions are met, Abbyy FineReader often generates various scanning errors, such as:

  • unable to open TWAIN source;
  • the parameter is set incorrectly;
  • internal software error;
  • source initialization error.

In the vast majority of cases, the problem is related to the application itself and its settings. But sometimes errors occur after updating the system or after connecting new equipment. Let's consider the most common recommendations of what to do if ABBYY FineReader does not see the scanner and displays error messages.

Error correction

There are some general tips for fixing something that doesn't work correctly:

  1. Update your hardware drivers to the latest versions from the manufacturer's official website.
  2. Check the rights of the current user in the system, increase the access level if necessary.
  3. Sometimes installing an older version of the application helps, especially if you are using non-new hardware.
  4. Check if the scanner can be seen by the system itself. If it does not appear in Device Manager or is shown with a yellow exclamation mark, then the problem is in the hardware, not the program. Refer to the manufacturer's instructions or technical support.
  5. The official ABBYY website has good technical support https://www.abbyy.com/ru-ru/support. You can ask a question, describing in detail your specific problem, and get a professional first-hand solution absolutely free.

Eliminating the error "The parameter is specified incorrectly"

In the latest version of ABBYY FineReader, it may also be called "Source Initialization Error". Initialization is the process of connecting and recognizing the hardware by the system.

If the Fine Reader does not see the scanner when starting the scan dialog box and displays such errors, then the following actions should help:

  1. Restart FineReader.
  2. Go to the "Tools" menu, select "OCR Editor".
  3. Click on "Tools", then "Options".
  4. Include the "Basic" section.
  5. Go to "Select a device for imaging", then "Select a device".
  6. Click on the dropdown list of available drivers. Check the performance of scanning one by one with each of the list. If successful with any of them, use it later.

ATTENTION. It is also possible that the scan failed with any of the available drivers. Then click Use Scanner Interface.

If that doesn't work, you will need the TWAIN_32 Twacker utility. It can be downloaded from the official ABBYY website at ftp://ftp.abbyy.com/TechSupport/twack_32.zip.

Then follow the instructions:

  1. Exit Fine Reader.
  2. Unpack the twack_32.zip archive to any folder.
  3. Double click on Twack_32.exe.
  4. After starting the program, go to the "File" menu, then "Acquire".
  5. Click "Scan" in the dialog that opens.
  6. If the document was scanned successfully, open the File menu and click Select Source.
  7. The driver through which the utility successfully completed the scan will be displayed in blue.
  8. Select the same driver file in the file reader.

If this fails again when you start Abbyy Finereader, then the problem is with the program. Send a request to ABBYY technical support. If 32 Twacker could not execute the “Scan” command, then the device itself or its driver is probably not working correctly. Contact the technical support of the scanner manufacturer.

Internal software error

It happens that when starting a scan, the application reports "Internal program error, code 142". It is usually associated with the removal or damage of the system files of the program. To fix and prevent reoccurrence, do the following:


Sometimes the FinReader may not see the scanner due to access restrictions. Run the program as an administrator or raise the rights of the current user.

This solves the problem of connecting Fine Reader to the scanner. Sometimes the reason is driver conflict or hardware incompatibility. And it happens that a scan failure occurs due to internal software errors. If you have encountered similar problems in the file reader, leave tips and solutions in the comments.

Although advances made to artificial intelligence (AI) over the past 50 years have not brought smart machines one iota closer to human cognitive capabilities, it would be unfair to completely deny progress in this direction. The most obvious and striking example is chess (not to mention the simpler games). The computer cannot yet imitate our thinking, but it is quite capable of compensating for this gap with a large amount of specialized memory and brute-force speed. Vladimir Kramnik described the performance of the Deep Fritz program that won him in 2006 as “inhuman” in the sense that it often contradicted the established (human) rules of strategy and tactics.

And just over a year ago, another brainchild of IBM, which at one time laid the foundation for the triumphant chess victories of computers (the famous Deep Blue), called Watson, made a new breakthrough, beating two champions of the popular American quiz Jeopardy by a large margin. It is significant, however, that although Watson independently voiced the answers, the questions were still transmitted to him in text form. This suggests that the successes in many areas of AI applications - speech and image recognition, machine translation - are quite modest, although this does not prevent us from applying them in practice today. The greatest successes, perhaps, are demonstrated by optical character recognition systems (OCR, Optical Character Recognition), with which almost all PC users are probably familiar in one way or another. Moreover, Russian developments in this area occupy a worthy place in the world - I mean ABBYY FineReader.

A bit of history

The current version of ABBYY FineReader is number 11, that is, the application has come a long way, and even the history of this process is of some interest. Without pretending to be an exhaustive chronicle, I will give only the main milestones over the last decade, during which I more or less followed FineReader:

YearVersionKey features
2003 7.0 Increase in recognition accuracy up to 25%. This was most reflected in tables, especially complex ones, with colored cells, hidden dividers, etc.
2005 8.0 Further optimization of recognition algorithms, primarily aimed at working not with document scans, but with digital photographs. For this, additional functions for preparing originals have appeared (elimination of distortions, alignment of lines, etc.).
2007 9.0 The emergence of ADRT technology, which takes into account the logical structure of the entire processed (multi-page) document and is able to highlight repeating elements (headers and footers), connect "flowing" objects (tables), etc.
2009 10.0 Further improvement of ADRT and recognition algorithms, increasing the accuracy of processing originals with low resolution up to 30%.
2011 11.0 The main attention is paid to the speed of the program. "Second coming" of black and white mode, which on originals of good quality gives an additional acceleration of up to 30%.

Naturally, during the same time, FineReader expanded support for document formats, improved built-in tools and interface, improved the reconstruction of the structure of originals, etc. after the next "breakthrough" there follows a certain period of "calm", which is necessary for the improvement of new algorithms. They are the main value of any OCR-program, and therefore any detailed information about them rarely reaches users. However, ABBYY has kindly agreed to open the veil of secrecy, and today we have the opportunity to look into the inner sanctum of FineReader.

Basic principles

So, since OCR belongs to the field of AI, it is quite logical that developers seek to imitate the activity of our brain at least to some extent. Of course, the structure of our visual system is incredibly complex, but the basic "large-block" principles of its functioning have been sufficiently studied, usually there are three of them:

  1. Integrity- an object is considered as a set of its parts and (for visual images) spatial relationships between them. In turn, the parts are interpreted only as part of the entire object. This principle helps to build and refine hypotheses, quickly rejecting the unlikely ones.
  2. Purposefulness- since any interpretation of data pursues a specific goal, then recognition is a process of putting forward hypotheses about an object and purposefully testing them. A system operating in accordance with this principle will not only use computational power more economically, but will also be less likely to make mistakes.
  3. Adaptability- the system saves the information accumulated in the process of work and uses it repeatedly, that is, it learns itself. This principle allows you to create and accumulate new knowledge and avoid the repeated solution of the same problems.

FineReader is the only OCR system in the world that operates in accordance with the principles described above at all stages of document processing. The corresponding technology is called IPA- by the first letters of English terms. For example, according to the principle of integrity, a fragment of an image will be interpreted as a symbol only if it contains all the structural parts of such objects, and they are in certain relationships. This helps to replace the enumeration of a large number of patterns (in search of a more or less suitable one) with a purposeful test of a reasonable number of hypotheses, and relying on the previously accumulated information about possible character styles in the recognized document.

However, the principles of IPA apply not only to the fragments corresponding to (presumably) individual characters, but also to the entire original page image. Most OCR systems are based on the recognition of the hierarchical structure of the document, that is, the page is divided into basic structural elements, such as tables, images, blocks of text, which, in turn, are divided into other characteristic objects - cells, paragraphs - and so on. , down to individual characters.

Such an analysis can be carried out in two main ways: from top to bottom, that is, from constituent elements to individual symbols, or, conversely, from bottom to top. One of them is most often used, but ABBYY has developed a special algorithm MDA(multilevel document analysis) that combines both. In short, it looks like this: the structure of the page is analyzed by the top-down method, and the reconstruction of the electronic document at the end of recognition occurs from the bottom-up, however, at all levels, an additional feedback mechanism operates. As a result, the probability of gross errors associated with incorrect recognition of high-level objects is sharply reduced.

ADRT

Historically, OCR systems have evolved from recognizing individual characters. This task is still the most important and most difficult one; the most complex algorithms are associated with it. However, it soon became clear that higher-level information (for example, about the document language and the correct spelling of recognized words) could help in solving it - this is how context and dictionary checks appeared. Then the desire to preserve formatting and recreate the physical structure (that is, the relative position of various objects) of the document led to the need for detailed analysis of the entire page. It is clear that this also noticeably affects the overall quality of recognition, since it helps to correctly process multi-column layout, tables and other techniques of "non-linear" text layout.

Most modern OCRs operate precisely on these three levels - characters, words, pages - practicing, as already mentioned, top-down or bottom-up approaches. However, ABBYY, in accordance with the principles of IPA, introduced another level into FineReader - the entire multi-page document. First of all, this was necessary for the correct reproduction of the logical structure, which is becoming more and more complicated in modern documents. But there are additional bonuses: increased accuracy and faster processing of repeating objects, more correct identification (and hence recognition) of objects "flowing" from page to page.

This is exactly what was developed for ADRT(Adaptive Document Recognition Technology) is a technology for analyzing and synthesizing a document at a logical level. Ultimately, it helps to make the result of FineReader's work as similar as possible to the original. For this, the image of the entire document is analyzed, and the recognized words are combined into groups (clusters) depending on the style, environment and location on the page. Thus, the program, as it were, sees the "logic" of the markup of the document and in the future can unify the design of the result.

Thanks to ADRT, FineReader, starting from version 9.0, has learned to detect, recognize and reproduce the following structural parts and elements of document formatting:

  • main text;
  • headers and footers;
  • page numbers;
  • headings of the same level;
  • table of contents;
  • text inserts;
  • figure captions;
  • tables;
  • footnotes;
  • signature / seal areas;
  • fonts and styles.

Recognition process

According to the MDA algorithm, the actual recognition starts from top to bottom, from the page level. It is clear that the more wrong decisions will be made in the early stages of this process, the more there will be in the next. That is why the recognition accuracy depends so much on the quality of the originals, but the algorithms for their preprocessing can be essential. Thus, as the popularity of color documents grew, FineReader introduced adaptive binarization ( AB). If you scan a document in black and white mode at once, where there are watermarks or text is located on a textured or colored substrate, then “garbage” will invariably appear on the image, which will then be quite difficult to separate from the “useful” image (since the original information about him is already lost). That is why FineReader prefers to work with color or grayscale images, independently converting them to black and white (this process is called binarization). But that's not all. Since the colors of the text and background can differ within the page and even individual lines, AB highlights words with more or less the same characteristics and selects binarization parameters that are optimal in terms of recognition quality for each. This is precisely the adaptability of the algorithm, which is thus an example of the use of feedback in MDA. It is clear that the effectiveness of AB strongly depends on the design of the source documents - on the ABBYY test base, this algorithm provided an increase in recognition accuracy by 14.5%.

But the most interesting, of course, begins when the recognition process descends to the lowest levels. The so-called linear division procedure breaks lines into words and words into individual letters; then, in accordance with the IPA principle, it forms a set of hypotheses (that is, possible options for what kind of character it is, into which characters the word is split, etc.) and, having provided each with a probability estimate, sends the character recognition mechanism to the input. The latter consists of a number of so-called classifiers, each of which also forms a series of hypotheses, ranked according to the assumed degree of likelihood. The most important characteristic of any classifier is the middle position of the correct hypothesis. It is clear that the higher it is, the less work for subsequent algorithms - for example, a dictionary check. But for sufficiently well-functioning classifiers, the most often assessed characteristics are the recognition accuracy according to the first three hypotheses or only according to the first - that is, roughly speaking, the ability to guess the correct answer from three or one attempt. ABBYY uses the following types of classifiers in its systems: raster, feature, feature differential, contour, structural and structural differential - which are grouped at two logical levels.

Operating principle RK, or raster classifier, is based on a pixel-by-pixel comparison of the symbol image with the standards. The latter are formed as a result of averaging images from the training sample and are reduced to a certain standard form; accordingly, the size, thickness of elements, and slope are also pre-normalized for the recognized image. This classifier is distinguished by its simplicity of implementation, speed of operation and resistance to image defects, but it provides a relatively low accuracy and that is why it is used at the first stage - to quickly generate a list of hypotheses.

Feature classifier ( PC), as its name implies, is based on the presence in the image of signs of a particular symbol. If there are N such features in total, then each hypothesis can be represented by a point in N-dimensional space; accordingly, the accuracy of the hypothesis will be estimated by the distance from it to the point corresponding to the standard (which is also accumulated on the training sample). It is clear that the types and number of features largely determine the quality of recognition, so there are usually a lot of them. This classifier is also relatively fast and simple, but not too resistant to various image defects. In addition, the PC does not operate with the original image, but with a certain model, abstraction, that is, it does not take into account some of the information: for example, the very fact of the presence of some important elements does not say anything about their mutual arrangement. For this reason, the PC is not used instead of, but together with the RK.

Contour classifier ( QC) is a special case of the PC and differs in that it analyzes the contours of the intended symbol extracted from the original image. In general, its accuracy is lower than that of a full-fledged PC.

Feature differential classifier ( MPC) is also similar to a PC, but is used solely to distinguish similar objects, such as "m" and "rn". Accordingly, it analyzes only those areas where differences are hidden, and not only the original images, but also the hypotheses formed at the early stages of recognition are fed to it as input. The principle of its operation, however, is somewhat different from that of a PC. At the stage of training in the N-dimensional space, two "clouds" (groups of points) of possible values ​​for each of the two options are formed, then a hyperplane is constructed that separates the "clouds" from each other and is approximately equidistant from them. The recognition result depends on which half-space the point corresponding to the original image falls into.

By itself, the MPC does not put forward hypotheses, but only clarifies the existing ones (the list of which, in the general case, is sorted by the bubble method), so that a direct assessment of its effectiveness is not carried out, but is indirectly equated to the characteristics of the entire first level of OCR recognition. However, it is clear that it depends on the correctness of the selected features and the representativeness of the sample of standards, the provision of which is a rather laborious task.

Structural differential classifier ( KFOR) was originally used to process handwritten texts. Its task is to distinguish between such similar objects as "C" and "G". Thus, the SDC is based on the features characteristic of each pair of symbols, the learning process is even more complicated than that of the MPC, and the speed of work is lower than that of all previous classifiers.

Structural classifier ( SC) is a source of pride for ABBYY, originally it was developed to recognize the so-called hand-printed text, that is, when a person writes in "block" letters, but was later used for print. It is used at the final stages of recognition and comes into effect quite rarely, namely, only when at least two hypotheses with sufficiently high probabilities reach it.

The qualitative characteristics of all classifiers are summarized in the following table. However, they only allow us to evaluate the effectiveness of the algorithms relative to each other, since they are not absolute, but are obtained on the basis of processing a specific test sample. One might get the impression that at the last stages of recognition, the struggle is literally for fractions of a percent, but in fact, each classifier makes a significant contribution to improving the recognition accuracy - for example, the IC reduces the number of errors by a noticeable 20%.

RKPCQCMPC *SDK **SK **
Accuracy for the first three options,%99,29 99,81 99,30 99,87 99,88 -
Accuracy according to the first option,%97,57 99,13 95,10 99,26 99,69 99,73

* assessment of the entire first level of ABBYY OCR algorithm
** estimate for the whole algorithm after adding the corresponding classifier

It is curious, however, that, despite the rather high accuracy, the recognition algorithm itself does not make a final decision. In accordance with the MDA principle, hypotheses are put forward at every logical level, and their number can grow exponentially. Accordingly, sequential testing of all hypotheses is unlikely to be effective, and therefore ABBYY OCR systems use the method of structuring hypotheses, that is, referring them to certain models. There are a couple of dozen of the latter, here are just a few of their types: dictionary word, non-dictionary word, Arabic numerals, Roman numerals, URL, regular expression - and each can include many specific models (for example, a word in one of the well-known languages, Latin, Cyrillic etc.).

All final actions are performed with hypotheses based on models. For example, context checking will determine the language of the document and immediately significantly reduce the likelihood of models using incorrect alphabets, while the dictionary one compensates for errors in the uncertain recognition of some characters: for example, the word "turn" is present in the English dictionary - unlike "tum" (in anyway, it is not among the popular ones). Although the priority of the dictionary is higher than that of any classifier, it is not necessarily the last resort, and in the general case does not stop further checks: firstly, as mentioned above, there is a model of a non-dictionary word, and secondly, the special organization of dictionaries allows with a high proportion of probabilities to guess whether some unknown word can refer to a particular language. Nevertheless, the dictionary check (and the completeness of dictionaries) has a significant impact on the recognition result, and in the tests of ABBYY itself it practically halves the number of errors.

Not only OCR

Printed documents are far from the only ones of interest in terms of their digitization and automatic processing. Quite often you have to work with forms, that is, documents with predefined and fixed fields that are filled in manually, but relatively neatly (so-called hand-printed characters) - an example is various questionnaires. The technology of their processing has a separate name - ICR(intelligent character recognition) - and quite significantly differs from OCR. So, since in this case the task is not to recreate the entire document, but to extract specific data from it, it breaks down into two main subtasks: finding the required fields and actually recognizing their content.

This is a rather specific area, and ABBYY offers a completely separate software product ABBYY FlexiCapture for it. It is designed to create automated and semi-automated systems, assumes customization for specific types of documents for which special templates are created, is able to intelligently find various fields on pages and verify data in them, etc. However, the very basis is based on character recognition algorithms similar to those that are used in FineReader, and the general scheme is very similar:

However, there is still an important difference: the structural classifier is an obligatory participant in the process - this is due to the specifics of hand-printed characters. In addition, ICR involves a large number of specific additional checks: for example, whether a character is strikethrough, or whether the recognized characters actually form a date.

This time I will tell you how to turn paper documents into an electronic form of PDF format, as well as how to transfer a paper document to a computer in order to change the text. So, let's begin.
I have a paper document in my hands.

SCAN to PDF

Task: transfer this document to a computer (convert to electronic form). Moreover, it must be done in this form so that it cannot be changed in the future (roughly speaking, you need to take a photo of the document). Then this electronic document must be mailed to an email address. Moreover, the client asks for it in pdf format.

By stages:
1) pass the document through the scanner
2) save the resulting print in pdf format to my computer
3) I send the received file by mail
In my work, I use 2 programs to solve this problem:
Foxit Phantom or ABBYY FineReader. For clarity, I attach screenshots:
In Foxit Phantom, with the scanner turned on, select FILE-CREATE PDF-FROM SCANNER in the main menu ...
It will scan and you will be prompted to save the file. Choose a location, write the name of the file and save.

ABBYY FineReader has huge buttons on the toolbar. One of them is called SCAN to PDF. We use it.

If you need to scan a multi-page document, then, in stages:
1) Press the button under number 1 SCAN

We receive the scanned document

We also scan one more page (press again the button under number 1 SCAN).
2) Save to PDF



As a result, we get a finished multi-page document in the form of a PDF file.

This file can now be sent by email.

TEXT RECOGNISING

Task: convert a paper document into electronic form (into a computer)

By stages:
1) Scan (SCAN button 1)

2) Recognition (RECOGNIZE ALL button 2)

Recognition should be understood as the process of translating a photograph (picture) into text (letters, numbers, signs). If you have photographed a text page, then after recognition 99% of the text from the paper will turn into electronic text. The electronic text can already be changed (edited) on the computer as you like.

3) Saving to a text editor (button 4 Save)
I advise you to choose TRANSFER ALL PAGES TO-MICROSOFT WORD

We get

I would like to point out important points in the RECOGNITION procedure. There are nuances when working.
Immediately after recognition, I advise you to look at the result. Especially the blocks created by FineReader.

These are areas marked with rectangular frames. These frames are of different colors. If it is red, then this block is recognized as a PICTURE. If it is black, then TEXT. Blocks are of different types. The block type can be found by clicking on the block with the RIGHT mouse button and selecting CHANGE BLOCK TYPE.

A little trick: you can select an arbitrary area and mark a block with any type. For example, select the part of the text that is poorly recognized using the left mouse button (press, hold and drag, the frame changes size).

As a result, a document in Word will have a text block and a picture block. The block picture will have a completely unchanged appearance. I use this method when saving stamps, non-standard fonts, pictures, photographs.

PS: Knowledge and ability to work with PDF, scan and recognize documents very often help out in office work. Knowledge - saves you time!

Loading ...Loading ...