WO2020143325A1

WO2020143325A1 - Electronic document generation method and device

Info

Publication number: WO2020143325A1
Application number: PCT/CN2019/118554
Authority: WO
Inventors: 黄泽浩; 宋欢儿
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-01-08
Filing date: 2019-11-14
Publication date: 2020-07-16
Also published as: CN109871521A

Abstract

An electronic document generation method and device. The method comprises: obtaining an entity image of a target entity, determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type (S101); adjusting a preset character recognition algorithm on the basis of the entity type, performing processing on the entity image by means of the adjusted character recognition algorithm, and outputting character information about the entity image (S102); obtaining a center coordinate of a recognized character according to a character area image, and determining a document item to which the recognized character belongs by means of the center coordinate and an effective area of each document item (S103); and importing the recognized character into a document item to which the document template belongs, and generating an electronic document about the target entity (S104). The item needing to be imported is determined according to the position of the character, so that the case of abnormal importing is reduced, semantic analysis is not needed, and the generation efficiency of the electronic document is improved.

Description

Method and equipment for generating electronic document

This application declares that it enjoys the priority of the Chinese patent application with the application number 201910017061.1 and the name "an electronic document generation method and equipment" filed on January 08, 2019. The entire content of the Chinese patent application is incorporated by reference in In this application.

Technical field

The present application belongs to the technical field of image processing, and in particular relates to a method and device for generating electronic documents.

Background technique

With the continuous advancement of the electronic process, electronic documents have been widely used in various applications due to the advantages of convenient storage and timely delivery. How to effectively convert physical files into electronic documents directly affects the efficiency of document management. The existing electronic document generation technology generally recognizes the electronic template corresponding to the entity file manually by the administrator, and manually fills the content contained in the entity file into each item of the electronic template. When there are a large number of entity files and When the amount of text is large, it takes more time to convert the electronic document, thereby reducing the efficiency of electronic document generation.

technical problem

In view of this, the embodiments of the present application provide an electronic document generation method and device to solve the existing electronic document generation technology, which requires an administrator to manually identify the electronic template corresponding to the entity file and convert the entity file The content contained in the manual is filled into each item of the electronic template, and the problem of low document generation efficiency.

Technical solution

A first aspect of embodiments of the present application provides an electronic document generation method, including:

Acquiring an entity image of the target entity, and determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type; the document template includes multiple document items;

Adjusting a preset character recognition algorithm based on the entity type, processing the entity image through the adjusted character recognition algorithm, and outputting character information about the entity image; the character information includes the recognized character and the recognized character Image of the character area of the character;

Acquiring the center coordinates of the recognized character according to the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items;

The recognized characters are imported into the document item to which the document template belongs to generate an electronic document about the target entity.

Beneficial effect

In the embodiment of the present application, by acquiring the entity image of the target entity to be converted, and then determining the entity type of the target entity according to the entity image, obtaining the document template matching the entity type; adjusting the character recognition algorithm according to the entity type, extracting the entity image contains Character information, and determine the corresponding document item according to the center coordinates of each recognized character in the character information, and then sequentially import into each associated document item of the document template to generate an electronic document about the target entity to achieve the purpose of automatic generation of electronic documents . Compared with the existing electronic document generation technology, the embodiments of the present application can determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm and does not require the user to manually select , Reducing the situation of importing anomalies, without semantic analysis, and improving the efficiency of generation.

BRIEF DESCRIPTION

1 is an implementation flowchart of an electronic document generation method provided in the first embodiment of the present application;

2 is a specific implementation flowchart of an electronic document generation method S102 provided in a second embodiment of the present application;

3 is a specific implementation flowchart of an electronic document generation method S103 provided in a third embodiment of the present application;

4 is a specific implementation flowchart of an electronic document generation method provided in a fourth embodiment of the present application;

5 is a specific implementation flowchart of an electronic document generation method S101 provided in a fifth embodiment of the present application;

6 is a structural block diagram of an electronic document generation device provided by an embodiment of the present application;

7 is a schematic diagram of a terminal device according to another embodiment of the present application.

Embodiments of the invention

In the embodiment of the present application, the execution subject of the process is a terminal device. The terminal device includes but is not limited to: a server, a computer, a smart phone, a tablet computer, and other devices capable of performing electronic document generation operations. FIG. 1 shows an implementation flowchart of the electronic document generation method provided by the first embodiment of the present application, which is described in detail as follows:

In S101, an entity image of a target entity is acquired, and an entity type of the target entity is determined according to the entity image, and a document template matching the entity type is acquired; the document template includes multiple document items.

In this embodiment, the terminal device can receive the entity map about the target entity sent by the user terminal. In this case, the user can collect the entity image about the target entity through the shooting unit built in the user terminal, and install it through the user terminal. Some clients upload the captured physical image to the terminal device, and after receiving the image upload instruction from the client, the terminal device performs the relevant operations of S101. Optionally, in order to ensure the legitimacy of the uploaded image, the terminal device will obtain the program number of the client, and determine whether the client is a program file downloaded through a legal distribution channel through the program number, if the client number is recognized as an illegal parameter , It refuses to receive the physical image, and returns abnormal image information, thereby ensuring the legitimacy of the authentication operation and preventing unauthorized users from performing image recognition, which leads to excessive load and reduces the recognition efficiency and accuracy; otherwise, if the program number is recognized If it is a legal number, the operation of S101 is performed. In addition to receiving entity images sent by other devices, you can also obtain entity images about target entities through image acquisition units such as the camera module or scanning module built in the terminal device. In this case, the terminal device can accept user-initiated image acquisition instructions Or when it is detected that the target entity to be recognized is placed in the image collection area, the image collection module is started to obtain the image information of the current time in the collection area, and the relevant operations of S101 are performed.

In this embodiment, the terminal device can preprocess the received entity image, thereby improving the accuracy of identifying the entity type. The specific pre-processing method may be: the terminal device acquires the ambient light intensity at the time of acquiring the image, determines the highlight adjustment factor and the shadow adjustment factor based on the current ambient light intensity, and applies the two adjustment factors to the highlight area and the shadow area of the physical image Make adjustments; identify the boundary contour of the target entity in the entity image, crop the entity image based on the boundary contour, so as to filter out invalid background areas; grayscale the cropped and adjusted entity image, which can improve recognition Accuracy.

In this embodiment, the terminal device may determine the entity type corresponding to the target entity according to the entity image. The specific way to identify the entity type may be: the terminal device recognizes the image size of the entity image, and compares the image size with a preset file type size list to determine the file type corresponding to the image size, so as to obtain The entity type of the target entity. In addition to determining the entity type according to the image size, the terminal device can also identify the file title of the target entity, extract the file keywords included in the file title, and determine the entity type of the target entity.

In this embodiment, the terminal device pre-configures corresponding document templates for different entity types. Since the document items included in different entity types will be different, in order to improve the accuracy of automatic import, the terminal device will pre-register different entities The document item corresponding to the type is encapsulated to generate a document template about the entity type, thereby improving the efficiency of subsequent character import. The document item is specifically used to represent different types of information contained in the target entity. One type of information can correspond to a document item. For example, a target entity is used to record the user name, and the user name can correspond to a document item; the target The entity also records the user's address, and the user's address can also correspond to another document item, so as to facilitate the classification and import of different types of information in the entity type, and improve the accuracy of the import.

In S102, a preset character recognition algorithm is adjusted based on the entity type, the entity image is processed by the adjusted character recognition algorithm, and character information about the entity image is output; the character information includes recognized characters And the character area image of the recognized character.

In this embodiment, after determining the entity type of the target entity, the terminal device may acquire the recognition algorithm parameters corresponding to the entity type, and adjust the preset character recognition algorithm through the recognition algorithm parameters to enable the character recognition The algorithm matches the entity type, thereby improving the accuracy of the character recognition algorithm. Among them, if the character recognition algorithm is a pooled neural network, the character information contained in the entity image is extracted through multiple pooling and fully connected layers, then the parameter of the recognition algorithm can be a pooled convolution kernel, and the corresponding is obtained based on the entity type Pooled convolution kernel, in particular, the entity type contains multiple character information of different font sizes and font types, which can correspond to multiple pooled convolution kernels, which can improve the efficiency of character recognition. If the character recognition algorithm is a window character recognition algorithm, and the sliding window is used to determine whether the area covered by the window contains characters, the corresponding sliding window can be obtained by the entity type.

Optionally, in this embodiment, the character recognition model may be an OCR algorithm based on Tessract technology, then the terminal device may obtain a font sample library associated with it according to the entity type, so as to quickly match each character contained in the entity image Identify,. Compared with the character recognition method based on neural network, the process efficiency of OCR recognition is higher and the hardware requirements for terminal devices are lower. On the other hand, the character sample library built by Tessract can also recognize characters of different font types. Further improve the efficiency of identification.

In this embodiment, after recognizing a character, the terminal device will locate the character area where the character is located, and obtain the coordinates of the center point of the character area as the character coordinates of the recognized character, and then based on the character coordinates and Recognize the correspondence between characters.

In S103, the center coordinates of the recognized character are obtained according to the character area image, and the document item to which the recognized character belongs is determined by the center coordinate and the effective area of each document item.

In this embodiment, after determining a character in the physical image, the terminal device may acquire the character area image where the character is located, and determine the character area image by using four corner point coordinates. And determine the center coordinates of the recognized characters based on the coordinates of the four corner points. Since the character information contained in different document items is fixed in the effective area to which the document item belongs, the center coordinates can be used as the characteristic coordinates of the recognized character, by calculating the distance between the center coordinates and the effective area of the document item Value, from which to determine whether the recognized character belongs to the document item based on the distance value. Optionally, if the distance value is less than the preset association distance, it is determined that the recognized character belongs to the document item; otherwise, if the distance value is greater than or equal to the preset association threshold, it is determined that the recognized character does not belong to the For document items, calculate the distance between the recognized character and other document items.

In this embodiment, since the entity document is printed based on the document template, the document generated by the salesperson or customer fills in the corresponding information, so the entity document has a corresponding document template, and each document in the document template The distance between the item and its associated information is small, so the distance between each document item and the recognized character can be calculated, and the document item corresponding to each recognized character can be identified, so that each recognized Characters are automatically imported for the purpose of document templates.

Optionally, in this embodiment, the terminal device may select a point closest to the effective area associated with the document item on the character area image, and import the preset Euclidean distance calculation model according to the coordinates of the two points To determine the Euclidean distance between two coordinate points. Preferably, the terminal device can perform a variant on the Euclidean distance calculation model to increase the weight of the longitudinal coordinate and decrease the weight of the horizontal coordinate. The specific Euclidean distance variant formula as follows:

Among them, α and β are preset coefficients. Since they belong to the same document item, they should be in the same horizontal area, so the weight of the ordinate to the distance value should be greater. Conversely, if the information content of a certain document item is large, the first character and the trailing character and the reference coordinate of the document item The horizontal offset between them is large, but they still belong to the same document item. Based on this, the weight of the corresponding horizontal coordinate should be small, so that the accuracy of recognition can be improved.

In S104, the recognized characters are imported into the document item to which the document template belongs, and an electronic document about the target entity is generated.

In this embodiment, after determining the document item corresponding to each recognized character, the terminal device can import each recognized character into the corresponding document item, thereby generating an electronic document about the target entity, and automatically generating an electronic document purpose.

It can be seen from the above that the method for generating an electronic document provided by an embodiment of the present application obtains the entity image of the target entity to be converted, and then determines the entity type of the target entity according to the entity image to obtain a document template matching the entity type ; Adjust the character recognition algorithm according to the entity type, extract the character information contained in the entity image, and determine the corresponding document item according to the center coordinates of each recognized character in the character information, and then sequentially import into each associated document item of the document template to generate The electronic document of the target entity realizes the purpose of automatically generating the electronic document. Compared with the existing electronic document generation technology, the embodiment of the present application can determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm, without requiring the user to manually select , Reducing the situation of importing anomalies, without semantic analysis, and improving the efficiency of generation.

FIG. 2 shows a specific implementation flowchart of an electronic document generation method S102 provided by the second embodiment of the present application. Referring to FIG. 2, relative to the embodiment described in FIG. 1, an electronic document generation method S102 provided in this embodiment includes: S1021 to S1025, and details are as follows:

Further, the adjusting a preset character recognition algorithm based on the entity type, and outputting character information about the entity image through the adjusted character recognition algorithm includes:

In S1021, the entity image is imported into a five-layer pooling network for pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image.

In this embodiment, in order to determine the character characteristics of the entity image, the terminal device may perform dimensionality reduction processing on the entity image through a preset five-layer pooling network, because the dimensionality reduction operation may make the image characteristics contained in the entity image more Obviously, for example, by pooling and dimension reduction, the information of the character size, character font type, and the location of the character in the solid image can be determined, and after the pooling and dimension reduction of the solid image, the data that the terminal device needs to process The amount will be greatly reduced, which can improve the efficiency of recognition.

In this embodiment, the terminal device adjusts the size of the physical image to a preset standard size, thereby performing dimensionality reduction operations on the adjusted physical image through reference pooling convolution verification; or, the terminal device recognizes the image of the physical image Size, thereby adjusting the pooled convolution kernels at various levels in the five-layer pooling network based on the image size. Through the above adjustment operation, the consistency of the output pooling feature matrix can be guaranteed.

Optionally, in this embodiment, the terminal device may first grayscale the solid image, thereby reducing the number of layers contained in the solid image and the outline of the prominent characters, because the solid image is a color image Next, it contains the image data of the three layers, and it is necessary to perform pooling and dimensionality reduction on the three images at the same time. The calculation of pooling and dimensionality reduction will be larger, so by graying the physical image, not only can it be reduced The number of layers of the entity image, thereby reducing the calculation amount of pooling and dimensionality reduction, can also increase the difference between the character boundary and the background image, and improve the efficiency of character information extraction.

In S1022, a sliding window matching the entity type is obtained, and sliding selection is performed on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences.

In this embodiment, the terminal device will determine the sliding window associated with the entity type according to the entity type of the target entity. The character size and font type included in different entity types will be different, so the terminal device will provide each entity type Configure a sliding window that matches the character information, and then perform sliding selection on the pooling feature matrix based on the sliding window, and the framed data is used as a window feature sequence, so that during the sliding selection process, multiple Window feature sequence.

Optionally, for different entity types, the size of the matching sliding window and the parameters contained in the window are also different.

In S1023, all the window feature sequences are imported into a preset recurrent neural network to generate a character recognition window on the solid image.

In this embodiment, after traversing and acquiring all the window feature sequences of the pooled feature matrix, the terminal device may import each window feature sequence into a preset recurrent neural network, so as to determine the character recognition matched by the entity image The window is the target window archor. Among them, the cyclic neural network includes a cyclic layer and a fully connected layer, the terminal device is set with a number of cycles, and all window trait sequences are extracted through the cyclic layer to form a cyclic feature sequence, and based on the cyclic feature sequence, it is finally poured into the full The connection layer can output the character recognition window corresponding to the physical image.

In S1024, calculate a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identify the area covered by the character recognition window based on the convolution value Whether the image is a character area image.

In this embodiment, the terminal device will perform a sliding frame selection on the solid image through the character recognition window, and calculate the convolution value between the area image covered by the character recognition lock on the solid image and the character recognition window in each sliding process Based on the value of the convolution value to determine whether the coverage area is a character area image. Since the character recognition window is generated based on the character feature sequence of the entity image, that is, the area image matching the character recognition window, it can be determined that the area image contains character information, so the character recognition window and the area can be calculated by calculating The convolution value between the images determines whether the area image covered this time is a character area image.

In this embodiment, the terminal device is provided with a matching range. If the convolution value is within the matching range, the area image covered this time is recognized as a character area image; otherwise, if the convolution value is outside the matching range, Then it is recognized that the area image covered this time is not a character area image.

In S1025, characters included in the character area image are recognized, and the character information is generated.

In this embodiment, characters contained in the character area image are determined by a character recognition algorithm such as OCR, and character information is generated according to the recognized characters and the location information where the character area image is located.

In the embodiment of the present application, by performing dimensionality reduction processing on the entity image and generating a character recognition window based on the window feature matrix after the dimensionality reduction processing, the accuracy of character recognition can be improved.

FIG. 3 shows a specific implementation flowchart of an electronic document generation method S103 provided in the third embodiment of the present application. Referring to FIG. 3, relative to the embodiment described in FIG. 1, an electronic document generation method S103 provided in this embodiment includes: S1031 to S1033, and specific details are as follows:

Further, the acquiring the center coordinates of the recognized character based on the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items, include:

In S1031, the corner coordinates of the character area image are obtained, and the center coordinates are calculated according to the corner coordinates and the image size of the solid image.

In this embodiment, the character area image can be defined by multiple corner coordinates. Therefore, the terminal device can arbitrarily select two diagonal corner coordinates or four corner coordinates from the character area image, so as to pass the corner coordinates Calculate the geometric center of the character area image. For example, if the two angular coordinates are (x ₁ , y ₁ ) and (x ₂ , y ₂ ), then the geometric center of the character area image is:

In this embodiment, the terminal device also acquires the image size of the physical image, so as to calculate the center coordinates of the character area image according to the geometric center and the image size. Assuming that the length and width of the physical image are divided into L and H, the geometric center of the character area is:

Due to the problem of shooting angle and resolution, the position of the character area image may be affected, and determining the center coordinate of the character area by the image size and geometric center of the physical image can reduce the influence of the center coordinate on the size of the physical image. Improve the accuracy of document item identification.

In S1032, the distance between the center coordinate and each coordinate point on the outline of the effective area is calculated, and the distance with the smallest value is selected as the characteristic distance between the character area image and the document item.

In this embodiment, after determining the center coordinates of the character area image, the terminal device may calculate the distance value between the center coordinate and each coordinate point on the outline line of the document item, and select the smallest distance value as the character area image The feature distance value from the document item is the closest to the document item due to the closest distance to the document item's outline boundary; otherwise, the farther away from the document item's outline boundary, the document item The less relevant the items are. Because, in order to determine the correlation between the two, the terminal device needs to determine the minimum value between the center coordinate and the effective area of the document item, that is, the above-mentioned feature distance.

In S1033, the document item with the smallest feature distance is selected as the document item in the character area image.

In this embodiment, the terminal device calculates the feature distance between the character feature area and each document item, and selects the document item with the smallest feature area as the document item of the character area image. Preferably, the terminal device uses the document item covered by the character feature area image as the associated document item of the recognized character, and calculates only the feature distance of the associated document item. Since the character area image of the recognized character may fall into the effective area of multiple document items, it is only necessary to distinguish which document item belongs to, and the character area image of the recognized character does not fall into the document item, then It must be irrelevant to the recognized character, so there is no need to calculate the feature distance, thereby reducing a large number of invalid calculations.

In the embodiment of the present application, by weighting the image coordinates of the center coordinates of the character area image, the influence of the image size on the calculation of the center coordinates can be reduced, and the accuracy of recognition can be improved.

FIG. 4 shows a specific implementation flowchart of an electronic document generation method provided by the fourth embodiment of the present application. Referring to FIG. 4, relative to the embodiments described in FIGS. 1 to 3, an electronic document generation method provided in this embodiment adjusts the preset character recognition algorithm based on the entity type through the adjusted Before the character recognition algorithm processes the entity image and outputs character information about the entity image, it further includes: S401-S403, which are described in detail as follows:

In S401, the average pixel value of the solid image is calculated according to the pixel value of each pixel in the solid image.

In this embodiment, in order to extract the character area image, the terminal device counts the pixel values of each pixel in the physical image to determine the reference color of the physical image. Since the background area image occupies more area than the character area image, the reference color corresponding to the physical image should be similar to the color of the background area image. Based on this, the terminal device calculates the average pixel value of the physical image, so that the background pixel can be easily identified.

In S402, if any difference between the pixel point and the average pixel value in the physical image is less than a preset background threshold, the pixel point is identified as a background pixel point.

In this embodiment, the terminal device calculates the difference between the pixel value of each pixel and the average pixel value, so that it can be determined whether the pixel is similar to the reference color of the physical image, and if it is similar, the pixel can be determined The point is a background pixel, so all pixels can be classified, and pixels with a difference less than a preset background threshold are identified as background pixels; pixels with a difference greater than or equal to the background threshold are identified as character pixels.

In S403, the area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.

In this embodiment, the terminal device recognizes the continuous area composed of background pixels as a background area image, and the terminal device deletes all the solid images in the background area images to obtain a character area image containing only character pixels .

In the embodiment of the present application, the average pixel value of the entity image is determined, so that the character area image is recognized according to the average pixel value, thereby improving the recognition efficiency and accuracy of the character area.

FIG. 5 shows a specific implementation flowchart of an electronic document generation method S101 provided by the fifth embodiment of the present application. Referring to FIG. 5, relative to the embodiments described in FIGS. 1-3, an electronic document generation method S101 provided in this embodiment includes: S1011 to S1012, and details are as follows:

In S1011, an identifier of a preset area in the physical image is obtained.

In this embodiment, an identifier is configured on at least one predetermined area of the target entity, the identifier may be a character string or a two-dimensional code identifier, etc. The terminal device may extract the identifier of the preset area from the entity area Symbol, and perform symbol recognition on the identifier.

In S1012, the entity type of the target entity is determined based on the identifier.

In this embodiment, the terminal device compares the identifier with the list of entity type identifiers to determine the entity type that the identifier matches, so that the entity type of the target entity can be determined.

In the embodiment of the present application, by identifying the identifier at the preset position, the entity type is determined, and the accuracy and efficiency of the entity type identification are improved.

It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the order of execution, and the execution order of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

FIG. 6 shows a structural block diagram of an electronic document generation device provided by an embodiment of the present application. Each unit included in the electronic document generation device is used to execute each step in the embodiment corresponding to FIG. 1. For details, please refer to the related description in the embodiment corresponding to FIG. 1 and FIG. 1. For ease of explanation, only parts related to this embodiment are shown.

Referring to FIG. 6, the electronic document generating device includes:

The entity image acquisition unit 61 is used to acquire the entity image of the target entity, determine the entity type of the target entity according to the entity image, and obtain a document template matching the entity type; the document template includes multiple document items ;

The character information output unit 62 is used to adjust a preset character recognition algorithm based on the entity type, process the entity image through the adjusted character recognition algorithm, and output character information about the entity image; the character information Including images of the recognized characters and the character area of the recognized characters;

The document item determination unit 63 is configured to acquire the center coordinates of the recognized character based on the character area image, and determine the belonging of the recognized character through the center coordinates and the effective area of each document item Document project

The character information importing unit 64 is configured to import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.

Optionally, the character information output unit 62 includes:

A pooling feature matrix generating unit, configured to import the entity image into a five-layer pooling network to perform pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;

A window feature sequence output unit, configured to obtain a sliding window that matches the entity type, and perform sliding selection on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;

A character recognition window generating unit, used to import all the window feature sequences into a preset recurrent neural network to generate a character recognition window on the solid image;

The character area recognition unit is used to calculate the convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and recognize the character recognition window based on the convolution value. Whether the area image covered is a character area image;

The character recognition unit is used for recognizing characters contained in the character area image and generating the character information.

Optionally, the document item determination unit 63 includes:

A center coordinate calculation unit, configured to acquire the corner coordinates of the character area image, and calculate the center coordinates according to the corner coordinates and the image size of the physical image;

A feature distance output unit for calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the feature distance between the character area image and the document item ;

The feature distance comparison unit is used to select the document item with the smallest feature distance as the document item in the character area image.

Optionally, the electronic document generating device further includes:

An average pixel value calculation unit, configured to calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image;

A background pixel recognition unit, used to identify the pixel as a background pixel if the difference between any of the pixels in the physical image and the average pixel value is less than a preset background threshold;

The character area image extraction unit is configured to recognize the area covered by the background pixels as a background area image, and remove the background area image from the physical image to obtain the character area image.

Optionally, the entity image acquisition unit 61 includes:

An identifier acquiring unit, configured to acquire an identifier of a preset area in the physical image;

The entity type determining unit is configured to determine the entity type of the target entity based on the identifier.

Therefore, the electronic document generation device provided by the embodiment of the present application can also determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm, without requiring the user to manually select, reducing In case of abnormal import, no semantic analysis is needed, which improves the efficiency of generation.

7 is a schematic diagram of a terminal device according to another embodiment of the present application. As shown in FIG. 7, the terminal device 7 of this embodiment includes: a processor 70, a memory 71, and computer-readable instructions 72 stored in the memory 71 and executable on the processor 70, such as an electronic document Generate the program. When the processor 70 executes the computer-readable instructions 72, the steps in the above embodiments of the method for generating each electronic document are implemented, for example, S101 to S104 shown in FIG. 1. Alternatively, when the processor 70 executes the computer-readable instructions 72, the functions of the units in the foregoing device embodiments are realized, for example, the functions of the modules 61 to 64 shown in FIG. 6.

Exemplarily, the computer-readable instructions 72 may be divided into one or more units, and the one or more units are stored in the memory 71 and executed by the processor 70 to complete the application . The one or more units may be a series of computer-readable instruction instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 72 in the terminal device 7. For example, the computer-readable instructions 72 may be divided into a solid image acquisition unit, a character information output unit, a document item determination unit, and a character information import unit, and the specific functions of each unit are as described above.

The terminal device 7 may be a computing device such as a desktop computer, a notebook, a palmtop computer and a cloud server. The terminal device may include, but is not limited to, a processor 70 and a memory 71. Those skilled in the art may understand that FIG. 7 is only an example of the terminal device 7 and does not constitute a limitation on the terminal device 7, and may include more or fewer components than the illustration, or a combination of certain components or different components. For example, the terminal device may further include an input and output device, a network access device, a bus, and the like.

The so-called processor 70 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 71 may be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7. The memory 71 may also be an external storage device of the terminal device 7, such as a plug-in hard disk equipped on the terminal device 7, a smart memory card (Smart, Media, Card, SMC), and a secure digital (SD) Cards, flash cards, etc. Further, the memory 71 may include both an internal storage unit of the terminal device 7 and an external storage device. The memory 71 is used to store the computer-readable instructions and other programs and data required by the terminal device. The memory 71 can also be used to temporarily store data that has been or will be output.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or software function unit.

Those of ordinary skill in the art may understand that all or part of the process in the method of the foregoing embodiments may be completed by instructing relevant hardware through computer-readable instructions, which may be stored in a computer non-volatile In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the foregoing method embodiments. Wherein, any reference to the memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still implement the foregoing The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not deviate from the spirit and scope of the technical solutions of the embodiments of the present application. Within the scope of protection of this application.

Claims

An electronic document generation method is characterized by including:

Acquiring an entity image of the target entity, and determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type; the document template includes multiple document items;

Adjusting a preset character recognition algorithm based on the entity type, processing the entity image through the adjusted character recognition algorithm, and outputting character information about the entity image; the character information includes the recognized character and the recognized character Image of the character area of the character;

Acquiring the center coordinates of the recognized character according to the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items;

Import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.
The generating method according to claim 1, wherein the adjusting a preset character recognition algorithm based on the entity type, and outputting character information about the entity image through the adjusted character recognition algorithm includes:

Importing the entity image into a five-layer pooling network for pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;

Acquiring a sliding window matching the entity type, and slidingly selecting on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;

Import all the window feature sequences into a preset recurrent neural network to generate a character recognition window about the entity image;

Calculating a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identifying whether the area image covered by the character recognition window is a character based on the convolution value Area image

Recognize the characters contained in the character area image to generate the character information.
The generating method according to claim 1, wherein the center coordinates of the recognized character are obtained from the character area image, and the center coordinates and the effective area of each of the document items are used to determine The document items to which the recognized characters belong include:

Acquiring the corner coordinates of the character area image, and calculating the center coordinates according to the corner coordinates and the image size of the physical image;

Calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the characteristic distance between the character area image and the document item;

The document item with the smallest feature distance is selected as the document item in the character area image.
The generation method according to any one of claims 1 to 3, wherein in the preset character recognition algorithm adjusted based on the entity type, an entity image is processed by the adjusted character recognition algorithm, Before outputting character information about the physical image, it also includes:

Calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image;

If the difference between any of the pixel points and the average pixel value in the physical image is less than a preset background threshold, the pixel point is identified as a background pixel point;

The area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.
The generating method according to any one of claims 1 to 3, wherein the acquiring the entity image of the target entity and determining the entity type of the target entity according to the entity image includes:

Acquiring the identifier of the preset area in the physical image;

The entity type of the target entity is determined based on the identifier.
An electronic document generating device, characterized in that it includes:

An entity image acquiring unit, configured to acquire an entity image of a target entity, determine the entity type of the target entity according to the entity image, and obtain a document template matching the entity type; the document template includes multiple document items;

The character information output unit is used to adjust a preset character recognition algorithm based on the entity type, process the entity image through the adjusted character recognition algorithm, and output character information about the entity image; the character information includes An image of the recognized character and the character area of the recognized character;

A document item determining unit, configured to acquire the center coordinates of the recognized character based on the character area image, and determine the document to which the recognized character belongs based on the center coordinates and the effective area of each document item project;

The character information importing unit is configured to import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.
The generating device according to claim 6, wherein the character information output unit includes:

A pooling feature matrix generating unit, configured to import the entity image into a five-layer pooling network to perform pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;

A window feature sequence output unit, configured to obtain a sliding window that matches the entity type, and perform sliding selection on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;

A character recognition window generating unit, used to import all the window feature sequences into a preset recurrent neural network to generate a character recognition window on the solid image;

The character area recognition unit is used to calculate the convolution value of the character recognition window between the area image covered by the entity image and the character recognition window, and recognize the character recognition window based on the convolution value. Whether the area image covered is a character area image;

The character recognition unit is used for recognizing characters contained in the character area image and generating the character information.
The generating device according to claim 6, wherein the document item determination unit includes:

A center coordinate calculation unit, configured to acquire the corner coordinates of the character area image, and calculate the center coordinates according to the corner coordinates and the image size of the physical image;

A feature distance output unit for calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the feature distance between the character area image and the document item ;

The feature distance comparison unit is used to select the document item with the smallest feature distance as the document item in the character area image.
The generating device according to any one of claims 6-8, wherein the electronic document generating device further comprises:

An average pixel value calculation unit, used to calculate the average pixel value of the solid image according to the pixel value of each pixel in the solid image;

A background pixel recognition unit, used to identify the pixel as a background pixel if the difference between any of the pixels in the physical image and the average pixel value is less than a preset background threshold;

The character area image extraction unit is configured to recognize the area covered by the background pixels as a background area image, and remove the background area image from the physical image to obtain the character area image.
The generation device according to any one of claims 6-8, wherein the entity image acquisition unit includes:

An identifier acquiring unit, configured to acquire an identifier of a preset area in the physical image;

The entity type determining unit is configured to determine the entity type of the target entity based on the identifier.
A terminal device, characterized in that the terminal device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and the processor implements the following steps when executed:

Acquiring an entity image of the target entity, and determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type; the document template includes multiple document items;

Adjusting a preset character recognition algorithm based on the entity type, processing the entity image through the adjusted character recognition algorithm, and outputting character information about the entity image; the character information includes the recognized character and the recognized character Image of the character area of the character;

Acquiring the center coordinates of the recognized character according to the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items;

The recognized characters are imported into the document item to which the document template belongs to generate an electronic document about the target entity.
The terminal device according to claim 11, wherein the adjusting a preset character recognition algorithm based on the entity type, and outputting character information about the entity image through the adjusted character recognition algorithm includes:

Importing the entity image into a five-layer pooling network for pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;

Acquiring a sliding window matching the entity type, and slidingly selecting on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;

Import all the window feature sequences into a preset recurrent neural network to generate a character recognition window about the entity image;

Calculating a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identifying whether the area image covered by the character recognition window is a character based on the convolution value Area image

Recognize the characters contained in the character area image to generate the character information.
The terminal device according to claim 11, wherein the center coordinates of the recognized character are acquired from the character area image, and the center coordinates and the effective area of each of the document items are used to determine The document items to which the recognized characters belong include:

Acquiring the corner coordinates of the character area image, and calculating the center coordinates according to the corner coordinates and the image size of the physical image;

Calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the characteristic distance between the character area image and the document item;

The document item with the smallest feature distance is selected as the document item in the character area image.
The terminal device according to any one of claims 11 to 13, wherein, in the adjusting a preset character recognition algorithm based on the entity type, the entity image is processed by the adjusted character recognition algorithm, Before outputting character information about the physical image, the processor performs the following steps when executing the computer-readable instructions:

Calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image;

If the difference between any of the pixel points and the average pixel value in the physical image is less than a preset background threshold, the pixel point is identified as a background pixel point;

The area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.
The terminal device according to any one of claims 11-13, wherein the acquiring an entity image of a target entity and determining the entity type of the target entity according to the entity image includes:

Acquiring the identifier of the preset area in the physical image;

The entity type of the target entity is determined based on the identifier.
A computer non-volatile readable storage medium, the computer non-volatile readable storage medium stores computer readable instructions, characterized in that, when the computer readable instructions are executed by a processor, the following steps are implemented:

Acquiring an entity image of a target entity, and determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type; the document template includes multiple document items;

Adjusting a preset character recognition algorithm based on the entity type, processing the entity image through the adjusted character recognition algorithm, and outputting character information about the entity image; the character information includes the recognized character and the recognized character Image of the character area of the character;

Acquiring the center coordinates of the recognized character according to the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items;

Import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.
The computer non-volatile storage medium according to claim 16, wherein the preset character recognition algorithm is adjusted based on the entity type, and the adjusted character recognition algorithm outputs information about the entity Image character information, including:

Import the entity image into a five-layer pooling network to perform pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;

Acquiring a sliding window that matches the entity type, and slidingly selecting on the pooling feature matrix based on the sliding window to obtain multiple window feature sequences;

Import all the window feature sequences into a preset recurrent neural network to generate a character recognition window about the entity image;

Calculating a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identifying whether the area image covered by the character recognition window is a character based on the convolution value Area image

Recognize the characters contained in the character area image to generate the character information.
The computer non-volatile storage medium according to claim 16, wherein the center coordinates of the recognized character are obtained from the character area image, and the center coordinates and each of the documents are obtained through the center coordinates The effective area of the item, which determines the document item to which the recognized character belongs, includes:

Acquiring the corner coordinates of the character area image, and calculating the center coordinates according to the corner coordinates and the image size of the physical image;

Calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the characteristic distance between the character area image and the document item;

The document item with the smallest feature distance is selected as the document item in the character area image.
The non-volatile computer-readable storage medium according to any one of claims 16 to 18, characterized in that, after the preset character recognition algorithm is adjusted based on the entity type, the adjusted character recognition Before the algorithm processes the solid image and outputs character information about the solid image, the computer-readable instructions are executed by the processor to implement the following steps:

Calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image;

If the difference between any of the pixel points and the average pixel value in the physical image is less than a preset background threshold, the pixel point is identified as a background pixel point;

The area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.
The computer non-volatile storage medium according to any one of claims 16 to 18, wherein the acquiring an entity image of the target entity and determining the entity type of the target entity based on the entity image, include:

Acquiring the identifier of the preset area in the physical image;

The entity type of the target entity is determined based on the identifier.