WO2020143325A1 - Electronic document generation method and device - Google Patents

Electronic document generation method and device Download PDF

Info

Publication number
WO2020143325A1
WO2020143325A1 PCT/CN2019/118554 CN2019118554W WO2020143325A1 WO 2020143325 A1 WO2020143325 A1 WO 2020143325A1 CN 2019118554 W CN2019118554 W CN 2019118554W WO 2020143325 A1 WO2020143325 A1 WO 2020143325A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
character
entity
document
area
Prior art date
Application number
PCT/CN2019/118554
Other languages
French (fr)
Chinese (zh)
Inventor
黄泽浩
宋欢儿
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020143325A1 publication Critical patent/WO2020143325A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion

Definitions

  • the present application belongs to the technical field of image processing, and in particular relates to a method and device for generating electronic documents.
  • the existing electronic document generation technology generally recognizes the electronic template corresponding to the entity file manually by the administrator, and manually fills the content contained in the entity file into each item of the electronic template. When there are a large number of entity files and When the amount of text is large, it takes more time to convert the electronic document, thereby reducing the efficiency of electronic document generation.
  • the embodiments of the present application provide an electronic document generation method and device to solve the existing electronic document generation technology, which requires an administrator to manually identify the electronic template corresponding to the entity file and convert the entity file The content contained in the manual is filled into each item of the electronic template, and the problem of low document generation efficiency.
  • a first aspect of embodiments of the present application provides an electronic document generation method, including:
  • the document template includes multiple document items
  • the character information includes the recognized character and the recognized character Image of the character area of the character;
  • the recognized characters are imported into the document item to which the document template belongs to generate an electronic document about the target entity.
  • the embodiments of the present application can determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm and does not require the user to manually select , Reducing the situation of importing anomalies, without semantic analysis, and improving the efficiency of generation.
  • FIG. 2 is a specific implementation flowchart of an electronic document generation method S102 provided in a second embodiment of the present application
  • FIG. 3 is a specific implementation flowchart of an electronic document generation method S103 provided in a third embodiment of the present application.
  • FIG. 4 is a specific implementation flowchart of an electronic document generation method provided in a fourth embodiment of the present application.
  • FIG. 5 is a specific implementation flowchart of an electronic document generation method S101 provided in a fifth embodiment of the present application.
  • FIG. 6 is a structural block diagram of an electronic document generation device provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a terminal device according to another embodiment of the present application.
  • the execution subject of the process is a terminal device.
  • the terminal device includes but is not limited to: a server, a computer, a smart phone, a tablet computer, and other devices capable of performing electronic document generation operations.
  • FIG. 1 shows an implementation flowchart of the electronic document generation method provided by the first embodiment of the present application, which is described in detail as follows:
  • an entity image of a target entity is acquired, and an entity type of the target entity is determined according to the entity image, and a document template matching the entity type is acquired; the document template includes multiple document items.
  • the terminal device can receive the entity map about the target entity sent by the user terminal.
  • the user can collect the entity image about the target entity through the shooting unit built in the user terminal, and install it through the user terminal.
  • Some clients upload the captured physical image to the terminal device, and after receiving the image upload instruction from the client, the terminal device performs the relevant operations of S101.
  • the terminal device will obtain the program number of the client, and determine whether the client is a program file downloaded through a legal distribution channel through the program number, if the client number is recognized as an illegal parameter , It refuses to receive the physical image, and returns abnormal image information, thereby ensuring the legitimacy of the authentication operation and preventing unauthorized users from performing image recognition, which leads to excessive load and reduces the recognition efficiency and accuracy; otherwise, if the program number is recognized If it is a legal number, the operation of S101 is performed.
  • the terminal device can accept user-initiated image acquisition instructions Or when it is detected that the target entity to be recognized is placed in the image collection area, the image collection module is started to obtain the image information of the current time in the collection area, and the relevant operations of S101 are performed.
  • the terminal device can preprocess the received entity image, thereby improving the accuracy of identifying the entity type.
  • the specific pre-processing method may be: the terminal device acquires the ambient light intensity at the time of acquiring the image, determines the highlight adjustment factor and the shadow adjustment factor based on the current ambient light intensity, and applies the two adjustment factors to the highlight area and the shadow area of the physical image Make adjustments; identify the boundary contour of the target entity in the entity image, crop the entity image based on the boundary contour, so as to filter out invalid background areas; grayscale the cropped and adjusted entity image, which can improve recognition Accuracy.
  • the terminal device may determine the entity type corresponding to the target entity according to the entity image.
  • the specific way to identify the entity type may be: the terminal device recognizes the image size of the entity image, and compares the image size with a preset file type size list to determine the file type corresponding to the image size, so as to obtain The entity type of the target entity.
  • the terminal device can also identify the file title of the target entity, extract the file keywords included in the file title, and determine the entity type of the target entity.
  • the terminal device pre-configures corresponding document templates for different entity types. Since the document items included in different entity types will be different, in order to improve the accuracy of automatic import, the terminal device will pre-register different entities
  • the document item corresponding to the type is encapsulated to generate a document template about the entity type, thereby improving the efficiency of subsequent character import.
  • the document item is specifically used to represent different types of information contained in the target entity. One type of information can correspond to a document item.
  • a target entity is used to record the user name, and the user name can correspond to a document item; the target The entity also records the user's address, and the user's address can also correspond to another document item, so as to facilitate the classification and import of different types of information in the entity type, and improve the accuracy of the import.
  • a preset character recognition algorithm is adjusted based on the entity type, the entity image is processed by the adjusted character recognition algorithm, and character information about the entity image is output; the character information includes recognized characters And the character area image of the recognized character.
  • the terminal device may acquire the recognition algorithm parameters corresponding to the entity type, and adjust the preset character recognition algorithm through the recognition algorithm parameters to enable the character recognition
  • the algorithm matches the entity type, thereby improving the accuracy of the character recognition algorithm.
  • the character recognition algorithm is a pooled neural network
  • the character information contained in the entity image is extracted through multiple pooling and fully connected layers
  • the parameter of the recognition algorithm can be a pooled convolution kernel, and the corresponding is obtained based on the entity type Pooled convolution kernel, in particular, the entity type contains multiple character information of different font sizes and font types, which can correspond to multiple pooled convolution kernels, which can improve the efficiency of character recognition.
  • the character recognition algorithm is a window character recognition algorithm, and the sliding window is used to determine whether the area covered by the window contains characters, the corresponding sliding window can be obtained by the entity type.
  • the character recognition model may be an OCR algorithm based on Tessract technology
  • the terminal device may obtain a font sample library associated with it according to the entity type, so as to quickly match each character contained in the entity image Identify,.
  • the process efficiency of OCR recognition is higher and the hardware requirements for terminal devices are lower.
  • the character sample library built by Tessract can also recognize characters of different font types. Further improve the efficiency of identification.
  • the terminal device after recognizing a character, the terminal device will locate the character area where the character is located, and obtain the coordinates of the center point of the character area as the character coordinates of the recognized character, and then based on the character coordinates and Recognize the correspondence between characters.
  • the center coordinates of the recognized character are obtained according to the character area image, and the document item to which the recognized character belongs is determined by the center coordinate and the effective area of each document item.
  • the terminal device may acquire the character area image where the character is located, and determine the character area image by using four corner point coordinates. And determine the center coordinates of the recognized characters based on the coordinates of the four corner points. Since the character information contained in different document items is fixed in the effective area to which the document item belongs, the center coordinates can be used as the characteristic coordinates of the recognized character, by calculating the distance between the center coordinates and the effective area of the document item Value, from which to determine whether the recognized character belongs to the document item based on the distance value.
  • the distance value is less than the preset association distance, it is determined that the recognized character belongs to the document item; otherwise, if the distance value is greater than or equal to the preset association threshold, it is determined that the recognized character does not belong to the For document items, calculate the distance between the recognized character and other document items.
  • the entity document is printed based on the document template
  • the document generated by the salesperson or customer fills in the corresponding information
  • the entity document has a corresponding document template
  • each document in the document template The distance between the item and its associated information is small, so the distance between each document item and the recognized character can be calculated, and the document item corresponding to each recognized character can be identified, so that each recognized Characters are automatically imported for the purpose of document templates.
  • the terminal device may select a point closest to the effective area associated with the document item on the character area image, and import the preset Euclidean distance calculation model according to the coordinates of the two points To determine the Euclidean distance between two coordinate points.
  • the terminal device can perform a variant on the Euclidean distance calculation model to increase the weight of the longitudinal coordinate and decrease the weight of the horizontal coordinate.
  • the specific Euclidean distance variant formula as follows:
  • ⁇ and ⁇ are preset coefficients. Since they belong to the same document item, they should be in the same horizontal area, so the weight of the ordinate to the distance value should be greater. Conversely, if the information content of a certain document item is large, the first character and the trailing character and the reference coordinate of the document item The horizontal offset between them is large, but they still belong to the same document item. Based on this, the weight of the corresponding horizontal coordinate should be small, so that the accuracy of recognition can be improved.
  • the recognized characters are imported into the document item to which the document template belongs, and an electronic document about the target entity is generated.
  • the terminal device can import each recognized character into the corresponding document item, thereby generating an electronic document about the target entity, and automatically generating an electronic document purpose.
  • the method for generating an electronic document obtains the entity image of the target entity to be converted, and then determines the entity type of the target entity according to the entity image to obtain a document template matching the entity type ; Adjust the character recognition algorithm according to the entity type, extract the character information contained in the entity image, and determine the corresponding document item according to the center coordinates of each recognized character in the character information, and then sequentially import into each associated document item of the document template to generate
  • the electronic document of the target entity realizes the purpose of automatically generating the electronic document.
  • the embodiment of the present application can determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm, without requiring the user to manually select , Reducing the situation of importing anomalies, without semantic analysis, and improving the efficiency of generation.
  • FIG. 2 shows a specific implementation flowchart of an electronic document generation method S102 provided by the second embodiment of the present application.
  • an electronic document generation method S102 provided in this embodiment includes: S1021 to S1025, and details are as follows:
  • the adjusting a preset character recognition algorithm based on the entity type, and outputting character information about the entity image through the adjusted character recognition algorithm includes:
  • the entity image is imported into a five-layer pooling network for pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image.
  • the terminal device may perform dimensionality reduction processing on the entity image through a preset five-layer pooling network, because the dimensionality reduction operation may make the image characteristics contained in the entity image more Obviously, for example, by pooling and dimension reduction, the information of the character size, character font type, and the location of the character in the solid image can be determined, and after the pooling and dimension reduction of the solid image, the data that the terminal device needs to process The amount will be greatly reduced, which can improve the efficiency of recognition.
  • the terminal device adjusts the size of the physical image to a preset standard size, thereby performing dimensionality reduction operations on the adjusted physical image through reference pooling convolution verification; or, the terminal device recognizes the image of the physical image Size, thereby adjusting the pooled convolution kernels at various levels in the five-layer pooling network based on the image size.
  • the terminal device may first grayscale the solid image, thereby reducing the number of layers contained in the solid image and the outline of the prominent characters, because the solid image is a color image Next, it contains the image data of the three layers, and it is necessary to perform pooling and dimensionality reduction on the three images at the same time.
  • the calculation of pooling and dimensionality reduction will be larger, so by graying the physical image, not only can it be reduced
  • the number of layers of the entity image, thereby reducing the calculation amount of pooling and dimensionality reduction can also increase the difference between the character boundary and the background image, and improve the efficiency of character information extraction.
  • a sliding window matching the entity type is obtained, and sliding selection is performed on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences.
  • the terminal device will determine the sliding window associated with the entity type according to the entity type of the target entity.
  • the character size and font type included in different entity types will be different, so the terminal device will provide each entity type Configure a sliding window that matches the character information, and then perform sliding selection on the pooling feature matrix based on the sliding window, and the framed data is used as a window feature sequence, so that during the sliding selection process, multiple Window feature sequence.
  • the size of the matching sliding window and the parameters contained in the window are also different.
  • the terminal device may import each window feature sequence into a preset recurrent neural network, so as to determine the character recognition matched by the entity image
  • the window is the target window archor.
  • the cyclic neural network includes a cyclic layer and a fully connected layer
  • the terminal device is set with a number of cycles
  • all window trait sequences are extracted through the cyclic layer to form a cyclic feature sequence, and based on the cyclic feature sequence, it is finally poured into the full
  • the connection layer can output the character recognition window corresponding to the physical image.
  • S1024 calculate a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identify the area covered by the character recognition window based on the convolution value Whether the image is a character area image.
  • the terminal device will perform a sliding frame selection on the solid image through the character recognition window, and calculate the convolution value between the area image covered by the character recognition lock on the solid image and the character recognition window in each sliding process Based on the value of the convolution value to determine whether the coverage area is a character area image. Since the character recognition window is generated based on the character feature sequence of the entity image, that is, the area image matching the character recognition window, it can be determined that the area image contains character information, so the character recognition window and the area can be calculated by calculating The convolution value between the images determines whether the area image covered this time is a character area image.
  • the terminal device is provided with a matching range. If the convolution value is within the matching range, the area image covered this time is recognized as a character area image; otherwise, if the convolution value is outside the matching range, Then it is recognized that the area image covered this time is not a character area image.
  • characters included in the character area image are recognized, and the character information is generated.
  • characters contained in the character area image are determined by a character recognition algorithm such as OCR, and character information is generated according to the recognized characters and the location information where the character area image is located.
  • a character recognition algorithm such as OCR
  • the accuracy of character recognition can be improved.
  • FIG. 3 shows a specific implementation flowchart of an electronic document generation method S103 provided in the third embodiment of the present application.
  • an electronic document generation method S103 provided in this embodiment includes: S1031 to S1033, and specific details are as follows:
  • the acquiring the center coordinates of the recognized character based on the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items include:
  • the character area image can be defined by multiple corner coordinates. Therefore, the terminal device can arbitrarily select two diagonal corner coordinates or four corner coordinates from the character area image, so as to pass the corner coordinates Calculate the geometric center of the character area image. For example, if the two angular coordinates are (x 1 , y 1 ) and (x 2 , y 2 ), then the geometric center of the character area image is:
  • the terminal device also acquires the image size of the physical image, so as to calculate the center coordinates of the character area image according to the geometric center and the image size.
  • the geometric center of the character area is: Due to the problem of shooting angle and resolution, the position of the character area image may be affected, and determining the center coordinate of the character area by the image size and geometric center of the physical image can reduce the influence of the center coordinate on the size of the physical image. Improve the accuracy of document item identification.
  • the distance between the center coordinate and each coordinate point on the outline of the effective area is calculated, and the distance with the smallest value is selected as the characteristic distance between the character area image and the document item.
  • the terminal device may calculate the distance value between the center coordinate and each coordinate point on the outline line of the document item, and select the smallest distance value as the character area image
  • the feature distance value from the document item is the closest to the document item due to the closest distance to the document item's outline boundary; otherwise, the farther away from the document item's outline boundary, the document item The less relevant the items are. Because, in order to determine the correlation between the two, the terminal device needs to determine the minimum value between the center coordinate and the effective area of the document item, that is, the above-mentioned feature distance.
  • the document item with the smallest feature distance is selected as the document item in the character area image.
  • the terminal device calculates the feature distance between the character feature area and each document item, and selects the document item with the smallest feature area as the document item of the character area image.
  • the terminal device uses the document item covered by the character feature area image as the associated document item of the recognized character, and calculates only the feature distance of the associated document item. Since the character area image of the recognized character may fall into the effective area of multiple document items, it is only necessary to distinguish which document item belongs to, and the character area image of the recognized character does not fall into the document item, then It must be irrelevant to the recognized character, so there is no need to calculate the feature distance, thereby reducing a large number of invalid calculations.
  • the influence of the image size on the calculation of the center coordinates can be reduced, and the accuracy of recognition can be improved.
  • FIG. 4 shows a specific implementation flowchart of an electronic document generation method provided by the fourth embodiment of the present application.
  • an electronic document generation method provided in this embodiment adjusts the preset character recognition algorithm based on the entity type through the adjusted Before the character recognition algorithm processes the entity image and outputs character information about the entity image, it further includes: S401-S403, which are described in detail as follows:
  • the average pixel value of the solid image is calculated according to the pixel value of each pixel in the solid image.
  • the terminal device in order to extract the character area image, the terminal device counts the pixel values of each pixel in the physical image to determine the reference color of the physical image. Since the background area image occupies more area than the character area image, the reference color corresponding to the physical image should be similar to the color of the background area image. Based on this, the terminal device calculates the average pixel value of the physical image, so that the background pixel can be easily identified.
  • the terminal device calculates the difference between the pixel value of each pixel and the average pixel value, so that it can be determined whether the pixel is similar to the reference color of the physical image, and if it is similar, the pixel can be determined
  • the point is a background pixel, so all pixels can be classified, and pixels with a difference less than a preset background threshold are identified as background pixels; pixels with a difference greater than or equal to the background threshold are identified as character pixels.
  • the area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.
  • the terminal device recognizes the continuous area composed of background pixels as a background area image, and the terminal device deletes all the solid images in the background area images to obtain a character area image containing only character pixels .
  • the average pixel value of the entity image is determined, so that the character area image is recognized according to the average pixel value, thereby improving the recognition efficiency and accuracy of the character area.
  • FIG. 5 shows a specific implementation flowchart of an electronic document generation method S101 provided by the fifth embodiment of the present application.
  • an electronic document generation method S101 provided in this embodiment includes: S1011 to S1012, and details are as follows:
  • an identifier is configured on at least one predetermined area of the target entity, the identifier may be a character string or a two-dimensional code identifier, etc.
  • the terminal device may extract the identifier of the preset area from the entity area Symbol, and perform symbol recognition on the identifier.
  • the entity type of the target entity is determined based on the identifier.
  • the terminal device compares the identifier with the list of entity type identifiers to determine the entity type that the identifier matches, so that the entity type of the target entity can be determined.
  • the entity type is determined, and the accuracy and efficiency of the entity type identification are improved.
  • FIG. 6 shows a structural block diagram of an electronic document generation device provided by an embodiment of the present application.
  • Each unit included in the electronic document generation device is used to execute each step in the embodiment corresponding to FIG. 1.
  • only parts related to this embodiment are shown.
  • the electronic document generating device includes:
  • the entity image acquisition unit 61 is used to acquire the entity image of the target entity, determine the entity type of the target entity according to the entity image, and obtain a document template matching the entity type; the document template includes multiple document items ;
  • the character information output unit 62 is used to adjust a preset character recognition algorithm based on the entity type, process the entity image through the adjusted character recognition algorithm, and output character information about the entity image; the character information Including images of the recognized characters and the character area of the recognized characters;
  • the document item determination unit 63 is configured to acquire the center coordinates of the recognized character based on the character area image, and determine the belonging of the recognized character through the center coordinates and the effective area of each document item Document project
  • the character information importing unit 64 is configured to import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.
  • the character information output unit 62 includes:
  • a pooling feature matrix generating unit configured to import the entity image into a five-layer pooling network to perform pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;
  • a window feature sequence output unit configured to obtain a sliding window that matches the entity type, and perform sliding selection on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;
  • a character recognition window generating unit used to import all the window feature sequences into a preset recurrent neural network to generate a character recognition window on the solid image
  • the character area recognition unit is used to calculate the convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and recognize the character recognition window based on the convolution value. Whether the area image covered is a character area image;
  • the character recognition unit is used for recognizing characters contained in the character area image and generating the character information.
  • the document item determination unit 63 includes:
  • a center coordinate calculation unit configured to acquire the corner coordinates of the character area image, and calculate the center coordinates according to the corner coordinates and the image size of the physical image
  • a feature distance output unit for calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the feature distance between the character area image and the document item ;
  • the feature distance comparison unit is used to select the document item with the smallest feature distance as the document item in the character area image.
  • the electronic document generating device further includes:
  • An average pixel value calculation unit configured to calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image
  • a background pixel recognition unit used to identify the pixel as a background pixel if the difference between any of the pixels in the physical image and the average pixel value is less than a preset background threshold
  • the character area image extraction unit is configured to recognize the area covered by the background pixels as a background area image, and remove the background area image from the physical image to obtain the character area image.
  • the entity image acquisition unit 61 includes:
  • An identifier acquiring unit configured to acquire an identifier of a preset area in the physical image
  • the entity type determining unit is configured to determine the entity type of the target entity based on the identifier.
  • the electronic document generation device provided by the embodiment of the present application can also determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm, without requiring the user to manually select, reducing In case of abnormal import, no semantic analysis is needed, which improves the efficiency of generation.
  • the terminal device 7 of this embodiment includes: a processor 70, a memory 71, and computer-readable instructions 72 stored in the memory 71 and executable on the processor 70, such as an electronic document Generate the program.
  • the processor 70 executes the computer-readable instructions 72
  • the steps in the above embodiments of the method for generating each electronic document are implemented, for example, S101 to S104 shown in FIG. 1.
  • the processor 70 executes the computer-readable instructions 72
  • the functions of the units in the foregoing device embodiments are realized, for example, the functions of the modules 61 to 64 shown in FIG. 6.
  • the computer-readable instructions 72 may be divided into one or more units, and the one or more units are stored in the memory 71 and executed by the processor 70 to complete the application .
  • the one or more units may be a series of computer-readable instruction instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 72 in the terminal device 7.
  • the computer-readable instructions 72 may be divided into a solid image acquisition unit, a character information output unit, a document item determination unit, and a character information import unit, and the specific functions of each unit are as described above.
  • the terminal device 7 may be a computing device such as a desktop computer, a notebook, a palmtop computer and a cloud server.
  • the terminal device may include, but is not limited to, a processor 70 and a memory 71.
  • FIG. 7 is only an example of the terminal device 7 and does not constitute a limitation on the terminal device 7, and may include more or fewer components than the illustration, or a combination of certain components or different components.
  • the terminal device may further include an input and output device, a network access device, a bus, and the like.
  • the so-called processor 70 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 71 may be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7.
  • the memory 71 may also be an external storage device of the terminal device 7, such as a plug-in hard disk equipped on the terminal device 7, a smart memory card (Smart, Media, Card, SMC), and a secure digital (SD) Cards, flash cards, etc.
  • the memory 71 may include both an internal storage unit of the terminal device 7 and an external storage device.
  • the memory 71 is used to store the computer-readable instructions and other programs and data required by the terminal device.
  • the memory 71 can also be used to temporarily store data that has been or will be output.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or software function unit.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM random access memory
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain (Synchlink) DRAM
  • RDRAM direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

An electronic document generation method and device. The method comprises: obtaining an entity image of a target entity, determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type (S101); adjusting a preset character recognition algorithm on the basis of the entity type, performing processing on the entity image by means of the adjusted character recognition algorithm, and outputting character information about the entity image (S102); obtaining a center coordinate of a recognized character according to a character area image, and determining a document item to which the recognized character belongs by means of the center coordinate and an effective area of each document item (S103); and importing the recognized character into a document item to which the document template belongs, and generating an electronic document about the target entity (S104). The item needing to be imported is determined according to the position of the character, so that the case of abnormal importing is reduced, semantic analysis is not needed, and the generation efficiency of the electronic document is improved.

Description

一种电子文档的生成方法及设备Method and equipment for generating electronic document
本申请申明享有2019年01月08日递交的申请号为201910017061.1、名称为“一种电子文档的生成方法及设备”中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。This application declares that it enjoys the priority of the Chinese patent application with the application number 201910017061.1 and the name "an electronic document generation method and equipment" filed on January 08, 2019. The entire content of the Chinese patent application is incorporated by reference in In this application.
技术领域Technical field
本申请属于图像处理技术领域,尤其涉及一种电子文档的生成方法及设备。The present application belongs to the technical field of image processing, and in particular relates to a method and device for generating electronic documents.
背景技术Background technique
随着电子化进程的不断推进,由于电子文档具有存储方便以及发送及时等优点,被广泛应用于各种应用中,如何能够有效地将实体文件转换为电子文档,则直接影响文档管理的效率。现有的电子文档的生成技术,一般是通过管理员人为识别该实体文件所对应的电子模板,并将实体文件中包含的内容手动填写到电子模板的各个项目内,当实体文件数量较多且文字量较大时,则需要较多时间来进行电子文档的转换,从而降低了电子文档生成的效率。With the continuous advancement of the electronic process, electronic documents have been widely used in various applications due to the advantages of convenient storage and timely delivery. How to effectively convert physical files into electronic documents directly affects the efficiency of document management. The existing electronic document generation technology generally recognizes the electronic template corresponding to the entity file manually by the administrator, and manually fills the content contained in the entity file into each item of the electronic template. When there are a large number of entity files and When the amount of text is large, it takes more time to convert the electronic document, thereby reducing the efficiency of electronic document generation.
技术问题technical problem
有鉴于此,本申请实施例提供了一种电子文档的生成方法及设备,以解决现有的电子文档的生成技术,需要通过管理员人为识别该实体文件所对应的电子模板,并将实体文件中包含的内容手动填写到电子模板的各个项目内,文档生成效率较低的问题。In view of this, the embodiments of the present application provide an electronic document generation method and device to solve the existing electronic document generation technology, which requires an administrator to manually identify the electronic template corresponding to the entity file and convert the entity file The content contained in the manual is filled into each item of the electronic template, and the problem of low document generation efficiency.
技术解决方案Technical solution
本申请实施例的第一方面提供了一种电子文档的生成方法,包括:A first aspect of embodiments of the present application provides an electronic document generation method, including:
获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,获取与所述实体类型匹配的文档模板;所述文档模板包含多个文档项目;Acquiring an entity image of the target entity, and determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type; the document template includes multiple document items;
基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息;所述字符信息包括已识别字符以及该已识别字符的字符区域图像;Adjusting a preset character recognition algorithm based on the entity type, processing the entity image through the adjusted character recognition algorithm, and outputting character information about the entity image; the character information includes the recognized character and the recognized character Image of the character area of the character;
根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目;Acquiring the center coordinates of the recognized character according to the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items;
将所述已识别字符导入至所述文档模板内所属的所述文档项目,生成关于所述目标实体的电子文档。The recognized characters are imported into the document item to which the document template belongs to generate an electronic document about the target entity.
有益效果Beneficial effect
本申请实施例通过获取待转换的目标实体的实体图像,继而根据该实体图像确定目标实体的实体类型,获取与实体类型相匹配的文档模板;根据实体类型调整字符识别算法,提取实体图像中包含的字符信息,并根据字符信息中各个已识别字符的中心坐标确定对应的文档项目,继而依次导入文档模板各个关联的文档项目内,生成关于目标实体的电子文档,实现电子文档的自动生成的目的。与现有的电子文档的生成技术相比,本申请实施例可以根据字符的位置确定对应的文档项目,并根据实体类型对字符识别算法调整,提高了字符识别算法的准确性,无需用户手动选取,减少了导入异常的情况,也无需进行语义分析,提高了生成的效率。In the embodiment of the present application, by acquiring the entity image of the target entity to be converted, and then determining the entity type of the target entity according to the entity image, obtaining the document template matching the entity type; adjusting the character recognition algorithm according to the entity type, extracting the entity image contains Character information, and determine the corresponding document item according to the center coordinates of each recognized character in the character information, and then sequentially import into each associated document item of the document template to generate an electronic document about the target entity to achieve the purpose of automatic generation of electronic documents . Compared with the existing electronic document generation technology, the embodiments of the present application can determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm and does not require the user to manually select , Reducing the situation of importing anomalies, without semantic analysis, and improving the efficiency of generation.
附图说明BRIEF DESCRIPTION
图1是本申请第一实施例提供的一种电子文档的生成方法的实现流程图;1 is an implementation flowchart of an electronic document generation method provided in the first embodiment of the present application;
图2是本申请第二实施例提供的一种电子文档的生成方法S102具体实现流程图;2 is a specific implementation flowchart of an electronic document generation method S102 provided in a second embodiment of the present application;
图3是本申请第三实施例提供的一种电子文档的生成方法S103具体实现流程图;3 is a specific implementation flowchart of an electronic document generation method S103 provided in a third embodiment of the present application;
图4是本申请第四实施例提供的一种电子文档的生成方法具体实现流程图;4 is a specific implementation flowchart of an electronic document generation method provided in a fourth embodiment of the present application;
图5是本申请第五实施例提供的一种电子文档的生成方法S101具体实现流程图;5 is a specific implementation flowchart of an electronic document generation method S101 provided in a fifth embodiment of the present application;
图6是本申请一实施例提供的一种电子文档的生成设备的结构框图;6 is a structural block diagram of an electronic document generation device provided by an embodiment of the present application;
图7是本申请另一实施例提供的一种终端设备的示意图。7 is a schematic diagram of a terminal device according to another embodiment of the present application.
本发明的实施方式Embodiments of the invention
在本申请实施例中,流程的执行主体为终端设备。该终端设备包括但不限于:服务器、计算机、智能手机以及平板电脑等能够执行电子文档的生成操作的设备。图1示出了本申请第一实施例提供的电子文档的生成方法的实现流程图,详述如下:In the embodiment of the present application, the execution subject of the process is a terminal device. The terminal device includes but is not limited to: a server, a computer, a smart phone, a tablet computer, and other devices capable of performing electronic document generation operations. FIG. 1 shows an implementation flowchart of the electronic document generation method provided by the first embodiment of the present application, which is described in detail as follows:
在S101中,获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,获取与所述实体类型匹配的文档模板;所述文档模板包含多个文档项目。In S101, an entity image of a target entity is acquired, and an entity type of the target entity is determined according to the entity image, and a document template matching the entity type is acquired; the document template includes multiple document items.
在本实施例中,终端设备可以接收用户终端发送的关于目标实体的实体图,在该情况下,用户可以通过自身用户终端内置的拍摄单元,采集关于目标实体的实体图像,并通过用户终端安装有的客户端,将拍摄得到的实体图像上传给终端设备,终端设备在接收到客户端的图像上传指令后,则执行S101的相关操作。可选地,终端设备为了保障上传图像的合法性,终端设备会获取客户端的程序编号,通过程序编号判断该客户端是否为通过合法发布渠道下载的程序文件,若该客户端编号识别为非法参数,则拒绝接收该实体图像,返回图像异常信息,从而保证了认证操作的合法性,避免未授权的用户进行图像识别,从而导致负载过大而降低识别效率以及准确率;反之,若程序编号识别为合法的编号,则执行 S101的操作。除了通过接收其他设备发送的实体图像外,还可以通过终端设备内置的拍摄模块或扫描模块等图像采集单元获取关于目标实体的实体图像,在该情况下,终端设备可以接受用户发起的图像采集指令或检测到图像采集区域放置了待识别的目标实体时,则启动图像采集模块,获取采集区域当前时刻的图像信息,并执行S101的相关操作。In this embodiment, the terminal device can receive the entity map about the target entity sent by the user terminal. In this case, the user can collect the entity image about the target entity through the shooting unit built in the user terminal, and install it through the user terminal. Some clients upload the captured physical image to the terminal device, and after receiving the image upload instruction from the client, the terminal device performs the relevant operations of S101. Optionally, in order to ensure the legitimacy of the uploaded image, the terminal device will obtain the program number of the client, and determine whether the client is a program file downloaded through a legal distribution channel through the program number, if the client number is recognized as an illegal parameter , It refuses to receive the physical image, and returns abnormal image information, thereby ensuring the legitimacy of the authentication operation and preventing unauthorized users from performing image recognition, which leads to excessive load and reduces the recognition efficiency and accuracy; otherwise, if the program number is recognized If it is a legal number, the operation of S101 is performed. In addition to receiving entity images sent by other devices, you can also obtain entity images about target entities through image acquisition units such as the camera module or scanning module built in the terminal device. In this case, the terminal device can accept user-initiated image acquisition instructions Or when it is detected that the target entity to be recognized is placed in the image collection area, the image collection module is started to obtain the image information of the current time in the collection area, and the relevant operations of S101 are performed.
在本实施例中,终端设备可以对接收到的实体图像进行预处理,从而可以提高识别实体类型的准确度。具体执行预处理的方式可以为:终端设备获取采集图像时的环境光强,基于当前的环境光强确定高光调整系数以及阴影调整系数,通过上述两个调整系数对实体图像的高光区域以及阴影区域进行调整;识别实体图像中关于目标实体的边界轮廓,基于该边界轮廓对实体图像进行裁剪,从而滤去无效的背景区域;对裁剪以及调整后的实体图像进行灰度化处理,从而可以提高识别的准确率。In this embodiment, the terminal device can preprocess the received entity image, thereby improving the accuracy of identifying the entity type. The specific pre-processing method may be: the terminal device acquires the ambient light intensity at the time of acquiring the image, determines the highlight adjustment factor and the shadow adjustment factor based on the current ambient light intensity, and applies the two adjustment factors to the highlight area and the shadow area of the physical image Make adjustments; identify the boundary contour of the target entity in the entity image, crop the entity image based on the boundary contour, so as to filter out invalid background areas; grayscale the cropped and adjusted entity image, which can improve recognition Accuracy.
在本实施例中,终端设备可以根据实体图像确定该目标实体对应的实体类型。具体识别实体类型的方式可以为:终端设备识别该实体图像的图像尺寸,并根据该图像尺寸与预设的文件类型尺寸列表进行比对,确定该图像尺寸对应的文件类型,从而获取得到关于该目标实体的实体类型。除了根据图像尺寸确定实体类型外,终端设备还可以对目标实体的文件标题进行识别,提取文件标题中所包含的文件关键词,确定该目标实体的实体类型。In this embodiment, the terminal device may determine the entity type corresponding to the target entity according to the entity image. The specific way to identify the entity type may be: the terminal device recognizes the image size of the entity image, and compares the image size with a preset file type size list to determine the file type corresponding to the image size, so as to obtain The entity type of the target entity. In addition to determining the entity type according to the image size, the terminal device can also identify the file title of the target entity, extract the file keywords included in the file title, and determine the entity type of the target entity.
在本实施例中,终端设备为不同的实体类型预先配置了对应的文档模板,由于不同的实体类型所包含的文档项目会不一样,为了提高自动导入的准确性,终端设备会预先将不同实体类型所对应的文档项目进行封装,生成关于该实体类型的文档模板,从而提高后续字符导入的效率。该文档项目具体用于表示该目标实体包含的不同类型的信息,一种类型信息可以对应一个文档项目,例如某一目标实体中用于记载用户姓名,则用户姓名可以对应一个文档项目;该目标实体还记载了用户住址,则用户住址也可以对应另一个文档项目,从而方便对实体类型中不同类型的信息进行分类导入,提高导入的准确性。In this embodiment, the terminal device pre-configures corresponding document templates for different entity types. Since the document items included in different entity types will be different, in order to improve the accuracy of automatic import, the terminal device will pre-register different entities The document item corresponding to the type is encapsulated to generate a document template about the entity type, thereby improving the efficiency of subsequent character import. The document item is specifically used to represent different types of information contained in the target entity. One type of information can correspond to a document item. For example, a target entity is used to record the user name, and the user name can correspond to a document item; the target The entity also records the user's address, and the user's address can also correspond to another document item, so as to facilitate the classification and import of different types of information in the entity type, and improve the accuracy of the import.
在S102中,基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息;所述字符信息包括已识别字符以及该已识别字符的字符区域图像。In S102, a preset character recognition algorithm is adjusted based on the entity type, the entity image is processed by the adjusted character recognition algorithm, and character information about the entity image is output; the character information includes recognized characters And the character area image of the recognized character.
在本实施例中,终端设备在确定了目标实体的实体类型后,可以获取该实体类型所对应的识别算法参数,并通过识别算法参数对预设的字符识别算法进行调整,以使该字符识别算法与该实体类型相匹配,从而提高字符识别算法的准确性。其中,若字符识别算法为一池化神经网络,通过多重池化以及全连接层提取实体图像中包含的字符信息,则该识别算法参数可以为池化卷积核,基于实体类型获取与之对应的池化卷积核,特别地,该实体 类型包含多个不同字体大小以及字体类型的字符信息,则可以对应多个池化卷积核,从而能够提高字符识别的效率。若字符识别算法为一窗口字符识别算法,通过滑动窗口判断该窗口覆盖区域中是否包含字符,则可以通过实体类型获取对应的滑动窗口。In this embodiment, after determining the entity type of the target entity, the terminal device may acquire the recognition algorithm parameters corresponding to the entity type, and adjust the preset character recognition algorithm through the recognition algorithm parameters to enable the character recognition The algorithm matches the entity type, thereby improving the accuracy of the character recognition algorithm. Among them, if the character recognition algorithm is a pooled neural network, the character information contained in the entity image is extracted through multiple pooling and fully connected layers, then the parameter of the recognition algorithm can be a pooled convolution kernel, and the corresponding is obtained based on the entity type Pooled convolution kernel, in particular, the entity type contains multiple character information of different font sizes and font types, which can correspond to multiple pooled convolution kernels, which can improve the efficiency of character recognition. If the character recognition algorithm is a window character recognition algorithm, and the sliding window is used to determine whether the area covered by the window contains characters, the corresponding sliding window can be obtained by the entity type.
可选地,在本实施例中,该字符识别模型可以为基于Tessract技术的OCR算法,则终端设备可以根据实体类型获取与之关联的字体样本库,从而能够快速对实体图像包含各个字符进行匹配识别,。与基于神经网络的字符识别方式相比,通过OCR识别的流程效率较高,以及对于终端设备的硬件要求较低,另一方面,通过Tessract构建字符样本库,也能够识别不同字体类型的字符,进一步提高了识别的效率。Optionally, in this embodiment, the character recognition model may be an OCR algorithm based on Tessract technology, then the terminal device may obtain a font sample library associated with it according to the entity type, so as to quickly match each character contained in the entity image Identify,. Compared with the character recognition method based on neural network, the process efficiency of OCR recognition is higher and the hardware requirements for terminal devices are lower. On the other hand, the character sample library built by Tessract can also recognize characters of different font types. Further improve the efficiency of identification.
在本实施例中,终端设备在识别到一个字符后,会定位该字符所在的字符区域,并获取该字符区域的中心点所在的坐标作为该已识别字符的字符坐标,继而基于字符坐标与已识别字符之间的对应关系。In this embodiment, after recognizing a character, the terminal device will locate the character area where the character is located, and obtain the coordinates of the center point of the character area as the character coordinates of the recognized character, and then based on the character coordinates and Recognize the correspondence between characters.
在S103中,根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目。In S103, the center coordinates of the recognized character are obtained according to the character area image, and the document item to which the recognized character belongs is determined by the center coordinate and the effective area of each document item.
在本实施例中,终端设备在实体图像中确定了一个字符后,可以获取该字符所在的字符区域图像,通过四个角点坐标来确定该字符区域图像。并基于四个角点坐标确定已识别字符的中心坐标。由于不同文档项目所包含的字符信息,会固定在该文档项目所属的有效区域内,因此可以根据中心坐标作为该已识别字符的特征坐标,通过计算中心坐标与文档项目的有效区域之间的距离值,从从而基于该距离值确定该已识别字符是否属于该文档项目。可选地,若该距离值小于预设的关联距离,则判定该已识别字符属于该文档项目;反之,若该距离值大于或等于预设的关联阈值,则判定该已识别字符不属于该文档项目,计算该已识别字符与其他文档项目的距离值。In this embodiment, after determining a character in the physical image, the terminal device may acquire the character area image where the character is located, and determine the character area image by using four corner point coordinates. And determine the center coordinates of the recognized characters based on the coordinates of the four corner points. Since the character information contained in different document items is fixed in the effective area to which the document item belongs, the center coordinates can be used as the characteristic coordinates of the recognized character, by calculating the distance between the center coordinates and the effective area of the document item Value, from which to determine whether the recognized character belongs to the document item based on the distance value. Optionally, if the distance value is less than the preset association distance, it is determined that the recognized character belongs to the document item; otherwise, if the distance value is greater than or equal to the preset association threshold, it is determined that the recognized character does not belong to the For document items, calculate the distance between the recognized character and other document items.
在本实施例中,由于实体文档是基于文档模板打印出来后,通过业务员或客户手写填入对应信息后生成的文档,因此实体文档均有一个对应的文档模板,而该文档模板中各个文档项目与其关联信息之间的距离是较小的,因此可以通过计算各个文档项目与已识别字符之间的距离值,可以识别得到各个已识别字符所对应的文档项目,从而将实现将各个已识别字符自动导入文档模板的目的。In this embodiment, since the entity document is printed based on the document template, the document generated by the salesperson or customer fills in the corresponding information, so the entity document has a corresponding document template, and each document in the document template The distance between the item and its associated information is small, so the distance between each document item and the recognized character can be calculated, and the document item corresponding to each recognized character can be identified, so that each recognized Characters are automatically imported for the purpose of document templates.
可选地,在本实施例中,终端设备可以在字符区域图像上选取一个与文档项目关联的有效区域最近的一个点,并根据上述两个点的坐标导入到预设的欧氏距离计算模型,确定两个坐标点之间的欧氏距离,优选地,终端设备可以对欧氏距离计算模型进行变式,提高纵向坐标的权重,而降低横向坐标的权重,具体的欧氏距离变式公式如下:Optionally, in this embodiment, the terminal device may select a point closest to the effective area associated with the document item on the character area image, and import the preset Euclidean distance calculation model according to the coordinates of the two points To determine the Euclidean distance between two coordinate points. Preferably, the terminal device can perform a variant on the Euclidean distance calculation model to increase the weight of the longitudinal coordinate and decrease the weight of the horizontal coordinate. The specific Euclidean distance variant formula as follows:
Figure PCTCN2019118554-appb-000001
Figure PCTCN2019118554-appb-000001
其中,α以及β为预设系数。由于属于相同的文档项目,应该处于同一水平区域,因此纵坐标对距离值的权重应较大,反之,若某一文档项目的信息内容较多,则首字符以及尾字符与文档项目的基准坐标之间的横向偏移较大,但仍属于同一个文档项目,基于此,对应的横坐标的权重应较小,从而可以提高识别的准确率。Among them, α and β are preset coefficients. Since they belong to the same document item, they should be in the same horizontal area, so the weight of the ordinate to the distance value should be greater. Conversely, if the information content of a certain document item is large, the first character and the trailing character and the reference coordinate of the document item The horizontal offset between them is large, but they still belong to the same document item. Based on this, the weight of the corresponding horizontal coordinate should be small, so that the accuracy of recognition can be improved.
在S104中,将所述已识别字符导入至所述文档模板内所属的所述文档项目,生成关于所述目标实体的电子文档。In S104, the recognized characters are imported into the document item to which the document template belongs, and an electronic document about the target entity is generated.
在本实施例中,终端设备在确定了各个已识别字符对应的文档项目后,可以将各个已识别字符导入到对应的文档项目内,从而生成关于目标实体的电子文档,实现自动生成电子文档的目的。In this embodiment, after determining the document item corresponding to each recognized character, the terminal device can import each recognized character into the corresponding document item, thereby generating an electronic document about the target entity, and automatically generating an electronic document purpose.
以上可以看出,本申请实施例提供的一种电子文档的生成方法通过获取待转换的目标实体的实体图像,继而根据该实体图像确定目标实体的实体类型,获取与实体类型相匹配的文档模板;根据实体类型调整字符识别算法,提取实体图像中包含的字符信息,并根据字符信息中各个已识别字符的中心坐标确定对应的文档项目,继而依次导入文档模板各个关联的文档项目内,生成关于目标实体的电子文档,实现电子文档的自动生成的目的。与现有的电子文档的生成技术相比,本申请实施例可以根据字符的位置确定对应的文档项目,并根据实体类型对字符识别算法调整,提高了字符识别算法的准确性,无需用户手动选取,减少了导入异常的情况,也无需进行语义分析,提高了生成的效率。It can be seen from the above that the method for generating an electronic document provided by an embodiment of the present application obtains the entity image of the target entity to be converted, and then determines the entity type of the target entity according to the entity image to obtain a document template matching the entity type ; Adjust the character recognition algorithm according to the entity type, extract the character information contained in the entity image, and determine the corresponding document item according to the center coordinates of each recognized character in the character information, and then sequentially import into each associated document item of the document template to generate The electronic document of the target entity realizes the purpose of automatically generating the electronic document. Compared with the existing electronic document generation technology, the embodiment of the present application can determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm, without requiring the user to manually select , Reducing the situation of importing anomalies, without semantic analysis, and improving the efficiency of generation.
图2示出了本申请第二实施例提供的一种电子文档的生成方法S102的具体实现流程图。参见图2,相对于图1所述实施例,本实施例提供的一种电子文档的生成方法S102包括:S1021~S1025,具体详述如下:FIG. 2 shows a specific implementation flowchart of an electronic document generation method S102 provided by the second embodiment of the present application. Referring to FIG. 2, relative to the embodiment described in FIG. 1, an electronic document generation method S102 provided in this embodiment includes: S1021 to S1025, and details are as follows:
进一步地,所述基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法输出关于所述实体图像的字符信息,包括:Further, the adjusting a preset character recognition algorithm based on the entity type, and outputting character information about the entity image through the adjusted character recognition algorithm includes:
在S1021中,将所述实体图像导入五层池化网络进行池化降维操作,得到所述实体图像的池化特征矩阵。In S1021, the entity image is imported into a five-layer pooling network for pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image.
在本实施例中,为了确定实体图像的字符特性,终端设备可以通过预设的五层池化网络对该实体图像进行降维处理,由于降维操作可以让实体图像所包含的图像特征更为明显,例如可以通过池化降维来确定实体图像中所包含的字符尺寸、字符字体类型以及字符所在区域位置等信息,并且对实体图像进行池化降维后,终端设备所需进行处理的数据量会大幅减少,从而能够提高识别的效率。In this embodiment, in order to determine the character characteristics of the entity image, the terminal device may perform dimensionality reduction processing on the entity image through a preset five-layer pooling network, because the dimensionality reduction operation may make the image characteristics contained in the entity image more Obviously, for example, by pooling and dimension reduction, the information of the character size, character font type, and the location of the character in the solid image can be determined, and after the pooling and dimension reduction of the solid image, the data that the terminal device needs to process The amount will be greatly reduced, which can improve the efficiency of recognition.
在本实施例中,终端设备将实体图像的尺寸调整为预设的标准尺寸,从而通过基准池化卷积核对调整后的实体图像进行降维操作;又或者,终端设备识别该实体图像的图像尺寸,从而基于图像尺寸调整五层池化网络中各个层级的池化卷积核。通过上述调整操作,可以保证输出的池化特征矩阵的一致性。In this embodiment, the terminal device adjusts the size of the physical image to a preset standard size, thereby performing dimensionality reduction operations on the adjusted physical image through reference pooling convolution verification; or, the terminal device recognizes the image of the physical image Size, thereby adjusting the pooled convolution kernels at various levels in the five-layer pooling network based on the image size. Through the above adjustment operation, the consistency of the output pooling feature matrix can be guaranteed.
可选地,在本实施例中,终端设备可以首先对实体图像进行灰度化处理,从而可以减少该实体图像所包含的图层个数以及突出字符的轮廓,由于实体图像是彩色图像的情况下,则包含三个图层的图像数据,需要同时对三个图像进行池化降维,池化降维的计算了会较大,因此通过对该实体图像进行灰度化处理,不仅能够减少实体图像的图层数,从而减少池化降维的计算量,还能够提高字符边界与背景图像之间的差异度,提高字符信息提取的效率。Optionally, in this embodiment, the terminal device may first grayscale the solid image, thereby reducing the number of layers contained in the solid image and the outline of the prominent characters, because the solid image is a color image Next, it contains the image data of the three layers, and it is necessary to perform pooling and dimensionality reduction on the three images at the same time. The calculation of pooling and dimensionality reduction will be larger, so by graying the physical image, not only can it be reduced The number of layers of the entity image, thereby reducing the calculation amount of pooling and dimensionality reduction, can also increase the difference between the character boundary and the background image, and improve the efficiency of character information extraction.
在S1022中,获取与所述实体类型匹配的滑动窗口,基于所述滑动窗口在所述池化特征矩阵上进行滑动选取,得到多个窗口特征序列。In S1022, a sliding window matching the entity type is obtained, and sliding selection is performed on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences.
在本实施例中,终端设备会根据目标实体的实体类型,确定与实体类型相关联的滑动窗口,不同实体类型所包含的字符大小以及字体类型会存在差异,因此终端设备会为每个实体类型配置与之字符信息相匹配的滑动窗口,继而基于该滑动窗口在池化特征矩阵上进行滑动选取,所框取的数据则作为一个窗口特征序列,从而在滑动选取的过程中,会生成多个窗口特征序列。In this embodiment, the terminal device will determine the sliding window associated with the entity type according to the entity type of the target entity. The character size and font type included in different entity types will be different, so the terminal device will provide each entity type Configure a sliding window that matches the character information, and then perform sliding selection on the pooling feature matrix based on the sliding window, and the framed data is used as a window feature sequence, so that during the sliding selection process, multiple Window feature sequence.
可选地,对于不同的实体类型,其匹配的滑动窗口的尺寸以及该窗口所包含的参数也不同。Optionally, for different entity types, the size of the matching sliding window and the parameters contained in the window are also different.
在S1023中,将所有所述窗口特征序列导入预设的循环神经网络,生成关于实体图像的字符识别窗口。In S1023, all the window feature sequences are imported into a preset recurrent neural network to generate a character recognition window on the solid image.
在本实施例中,终端设备在遍历获取了池化特征矩阵所有的窗口特征序列后,可以将各个窗口特征序列导入到预设的循环神经网络中,从而可以确定该实体图像所匹配的字符识别窗口,即靶窗口archor。其中,该循环神经网络包含有循环层以及全连接层,终端设备设置有循环次数,对所有窗口特质序列通过循环层进行特质提取,从而构成循环特征序列,并基于该循环特征序列最后倒入全连接层,即可以输出关于该实体图像对应的字符识别窗口。In this embodiment, after traversing and acquiring all the window feature sequences of the pooled feature matrix, the terminal device may import each window feature sequence into a preset recurrent neural network, so as to determine the character recognition matched by the entity image The window is the target window archor. Among them, the cyclic neural network includes a cyclic layer and a fully connected layer, the terminal device is set with a number of cycles, and all window trait sequences are extracted through the cyclic layer to form a cyclic feature sequence, and based on the cyclic feature sequence, it is finally poured into the full The connection layer can output the character recognition window corresponding to the physical image.
在S1024中,计算所述字符识别窗口在所述实体图像所覆盖的区域图像与所述字符识别窗口之间的卷积值,基于所述卷积值识别被所述字符识别窗口所覆盖的区域图像是否为字符区域图像。In S1024, calculate a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identify the area covered by the character recognition window based on the convolution value Whether the image is a character area image.
在本实施例中,终端设备会通过字符识别窗口在实体图像上进行滑动框取,并计算各 个滑动过程中该字符识别在实体图像上锁覆盖的区域图像与字符识别窗口之间的卷积值,基于卷积值的数值判断该覆盖区域是否为字符区域图像。由于该字符识别窗口是根据实体图像的字符特征序列生成得到的,即与该字符识别窗口相匹配的区域图像,则可以确定该区域图像包含有字符信息,因此,可以通过计算字符识别窗口与区域图像之间的卷积值来判断本次覆盖的区域图像是否为字符区域图像。In this embodiment, the terminal device will perform a sliding frame selection on the solid image through the character recognition window, and calculate the convolution value between the area image covered by the character recognition lock on the solid image and the character recognition window in each sliding process Based on the value of the convolution value to determine whether the coverage area is a character area image. Since the character recognition window is generated based on the character feature sequence of the entity image, that is, the area image matching the character recognition window, it can be determined that the area image contains character information, so the character recognition window and the area can be calculated by calculating The convolution value between the images determines whether the area image covered this time is a character area image.
在本实施例中,终端设备设置有一个匹配范围,若该卷积值在该匹配范围内,则识别本次覆盖的区域图像为字符区域图像;反之,若该卷积值在匹配范围外,则识别本次覆盖的区域图像并非字符区域图像。In this embodiment, the terminal device is provided with a matching range. If the convolution value is within the matching range, the area image covered this time is recognized as a character area image; otherwise, if the convolution value is outside the matching range, Then it is recognized that the area image covered this time is not a character area image.
在S1025中,识别所述字符区域图像包含的字符,生成所述字符信息。In S1025, characters included in the character area image are recognized, and the character information is generated.
在本实施例中,通过OCR等字符识别算法确定该字符区域图像中包含的字符,并根据识别出来的字符以及字符区域图像所在的位置信息生成字符信息。In this embodiment, characters contained in the character area image are determined by a character recognition algorithm such as OCR, and character information is generated according to the recognized characters and the location information where the character area image is located.
在本申请实施例中,通过对实体图像进行降维处理,并基于降维处理后的窗口特征矩阵生成字符识别窗口,从而能够提高字符识别的准确性。In the embodiment of the present application, by performing dimensionality reduction processing on the entity image and generating a character recognition window based on the window feature matrix after the dimensionality reduction processing, the accuracy of character recognition can be improved.
图3示出了本申请第三实施例提供的一种电子文档的生成方法S103的具体实现流程图。参见图3,相对于图1所述的实施例,本实施例提供的一种电子文档的生成方法S103包括:S1031~S1033,具体详述如下:FIG. 3 shows a specific implementation flowchart of an electronic document generation method S103 provided in the third embodiment of the present application. Referring to FIG. 3, relative to the embodiment described in FIG. 1, an electronic document generation method S103 provided in this embodiment includes: S1031 to S1033, and specific details are as follows:
进一步地,所述根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目,包括:Further, the acquiring the center coordinates of the recognized character based on the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items, include:
在S1031中,获取所述字符区域图像的角坐标,并根据所述角坐标以及所述实体图像的图像尺寸,计算所述中心坐标。In S1031, the corner coordinates of the character area image are obtained, and the center coordinates are calculated according to the corner coordinates and the image size of the solid image.
在本实施例中,字符区域图像可以通过多个角坐标进行区域限定,因此,终端设备可以从该字符区域图像中任意选取两个对角的角坐标,或四个角坐标,从而通过角坐标计算该字符区域图像的几何中心。例如,两个角坐标的分别为(x 1,y 1)以及(x 2,y 2),则该字符区域图像的几何中心即为:
Figure PCTCN2019118554-appb-000002
In this embodiment, the character area image can be defined by multiple corner coordinates. Therefore, the terminal device can arbitrarily select two diagonal corner coordinates or four corner coordinates from the character area image, so as to pass the corner coordinates Calculate the geometric center of the character area image. For example, if the two angular coordinates are (x 1 , y 1 ) and (x 2 , y 2 ), then the geometric center of the character area image is:
Figure PCTCN2019118554-appb-000002
在本实施例中,终端设备还会获取该实体图像的图像尺寸,从而根据该几何中心以及图像尺寸计算该字符区域图像的中心坐标。假设该实体图像的长和宽分为L和H,则该字符区域的几何中心即为:
Figure PCTCN2019118554-appb-000003
由于拍摄角度以及分辨率的问题,可能对字符区域图像的位置产生影响,而通过实体图像的图像尺寸以及几何中心确定该字符区域的中 心坐标,可以减少该中心坐标所受实体图像尺寸的影响,提高文档项目识别的准确性。
In this embodiment, the terminal device also acquires the image size of the physical image, so as to calculate the center coordinates of the character area image according to the geometric center and the image size. Assuming that the length and width of the physical image are divided into L and H, the geometric center of the character area is:
Figure PCTCN2019118554-appb-000003
Due to the problem of shooting angle and resolution, the position of the character area image may be affected, and determining the center coordinate of the character area by the image size and geometric center of the physical image can reduce the influence of the center coordinate on the size of the physical image. Improve the accuracy of document item identification.
在S1032中,计算所述中心坐标与所述有效区域的轮廓线上各个坐标点的相距距离,选取数值最小的所述相距距离作为所述字符区域图像与所述文档项目的特征距离。In S1032, the distance between the center coordinate and each coordinate point on the outline of the effective area is calculated, and the distance with the smallest value is selected as the characteristic distance between the character area image and the document item.
在本实施例中,终端设备在确定了字符区域图像的中心坐标后,可以计算该中心坐标与文档项目轮廓线上各个坐标点之间的距离值,并选取最小的距离值作为该字符区域图像与文档项目的特征距离值,由于与文档项目的轮廓边界相距距离最近,则与该文档项目之间的相关度越大;反之,与该文档项目的轮廓边界相距距离越远,则与该文档项目之间的相关度越小。因为,为了确定两者之间的相关度,终端设备需要确定中心坐标与文档项目的有效区域之间的最小值,即上述的特征距离。In this embodiment, after determining the center coordinates of the character area image, the terminal device may calculate the distance value between the center coordinate and each coordinate point on the outline line of the document item, and select the smallest distance value as the character area image The feature distance value from the document item is the closest to the document item due to the closest distance to the document item's outline boundary; otherwise, the farther away from the document item's outline boundary, the document item The less relevant the items are. Because, in order to determine the correlation between the two, the terminal device needs to determine the minimum value between the center coordinate and the effective area of the document item, that is, the above-mentioned feature distance.
在S1033中,选取所述特征距离最小的所述文档项目作为所述字符区域图像所述的文档项目。In S1033, the document item with the smallest feature distance is selected as the document item in the character area image.
在本实施例中,终端设备会计算字符特征区域与各个文档项目之间的特征距离,并选取特征区域最小的文档项目作为该字符区域图像的文档项目。优选地,终端设备会根据该字符特征区域图像所覆盖的文档项目作为该已识别字符的关联文档项目,并只计算关联文档项目的特征距离。由于该已识别字符的字符区域图像可能落入了多个文档项目的有效区域,此时才需要分辨属于哪一个文档项目,而该已识别字符的字符区域图像并没有落入的文档项目,则必然与该已识别字符不相关,从而无需计算该特征距离,从而减少了大量的无效计算。In this embodiment, the terminal device calculates the feature distance between the character feature area and each document item, and selects the document item with the smallest feature area as the document item of the character area image. Preferably, the terminal device uses the document item covered by the character feature area image as the associated document item of the recognized character, and calculates only the feature distance of the associated document item. Since the character area image of the recognized character may fall into the effective area of multiple document items, it is only necessary to distinguish which document item belongs to, and the character area image of the recognized character does not fall into the document item, then It must be irrelevant to the recognized character, so there is no need to calculate the feature distance, thereby reducing a large number of invalid calculations.
在本申请实施例中,通过对字符区域图像的中心坐标进行图像尺寸的加权处理,从而能够减少图像尺寸对于中心坐标计算的影响,提高了识别的准确性。In the embodiment of the present application, by weighting the image coordinates of the center coordinates of the character area image, the influence of the image size on the calculation of the center coordinates can be reduced, and the accuracy of recognition can be improved.
图4示出了本申请第四实施例提供的一种电子文档的生成方法的具体实现流程图。参见图4,相对于图1至图3所述实施例,本实施例提供的一种电子文档的生成方法在所述基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息之前,还包括:S401~S403,具体详述如下:FIG. 4 shows a specific implementation flowchart of an electronic document generation method provided by the fourth embodiment of the present application. Referring to FIG. 4, relative to the embodiments described in FIGS. 1 to 3, an electronic document generation method provided in this embodiment adjusts the preset character recognition algorithm based on the entity type through the adjusted Before the character recognition algorithm processes the entity image and outputs character information about the entity image, it further includes: S401-S403, which are described in detail as follows:
在S401中,根据所述实体图像中各个像素点的像素值,计算所述实体图像的平均像素值。In S401, the average pixel value of the solid image is calculated according to the pixel value of each pixel in the solid image.
在本实施例中,终端设备为了提取字符区域图像,会统计该实体图像中各个像素点的像素值,确定该实体图像的基准色。由于与字符区域图像相比,背景区域图像所占据的面积较多,因此该实体图像对应的基准色应与背景区域图像的颜色相近。基于此,终端设备会计算该实体图像的平均像素值,从而可以便于识别出背景像素点。In this embodiment, in order to extract the character area image, the terminal device counts the pixel values of each pixel in the physical image to determine the reference color of the physical image. Since the background area image occupies more area than the character area image, the reference color corresponding to the physical image should be similar to the color of the background area image. Based on this, the terminal device calculates the average pixel value of the physical image, so that the background pixel can be easily identified.
在S402中,若所述实体图像内任一所述像素点与所述平均像素值的差值小于预设的背 景阈值,则识别所述像素点为背景像素点。In S402, if any difference between the pixel point and the average pixel value in the physical image is less than a preset background threshold, the pixel point is identified as a background pixel point.
在本实施例中,终端设备会计算各个像素点的像素值与该平均像素值之间的差值,从而可以确定该像素点与实体图像的基准色是否相近,若相近,则可以确定该像素点为背景像素点,因此可以将所有像素点分类,将差值小于预设的背景阈值的像素点识别为背景像素点;而该差值大于或等于背景阈值的像素点识别为字符像素点。In this embodiment, the terminal device calculates the difference between the pixel value of each pixel and the average pixel value, so that it can be determined whether the pixel is similar to the reference color of the physical image, and if it is similar, the pixel can be determined The point is a background pixel, so all pixels can be classified, and pixels with a difference less than a preset background threshold are identified as background pixels; pixels with a difference greater than or equal to the background threshold are identified as character pixels.
在S403中,将所述背景像素点覆盖的区域识别为背景区域图像,并将所述背景区域图像从所述实体图像中移除,得到所述字符区域图像。In S403, the area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.
在本实施例中,终端设备将背景像素点构成的连续的区域识别为一个背景区域图像,终端设别将所有背景区域图像中实体图像中删除,则可以得到只包含字符像素点的字符区域图像。In this embodiment, the terminal device recognizes the continuous area composed of background pixels as a background area image, and the terminal device deletes all the solid images in the background area images to obtain a character area image containing only character pixels .
在本申请实施例中,确定实体图像的平均像素值,从而根据平均像素值识别得到字符区域图像,从而提高了字符区域的识别效率以及准确性。In the embodiment of the present application, the average pixel value of the entity image is determined, so that the character area image is recognized according to the average pixel value, thereby improving the recognition efficiency and accuracy of the character area.
图5示出了本申请第五实施例提供的一种电子文档的生成方法S101的具体实现流程图。参见图5,相对于图1-图3所述实施例,本实施例提供的一种电子文档的生成方法S101包括:S1011~S1012,具体详述如下:FIG. 5 shows a specific implementation flowchart of an electronic document generation method S101 provided by the fifth embodiment of the present application. Referring to FIG. 5, relative to the embodiments described in FIGS. 1-3, an electronic document generation method S101 provided in this embodiment includes: S1011 to S1012, and details are as follows:
在S1011中,获取所述实体图像中预设区域的标示符。In S1011, an identifier of a preset area in the physical image is obtained.
在本实施例中,目标实体上在预定的至少一个区域内会配置有标示符,该标示符可以为一字符串或二维码标识等,终端设备可以从实体区域中提取预设区域的标示符,并对该标示符进行符号识别。In this embodiment, an identifier is configured on at least one predetermined area of the target entity, the identifier may be a character string or a two-dimensional code identifier, etc. The terminal device may extract the identifier of the preset area from the entity area Symbol, and perform symbol recognition on the identifier.
在S1012中,基于所述标示符确定所述目标实体的实体类型。In S1012, the entity type of the target entity is determined based on the identifier.
在本实施例中,终端设备将该标示符与实体类型标识列表进行比对,从而确定该标示符所匹配的实体类型,从而能够确定该目标实体的实体类型。In this embodiment, the terminal device compares the identifier with the list of entity type identifiers to determine the entity type that the identifier matches, so that the entity type of the target entity can be determined.
在本申请实施例中,通过对预设位置的标示符进行识别,从而确定实体类型,提高了实体类型的识别准确性以及识别效率。In the embodiment of the present application, by identifying the identifier at the preset position, the entity type is determined, and the accuracy and efficiency of the entity type identification are improved.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the order of execution, and the execution order of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.
图6示出了本申请一实施例提供的一种电子文档的生成设备的结构框图,该电子文档的生成设备包括的各单元用于执行图1对应的实施例中的各步骤。具体请参阅图1与图1所对应的实施例中的相关描述。为了便于说明,仅示出了与本实施例相关的部分。FIG. 6 shows a structural block diagram of an electronic document generation device provided by an embodiment of the present application. Each unit included in the electronic document generation device is used to execute each step in the embodiment corresponding to FIG. 1. For details, please refer to the related description in the embodiment corresponding to FIG. 1 and FIG. 1. For ease of explanation, only parts related to this embodiment are shown.
参见图6,所述电子文档的生成设备包括:Referring to FIG. 6, the electronic document generating device includes:
实体图像获取单元61,用于获取目标实体的实体图像,并根据所述实体图像确定所述 目标实体的实体类型,获取与所述实体类型匹配的文档模板;所述文档模板包含多个文档项目;The entity image acquisition unit 61 is used to acquire the entity image of the target entity, determine the entity type of the target entity according to the entity image, and obtain a document template matching the entity type; the document template includes multiple document items ;
字符信息输出单元62,用于基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息;所述字符信息包括有已识别字符以及该已识别字符的字符区域图像;The character information output unit 62 is used to adjust a preset character recognition algorithm based on the entity type, process the entity image through the adjusted character recognition algorithm, and output character information about the entity image; the character information Including images of the recognized characters and the character area of the recognized characters;
文档项目确定单元63,用于根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目;The document item determination unit 63 is configured to acquire the center coordinates of the recognized character based on the character area image, and determine the belonging of the recognized character through the center coordinates and the effective area of each document item Document project
字符信息导入单元64,用于将所述已识别字符导入至所述文档模板内所属的所述文档项目,生成关于所述目标实体的电子文档。The character information importing unit 64 is configured to import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.
可选地,所述字符信息输出单元62,包括:Optionally, the character information output unit 62 includes:
池化特征矩阵生成单元,用于将所述实体图像导入五层池化网络进行池化降维操作,得到所述实体图像的池化特征矩阵;A pooling feature matrix generating unit, configured to import the entity image into a five-layer pooling network to perform pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;
窗口特征序列输出单元,用于获取与所述实体类型匹配的滑动窗口,基于所述滑动窗口在所述池化特征矩阵上进行滑动选取,得到多个窗口特征序列;A window feature sequence output unit, configured to obtain a sliding window that matches the entity type, and perform sliding selection on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;
字符识别窗口生成单元,用于将所有所述窗口特征序列导入预设的循环神经网络,生成关于实体图像的字符识别窗口;A character recognition window generating unit, used to import all the window feature sequences into a preset recurrent neural network to generate a character recognition window on the solid image;
字符区域识别单元,用于计算所述字符识别窗口在所述实体图像所覆盖的区域图像与所述字符识别窗口之间的卷积值,基于所述卷积值识别被所述字符识别窗口所覆盖的区域图像是否为字符区域图像;The character area recognition unit is used to calculate the convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and recognize the character recognition window based on the convolution value. Whether the area image covered is a character area image;
字符识别单元,用于识别所述字符区域图像包含的字符,生成所述字符信息。The character recognition unit is used for recognizing characters contained in the character area image and generating the character information.
可选地,所述文档项目确定单元63包括:Optionally, the document item determination unit 63 includes:
中心坐标计算单元,用于获取所述字符区域图像的角坐标,并根据所述角坐标以及所述实体图像的图像尺寸,计算所述中心坐标;A center coordinate calculation unit, configured to acquire the corner coordinates of the character area image, and calculate the center coordinates according to the corner coordinates and the image size of the physical image;
特征距离输出单元,用于计算所述中心坐标与所述有效区域的轮廓线上各个坐标点的相距距离,选取数值最小的所述相距距离作为所述字符区域图像与所述文档项目的特征距离;A feature distance output unit for calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the feature distance between the character area image and the document item ;
特征距离比较单元,用于选取所述特征距离最小的所述文档项目作为所述字符区域图像所述的文档项目。The feature distance comparison unit is used to select the document item with the smallest feature distance as the document item in the character area image.
可选地,所述电子文档的生成设备还包括:Optionally, the electronic document generating device further includes:
平均像素值计算单元,用于根据所述实体图像中各个像素点的像素值,计算所述实体 图像的平均像素值;An average pixel value calculation unit, configured to calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image;
背景像素点识别单元,用于若所述实体图像内任一所述像素点与所述平均像素值的差值小于预设的背景阈值,则识别所述像素点为背景像素点;A background pixel recognition unit, used to identify the pixel as a background pixel if the difference between any of the pixels in the physical image and the average pixel value is less than a preset background threshold;
字符区域图像提取单元,用于将所述背景像素点覆盖的区域识别为背景区域图像,并将所述背景区域图像从所述实体图像中移除,得到所述字符区域图像。The character area image extraction unit is configured to recognize the area covered by the background pixels as a background area image, and remove the background area image from the physical image to obtain the character area image.
可选地,所述实体图像获取单元61包括:Optionally, the entity image acquisition unit 61 includes:
标示符获取单元,用于获取所述实体图像中预设区域的标示符;An identifier acquiring unit, configured to acquire an identifier of a preset area in the physical image;
实体类型确定单元,用于基于所述标示符确定所述目标实体的实体类型。The entity type determining unit is configured to determine the entity type of the target entity based on the identifier.
因此,本申请实施例提供的电子文档的生成设备同样可以根据字符的位置确定对应的文档项目,并根据实体类型对字符识别算法调整,提高了字符识别算法的准确性,无需用户手动选取,减少了导入异常的情况,也无需进行语义分析,提高了生成的效率。Therefore, the electronic document generation device provided by the embodiment of the present application can also determine the corresponding document item according to the position of the character, and adjust the character recognition algorithm according to the entity type, which improves the accuracy of the character recognition algorithm, without requiring the user to manually select, reducing In case of abnormal import, no semantic analysis is needed, which improves the efficiency of generation.
图7是本申请另一实施例提供的一种终端设备的示意图。如图7所示,该实施例的终端设备7包括:处理器70、存储器71以及存储在所述存储器71中并可在所述处理器70上运行的计算机可读指令72,例如电子文档的生成程序。所述处理器70执行所述计算机可读指令72时实现上述各个电子文档的生成方法实施例中的步骤,例如图1所示的S101至S104。或者,所述处理器70执行所述计算机可读指令72时实现上述各装置实施例中各单元的功能,例如图6所示模块61至64功能。7 is a schematic diagram of a terminal device according to another embodiment of the present application. As shown in FIG. 7, the terminal device 7 of this embodiment includes: a processor 70, a memory 71, and computer-readable instructions 72 stored in the memory 71 and executable on the processor 70, such as an electronic document Generate the program. When the processor 70 executes the computer-readable instructions 72, the steps in the above embodiments of the method for generating each electronic document are implemented, for example, S101 to S104 shown in FIG. 1. Alternatively, when the processor 70 executes the computer-readable instructions 72, the functions of the units in the foregoing device embodiments are realized, for example, the functions of the modules 61 to 64 shown in FIG. 6.
示例性的,所述计算机可读指令72可以被分割成一个或多个单元,所述一个或者多个单元被存储在所述存储器71中,并由所述处理器70执行,以完成本申请。所述一个或多个单元可以是能够完成特定功能的一系列计算机可读指令指令段,该指令段用于描述所述计算机可读指令72在所述终端设备7中的执行过程。例如,所述计算机可读指令72可以被分割实体图像获取单元、字符信息输出单元、文档项目确定单元以及字符信息导入单元,各单元具体功能如上所述。Exemplarily, the computer-readable instructions 72 may be divided into one or more units, and the one or more units are stored in the memory 71 and executed by the processor 70 to complete the application . The one or more units may be a series of computer-readable instruction instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 72 in the terminal device 7. For example, the computer-readable instructions 72 may be divided into a solid image acquisition unit, a character information output unit, a document item determination unit, and a character information import unit, and the specific functions of each unit are as described above.
所述终端设备7可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述终端设备可包括,但不仅限于,处理器70、存储器71。本领域技术人员可以理解,图7仅仅是终端设备7的示例,并不构成对终端设备7的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述终端设备还可以包括输入输出设备、网络接入设备、总线等。The terminal device 7 may be a computing device such as a desktop computer, a notebook, a palmtop computer and a cloud server. The terminal device may include, but is not limited to, a processor 70 and a memory 71. Those skilled in the art may understand that FIG. 7 is only an example of the terminal device 7 and does not constitute a limitation on the terminal device 7, and may include more or fewer components than the illustration, or a combination of certain components or different components. For example, the terminal device may further include an input and output device, a network access device, a bus, and the like.
所称处理器70可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The so-called processor 70 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
所述存储器71可以是所述终端设备7的内部存储单元,例如终端设备7的硬盘或内存。所述存储器71也可以是所述终端设备7的外部存储设备,例如所述终端设备7上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器71还可以既包括所述终端设备7的内部存储单元也包括外部存储设备。所述存储器71用于存储所述计算机可读指令以及所述终端设备所需的其他程序和数据。所述存储器71还可以用于暂时地存储已经输出或者将要输出的数据。The memory 71 may be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7. The memory 71 may also be an external storage device of the terminal device 7, such as a plug-in hard disk equipped on the terminal device 7, a smart memory card (Smart, Media, Card, SMC), and a secure digital (SD) Cards, flash cards, etc. Further, the memory 71 may include both an internal storage unit of the terminal device 7 and an external storage device. The memory 71 is used to store the computer-readable instructions and other programs and data required by the terminal device. The memory 71 can also be used to temporarily store data that has been or will be output.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or software function unit.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机非易失性可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art may understand that all or part of the process in the method of the foregoing embodiments may be completed by instructing relevant hardware through computer-readable instructions, which may be stored in a computer non-volatile In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the foregoing method embodiments. Wherein, any reference to the memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still implement the foregoing The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not deviate from the spirit and scope of the technical solutions of the embodiments of the present application. Within the scope of protection of this application.

Claims (20)

  1. 一种电子文档的生成方法,其特征在于,包括:An electronic document generation method is characterized by including:
    获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,获取与所述实体类型匹配的文档模板;所述文档模板包含多个文档项目;Acquiring an entity image of the target entity, and determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type; the document template includes multiple document items;
    基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息;所述字符信息包括已识别字符以及该已识别字符的字符区域图像;Adjusting a preset character recognition algorithm based on the entity type, processing the entity image through the adjusted character recognition algorithm, and outputting character information about the entity image; the character information includes the recognized character and the recognized character Image of the character area of the character;
    根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目;Acquiring the center coordinates of the recognized character according to the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items;
    将所述已识别字符导入至所述文档模板内所属的所述文档项目,生成关于所述目标实体的电子文档。Import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.
  2. 根据权利要求1所述的生成方法,其特征在于,所述基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法输出关于所述实体图像的字符信息,包括:The generating method according to claim 1, wherein the adjusting a preset character recognition algorithm based on the entity type, and outputting character information about the entity image through the adjusted character recognition algorithm includes:
    将所述实体图像导入五层池化网络进行池化降维操作,得到所述实体图像的池化特征矩阵;Importing the entity image into a five-layer pooling network for pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;
    获取与所述实体类型匹配的滑动窗口,基于所述滑动窗口在所述池化特征矩阵上进行滑动选取,得到多个窗口特征序列;Acquiring a sliding window matching the entity type, and slidingly selecting on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;
    将所有所述窗口特征序列导入预设的循环神经网络,生成关于实体图像的字符识别窗口;Import all the window feature sequences into a preset recurrent neural network to generate a character recognition window about the entity image;
    计算所述字符识别窗口在所述实体图像所覆盖的区域图像与所述字符识别窗口之间的卷积值,基于所述卷积值识别被所述字符识别窗口所覆盖的区域图像是否为字符区域图像;Calculating a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identifying whether the area image covered by the character recognition window is a character based on the convolution value Area image
    识别所述字符区域图像包含的字符,生成所述字符信息。Recognize the characters contained in the character area image to generate the character information.
  3. 根据权利要求1所述的生成方法,其特征在于,所述根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目,包括:The generating method according to claim 1, wherein the center coordinates of the recognized character are obtained from the character area image, and the center coordinates and the effective area of each of the document items are used to determine The document items to which the recognized characters belong include:
    获取所述字符区域图像的角坐标,并根据所述角坐标以及所述实体图像的图像尺寸,计算所述中心坐标;Acquiring the corner coordinates of the character area image, and calculating the center coordinates according to the corner coordinates and the image size of the physical image;
    计算所述中心坐标与所述有效区域的轮廓线上各个坐标点的相距距离,选取数值最小的所述相距距离作为所述字符区域图像与所述文档项目的特征距离;Calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the characteristic distance between the character area image and the document item;
    选取所述特征距离最小的所述文档项目作为所述字符区域图像所述的文档项目。The document item with the smallest feature distance is selected as the document item in the character area image.
  4. 根据权利要求1-3任一项所述的生成方法,其特征在于,在所述基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息之前,还包括:The generation method according to any one of claims 1 to 3, wherein in the preset character recognition algorithm adjusted based on the entity type, an entity image is processed by the adjusted character recognition algorithm, Before outputting character information about the physical image, it also includes:
    根据所述实体图像中各个像素点的像素值,计算所述实体图像的平均像素值;Calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image;
    若所述实体图像内任一所述像素点与所述平均像素值的差值小于预设的背景阈值,则识别所述像素点为背景像素点;If the difference between any of the pixel points and the average pixel value in the physical image is less than a preset background threshold, the pixel point is identified as a background pixel point;
    将所述背景像素点覆盖的区域识别为背景区域图像,并将所述背景区域图像从所述实体图像中移除,得到所述字符区域图像。The area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.
  5. 根据权利要求1-3任一项所述的生成方法,其特征在于,所述获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,包括:The generating method according to any one of claims 1 to 3, wherein the acquiring the entity image of the target entity and determining the entity type of the target entity according to the entity image includes:
    获取所述实体图像中预设区域的标示符;Acquiring the identifier of the preset area in the physical image;
    基于所述标示符确定所述目标实体的实体类型。The entity type of the target entity is determined based on the identifier.
  6. 一种电子文档的生成设备,其特征在于,包括:An electronic document generating device, characterized in that it includes:
    实体图像获取单元,用于获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,获取与所述实体类型匹配的文档模板;所述文档模板包含多个文档项目;An entity image acquiring unit, configured to acquire an entity image of a target entity, determine the entity type of the target entity according to the entity image, and obtain a document template matching the entity type; the document template includes multiple document items;
    字符信息输出单元,用于基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息;所述字符信息包括有已识别字符以及该已识别字符的字符区域图像;The character information output unit is used to adjust a preset character recognition algorithm based on the entity type, process the entity image through the adjusted character recognition algorithm, and output character information about the entity image; the character information includes An image of the recognized character and the character area of the recognized character;
    文档项目确定单元,用于根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目;A document item determining unit, configured to acquire the center coordinates of the recognized character based on the character area image, and determine the document to which the recognized character belongs based on the center coordinates and the effective area of each document item project;
    字符信息导入单元,用于将所述已识别字符导入至所述文档模板内所属的所述文档项目,生成关于所述目标实体的电子文档。The character information importing unit is configured to import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.
  7. 根据权利要求6所述的生成设备,其特征在于,所述字符信息输出单元,包括:The generating device according to claim 6, wherein the character information output unit includes:
    池化特征矩阵生成单元,用于将所述实体图像导入五层池化网络进行池化降维操作,得到所述实体图像的池化特征矩阵;A pooling feature matrix generating unit, configured to import the entity image into a five-layer pooling network to perform pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;
    窗口特征序列输出单元,用于获取与所述实体类型匹配的滑动窗口,基于所述滑动窗口在所述池化特征矩阵上进行滑动选取,得到多个窗口特征序列;A window feature sequence output unit, configured to obtain a sliding window that matches the entity type, and perform sliding selection on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;
    字符识别窗口生成单元,用于将所有所述窗口特征序列导入预设的循环神经网络,生成关于实体图像的字符识别窗口;A character recognition window generating unit, used to import all the window feature sequences into a preset recurrent neural network to generate a character recognition window on the solid image;
    字符区域识别单元,用于计算所述字符识别窗口在所述实体图像所覆盖的区域图像与所述字符识别窗口之间的卷积值,基于所述卷积值识别被所述字符识别窗口所覆盖的区域图像是否为字符区域图像;The character area recognition unit is used to calculate the convolution value of the character recognition window between the area image covered by the entity image and the character recognition window, and recognize the character recognition window based on the convolution value. Whether the area image covered is a character area image;
    字符识别单元,用于识别所述字符区域图像包含的字符,生成所述字符信息。The character recognition unit is used for recognizing characters contained in the character area image and generating the character information.
  8. 根据权利要求6所述的生成设备,其特征在于,所述文档项目确定单元包括:The generating device according to claim 6, wherein the document item determination unit includes:
    中心坐标计算单元,用于获取所述字符区域图像的角坐标,并根据所述角坐标以及所述实体图像的图像尺寸,计算所述中心坐标;A center coordinate calculation unit, configured to acquire the corner coordinates of the character area image, and calculate the center coordinates according to the corner coordinates and the image size of the physical image;
    特征距离输出单元,用于计算所述中心坐标与所述有效区域的轮廓线上各个坐标点的相距距离,选取数值最小的所述相距距离作为所述字符区域图像与所述文档项目的特征距离;A feature distance output unit for calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the feature distance between the character area image and the document item ;
    特征距离比较单元,用于选取所述特征距离最小的所述文档项目作为所述字符区域图像所述的文档项目。The feature distance comparison unit is used to select the document item with the smallest feature distance as the document item in the character area image.
  9. 根据权利要求6-8任一项所述的生成设备,其特征在于,所述电子文档的生成设备还包括:The generating device according to any one of claims 6-8, wherein the electronic document generating device further comprises:
    平均像素值计算单元,用于根据所述实体图像中各个像素点的像素值,计算所述实体图像的平均像素值;An average pixel value calculation unit, used to calculate the average pixel value of the solid image according to the pixel value of each pixel in the solid image;
    背景像素点识别单元,用于若所述实体图像内任一所述像素点与所述平均像素值的差值小于预设的背景阈值,则识别所述像素点为背景像素点;A background pixel recognition unit, used to identify the pixel as a background pixel if the difference between any of the pixels in the physical image and the average pixel value is less than a preset background threshold;
    字符区域图像提取单元,用于将所述背景像素点覆盖的区域识别为背景区域图像,并将所述背景区域图像从所述实体图像中移除,得到所述字符区域图像。The character area image extraction unit is configured to recognize the area covered by the background pixels as a background area image, and remove the background area image from the physical image to obtain the character area image.
  10. 根据权利要求6-8任一项所述的生成设备,其特征在于,所述实体图像获取单元包括:The generation device according to any one of claims 6-8, wherein the entity image acquisition unit includes:
    标示符获取单元,用于获取所述实体图像中预设区域的标示符;An identifier acquiring unit, configured to acquire an identifier of a preset area in the physical image;
    实体类型确定单元,用于基于所述标示符确定所述目标实体的实体类型。The entity type determining unit is configured to determine the entity type of the target entity based on the identifier.
  11. 一种终端设备,其特征在于,所述终端设备包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行时实现如下步骤:A terminal device, characterized in that the terminal device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and the processor implements the following steps when executed:
    获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,获取与所述实体类型匹配的文档模板;所述文档模板包含多个文档项目;Acquiring an entity image of the target entity, and determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type; the document template includes multiple document items;
    基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息;所述字符信息包括已识别字符以及该已识别字符的字符区域图像;Adjusting a preset character recognition algorithm based on the entity type, processing the entity image through the adjusted character recognition algorithm, and outputting character information about the entity image; the character information includes the recognized character and the recognized character Image of the character area of the character;
    根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目;Acquiring the center coordinates of the recognized character according to the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items;
    将所述已识别字符导入至所述文档模板内所属的所述文档项目,生成关于所述目标实体的电子文档。The recognized characters are imported into the document item to which the document template belongs to generate an electronic document about the target entity.
  12. 根据权利要求11所述的终端设备,其特征在于,所述基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法输出关于所述实体图像的字符信息,包括:The terminal device according to claim 11, wherein the adjusting a preset character recognition algorithm based on the entity type, and outputting character information about the entity image through the adjusted character recognition algorithm includes:
    将所述实体图像导入五层池化网络进行池化降维操作,得到所述实体图像的池化特征矩阵;Importing the entity image into a five-layer pooling network for pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;
    获取与所述实体类型匹配的滑动窗口,基于所述滑动窗口在所述池化特征矩阵上进行滑动选取,得到多个窗口特征序列;Acquiring a sliding window matching the entity type, and slidingly selecting on the pooled feature matrix based on the sliding window to obtain multiple window feature sequences;
    将所有所述窗口特征序列导入预设的循环神经网络,生成关于实体图像的字符识别窗口;Import all the window feature sequences into a preset recurrent neural network to generate a character recognition window about the entity image;
    计算所述字符识别窗口在所述实体图像所覆盖的区域图像与所述字符识别窗口之间的卷积值,基于所述卷积值识别被所述字符识别窗口所覆盖的区域图像是否为字符区域图像;Calculating a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identifying whether the area image covered by the character recognition window is a character based on the convolution value Area image
    识别所述字符区域图像包含的字符,生成所述字符信息。Recognize the characters contained in the character area image to generate the character information.
  13. 根据权利要求11所述的终端设备,其特征在于,所述根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目,包括:The terminal device according to claim 11, wherein the center coordinates of the recognized character are acquired from the character area image, and the center coordinates and the effective area of each of the document items are used to determine The document items to which the recognized characters belong include:
    获取所述字符区域图像的角坐标,并根据所述角坐标以及所述实体图像的图像尺寸,计算所述中心坐标;Acquiring the corner coordinates of the character area image, and calculating the center coordinates according to the corner coordinates and the image size of the physical image;
    计算所述中心坐标与所述有效区域的轮廓线上各个坐标点的相距距离,选取数值最小的所述相距距离作为所述字符区域图像与所述文档项目的特征距离;Calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the characteristic distance between the character area image and the document item;
    选取所述特征距离最小的所述文档项目作为所述字符区域图像所述的文档项目。The document item with the smallest feature distance is selected as the document item in the character area image.
  14. 根据权利要求11-13任一项所述的终端设备,其特征在于,在所述基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息之前,所述处理器执行所述计算机可读指令时执行以下步骤:The terminal device according to any one of claims 11 to 13, wherein, in the adjusting a preset character recognition algorithm based on the entity type, the entity image is processed by the adjusted character recognition algorithm, Before outputting character information about the physical image, the processor performs the following steps when executing the computer-readable instructions:
    根据所述实体图像中各个像素点的像素值,计算所述实体图像的平均像素值;Calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image;
    若所述实体图像内任一所述像素点与所述平均像素值的差值小于预设的背景阈值,则 识别所述像素点为背景像素点;If the difference between any of the pixel points and the average pixel value in the physical image is less than a preset background threshold, the pixel point is identified as a background pixel point;
    将所述背景像素点覆盖的区域识别为背景区域图像,并将所述背景区域图像从所述实体图像中移除,得到所述字符区域图像。The area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.
  15. 根据权利要求11-13任一项所述的终端设备,其特征在于,所述获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,包括:The terminal device according to any one of claims 11-13, wherein the acquiring an entity image of a target entity and determining the entity type of the target entity according to the entity image includes:
    获取所述实体图像中预设区域的标示符;Acquiring the identifier of the preset area in the physical image;
    基于所述标示符确定所述目标实体的实体类型。The entity type of the target entity is determined based on the identifier.
  16. 一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如下步骤:A computer non-volatile readable storage medium, the computer non-volatile readable storage medium stores computer readable instructions, characterized in that, when the computer readable instructions are executed by a processor, the following steps are implemented:
    获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,获取与所述实体类型匹配的文档模板;所述文档模板包含多个文档项目;Acquiring an entity image of a target entity, and determining an entity type of the target entity according to the entity image, and obtaining a document template matching the entity type; the document template includes multiple document items;
    基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息;所述字符信息包括已识别字符以及该已识别字符的字符区域图像;Adjusting a preset character recognition algorithm based on the entity type, processing the entity image through the adjusted character recognition algorithm, and outputting character information about the entity image; the character information includes the recognized character and the recognized character Image of the character area of the character;
    根据所述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目;Acquiring the center coordinates of the recognized character according to the character area image, and determining the document item to which the recognized character belongs through the center coordinate and the effective area of each of the document items;
    将所述已识别字符导入至所述文档模板内所属的所述文档项目,生成关于所述目标实体的电子文档。Import the recognized characters into the document item to which the document template belongs, and generate an electronic document about the target entity.
  17. 根据权利要求16所述的计算机非易失性可读存储介质,其特征在于,所述基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法输出关于所述实体图像的字符信息,包括:The computer non-volatile storage medium according to claim 16, wherein the preset character recognition algorithm is adjusted based on the entity type, and the adjusted character recognition algorithm outputs information about the entity Image character information, including:
    将所述实体图像导入五层池化网络进行池化降维操作,得到所述实体图像的池化特征矩阵;Import the entity image into a five-layer pooling network to perform pooling and dimensionality reduction operations to obtain a pooling feature matrix of the entity image;
    获取与所述实体类型匹配的滑动窗口,基于所述滑动窗口在所述池化特征矩阵上进行滑动选取,得到多个窗口特征序列;Acquiring a sliding window that matches the entity type, and slidingly selecting on the pooling feature matrix based on the sliding window to obtain multiple window feature sequences;
    将所有所述窗口特征序列导入预设的循环神经网络,生成关于实体图像的字符识别窗口;Import all the window feature sequences into a preset recurrent neural network to generate a character recognition window about the entity image;
    计算所述字符识别窗口在所述实体图像所覆盖的区域图像与所述字符识别窗口之间的卷积值,基于所述卷积值识别被所述字符识别窗口所覆盖的区域图像是否为字符区域图像;Calculating a convolution value of the character recognition window between the area image covered by the physical image and the character recognition window, and identifying whether the area image covered by the character recognition window is a character based on the convolution value Area image
    识别所述字符区域图像包含的字符,生成所述字符信息。Recognize the characters contained in the character area image to generate the character information.
  18. 根据权利要求16所述的计算机非易失性可读存储介质,其特征在于,所述根据所 述字符区域图像获取所述已识别字符的中心坐标,并通过所述中心坐标以及各个所述文档项目的有效区域,确定所述已识别字符所属的所述文档项目,包括:The computer non-volatile storage medium according to claim 16, wherein the center coordinates of the recognized character are obtained from the character area image, and the center coordinates and each of the documents are obtained through the center coordinates The effective area of the item, which determines the document item to which the recognized character belongs, includes:
    获取所述字符区域图像的角坐标,并根据所述角坐标以及所述实体图像的图像尺寸,计算所述中心坐标;Acquiring the corner coordinates of the character area image, and calculating the center coordinates according to the corner coordinates and the image size of the physical image;
    计算所述中心坐标与所述有效区域的轮廓线上各个坐标点的相距距离,选取数值最小的所述相距距离作为所述字符区域图像与所述文档项目的特征距离;Calculating the distance between the center coordinate and each coordinate point on the contour line of the effective area, and selecting the smallest distance as the characteristic distance between the character area image and the document item;
    选取所述特征距离最小的所述文档项目作为所述字符区域图像所述的文档项目。The document item with the smallest feature distance is selected as the document item in the character area image.
  19. 根据权利要求16-18任一项所述的计算机非易失性可读存储介质,其特征在于,在所述基于所述实体类型调整预设的字符识别算法,通过调整后的所述字符识别算法对实体图像进行处理,输出关于所述实体图像的字符信息之前,所述计算机可读指令被处理器执行时实现如下步骤:The non-volatile computer-readable storage medium according to any one of claims 16 to 18, characterized in that, after the preset character recognition algorithm is adjusted based on the entity type, the adjusted character recognition Before the algorithm processes the solid image and outputs character information about the solid image, the computer-readable instructions are executed by the processor to implement the following steps:
    根据所述实体图像中各个像素点的像素值,计算所述实体图像的平均像素值;Calculate the average pixel value of the physical image according to the pixel value of each pixel in the physical image;
    若所述实体图像内任一所述像素点与所述平均像素值的差值小于预设的背景阈值,则识别所述像素点为背景像素点;If the difference between any of the pixel points and the average pixel value in the physical image is less than a preset background threshold, the pixel point is identified as a background pixel point;
    将所述背景像素点覆盖的区域识别为背景区域图像,并将所述背景区域图像从所述实体图像中移除,得到所述字符区域图像。The area covered by the background pixels is identified as a background area image, and the background area image is removed from the physical image to obtain the character area image.
  20. 如权利要求16-18任一项所述的计算机非易失性可读存储介质,其特征在于,所述获取目标实体的实体图像,并根据所述实体图像确定所述目标实体的实体类型,包括:The computer non-volatile storage medium according to any one of claims 16 to 18, wherein the acquiring an entity image of the target entity and determining the entity type of the target entity based on the entity image, include:
    获取所述实体图像中预设区域的标示符;Acquiring the identifier of the preset area in the physical image;
    基于所述标示符确定所述目标实体的实体类型。The entity type of the target entity is determined based on the identifier.
PCT/CN2019/118554 2019-01-08 2019-11-14 Electronic document generation method and device WO2020143325A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910017061.1A CN109871521A (en) 2019-01-08 2019-01-08 A kind of generation method and equipment of electronic document
CN201910017061.1 2019-01-08

Publications (1)

Publication Number Publication Date
WO2020143325A1 true WO2020143325A1 (en) 2020-07-16

Family

ID=66917551

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118554 WO2020143325A1 (en) 2019-01-08 2019-11-14 Electronic document generation method and device

Country Status (2)

Country Link
CN (1) CN109871521A (en)
WO (1) WO2020143325A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113130023A (en) * 2021-04-22 2021-07-16 嘉兴易迪希计算机技术有限公司 Image-text recognition and entry method and system in EDC system
CN113435331A (en) * 2021-06-28 2021-09-24 平安科技(深圳)有限公司 Image character recognition method, system, electronic equipment and storage medium
CN115052132A (en) * 2022-07-05 2022-09-13 国网江苏省电力有限公司南通市通州区供电分公司 Fishing electric shock prevention early warning method and system based on artificial intelligence

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871521A (en) * 2019-01-08 2019-06-11 平安科技(深圳)有限公司 A kind of generation method and equipment of electronic document
CN110764721A (en) * 2019-09-19 2020-02-07 北京三快在线科技有限公司 Template generation method and device, electronic equipment and computer readable medium
CN111144210B (en) * 2019-11-26 2023-07-18 泰康保险集团股份有限公司 Image structuring processing method and device, storage medium and electronic equipment
CN111444907B (en) * 2020-03-24 2023-05-16 上海东普信息科技有限公司 Method, device, equipment and storage medium for character recognition
CN112926590B (en) * 2021-03-18 2023-12-01 上海晨兴希姆通电子科技有限公司 Segmentation recognition method and system for characters on cable
CN115761781B (en) * 2023-01-06 2023-06-20 江苏狄诺尼信息技术有限责任公司 Note image data recognition system for engineering electronic files

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105260733A (en) * 2015-09-11 2016-01-20 北京百度网讯科技有限公司 Method and device for processing image information
US20180060704A1 (en) * 2016-08-30 2018-03-01 Baidu Online Network Technology (Beijing) Co., Ltd. Method And Apparatus For Image Character Recognition Model Generation, And Vertically-Oriented Character Image Recognition
CN108121966A (en) * 2017-12-21 2018-06-05 欧浦智网股份有限公司 A kind of list method for automatically inputting, electronic equipment and storage medium based on OCR technique
CN109710907A (en) * 2018-12-20 2019-05-03 平安科技(深圳)有限公司 A kind of generation method and equipment of electronic document
CN109871521A (en) * 2019-01-08 2019-06-11 平安科技(深圳)有限公司 A kind of generation method and equipment of electronic document

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5144683A (en) * 1989-04-28 1992-09-01 Hitachi, Ltd. Character recognition equipment
GB0622863D0 (en) * 2006-11-16 2006-12-27 Ibm Automated generation of form definitions from hard-copy forms
CN102831416A (en) * 2012-08-15 2012-12-19 广州广电运通金融电子股份有限公司 Character identification method and relevant device
CN108121984B (en) * 2016-11-30 2021-09-21 杭州海康威视数字技术股份有限公司 Character recognition method and device
JP6938228B2 (en) * 2017-05-31 2021-09-22 株式会社日立製作所 Calculator, document identification method, and system
CN108765118B (en) * 2018-05-18 2022-03-15 大账房网络科技股份有限公司 Method and system for generating voucher by mixed scanning of bills

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105260733A (en) * 2015-09-11 2016-01-20 北京百度网讯科技有限公司 Method and device for processing image information
US20180060704A1 (en) * 2016-08-30 2018-03-01 Baidu Online Network Technology (Beijing) Co., Ltd. Method And Apparatus For Image Character Recognition Model Generation, And Vertically-Oriented Character Image Recognition
CN108121966A (en) * 2017-12-21 2018-06-05 欧浦智网股份有限公司 A kind of list method for automatically inputting, electronic equipment and storage medium based on OCR technique
CN109710907A (en) * 2018-12-20 2019-05-03 平安科技(深圳)有限公司 A kind of generation method and equipment of electronic document
CN109871521A (en) * 2019-01-08 2019-06-11 平安科技(深圳)有限公司 A kind of generation method and equipment of electronic document

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113130023A (en) * 2021-04-22 2021-07-16 嘉兴易迪希计算机技术有限公司 Image-text recognition and entry method and system in EDC system
CN113435331A (en) * 2021-06-28 2021-09-24 平安科技(深圳)有限公司 Image character recognition method, system, electronic equipment and storage medium
CN113435331B (en) * 2021-06-28 2023-06-09 平安科技(深圳)有限公司 Image character recognition method, system, electronic equipment and storage medium
CN115052132A (en) * 2022-07-05 2022-09-13 国网江苏省电力有限公司南通市通州区供电分公司 Fishing electric shock prevention early warning method and system based on artificial intelligence

Also Published As

Publication number Publication date
CN109871521A (en) 2019-06-11

Similar Documents

Publication Publication Date Title
WO2020143325A1 (en) Electronic document generation method and device
US9754164B2 (en) Systems and methods for classifying objects in digital images captured using mobile devices
US10504202B2 (en) Method and device for identifying whether standard picture contains watermark
WO2020098250A1 (en) Character recognition method, server, and computer readable storage medium
WO2019192121A1 (en) Dual-channel neural network model training and human face comparison method, and terminal and medium
JP5506785B2 (en) Fingerprint representation using gradient histogram
US8718365B1 (en) Text recognition for textually sparse images
CN111191568A (en) Method, device, equipment and medium for identifying copied image
CN109766778A (en) Invoice information input method, device, equipment and storage medium based on OCR technique
WO2022156178A1 (en) Image target comparison method and apparatus, computer device and readable storage medium
CN110738236A (en) Image matching method and device, computer equipment and storage medium
CN114283156B (en) Method and device for removing document image color and handwriting
CN110738222B (en) Image matching method and device, computer equipment and storage medium
CN110866457A (en) Electronic insurance policy obtaining method and device, computer equipment and storage medium
WO2021218183A1 (en) Certificate edge detection method and apparatus, and device and medium
CN111680181A (en) Abnormal object identification method and terminal equipment
RU2633182C1 (en) Determination of text line orientation
CN110889341A (en) Form image recognition method and device based on AI (Artificial Intelligence), computer equipment and storage medium
CN112101296A (en) Face registration method, face verification method, device and system
CN112163110A (en) Image classification method and device, electronic equipment and computer-readable storage medium
CN113159037B (en) Picture correction method, device, computer equipment and storage medium
CN113610090B (en) Seal image identification and classification method, device, computer equipment and storage medium
CN114399495A (en) Image definition calculation method, device, equipment and storage medium
CN112733565A (en) Two-dimensional code coarse positioning method, equipment and storage medium
CN111539406B (en) Certificate copy information identification method, server and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19909421

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 31.08.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19909421

Country of ref document: EP

Kind code of ref document: A1