WO2020098078A1 - Ocr训练样本的生成方法、装置、设备及可读存储介质 - Google Patents

Ocr训练样本的生成方法、装置、设备及可读存储介质 Download PDF

Info

Publication number
WO2020098078A1
WO2020098078A1 PCT/CN2018/123225 CN2018123225W WO2020098078A1 WO 2020098078 A1 WO2020098078 A1 WO 2020098078A1 CN 2018123225 W CN2018123225 W CN 2018123225W WO 2020098078 A1 WO2020098078 A1 WO 2020098078A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
text line
sample
picture
byte size
Prior art date
Application number
PCT/CN2018/123225
Other languages
English (en)
French (fr)
Inventor
高梁梁
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020098078A1 publication Critical patent/WO2020098078A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition

Definitions

  • the present application mainly relates to the technical field of image processing, and in particular, to a method, device, device and readable storage medium for generating OCR training samples.
  • OCR Optical Character Recognition
  • Optical Character Recognition is a technology for recognizing characters on paper documents. Before recognition, training needs to be performed through samples. Training is usually performed using various sample pictures with text.
  • the text part of the sample pictures is first segmented to form a text line picture, and a corresponding label is set for the text line picture to store; when the sample picture involves When there are multiple text parts, multiple text line pictures and corresponding labels will be formed; the storage of multiple text line pictures takes up more space, and the segmentation of each text part takes more time, reducing OCR training generation Sample efficiency.
  • the main purpose of this application is to provide an OCR training sample generation method, device, equipment and readable storage medium, aiming to solve the problem of dividing the text part of the sample picture and adding tags to store in the prior art, resulting in the generation of OCR The problem of low efficiency of training samples and large storage space.
  • the present application provides a method for generating OCR training samples.
  • the method for generating OCR training samples includes the following steps:
  • the present application also provides an OCR training sample generation device, the OCR training sample generation device includes:
  • a recognition module configured to receive a sample picture, and when receiving a frame selection operation on a text line in the sample picture, identify the coordinate information of the frame selection corresponding to the frame selection operation;
  • the establishment module is configured to receive the label information entered based on the frame selection operation, and establish a correspondence between the coordinate information and the label information to form text line information;
  • the generating module is used to obtain file header information, and add the text header information, the picture information corresponding to the sample picture, and each of the text line information to a preset file for storage to generate an OCR training sample.
  • the present application also proposes an OCR training sample generation device, the OCR training sample generation device includes: a memory, a processor, a communication bus, and the generation of OCR training samples stored on the memory program;
  • the communication bus is used to implement connection communication between the processor and the memory
  • the processor is used to execute the OCR training sample generation program to achieve the following steps:
  • the present application also provides a readable storage medium that stores one or more programs, and the one or more programs may be executed by one or more processors to Used for:
  • the OCR training sample generation method of this embodiment when receiving a frame selection operation on a text line in a sample picture, recognizes the coordinate information of the frame corresponding to the frame selection operation; When entering the tag information, the coordinate information and the tag information are correspondingly formed to form text line information; then the acquired file header information, the image information corresponding to the sample picture and each text line information are added to the preset file for storage, That is, OCR training samples are generated.
  • the OCR training sample in this solution is composed of file header information, picture information and text line information.
  • the file header information can determine the picture information and text line information in the OCR training sample, and the coordinate information in the text line information determines the picture information The text line in the text; and then the correspondence between the coordinate information and the label information to determine the label information corresponding to the text line in the picture information; OCR training can be performed according to the text line and the corresponding label information. Because there is no need to perform segmentation operations on the sample pictures, the storage of the segmented sample pictures is avoided, saving storage space; at the same time, the time spent in segmentation is saved, and the efficiency of generating OCR training samples is improved.
  • FIG. 1 is a schematic flowchart of a first embodiment of a method for generating OCR training samples of the present application
  • FIG. 2 is a schematic diagram of functional modules of a first embodiment of an apparatus for generating OCR training samples of the present application
  • FIG. 3 is a schematic diagram of a device structure of a hardware operating environment involved in the method of an embodiment of the present application
  • FIG. 4 is a schematic diagram of a frame selection corresponding to a frame selection operation on a sample picture in the OCR training sample generation method of the present application.
  • This application provides a method for generating OCR training samples.
  • FIG. 1 is a schematic flowchart of a first embodiment of a method for generating OCR training samples according to this application.
  • the method for generating OCR training samples includes:
  • Step S10 Receive a sample picture, and when receiving a frame selection operation on a text line in the sample picture, identify coordinate information of a frame selection corresponding to the frame selection operation;
  • the method for generating OCR training samples of the present application is applied to a server, and is suitable for generating training samples for OCR training through the server. Because OCR scans the text data first, and then analyzes and processes the image file to obtain the text and layout information, that is, OCR is suitable for identifying the image file; therefore, the sample used for OCR training is a picture, and the picture As a sample picture.
  • This sample image can be pre-selected by the developer and uploaded to the storage unit of the server for storage, or can be selected and uploaded to the server in real time by the developer, which is set according to actual needs; when pre-uploaded, the developer sends the calling instruction , The stored sample picture is called and displayed on the terminal interface; when uploading in real time, the received sample picture is directly output and displayed on the terminal interface; where the terminal can be a laptop computer or a fixed computer that is in communication with the server.
  • the developer can perform a frame selection operation on the text line in the displayed sample picture.
  • the essence of the frame selection operation is to divide each text in the sample picture into a selection box, that is, the frame selection operation corresponds to Check boxes, such as check boxes W1 and W2 in Figure 4.
  • the frame selection operation is received, the coordinate information of the corresponding frame selection is identified, and the identification is performed according to the preset coordinate system where the sample picture is located; specifically, the coordinate information of the frame selection corresponding to the frame selection operation is identified
  • the steps include:
  • Step S11 Read the first boundary point and the second boundary point in the selection box corresponding to the selection operation, where the first boundary point and the second boundary point correspond to different boundaries in the selection box;
  • the selection box for selecting a text line may exist in the form of a rectangular frame, a square frame, or a polygonal frame. In order to facilitate data processing, it is preferably in the form of a rectangular frame. No matter what kind of selection box, the corresponding boundary between the boundary and the boundary in the selection box will form a corresponding point, which will be regarded as the boundary point. After receiving the frame selection operation and forming the corresponding frame, read two boundary points from the frame as the first boundary point and the second boundary point respectively, and the first boundary point and the second boundary point correspond Different boundaries in the selection box, such as the first boundary point p1 and the second boundary point p2 in FIG.
  • the upper left corner boundary point and the lower right corner boundary point of can also be the upper right corner boundary point and the lower left corner boundary point of the selection box to characterize the selection box by the first boundary point and the second boundary point.
  • Step S12 Map the first boundary point and the second boundary point to a preset coordinate system, respectively determine the coordinate values of the first boundary point and the second boundary point, and set the coordinate values to Coordinate information.
  • a preset coordinate system reflecting coordinate values is preset, the preset coordinate system is preferably a two-dimensional coordinate system, and the sample picture displayed on the terminal interface is located in the preset coordinate system.
  • the read first and second boundary points are mapped into a preset coordinate system, and the coordinate values of the first and second boundary points corresponding to the coordinate points in the preset coordinate system are the first The coordinate values corresponding to the boundary point and the second boundary point; as shown in Figure 4, the corresponding coordinate values (x1, y1) and (x2, y2) of the points p1 and p2 in the preset coordinate system are their respective coordinate values .
  • the coordinate values of the first boundary point and the second boundary point are used as the coordinate information of the frame selection corresponding to the frame selection operation, so as to reflect the frame selection through the coordinate information in the subsequent OCR training process, and then determine the content of the frame selection Is a line of text used for training.
  • Step S20 Receive label information entered based on the frame selection operation, and establish a correspondence between the coordinate information and the label information to form text line information;
  • the developer selects the text line in the sample picture through the check box, and enters the corresponding label information for the text line selected by the check box;
  • the label information is the text information of the text line, which is used to characterize the text
  • the content of the line is trained by OCR through the correspondence between the text line and the text information to realize the OCR recognition of the text information expressed in the text line.
  • the coordinate information and the label information are correspondingly formed to form the text line information.
  • the coordinate information represents the text line in the sample picture, and the label information is the text information corresponding to the text line; the text line information formed in the correspondence relationship between the coordinate information and the label information is essentially a text line Correspondence between the text information; realize the corresponding label information according to the coordinate information of the text line, and then carry out OCR training according to the text line and the label information.
  • Step S30 Obtain the file header information, and add the text header information, the picture information corresponding to the sample picture, and each of the text line information to a preset file for storage to generate an OCR training sample.
  • OCR training is performed based on text line information and sample pictures, and text line information and sample pictures need to be stored for subsequent OCR training.
  • Convert the sample picture into a data stream the data stream is base64, which uses writeable character data to represent binary data, so that Chinese characters or pictures can be smoothly transmitted on the network.
  • Use the data stream converted from the sample image as the image information corresponding to the sample image, and store the image information and text line information as text information; a preset file for storing text information is preset, and the image information can be used
  • text line information is added to the preset file for storage.
  • the file header information can be set; the file header information can be set by the developer or can be generated by detection, which involves pictures characterizing the sample pictures Byte size, text line byte size of text line information, number of text line information and other text information.
  • the developer enters the detected image byte size, text line byte size, and information amount; obtains the file header information input by the developer, and compares it with the image information and text line
  • the information is sequentially added to the preset file for storage, and OCR training samples are generated.
  • the text line information part and the sample picture part are determined, and then the text line information and the sample picture are combined to perform OCR training.
  • the preset file in this embodiment may be a file developed by a developer that relies on special software, so that the formed OCR training sample can only be viewed and used by users who have permission to use the special software. It makes the stored OCR training samples have higher confidentiality.
  • the file header information When the file header information is generated in a detection manner, the file header information needs to be generated before obtaining the file header information in this embodiment.
  • the step of obtaining the file header information includes:
  • Step S40 Detect the image byte size of the sample image and the text line byte size of each text line information, and count the number of information of the text line information;
  • the developer develops a detection tool for the size of bytes occupied by various types of information in advance.
  • the detection tool is called to detect the size of the bytes occupied by it, and the detection result is the sample picture.
  • Picture byte size At the same time, each text line involved in the sample picture is box-selected to form multiple boxes; and it is created between the coordinate information recognized by each box and the label information entered for each box-selection operation
  • the detection result is the text line byte size of each text line information.
  • the number of correspondences between the created coordinate information and label information that is, the amount of text line information is counted, and the counted amount is used as the amount of information; based on the image byte size, text line byte size, and information amount Generate file header information.
  • Step S50 generating the file header information from the picture byte size, the text line byte size and the information quantity, and counting the file header byte size of the file header information;
  • the text information representing the byte size of the picture, the byte size of the text line, and the amount of information is formed into the file header information
  • the file header information includes the field name and field value to characterize the type of each text information and its corresponding value
  • the field name is the name of each text information, that is, the name of the picture byte size, the text line byte size and the number of information
  • the field value is the numerical size corresponding to each text information, namely the picture byte size, text line
  • the numerical size corresponding to the three of the byte size and the number of information such as "picture byte size: 50k", where "picture byte size" is the field name, and "50k” is the field value.
  • the file header information itself also occupies a certain byte size, which represents the size of the text information included in the file header information; accordingly, the detection tool is called to occupy the file header information
  • the byte size of the file is detected, and the result obtained by the detection is the file header byte size of the statistical file header information.
  • step S60 the file header byte size is added to the file header information to update the file header information.
  • the file header byte size obtained by the detection is added to the file header information to update the formed file header information; the file header information, picture information and text line information are added to the preset file for storage
  • the OCR training sample After generating the OCR training sample, first determine the file header information involved in the preset file according to the file header byte size; then determine the picture information involved in the preset file by the picture byte size in the file header information , And the text line information involved in the preset file is determined by the text line byte size in the file header information; then OCR training is performed by the picture information and text line information.
  • the OCR training sample generation method of this embodiment when receiving a frame selection operation on a text line in a sample picture, recognizes the coordinate information of the frame corresponding to the frame selection operation; When entering the tag information, the coordinate information and the tag information are correspondingly formed to form text line information; then the acquired file header information, the image information corresponding to the sample picture and each text line information are added to the preset file for storage, That is, OCR training samples are generated.
  • the OCR training sample in this solution is composed of file header information, picture information and text line information.
  • the file header information can determine the picture information and text line information in the OCR training sample, and the coordinate information in the text line information determines the picture information The text line in the text; and then the correspondence between the coordinate information and the label information to determine the label information corresponding to the text line in the picture information; OCR training can be performed according to the text line and the corresponding label information. Because there is no need to perform segmentation operations on the sample pictures, the storage of the segmented sample pictures is avoided, saving storage space; at the same time, the time spent in segmentation is saved, and the efficiency of generating OCR training samples is improved.
  • after the step of generating OCR training samples includes:
  • Step S70 when receiving the OCR training instruction, call the OCR training sample, and read the file header in the OCR training sample according to the byte size information of the file header information in the OCR training sample Information, each of the text line information and the picture information;
  • the training sample can be used for OCR training.
  • the OCR training is triggered by the training instruction.
  • the OCR training sample is called, and the file header information in the OCR training sample is read first, and then the bytes in the file header information are read. Size information; further, according to the byte size information, read the file header information in the OCR training sample, each file line information and picture information.
  • the byte size information in the file header information relates to the file header byte size, picture byte size and text line byte size
  • each text line information and picture information according to the byte size information when reading the file header information, each text line information and picture information according to the byte size information , And the three types of byte size are required; specifically, according to the byte size information of the file header information in the OCR training sample, the steps of reading the file header information, each text line information, and the picture information in the OCR training sample include:
  • Step S71 Read the byte size information of the file header information in the OCR training sample, and determine the file header byte size, text line byte size, and picture byte size according to the byte size information;
  • an OCR training sample is formed; that is, the OCR training sample involves three parts of text information, which correspond to the file header information, Each line of text information and picture information; there is no intersection between the three types of text information, and different types of text information have different byte sizes, so that according to the byte size information in the file header information, each Part of the text information is read to obtain the file header information, each text line information, and picture information.
  • the file header information is set at the forefront of the OCR training sample to read the file header information first; at the same time, the byte size The information is prioritized in the front row of the file header information to read the file header information and the byte size information in priority, and the byte size information determines the file header information, each text line information, and the picture information.
  • the byte size information includes the file header byte size, picture byte size and text line byte size
  • different byte identifiers are set in advance for the unused byte size; the word in the file header information is read After the section size information, the file header byte size, picture byte size and text line byte size are determined according to each byte identifier.
  • the byte identifiers f1, f2, and f3 are set in advance to characterize the file header byte size, picture byte size, and text line byte size; after reading the byte size information, continue to read the byte size information
  • the byte identifier carried by each byte size data in the data when the byte identifier carried by the byte size data is f1, the byte size data is determined as the file header byte size; when the byte size data If the byte identifier carried is f2, the byte size data is determined as the picture byte size; when the byte identifier carried by the byte size data is f3, the byte size data is determined as the text line Section size; to determine the file header byte size, picture byte size and text line byte size from the read byte size information.
  • Step S72 Read first sample information, second sample information, and third sample information corresponding to the file header byte size, text line byte size, and picture byte size from the OCR training sample,
  • the first sample information, the second sample information, and the third sample information are set as the file header information, each text line information, and picture information.
  • the OCR training sample can be Read the first sample information corresponding to the file header byte size, the second sample information corresponding to the text line byte size and the third sample information corresponding to the picture byte size, the first sample information, the second The sample information and the third sample information are file header information, each text line information, and picture information.
  • the process of reading first determine whether the byte size occupied by the read information is consistent with the file header byte size, if they are consistent, the reading process of this information is interrupted, and the information read this time is used as The first sample information; then start the next stage of the reading process, and determine whether the byte size occupied by the read information is consistent with the text line byte size, if they are consistent, the reading process of the information is interrupted, and Use the information read this time as the second sample information; after that, start a new reading process, and determine whether the byte size occupied by the read information is the same as the image byte size, and if it is consistent, interrupt the time The process of reading the information, and use the information read this time as the third sample information; if in each reading process, the byte size occupied by the read information and the file header byte size, text If the line byte size or the picture byte size are inconsistent, continue reading until the byte size occupied by the read information is consistent with the file header byte size, text line byte size, or picture byte
  • the OCR training sample stores the image information converted by the sample image
  • the byte size information in the file header information refers to the image byte size of the sample image.
  • the image byte size cannot directly characterize the image
  • Step S80 Determine a sample picture according to the picture information, and determine a target text line corresponding to each coordinate information in the sample picture according to the coordinate information in each text line information;
  • the corresponding sample picture can be determined by converting the picture information.
  • each text line information is the correspondence between coordinate information and label information
  • the coordinate information is the coordinates of each text line in the sample picture
  • the text line in the sample picture can be determined according to the coordinate information in the text line information, the There is a correspondence between each text line and coordinate information, and the text line corresponding to each coordinate information is regarded as the target text line.
  • the process of determining the target text line corresponding to the coordinate information can be based on the coordinate value; specifically, according to each The coordinate information in the text line information, the step of determining the target text line corresponding to each coordinate information in the sample picture includes:
  • Step S81 randomly selecting one piece of text line information from each of the text line information as the target text line information, and reading the target coordinate value of the coordinate information contained in the target text line information;
  • each text line information involves the correspondence between multiple coordinate information and label information
  • one piece of text line information can be arbitrarily selected from each text line information, and the selected text line information is used as the target text line information .
  • the coordinate information and label information in the target text line information are corresponding to the target coordinate information and target label information, and the coordinate information in the target text line information is read, that is, the coordinate value in the target coordinate information; the coordinate value is used as The target coordinate value is the value of the first boundary point and the second boundary point of the selected frame corresponding to a certain frame selection operation in a preset coordinate system.
  • Step S82 Map the target coordinate value to the preset coordinate system where the sample picture is located, form a value box in the sample picture, and set the text line in the value box in the sample picture Is the target text line.
  • the target coordinate value corresponds to two points in the preset coordinate system, and these two points are the first boundary point and the second boundary point of the frame selection corresponding to the frame selection operation of the text line in the sample picture, by
  • the target coordinate value can determine a certain text line in the sample picture.
  • the target coordinate value is mapped to the preset coordinate system where the sample picture is located, so that the target coordinate value corresponds to two coordinate points in the preset coordinate system, and the two points form a value frame in the sample picture.
  • the numerical box formed in the sample picture is a rectangular box formed by subtracting the absolute distance calculated by x4 and y4 minus y3.
  • the value box corresponds to a text line in the sample picture, and the text line is used as the target text line, and the target text line corresponding to the coordinate information in the sample picture is determined according to the coordinate information in the target text line information. After each text line information is selected as the target text line information, the target text lines corresponding to each coordinate information are determined from the sample picture according to the coordinate information in each target text line information.
  • Step S90 Establish a mapping relationship between each target text line and the label information in the corresponding relationship according to the correspondence between the coordinate information and the label information, and according to each mapping relationship Conduct OCR training.
  • the The label information corresponding to the target text line that is, according to the coordinate information
  • a mapping relationship can be established between the target text line and the label information in the corresponding relationship. If there is a correspondence between the coordinate information w and the label information P, and the target text behavior Q corresponding to the coordinate information w in the sample picture is determined, then according to the coordinate information w, the relationship between the target text line Q and the corresponding relationship P can be established. Mapping relations. Furthermore, OCR training is carried out according to the target text line and label information in the mapping relationship.
  • the process of OCR training is to target
  • the text information existing in the form of pictures in the text line is recognized as the label information existing in the text information itself.
  • the sample image includes multiple text lines, so that the established mapping relationship involves multiple, in the process of OCR training, in order to avoid repeated training on the target text line and label information in each mapping relationship, set the pass A mechanism for distinguishing identifiers; specifically, the steps of OCR training according to each mapping relationship include:
  • Step S91 Transmit the mapping relationship to a preset model, perform OCR training, and assign an identifier to the mapping relationship trained by OCR;
  • a preset model for OCR training is preset, and the preset model may be, for example, SVM (support vector machine, support vector machine) This type of supervised learning method; the target text line and label information in each pair of mapping relationships are transferred to the preset model, then OCR training can be performed, and the transfer to the preset model is performed at the same time.
  • the mapping relationship of OCR training is assigned an identifier to indicate that the mapping relationship has been OCR trained.
  • Step S92 Count the number of identifiers and determine whether the number of identifiers is consistent with the number of information. If the number of identifiers is consistent, complete the training of OCR training samples;
  • the number of assigned identifiers is counted.
  • the number of identifiers in the statistics represents the number of OCR trainings that have been carried out in each mapping relationship; the number of identifiers and the number of information in the text line information
  • the consistency between the two is characterized by In order to perform OCR training on the text lines divided by the frame selection operation in the sample image, each mapping relationship is transferred to a preset template for OCR training, thus completing the training of the OCR training sample.
  • Step S93 if the amount of information is inconsistent with each other, then read the target mapping relationship to which the identifier is not assigned from each mapping relationship, and use the target mapping relationship as a new mapping relationship and execute the mapping The steps of transferring relationships to a preset model.
  • mapping relationship in each mapping relationship that has not been trained for OCR, and the mapping relationship that is not trained for OCR does not carry an identifier;
  • the mapping relationship of unassigned identifiers is read from the mapping relationship, and the mapping relationship is the target mapping relationship.
  • the target mapping relationship is used as a new mapping relationship and transferred to the preset model for training; until the number of identifiers and the number of information Consistent, each mapping relationship is transferred to the preset model for OCR training, that is, the training of OCR training samples is completed.
  • the present application provides an apparatus for generating OCR training samples.
  • the apparatus for generating OCR training samples includes:
  • the identification module 10 is configured to receive a sample picture, and when receiving a frame selection operation on a text line in the sample picture, identify the coordinate information of the frame selection corresponding to the frame selection operation;
  • the establishment module 20 is configured to receive the label information entered based on the frame selection operation, and establish a correspondence between the coordinate information and the label information to form text line information;
  • the generating module 30 is configured to obtain file header information, add the text header information, the picture information corresponding to the sample picture, and each of the text line information to a preset file for storage to generate an OCR training sample.
  • the device for generating the OCR training sample of this embodiment when receiving a frame selection operation for the text line in the sample picture, the recognition module 10 recognizes the coordinate information of the frame selection corresponding to the frame selection operation; When the label information entered in the frame operation is established, the establishment module 20 establishes a correspondence between the coordinate information and the label information to form text line information; the generation module 30 then obtains the acquired file header information, the image information corresponding to the sample image, and each text Line information is added to the preset file for storage, that is, OCR training samples are generated.
  • the OCR training sample in this solution is composed of file header information, picture information and text line information.
  • the file header information can determine the picture information and text line information in the OCR training sample, and the coordinate information in the text line information determines the picture information The text line in the text; and then the correspondence between the coordinate information and the label information to determine the label information corresponding to the text line in the picture information; OCR training can be performed according to the text line and the corresponding label information. Because there is no need to perform segmentation operations on the sample pictures, the storage of the segmented sample pictures is avoided, saving storage space; at the same time, the time spent in segmentation is saved, and the efficiency of generating OCR training samples is improved.
  • each virtual function module of the above OCR training sample generation device is stored in the memory 1005 of the OCR training sample generation device shown in FIG. 3, and when the processor 1001 executes the OCR training sample generation program, the embodiment shown in FIG. 2 is implemented The function of each module in
  • FIG. 3 is a schematic structural diagram of a device in a hardware operating environment involved in a method according to an embodiment of the present application.
  • the device for generating OCR training samples in the embodiment of the present application may be a PC (personal computer, personal computer ), Or terminal devices such as smart phones, tablet computers, e-book readers, and portable computers.
  • PC personal computer, personal computer
  • terminal devices such as smart phones, tablet computers, e-book readers, and portable computers.
  • the OCR training sample generation device may include: a processor 1001, such as a CPU (Central Processing Unit, central processing unit), memory 1005, communication bus 1002. Among them, the communication bus 1002 is used to implement connection communication between the processor 1001 and the memory 1005.
  • the memory 1005 may be a high-speed RAM (random access memory, random access memory), can also be a stable memory (non-volatile memory), such as disk storage.
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • the OCR training sample generation device may further include a user interface, a network interface, a camera, and an RF (Radio Frequency (radio frequency) circuit, sensor, audio circuit, WiFi (Wireless Fidelity, wireless broadband) module and so on.
  • the user interface may include a display (Display), an input unit such as a keyboard (Keyboard), and the optional user interface may also include a standard wired interface and a wireless interface.
  • the network interface may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the structure of the OCR training sample generation device shown in FIG. 3 does not constitute a limitation on the OCR training sample generation device, and may include more or fewer components than shown, or a combination of certain Components, or different component arrangements.
  • the memory 1005 as a readable storage medium may include an operating system, a network communication module, and an OCR training sample generation program.
  • the operating system is a program that manages and controls the hardware and software resources of the OCR training sample generation device, and supports the operation of the OCR training sample generation program and other software and / or programs.
  • the network communication module is used to implement communication between various components in the memory 1005 and other hardware and software in the OCR training sample generation device.
  • the processor 1001 is used to execute the OCR training sample generation program stored in the memory 1005 to implement the steps in the embodiments of the above-described OCR training sample generation method.
  • the present application provides a readable storage medium that stores one or more programs, and the one or more programs may also be executed by one or more processors to implement the above OCR training The steps in each embodiment of the sample generation method.
  • the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or part that contributes to the existing technology, and the computer software product is stored in a readable storage medium (such as ROM) as described above / RAM, magnetic disk, and optical disk), including several instructions to enable a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to perform the methods described in the embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

一种OCR训练样本的生成方法、装置、设备及可读存储介质,所述方法包括:接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息(S10);接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息(S20);获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本(S30)。该方法基于图像处理生成OCR训练样本的文件头信息、图片信息以及文本行信息,不需要对样本图片进行切分,避免存储切分的样本图片,节省了存储空间;同时节省切分花费的时间,提高了OCR训练样本的生成效率。

Description

OCR训练样本的生成方法、装置、设备及可读存储介质
本申请要求于2018年11月12日提交中国专利局、申请号为201811342303.6、发明名称为“OCR训练样本的生成方法、装置、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请主要涉及图片处理技术领域,具体地说,涉及一种OCR训练样本的生成方法、装置、设备及可读存储介质。
背景技术
OCR(Optical Character Recognition光学字符识别)是对纸质文件上的字符进行识别的技术,在识别之前需要通过样本进行训练,训练通常使用各种带有文本文字的样本图片进行。
目前,在用样本图片进行OCR训练的过程中,先将样本图片中的文字部分进行切分,形成文本行图片,并针对文本行图片设置对应的标签(label)进行存储;当样本图片中涉及到多处文字部分时,会形成多个文本行图片及对应的label;多个文本行图片的存储占用较多空间,且对各文字部分的切分需要花费较多时间,降低了生成OCR训练样本的效率。
发明内容
本申请的主要目的是提供一种OCR训练样本的生成方法、装置、设备及可读存储介质,旨在解决现有技术中对样本图片中的文字部分逐一切分并添加标签存储,导致生成OCR训练样本的效率低,且占用存储空间大的问题。
为实现上述目的,本申请提供一种OCR训练样本的生成方法,所述OCR训练样本的生成方法包括以下步骤:
接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
此外,为实现上述目的,本申请还提出一种OCR训练样本的生成装置,所述OCR训练样本的生成装置包括:
识别模块,用于接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
建立模块,用于接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
生成模块,用于获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
此外,为实现上述目的,本申请还提出一种OCR训练样本的生成设备,所述OCR训练样本的生成设备包括:存储器、处理器、通信总线以及存储在所述存储器上的OCR训练样本的生成程序;
所述通信总线用于实现处理器和存储器之间的连接通信;
所述处理器用于执行所述OCR训练样本的生成程序,以实现以下步骤:
接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
此外,为实现上述目的,本申请还提供一种可读存储介质,所述可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序可被一个或者一个以上的处理器执行以用于:
接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
本实施例的OCR训练样本的生成方法,当接收到针对样本图片中的文本行进行框选操作时,识别与该框选操作所对应选框的坐标信息;并在接收到基于选框操作所录入的标签信息时,将坐标信息和标签信息建立对应关系,形成文本行信息;再将获取的文件头信息、与样本图片所对应的图片信息以及各个文本行信息添加到预设文件中存储,即生成OCR训练样本。本方案中的OCR训练样本由文件头信息、图片信息以及文本行信息组成,由文件头信息可确定OCR训练样本中的图片信息以及文本行信息,而由文本行信息中的坐标信息确定图片信息中的文本行;进而由坐标信息与标签信息之间的对应关系,确定与图片信息中文本行对应的标签信息;依据文本行及其对应的标签信息即可进行OCR训练。因不需要对样本图片进行切分操作,避免了对切分的样本图片的存储,节省了存储空间;同时节省了切分所花费的时间,提高了OCR训练样本的生成效率。
附图说明
图1是本申请的OCR训练样本的生成方法第一实施例的流程示意图;
图2是本申请的OCR训练样本的生成装置第一实施例的功能模块示意图;
图3是本申请实施例方法涉及的硬件运行环境的设备结构示意图;
图4是本申请的OCR训练样本的生成方法中对样本图片进行框选操作所对应选框示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供一种OCR训练样本的生成方法。
请参照图1,图1为本申请OCR训练样本的生成方法第一实施例的流程示意图。在本实施例中,所述OCR训练样本的生成方法包括:
步骤S10,接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
本申请的OCR训练样本的生成方法应用于服务器,适用于通过服务器生成用于OCR训练的训练样本。因OCR先对文本资料扫描,再对图片文件分析处理,来获取文字及版面信息,即OCR适用于对图片文件进行识别;从而用于对OCR训练所使用的样本相应的为图片,将该图片作为样本图片。此样本图片可由开发人员预先选择并上传到服务器的存储单元中进行存储,也可由开发人员实时选择并上传到服务器,具体根据实际需求进行设定;当预先上传时,则由开发人员发送调用指令,将该存储的样本图片调用显示在终端界面上;当实时上传时,则直接将该接收的样本图片输出在终端界面显示;其中终端可以是与服务器通信连接的笔记本电脑、固定电脑等。
在将样本图片显示后,开发人员可对该显示的样本图片中的文本行进行框选操作,该框选操作其实质为将样本图片中的各文字划分选择框内,即框选操作存在对应的选框,如图4中的选框W1和W2,。在接收到该框选操作时,对其所对应选框的坐标信息进行识别,该识别依据样本图片所在的预先设定的坐标系进行;具体地,识别与框选操作对应选框的坐标信息的步骤包括:
步骤S11,读取所述选框操作对应选框中的第一边界点和第二边界点,其中所述第一边界点和所述第二边界点对应所述选框中的不同边界;
可理解地,对于用于选择文本行的选择框,可以以矩形框、正方形框、多边形框的形式存在,为了便于数据处理,优选为以矩形框的形式存在。无论哪种形式的选择框,选择框中相交的边界与边界之间会形成相应的点,将该点做为边界点。在接收到框选操作,而形成对应的选框之后,从该选框中读取两个边界点分别作为第一边界点和第二边界点,且该第一边界点和第二边界点对应于选框中的不同边界,如图4中的第一边界点p1和第二边界点p2;即形成第一边界点的边界和第二边界点的边界不存在相同的边界,可以为选择框的左上角边界点和右下角边界点,也可以为选择框的右上角边界点和左下角边界点,以通过第一边界点和第二边界点表征选框。
步骤S12,将所述第一边界点和第二边界点映射到预设坐标系上,分别确定所述第一边界点和所述第二边界点的坐标数值,并将所述坐标数值设为坐标信息。
进一步地,预先设置有体现坐标数值的预设坐标系,该预设坐标系优选为二维坐标系,在终端界面所显示的样本图片位于该预设坐标系内。将读取的第一边界点和第二边界点映射到预设坐标系中,第一边界点和第二边界点在预设坐标系中所对应的坐标点具有的坐标值,即为第一边界点和第二边界点所对应的坐标数值;如图4中p1点和p2点在预设坐标系中对应的坐标值(x1、y1)和(x2、y2),即为各自的坐标数值。将该第一边界点和第二边界点的坐标数值作为与框选操作对应选框的坐标信息,以在后续进行OCR训练的过程中通过坐标信息体现选框,进而将选框中的内容确定为用于训练的文本行。
步骤S20,接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
更进一步地,开发人员在通过选框对样本图片中的文本行进行选择的同时,针对选框所选择的文本行录入对应的标签信息;该标签信息为文本行的文字信息,用于表征文本行的内容,以通过文本行和文字信息之间的对应关系进行OCR训练,实现OCR对文本行所表达文字信息的识别。为了实现依据文本行和文字信息之间的对应关系进行OCR训练,在接收到针对框选操作所录入的标签信息之后,将坐标信息和标签信息建立对应关系,形成文本行信息。因坐标信息表征了样本图片中的文本行,而标签信息为与该文本行所对应的文字信息;坐标信息与标签信息之间所形成的以对应关系存在的文本行信息,其实质为文本行与文字信息之间的对应关系;实现根据文本行所具有的坐标信息,查找到对应的标签信息,进而依据该文本行和标签信息进行OCR训练。
步骤S30,获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
可理解地,OCR训练依据文本行信息以及样本图片进行,需要将文本行信息和样本图片存储以用于后续OCR训练。将样本图片转换为数据流,数据流为base64,其用可写的字符形式数据表征二进制数据,以让中文字或者图片在网络上顺利传输。将该由样本图片所转换的数据流作为与样本图片对应的图片信息,并将该图片信息以及文本行信息作为文字信息进行存储;预先设置有用于存储文字信息的预设文件,可将图片信息和文本行信息添加到该预设文件中进行存储。为了在预设文件中对图片信息和文本行信息的界限进行区分,可设置文件头信息;该文件头信息可由开发人员设定,也可以通过检测的方式生成,其中涉及到表征样本图片的图片字节大小,文本行信息的文本行字节大小,文本行信息的信息数量等文字信息。在由开发人员设定时,由开发人员将检测的图片字节大小、文本行字节大小以及信息数量进行输入操作;获取该开发人员输入的文件头信息,并将其和图片信息以及文本行信息依次添加到预设文件中存储,生成OCR训练样本。以依据该存储的OCR训练样本中的文件头信息,确定其中的文本行信息部分以及样本图片部分,进而结合文本行信息和样本图片进行OCR训练。需要说明的是,本实施例中的预设文件可以为由开发人员所开发的依赖专用软件打开的文件,使得所形成的OCR训练样本仅能由对专用软件具有使用权限的用户查看并使用,使得所存储的OCR训练样本具有较高的保密性。
当文件头信息以检测的方式生成时,则在本实施例的获取文件头信息之前需要生成该文件头信息,具有地,获取文件头信息的步骤之前包括:
步骤S40,检测所述样本图片的图片字节大小以及各所述文本行信息的文本行字节大小,并统计所述文本行信息的信息数量;
进一步地,开发人员预先开发针对各类信息所占用字节大小的检测工具,当接收到样本图片时,调用该检测工具对其所占用的字节大小进行检测,该检测结果即为样本图片的图片字节大小。同时在将样本图片中所涉及的各个文本行均进行框选操作,形成多个选框;并在各个选框所识别出的坐标信息以及针对各次框选操作所录入的标签信息之间创建对应关系,形成多个文本行信息后;同样的调用该检测工具对多个文本行信息所占用的字节大小进行检测,该检测结果即为各文本行信息的文本行字节大小。此外还对创建的坐标信息与标签信息之间的对应关系数量,即文本行信息的数量进行统计,将该统计的数量作为信息数量;以依据图片字节大小、文本行字节大小以及信息数量生成文件头信息。
步骤S50,将所述图片字节大小、所述文本行字节大小以及所述信息数量生成文件头信息,并统计所述文件头信息的文件头字节大小;
更进一步地,将表征图片字节大小、文本行字节大小以及信息数量的文字信息形成文件头信息,该文件头信息中包括字段名以及字段值,以表征各个文字信息的类型及其对应数值;其中字段名为各个文字信息的名称,即图片字节大小、文本行字节大小以及信息数量三者的名称,字段值为各个文字信息所对应的数值大小,即图片字节大小、文本行字节大小以及信息数量三者所对应的数值大小;如“图片字节大小:50k”,其中“图片字节大小”为字段名,而“50k”为字段值。在将各个文字信息形成文件头信息后,该文件头信息本身也占用一定的字节大小,表征文件头信息中所包括文字信息所占用的大小;相应地,调用检测工具对文件头信息所占用的字节大小进行检测,检测所得到的结果即为统计的文件头信息的文件头字节大小。
步骤S60,将所述文件头字节大小添加到所述文件头信息中,以对所述文件头信息进行更新。
进一步地,将检测所得到的文件头字节大小添加到文件头信息中,以对所形成的文件头信息进行更新;在将文件头信息、图片信息和文本行信息添加到预设文件中存储,生成OCR训练样本后,先根据该文件头字节大小确定预设文件中所涉及到的文件头信息;再由文件头信息中的图片字节大小确定预设文件中所涉及到的图片信息,以及由文件头信息中的文本行字节大小确定预设文件中所涉及到的文本行信息;进而由图片信息和文本行信息进行OCR训练。
本实施例的OCR训练样本的生成方法,当接收到针对样本图片中的文本行进行框选操作时,识别与该框选操作所对应选框的坐标信息;并在接收到基于选框操作所录入的标签信息时,将坐标信息和标签信息建立对应关系,形成文本行信息;再将获取的文件头信息、与样本图片所对应的图片信息以及各个文本行信息添加到预设文件中存储,即生成OCR训练样本。本方案中的OCR训练样本由文件头信息、图片信息以及文本行信息组成,由文件头信息可确定OCR训练样本中的图片信息以及文本行信息,而由文本行信息中的坐标信息确定图片信息中的文本行;进而由坐标信息与标签信息之间的对应关系,确定与图片信息中文本行对应的标签信息;依据文本行及其对应的标签信息即可进行OCR训练。因不需要对样本图片进行切分操作,避免了对切分的样本图片的存储,节省了存储空间;同时节省了切分所花费的时间,提高了OCR训练样本的生成效率。
进一步地,在本申请OCR训练样本的生成方法另一实施例中,所述生成OCR训练样本的步骤之后包括:
步骤S70,当接收到OCR训练指令时,调用所述OCR训练样本,并根据所述OCR训练样本中所述文件头信息的字节大小信息,读取所述OCR训练样本中的所述文件头信息、各所述文本行信息以及所述图片信息;
可理解地,在生成用于对OCR进行训练的训练样本后,即可使用该训练样本进行OCR训练。具体地,OCR训练由训练指令进行触发,当接收到OCR训练指令时,则对OCR训练样本进行调用,并先读取OCR训练样本中的文件头信息,再读取文件头信息中的字节大小信息;进而依据该字节大小信息,读取OCR训练样本中的文件头信息,各文件行信息以及图片信息。因文件头信息中的字节大小信息涉及到文件头字节大小、图片字节大小以及文本行字节大小,使得在依据字节大小信息读取文件头信息、各文本行信息以及图片信息时,需要以及此三类字节大小进行;具体地,根据OCR训练样本中文件头信息的字节大小信息,读取OCR训练样本中的文件头信息、各文本行信息以及图片信息的步骤包括:
步骤S71,读取所述OCR训练样本中所述文件头信息的字节大小信息,并根据所述字节大小信息确定文件头字节大小、文本行字节大小以及图片字节大小;
进一步地,因文件头信息、各文本行信息以及图片信息按照分类依次存储在预设文件中,形成OCR训练样本;即OCR训练样本中涉及到三个部分的文字信息,分别对应文件头信息、各文本行信息以及图片信息;三类文字信息之间不存在交叉的情况,且不同类型的文字信息之间具有不同的字节大小,从而可依据文件头信息中的字节大小信息,对各个部分的文字信息进行读取,来获得文件头信息、各文本行信息以及图片信息。为了便于快速确定预设文件中文件头信息、各文本行信息以及图片信息的分界,将文件头信息设定在OCR训练样本的前列,以优先对文件头信息进行读取;同时将字节大小信息优先排列在文件头信息中的前列,以优先读取文件头信息及其中的字节大小信息,而由字节大小信息确定文件头信息、各文本行信息以及图片信息。此外也可以设置字段标识的形式,优先读取文件头信息中的字节大小信息;即针对文件头信息、各文本行信息以及图片信息设定不同的标识符,先依据标识符确定文件头信息;同样地针对文件头信息中的各项信息设定不同的子标识符,进而依据该子标识符确定其中的字节大小信息。
因字节大小信息中包括文件头字节大小、图片字节大小以及文本行字节大小,预先针对不用的字节大小设定不同的字节标识符;在读取到文件头信息中的字节大小信息后,依据各个字节标识符确定其中的文件头字节大小、图片字节大小和文本行字节大小。如预先设定字节标识符f1、f2、f3,分别表征文件头字节大小、图片字节大小以及文本行字节大小;在读取到字节大小信息后,继续读取字节大小信息中各字节大小数据所携带的字节标识符;当字节大小数据所携带的字节标识符为f1,则将该字节大小数据判定为文件头字节大小;当字节大小数据所携带的字节标识符为f2,则将该字节大小数据判定为图片字节大小;当字节大小数据所携带的字节标识符为f3,则将该字节大小数据判定为文本行字节大小;实现从读取的字节大小信息中确定文件头字节大小、图片字节大小以及文本行字节大小。
步骤S72,从所述OCR训练样本中分别读取与所述文件头字节大小、文本行字节大小以及图片字节大小对应的第一样本信息、第二样本信息以及第三样本信息,并将所述第一样本信息、第二样本信息以及第三样本信息设为文件头信息、各文本行信息以及图片信息。
进一步地,在读取到表征文件头信息、各文本行信息以及图片信息所占字节大小的文件头字节大小、文本行字节大小以及图片字节大小后,即可从OCR训练样本中读取与文件头字节大小对应的第一样本信息,与文本行字节大小对应的第二样本信息以及与图片字节大小对应的第三样本信息,该第一样本信息、第二样本信息以及第三样本信息即为文件头信息、各文本行信息以及图片信息。在读取的过程中,先判断读取的信息所占用的字节大小和文件头字节大小是否一致,若一致则中断该次信息的读取过程,并将该次所读取的信息作为第一样本信息;再启动下一阶段的读取过程,并判断读取的信息所占用的字节大小和文本行字节大小是否一致,若一致则中断该次信息的读取过程,并将该次所读取的信息作为第二样本信息;此后再启动新的一次读取过程,并判断读取的信息所占用的字节大小和图片字节大小是否一致,若一致则中断该次信息的读取过程,并将该次所读取的信息作为第三样本信息;若在各次的读取过程中,所读取的信息所占用的字节大小和文件头字节大小、文本行字节大小或图片字节大小均不一致,则继续读取,直到所读取的信息所占用的字节大小和文件头字节大小、文本行字节大小或图片字节大小一致。
需要说明的是,OCR训练样本中存储的是经样本图片所转换的图片信息,而文件头信息中字节大小信息所涉及的是样本图片的图片字节大小,图片字节大小不能直接表征图片信息所占用的字节大小。因图片信息由样本图片转换而来,从而可将表征样本图片所占用字节大小的图片字节大小转换为表征图片信息所占用字节大小的图片信息字节大小;在判断读取的信息所占用的字节大小和图片字节大小是否一致的过程中,其实质为判断读取的信息所占用的字节大小和图片信息字节大小是否一致的过程,若一致,则说明对图片信息所占用字节的大小读取完成。
步骤S80,根据所述图片信息确定样本图片,并根据各所述文本行信息中的坐标信息,确定所述样本图片中与各所述坐标信息对应的目标文本行;
进一步地,在从OCR训练样本中读取到文件头信息、各文本行信息以及图片信息后,可通过对图片信息的转换即可确定对应的样本图片。同时因各文本行信息为坐标信息与标签信息之间的对应关系,而坐标信息为样本图片中各文本行的坐标,从而依据文本行信息中的坐标信息可确定样本图片中的文本行,该各文本行和坐标信息之间具有对应关系,将与各坐标信息对应的文本行作为目标文本行。因坐标信息其实质为位于预设坐标系中第一边界点和第二边界点的坐标数值,从而在确定与坐标信息对应目标文本行的过程中,可依据坐标数值进行;具体地,根据各文本行信息中的坐标信息,确定样本图片中与各坐标信息对应的目标文本行的步骤包括:
步骤S81,从各所述文本行信息中任意选取一项文本行信息作为目标文本行信息,读取所述目标文本行信息中所具有坐标信息的目标坐标数值;
因各文本行信息中涉及到多个坐标信息和标签信息之间的对应关系,从而可从各文本行信息中任意选择一项文本行信息,将该所选择的文本行信息作为目标文本行信息。形成该目标文本行信息中的坐标信息和标签信息相应的为目标坐标信息和目标标签信息,读取该目标文本行信息中的坐标信息,即目标坐标信息中的坐标数值;将该坐标数值作为目标坐标数值,该目标坐标数值即为某次框选操作所对应选框的第一边界点和第二边界点在预设坐标系中的数值。
步骤S82,将所述目标坐标数值映射到所述样本图片所在的预设坐标系中,在所述样本图片中形成数值框,并将所述样本图片中位于所述数值框中的文本行设为目标文本行。
进一步地,目标坐标数值对应于预设坐标系中的两个点,该两个点即为对样本图片中文本行进行框选操作所对应选框的第一边界点和第二边界点,由该目标坐标数值可确定样本图片中的某一文本行。将目标坐标数值映射到样本图片所在的预设坐标系中,使得目标坐标数值对应预设坐标系中的两个坐标点,由该两个点在样本图片中形成数值框。如目标坐标数值在预设坐标系中所对应的两个坐标点分别为A1和A2,且A1的坐标数值为(x3、y3),A2的坐标数值为(x4、y4),则A1和A2在样本图片中所形成的数值框为由x4减去x3,y4减去y3计算的绝对距离所形成的矩形框。数值框在样本图片中对应有文本行,将该文本行作为目标文本行,而实现依据目标文本行信息中的坐标信息,确定样本图片中与该坐标信息对应的目标文本行。当各文本行信息均选取作为目标文本行信息后,则实现依据各目标文本行信息中的坐标信息,从样本图片中确定与各坐标信息所对应的各个目标文本行。
步骤S90,根据所述坐标信息和所述标签信息之间的对应关系,在各所述目标文本行和所述对应关系中的所述标签信息之间建立映射关系,并根据各所述映射关系进行OCR训练。
更进一步地,因文本行信息中的坐标信息与标签信息之间存在对应关系,在依据各坐标信息确定样本图片中与各坐标信息所对应的目标文本行之后,依据对应关系,可确定与各目标文本行所对应的标签信息;即依据坐标信息,可在目标文本行和对应关系中的标签信息之间建立映射关系。如坐标信息w与标签信息P之间存在对应关系,而确定样本图片中与坐标信息w所对应的目标文本行为Q,则依据坐标信息w可建立目标文本行Q与对应关系中P之间的映射关系。进而依据映射关系中的目标文本行和标签信息进行OCR训练,因目标文本行为存在于样本图片中以图片形式体现的各个文字信息,而标签信息为文字信息本身,OCR训练的过程即为将目标文本行中以图片形式存在的文字信息识别为以文字信息本身所存在的标签信息。考虑到样本图片中包括多个文本行,使得所建立的映射关系涉及到多个,在进行OCR训练过程中,为了避免对各映射关系中的目标文本行和标签信息进行重复训练,设置有通过标识符进行区分的机制;具体地,根据各映射关系进行OCR训练的步骤包括:
步骤S91,将所述映射关系传输到预设模型中,进行OCR训练,并对经OCR训练的所述映射关系分配标识符;
进一步地,预先设置有用于OCR训练的预设模型,预设模型可以是诸如SVM(support vector machine,支持向量机)此类的监督式学习方法;将各对映射关系中的目标文本行和标签信息传输到预设模型中,即可进行OCR训练,同时对经传输到预设模型中进行OCR训练的映射关系分配标识符,以表征该映射关系已经进行了OCR训练。
步骤S92,统计所述标识符的数量,并判断所述标识符的数量是否和所述信息数量一致,若和所述信息数量一致,则完成对OCR训练样本的训练;
在进行OCR训练的过程中,对分配的标识符的数量进行统计,该统计的标识符数量表征了各映射关系中已经进行了OCR训练的数量;将标识符的数量和文本行信息的信息数量对比,判断标识符的数量和信息数量是否一致。若两者一致,则说明进行OCR训练的映射关系和文本行信息的信息数量一致,而文本行信息的信息数量为坐标信息与标签信息之间的对应关系数量,两者之间的一致性表征了对样本图像中经框选操作所划分的文本行均进行了OCR训练,即将各映射关系均传输到预设模板中进行了OCR训练,从而完成了对OCR训练样本的训练。
步骤S93,若和所述信息数量不一致,则从各所述映射关系中读取未分配所述标识符的目标映射关系,并将所述目标映射关系作为新的映射关系,执行将所述映射关系传输到预设模型中的步骤。
更进一步地,在当判断出标识符的数量和信息数量不一致,则说明各映射关系中存在尚未进行OCR训练的映射关系,且该未进行OCR训练的映射关系不携带有标识符;从而从各映射关系中读取未分配标识符的映射关系,该映射关系即为目标映射关系,将该目标映射关系作为新的映射关系,传输到预设模型中进行训练;直到标识符的数量和信息数量一致,各个映射关系均传输到预设模型中进行OCR训练,即完成对OCR训练样本的训练。
此外,请参照图2,本申请提供一种OCR训练样本的生成装置,在本申请OCR训练样本的生成装置第一实施例中,所述OCR训练样本的生成装置包括:
识别模块10,用于接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
建立模块20,用于接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
生成模块30,用于获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
本实施例的OCR训练样本的生成装置,当接收到针对样本图片中的文本行进行框选操作时,识别模块10识别与该框选操作所对应选框的坐标信息;并在接收到基于选框操作所录入的标签信息时,建立模块20将坐标信息和标签信息建立对应关系,形成文本行信息;再由生成模块30将获取的文件头信息、与样本图片所对应的图片信息以及各个文本行信息添加到预设文件中存储,即生成OCR训练样本。本方案中的OCR训练样本由文件头信息、图片信息以及文本行信息组成,由文件头信息可确定OCR训练样本中的图片信息以及文本行信息,而由文本行信息中的坐标信息确定图片信息中的文本行;进而由坐标信息与标签信息之间的对应关系,确定与图片信息中文本行对应的标签信息;依据文本行及其对应的标签信息即可进行OCR训练。因不需要对样本图片进行切分操作,避免了对切分的样本图片的存储,节省了存储空间;同时节省了切分所花费的时间,提高了OCR训练样本的生成效率。
其中,上述OCR训练样本的生成装置的各虚拟功能模块存储于图3所示OCR训练样本的生成设备的存储器1005中,处理器1001执行OCR训练样本的生成程序时,实现图2所示实施例中各个模块的功能。
需要说明的是,本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
参照图3,图3是本申请实施例方法涉及的硬件运行环境的设备结构示意图。
本申请实施例OCR训练样本的生成设备可以是PC( personal computer,个人计算机 ),也可以是智能手机、平板电脑、电子书阅读器、便携计算机等终端设备。
如图3所示,该OCR训练样本的生成设备可以包括:处理器1001,例如CPU(Central Processing Unit,中央处理器),存储器1005,通信总线1002。其中,通信总线1002用于实现处理器1001和存储器1005之间的连接通信。存储器1005可以是高速RAM(random access memory,随机存取存储器),也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
可选地,该OCR训练样本的生成设备还可以包括用户接口、网络接口、摄像头、RF(Radio Frequency,射频)电路,传感器、音频电路、WiFi(Wireless Fidelity,无线宽带)模块等等。用户接口可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口还可以包括标准的有线接口、无线接口。网络接口可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。
本领域技术人员可以理解,图3中示出的OCR训练样本的生成设备结构并不构成对OCR训练样本的生成设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图3所示,作为一种可读存储介质的存储器1005中可以包括操作系统、网络通信模块以及OCR训练样本的生成程序。操作系统是管理和控制OCR训练样本的生成设备硬件和软件资源的程序,支持OCR训练样本的生成程序以及其它软件和/或程序的运行。网络通信模块用于实现存储器1005内部各组件之间的通信,以及与OCR训练样本的生成设备中其它硬件和软件之间通信。
在图3所示的OCR训练样本的生成设备中,处理器1001用于执行存储器1005中存储的OCR训练样本的生成程序,实现上述OCR训练样本的生成方法各实施例中的步骤。
本申请提供了一种可读存储介质,所述可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序还可被一个或者一个以上的处理器执行以用于实现上述OCR训练样本的生成方法各实施例中的步骤。
还需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个可读存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是在本申请的构思下,利用本申请说明书及附图内容所作的等效结构变换,或直接/间接运用在其他相关的技术领域均包括在本申请的专利保护范围内。

Claims (20)

  1. 一种OCR训练样本的生成方法,其特征在于,所述OCR训练样本的生成方法包括以下步骤:
    接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
    接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
    获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
  2. 如权利要求1所述的OCR训练样本的生成方法,其特征在于,所述生成OCR训练样本的步骤之后包括:
    当接收到OCR训练指令时,调用所述OCR训练样本,并根据所述OCR训练样本中所述文件头信息的字节大小信息,读取所述OCR训练样本中的所述文件头信息、各所述文本行信息以及所述图片信息;
    根据所述图片信息确定样本图片,并根据各所述文本行信息中的坐标信息,确定所述样本图片中与各所述坐标信息对应的目标文本行;
    根据所述坐标信息和所述标签信息之间的对应关系,在各所述目标文本行和所述对应关系中的所述标签信息之间建立映射关系,并根据各所述映射关系进行OCR训练。
  3. 如权利要求2所述的OCR训练样本的生成方法,其特征在于,所述根据所述OCR训练样本中所述文件头信息的字节大小信息,读取所述OCR训练样本中的所述文件头信息、各所述文本行信息以及所述图片信息的步骤包括:
    读取所述OCR训练样本中所述文件头信息的字节大小信息,并根据所述字节大小信息确定文件头字节大小、文本行字节大小以及图片字节大小;
    从所述OCR训练样本中分别读取与所述文件头字节大小、文本行字节大小以及图片字节大小对应的第一样本信息、第二样本信息以及第三样本信息,并将所述第一样本信息、第二样本信息以及第三样本信息设为文件头信息、各文本行信息以及图片信息。
  4. 如权利要求2所述的OCR训练样本的生成方法,其特征在于,所述识别与所述框选操作对应选框的坐标信息的步骤包括:
    读取所述选框操作对应选框中的第一边界点和第二边界点,其中所述第一边界点和所述第二边界点对应所述选框中的不同边界;
    将所述第一边界点和第二边界点映射到预设坐标系上,分别确定所述第一边界点和所述第二边界点的坐标数值,并将所述坐标数值设为坐标信息。
  5. 如权利要求4所述的OCR训练样本的生成方法,其特征在于,所述根据各所述文本行信息中的坐标信息,确定所述样本图片中与各所述坐标信息对应的目标文本行的步骤包括:
    从各所述文本行信息中任意选取一项文本行信息作为目标文本行信息,读取所述目标文本行信息中所具有坐标信息的目标坐标数值;
    将所述目标坐标数值映射到所述样本图片所在的预设坐标系中,在所述样本图片中形成数值框,并将所述样本图片中位于所述数值框中的文本行设为目标文本行。
  6. 如权利要求2所述的OCR训练样本的生成方法,其特征在于,所述获取文件头信息的步骤之前包括:
    检测所述样本图片的图片字节大小以及各所述文本行信息的文本行字节大小,并统计所述文本行信息的信息数量;
    将所述图片字节大小、所述文本行字节大小以及所述信息数量生成文件头信息,并统计所述文件头信息的文件头字节大小;
    将所述文件头字节大小添加到所述文件头信息中,以对所述文件头信息进行更新。
  7. 如权利要求6所述的OCR训练样本的生成方法,其特征在于,所述根据各所述映射关系进行OCR训练的步骤包括:
    将所述映射关系传输到预设模型中,进行OCR训练,并对经OCR训练的所述映射关系分配标识符;
    统计所述标识符的数量,并判断所述标识符的数量是否和所述信息数量一致,若和所述信息数量一致,则完成对OCR训练样本的训练;
    若和所述信息数量不一致,则从各所述映射关系中读取未分配所述标识符的目标映射关系,并将所述目标映射关系作为新的映射关系,执行将所述映射关系传输到预设模型中的步骤。
  8. 一种OCR训练样本的生成装置,其特征在于,所述OCR训练样本的生成装置包括:
    识别模块,用于接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
    建立模块,用于接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
    生成模块,用于获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
  9. 一种OCR训练样本的生成设备,其特征在于,所述OCR训练样本的生成设备包括:存储器、处理器、通信总线以及存储在所述存储器上的OCR训练样本的生成程序;
    所述通信总线用于实现处理器和存储器之间的连接通信;
    所述处理器用于执行所述OCR训练样本的生成程序,以实现以下步骤:
    接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
    接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
    获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
  10. 如权利要求9所述的OCR训练样本的生成设备,其特征在于,所述生成OCR训练样本的步骤之后,所述处理器用于执行所述OCR训练样本的生成程序,以实现以下步骤:
    当接收到OCR训练指令时,调用所述OCR训练样本,并根据所述OCR训练样本中所述文件头信息的字节大小信息,读取所述OCR训练样本中的所述文件头信息、各所述文本行信息以及所述图片信息;
    根据所述图片信息确定样本图片,并根据各所述文本行信息中的坐标信息,确定所述样本图片中与各所述坐标信息对应的目标文本行;
    根据所述坐标信息和所述标签信息之间的对应关系,在各所述目标文本行和所述对应关系中的所述标签信息之间建立映射关系,并根据各所述映射关系进行OCR训练。
  11. 如权利要求10所述的OCR训练样本的生成设备,其特征在于,所述根据所述OCR训练样本中所述文件头信息的字节大小信息,读取所述OCR训练样本中的所述文件头信息、各所述文本行信息以及所述图片信息的步骤包括:
    读取所述OCR训练样本中所述文件头信息的字节大小信息,并根据所述字节大小信息确定文件头字节大小、文本行字节大小以及图片字节大小;
    从所述OCR训练样本中分别读取与所述文件头字节大小、文本行字节大小以及图片字节大小对应的第一样本信息、第二样本信息以及第三样本信息,并将所述第一样本信息、第二样本信息以及第三样本信息设为文件头信息、各文本行信息以及图片信息。
  12. 如权利要求10所述的OCR训练样本的生成设备,其特征在于,所述识别与所述框选操作对应选框的坐标信息的步骤包括:
    读取所述选框操作对应选框中的第一边界点和第二边界点,其中所述第一边界点和所述第二边界点对应所述选框中的不同边界;
    将所述第一边界点和第二边界点映射到预设坐标系上,分别确定所述第一边界点和所述第二边界点的坐标数值,并将所述坐标数值设为坐标信息。
  13. 如权利要求12所述的OCR训练样本的生成设备,其特征在于,所述根据各所述文本行信息中的坐标信息,确定所述样本图片中与各所述坐标信息对应的目标文本行的步骤包括:
    从各所述文本行信息中任意选取一项文本行信息作为目标文本行信息,读取所述目标文本行信息中所具有坐标信息的目标坐标数值;
    将所述目标坐标数值映射到所述样本图片所在的预设坐标系中,在所述样本图片中形成数值框,并将所述样本图片中位于所述数值框中的文本行设为目标文本行。
  14. 如权利要求10所述的OCR训练样本的生成设备,其特征在于,所述获取文件头信息的步骤之前,所述处理器用于执行所述OCR训练样本的生成程序,以实现以下步骤:
    检测所述样本图片的图片字节大小以及各所述文本行信息的文本行字节大小,并统计所述文本行信息的信息数量;
    将所述图片字节大小、所述文本行字节大小以及所述信息数量生成文件头信息,并统计所述文件头信息的文件头字节大小;
    将所述文件头字节大小添加到所述文件头信息中,以对所述文件头信息进行更新。
  15. 一种可读存储介质,其特征在于,所述可读存储介质上存储有OCR训练样本的生成程序,所述OCR训练样本的生成程序被处理器执行,实现以下步骤:
    接收样本图片,并在接收到对所述样本图片中文本行的框选操作时,识别与所述框选操作对应选框的坐标信息;
    接收基于所述框选操作录入的标签信息,并将所述坐标信息和所述标签信息建立对应关系,形成文本行信息;
    获取文件头信息,并将所述文本头信息、与所述样本图片对应的图片信息以及各所述文本行信息添加到预设文件中存储,生成OCR训练样本。
  16. 如权利要求15所述的可读存储介质,其特征在于,所述生成OCR训练样本的步骤之后,所述OCR训练样本的生成程序被处理器执行,实现以下步骤:
    当接收到OCR训练指令时,调用所述OCR训练样本,并根据所述OCR训练样本中所述文件头信息的字节大小信息,读取所述OCR训练样本中的所述文件头信息、各所述文本行信息以及所述图片信息;
    根据所述图片信息确定样本图片,并根据各所述文本行信息中的坐标信息,确定所述样本图片中与各所述坐标信息对应的目标文本行;
    根据所述坐标信息和所述标签信息之间的对应关系,在各所述目标文本行和所述对应关系中的所述标签信息之间建立映射关系,并根据各所述映射关系进行OCR训练。
  17. 如权利要求16所述的可读存储介质,其特征在于,所述根据所述OCR训练样本中所述文件头信息的字节大小信息,读取所述OCR训练样本中的所述文件头信息、各所述文本行信息以及所述图片信息的步骤包括:
    读取所述OCR训练样本中所述文件头信息的字节大小信息,并根据所述字节大小信息确定文件头字节大小、文本行字节大小以及图片字节大小;
    从所述OCR训练样本中分别读取与所述文件头字节大小、文本行字节大小以及图片字节大小对应的第一样本信息、第二样本信息以及第三样本信息,并将所述第一样本信息、第二样本信息以及第三样本信息设为文件头信息、各文本行信息以及图片信息。
  18. 如权利要求16所述的可读存储介质,其特征在于,所述识别与所述框选操作对应选框的坐标信息的步骤包括:
    读取所述选框操作对应选框中的第一边界点和第二边界点,其中所述第一边界点和所述第二边界点对应所述选框中的不同边界;
    将所述第一边界点和第二边界点映射到预设坐标系上,分别确定所述第一边界点和所述第二边界点的坐标数值,并将所述坐标数值设为坐标信息。
  19. 如权利要求18所述的可读存储介质,其特征在于,所述根据各所述文本行信息中的坐标信息,确定所述样本图片中与各所述坐标信息对应的目标文本行的步骤包括:
    从各所述文本行信息中任意选取一项文本行信息作为目标文本行信息,读取所述目标文本行信息中所具有坐标信息的目标坐标数值;
    将所述目标坐标数值映射到所述样本图片所在的预设坐标系中,在所述样本图片中形成数值框,并将所述样本图片中位于所述数值框中的文本行设为目标文本行。
  20. 如权利要求16所述的可读存储介质,其特征在于,所述获取文件头信息的步骤之前,所述OCR训练样本的生成程序被处理器执行,实现以下步骤:
    检测所述样本图片的图片字节大小以及各所述文本行信息的文本行字节大小,并统计所述文本行信息的信息数量;
    将所述图片字节大小、所述文本行字节大小以及所述信息数量生成文件头信息,并统计所述文件头信息的文件头字节大小;
    将所述文件头字节大小添加到所述文件头信息中,以对所述文件头信息进行更新。
PCT/CN2018/123225 2018-11-12 2018-12-24 Ocr训练样本的生成方法、装置、设备及可读存储介质 WO2020098078A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811342303.6A CN109711396A (zh) 2018-11-12 2018-11-12 Ocr训练样本的生成方法、装置、设备及可读存储介质
CN201811342303.6 2018-11-12

Publications (1)

Publication Number Publication Date
WO2020098078A1 true WO2020098078A1 (zh) 2020-05-22

Family

ID=66254901

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/123225 WO2020098078A1 (zh) 2018-11-12 2018-12-24 Ocr训练样本的生成方法、装置、设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN109711396A (zh)
WO (1) WO2020098078A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199337A (zh) * 2020-10-20 2021-01-08 支付宝(杭州)信息技术有限公司 一种证明文件处理方法、装置及设备

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699906B (zh) * 2019-10-22 2023-09-22 杭州海康威视数字技术股份有限公司 获取训练数据的方法、装置及存储介质
CN111325106B (zh) * 2020-01-22 2023-11-03 京东科技控股股份有限公司 生成训练数据的方法及装置
CN111523541A (zh) * 2020-04-21 2020-08-11 上海云从汇临人工智能科技有限公司 一种基于ocr的数据生成方法、系统、设备及介质
WO2021212658A1 (zh) * 2020-04-24 2021-10-28 平安国际智慧城市科技股份有限公司 Ocr图像样本生成、印刷体验证方法、装置、设备及介质
CN115495001B (zh) * 2022-11-21 2023-04-07 成都智元汇信息技术股份有限公司 框选定位生成广告位信息展示组件方法和系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020154815A1 (en) * 2001-03-30 2002-10-24 Hiroyuki Mizutani Character recognition device and a method therefore
CN1482572A (zh) * 2003-06-27 2004-03-17 杭州信雅达系统工程股份有限公司 票据图象处理装置
CN104899060A (zh) * 2015-05-20 2015-09-09 天脉聚源(北京)教育科技有限公司 一种图片加载处理方法和装置
CN105005793A (zh) * 2015-07-15 2015-10-28 广州敦和信息技术有限公司 一种发票字条自动识别录入的方法及装置
CN107861931A (zh) * 2017-11-02 2018-03-30 金蝶软件(中国)有限公司 模板文件处理方法、装置、计算机设备和存储介质
CN108734089A (zh) * 2018-04-02 2018-11-02 腾讯科技(深圳)有限公司 识别图片文件中表格内容的方法、装置、设备及存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106611174A (zh) * 2016-12-29 2017-05-03 成都数联铭品科技有限公司 一种非常见字体的ocr识别方法
CN107273883B (zh) * 2017-05-03 2020-04-21 天方创新(北京)信息技术有限公司 决策树模型训练方法、确定ocr结果中数据属性方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020154815A1 (en) * 2001-03-30 2002-10-24 Hiroyuki Mizutani Character recognition device and a method therefore
CN1482572A (zh) * 2003-06-27 2004-03-17 杭州信雅达系统工程股份有限公司 票据图象处理装置
CN104899060A (zh) * 2015-05-20 2015-09-09 天脉聚源(北京)教育科技有限公司 一种图片加载处理方法和装置
CN105005793A (zh) * 2015-07-15 2015-10-28 广州敦和信息技术有限公司 一种发票字条自动识别录入的方法及装置
CN107861931A (zh) * 2017-11-02 2018-03-30 金蝶软件(中国)有限公司 模板文件处理方法、装置、计算机设备和存储介质
CN108734089A (zh) * 2018-04-02 2018-11-02 腾讯科技(深圳)有限公司 识别图片文件中表格内容的方法、装置、设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199337A (zh) * 2020-10-20 2021-01-08 支付宝(杭州)信息技术有限公司 一种证明文件处理方法、装置及设备

Also Published As

Publication number Publication date
CN109711396A (zh) 2019-05-03

Similar Documents

Publication Publication Date Title
WO2020098078A1 (zh) Ocr训练样本的生成方法、装置、设备及可读存储介质
WO2020171611A1 (en) Electronic device for providing various functions through application using a camera and operating method thereof
WO2020015067A1 (zh) 数据采集方法、装置、设备及存储介质
US9292096B2 (en) Conference projection system with gesture-based image transmitting unit
WO2020073495A1 (zh) 基于人工智能的复审方法、装置、设备及存储介质
WO2014073762A1 (en) Image forming apparatus supporting near field communication function and method of displaying menu in image forming apparatus
WO2016027983A1 (en) Method and electronic device for classifying contents
WO2020087704A1 (zh) 信贷信息管理方法、装置、设备和存储介质
WO2021132793A1 (ko) 스와치 자동인식 기반 빅데이터 자재정보 관리 시스템
WO2020141763A1 (ko) 전자 장치 및 그 폴더 구성 방법
WO2020233089A1 (zh) 测试用例生成方法、装置、终端及计算机可读存储介质
WO2021027134A1 (zh) 数据存储方法、装置、设备和计算机存储介质
WO2013147484A1 (ko) 대기표 운용 시스템 및 그 운용방법
WO2020096262A1 (ko) 전자 장치, 그의 개인 정보 제공 방법 및 이를 기록한 컴퓨터 판독 가능 기록매체
WO2020122604A1 (en) Electronic device and method for displaying web content in augmented reality mode
WO2021162323A1 (ko) 전자 장치 및 상기 전자 장치를 이용한 콘텐츠 운용 방법
WO2015093754A1 (ko) 전자 장치에서 연결 정보를 공유하는 방법 및 장치
WO2020138909A1 (ko) 콘텐트 공유 방법 및 그 전자 장치
WO2019164196A1 (ko) 문자를 인식하기 위한 전자 장치 및 방법
WO2016186326A1 (ko) 검색어 리스트 제공 장치 및 이를 이용한 방법
WO2020034531A1 (zh) 空间清理方法、装置、设备和计算机可读存储介质
WO2019147029A1 (ko) 상점 정보를 수신하는 방법 및 이를 사용하는 전자 장치
EP3759618A1 (en) Method and device for retrieving content
WO2020133601A1 (zh) 数据的传输方法、装置及计算机可读存储介质
WO2019033718A1 (zh) 第三方信息通知方法、系统及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18940071

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 20.08.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18940071

Country of ref document: EP

Kind code of ref document: A1