CN109766879B - Character detection model generation method, character detection device, character detection equipment and medium - Google Patents

Character detection model generation method, character detection device, character detection equipment and medium Download PDF

Info

Publication number
CN109766879B
CN109766879B CN201910027515.3A CN201910027515A CN109766879B CN 109766879 B CN109766879 B CN 109766879B CN 201910027515 A CN201910027515 A CN 201910027515A CN 109766879 B CN109766879 B CN 109766879B
Authority
CN
China
Prior art keywords
character
image
picture
identified
positioning information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910027515.3A
Other languages
Chinese (zh)
Other versions
CN109766879A (en
Inventor
卢永晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201910027515.3A priority Critical patent/CN109766879B/en
Publication of CN109766879A publication Critical patent/CN109766879A/en
Application granted granted Critical
Publication of CN109766879B publication Critical patent/CN109766879B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

The embodiment of the disclosure discloses a method, a device, equipment and a medium for generating a character detection model and detecting characters. The method for generating the character detection model comprises the following steps: constructing at least one text picture to be recognized according to at least one character image and a blank background picture; acquiring positioning information of each character image in the at least one text picture to be identified; the positioning information of each character image in the character picture to be identified is correspondingly used as a group of character detection training sample data; and training the standard detection model by adopting at least one group of character detection training sample data to generate a character detection model. Through the technical scheme, the character detection training sample can be provided for the training character detection model rapidly and in large quantity, and the problems of low efficiency and high cost existing when the character detection training sample (especially for small language characters) is generated through manual labeling are solved.

Description

Character detection model generation method, character detection device, character detection equipment and medium
Technical Field
The embodiment of the disclosure relates to data technology, in particular to a method, a device, equipment and a medium for generating a character detection model and detecting characters.
Background
OCR (Optical Character Recognition ) refers to the process of an electronic device (e.g., a scanner or digital camera) checking characters printed on paper, determining their shape by detecting dark and light patterns, and then translating the shape into computer text using a character recognition method.
In the OCR recognition process, firstly, position information of each character in a picture to be recognized is detected by using a character detection model, and then character images corresponding to the characters are obtained. The training character detection model requires a large amount of character detection training sample data, wherein each group of character detection training sample data at least comprises a picture to be recognized and position information of each character image in the picture to be recognized. Character detection training sample data, especially character detection training sample data related to small language characters (such as a Hindi) are usually generated based on labeling each character in a picture to be recognized manually, but the efficiency of manual labeling is low, and the labor cost is high.
Disclosure of Invention
The embodiment of the disclosure provides a method, a device, equipment and a medium for generating a character detection model, so as to realize automatic labeling of each character in a picture, replace manual character labeling work, improve the character labeling efficiency, and further quickly generate a large amount of character detection training sample data for training the character detection model.
In a first aspect, an embodiment of the present disclosure provides a method for generating a character detection model, including:
constructing at least one text picture to be recognized according to at least one character image and a blank background picture;
acquiring positioning information of each character image in the at least one text picture to be identified;
the positioning information of each character image in the character picture to be identified is correspondingly used as a group of character detection training sample data;
and training the standard detection model by adopting at least one group of character detection training sample data to generate a character detection model.
Further, the constructing at least one text picture to be recognized according to the at least one character image and the blank background picture includes:
stitching the at least one character image into at least one character line image;
and constructing at least one text picture to be recognized according to the at least one character line image and the blank background picture.
Further, the constructing at least one text picture to be recognized according to the at least one character line image and the blank background picture includes:
and adding the at least one character line image to the blank background picture according to preset positioning information to construct at least one character picture to be recognized.
Further, the positioning information includes position information and rotation angle information.
Further, before the step of using the text image to be recognized and the positioning information of each character image in the text image to be recognized as a set of character detection training sample data, the method further includes:
and adding noise to the text picture to be identified.
Further, the standard detection model is an original machine learning model;
training the standard detection model by adopting at least one group of character detection training sample data to generate a character detection model, wherein the method comprises the following steps:
and training the original machine learning model by adopting at least one group of character detection training sample data and a standard character detection training sample set to generate a character detection model.
Further, the characters include hindi characters.
In a second aspect, an embodiment of the present disclosure further provides a method for detecting a character, including:
acquiring a text picture to be identified;
inputting the text picture to be recognized into a character detection model generated by the character detection model generation method according to any embodiment of the disclosure;
and acquiring positioning information of each character image in the character image to be recognized, which is output by the character detection model.
Further, the positioning information includes position information and rotation angle information, and the characters include a hindi character.
In a third aspect, an embodiment of the present disclosure further provides a device for generating a character detection model, where the device includes:
the character picture construction module to be identified is used for constructing at least one character picture to be identified according to at least one character image and the blank background picture;
the positioning information acquisition module is used for acquiring positioning information of each character image in the at least one text picture to be identified;
the training sample data generation module is used for correspondingly taking the to-be-identified text picture and the positioning information of each character image in the to-be-identified text picture as a group of character detection training sample data;
and the model training module is used for training the standard detection model by adopting at least one group of character detection training sample data to generate a character detection model.
Further, the text-to-be-identified picture construction module includes: a character line image construction unit and a character picture construction unit to be identified, wherein,
the character line image construction unit is used for splicing at least one character image into at least one character line image;
The character picture construction unit to be identified is used for constructing at least one character picture to be identified according to the at least one character line image and the blank background picture.
Further, the to-be-identified text image construction unit is specifically configured to add the at least one character line image to the blank background image according to preset positioning information, so as to construct at least one to-be-identified text image.
Further, the positioning information includes position information and rotation angle information.
Further, the generating device of the character detection model further includes: the image processing module is used for adding noise to the to-be-identified text image before the to-be-identified text image and the positioning information of each character image in the to-be-identified text image are correspondingly used as a group of character detection training sample data.
Further, the standard detection model is an original machine learning model;
the model training module is specifically configured to train the original machine learning model by using at least one set of character detection training sample data and a standard character detection training sample set, so as to generate a character detection model.
Further, the characters include hindi characters.
In a fourth aspect, an embodiment of the present disclosure further provides a character detection apparatus, including: the character and picture to be identified acquisition module is used for acquiring the character and picture to be identified;
the detection module is used for inputting the text picture to be identified into the character detection model generated by the generation device of the character detection model according to any embodiment of the disclosure;
the detection result acquisition module is used for acquiring the positioning information of each character image in the character image to be identified, which is output by the character detection model.
Further, the positioning information includes position information and rotation angle information, and the characters include a hindi character.
In a fifth aspect, embodiments of the present disclosure further provide an electronic device, including:
one or more processors;
a memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of generating a character detection model as described in any embodiment of the present disclosure.
In a sixth aspect, the embodiments of the present disclosure further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for generating a character detection model according to any embodiment of the present disclosure.
In a seventh aspect, embodiments of the present disclosure further provide an electronic device, including:
one or more processors;
a memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the character detection method as described in any embodiment of the present disclosure.
In an eighth aspect, the embodiments of the present disclosure further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the character detection method according to any of the embodiments of the present disclosure.
According to the embodiment of the disclosure, at least one character image to be recognized is constructed according to at least one character image and a blank background image, and then the character image to be recognized and the positioning information of each character image in the character image to be recognized are correspondingly used as a technical scheme of a group of character detection training sample data, so that a great number of character detection training samples are quickly generated, a mode of generating the character detection training samples through manual labeling is replaced, the generation efficiency of the character detection training samples is improved, and further training samples can be quickly and massively provided for training character detection models. The technical scheme is very significant for the character detection model capable of detecting the small-language characters, and solves the problems of low efficiency and high cost existing when the character detection training sample for the small-language characters is generated through manual labeling.
Drawings
FIG. 1 is a flowchart of a method for generating a character detection model according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a character detection method according to a second embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a generating device of a character detection model according to a third embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a character detecting device according to a fourth embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the present disclosure and not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present disclosure are shown in the drawings.
Example 1
Fig. 1 is a flowchart of a method for generating a character detection model according to a first embodiment of the present disclosure, where the method may be performed by a device for generating a character detection model, where the device may be implemented in software and/or hardware, and the device may be configured in an electronic device, such as a computer typically. As shown in fig. 1, the method specifically includes the following steps:
S110, constructing at least one text picture to be recognized according to the at least one character image and the blank background picture.
Typically, the character image is an image of a single character corresponding to a small language, and may specifically be a hindi character image. A large number of character detection training sample data aiming at the small language are constructed to train the character detection model, so that the trained character detection model can successfully detect the positioning information of each small language character in the to-be-recognized text picture, and then the character recognition model can be used for carrying out character recognition on each small language character image in the to-be-recognized text picture.
Specifically, a large number of corpora corresponding to the small language on the network can be obtained through the web crawler technology, character images of all the characters corresponding to the corpora are obtained in a font library according to Unicode corresponding to the corpora, and then all the character images are added on a blank background picture to form a text picture to be recognized, wherein when the character images are added on the blank background picture, the positioning information of the character images in the blank background picture needs to be determined first.
Specifically, the positioning information includes position information and rotation angle information.
Specifically, the position information may be coordinate position information of a boundary point or a center point of the character image in the blank background picture, and, taking a rectangular frame of which the size of the boundary of the character image is known as an example (generally, the sizes of the character images for the same language are identical), coordinate position information of an upper left boundary point and a lower right boundary point (or coordinate position information of an upper left boundary point and an upper right boundary point) of the character image may be used as the position information of the character image, or coordinate position information of a center point of the character image may be used as the position information of the character image. The rotation angle information may be specific information of an included angle between a horizontal symmetry axis and a horizontal direction of the character image, or specific information of an included angle between a vertical symmetry axis and a vertical direction of the character image, or specific information of an included angle (acute angle) between a clockwise rotation of the horizontal symmetry axis and a horizontal direction may be set to a positive value, an included angle (acute angle) between a counterclockwise rotation of the horizontal symmetry axis and a horizontal direction may be set to a negative value, and the same manner may be set for the vertical coordinate axis.
As a specific implementation manner of this embodiment, at least one text picture to be identified may be constructed according to at least one character image and a blank background picture, which specifically includes: stitching the at least one character image into at least one character line image; and constructing at least one text picture to be recognized according to the at least one character line image and the blank background picture.
The following explanation is in chinese, since the use of small languages is inconvenient for illustration. For example, according to the corpus "i am chinese" to obtain character images corresponding to the characters "i am", "yes", "middle", "national" and "man", respectively, where each of the character images has a pixel of 10×10, the five character images may be first spliced into a character line image having a pixel of 10×50, then the rotation angle when the character line image having a pixel of 10×50 is added to the blank background picture is determined, and position information in the blank background picture having a pixel of 480×480 is obtained, and finally the character line image is added to the blank background picture according to the determined rotation angle and the determined position information to form a text picture to be recognized.
The above example is to seamlessly stitch at least one character image into at least one character line image, or stitch at least one character image into at least one character line image at preset intervals, where the pixel sizes of the intervals between adjacent character images may be the same or different (the pixel sizes of the intervals need to be recorded respectively).
Typically, at least one text picture to be recognized can be constructed according to at least one character line image and a blank background picture, specifically: and adding at least one character line image to the blank background image according to preset positioning information to construct at least one character image to be recognized.
The preset positioning information may be position information and rotation angle information determined according to a random algorithm.
Specifically, the position point information and the rotation angle of the top left vertex of the boundary of the character line image to be added may be determined, for example, the pixel coordinate of the position point information is (5, 20), the rotation angle is 5 °, the top left vertex of the character line image is aligned to the position point of the pixel coordinate is (5, 20), and the character line image is added to the blank background picture after being rotated counterclockwise by 5 °.
Only one character line image can be added in one blank background image to form a character image to be recognized, and at least two character line images can be added to form the character image to be recognized.
If at least two character line images are added to the blank background picture at the same time, the pixel size, the position information and the rotation angle information of each character line image in the at least two character line images need to be comprehensively considered so as to avoid the phenomenon that different character line images overlap.
Specifically, the blank background picture may be divided into at least two line areas, and the position information and the rotation angle information of the character line image added to the corresponding line area may be randomly determined based on each line area, respectively, so that the character line image completely falls into the corresponding line area. The number of character line images added to the blank background picture is less than or equal to the number of divided line areas.
Specifically, the position information of the first character line image in the blank background picture and the rotation angle information of the first character line image can be determined according to a random algorithm, and then the first character line image is added into the blank background picture; and then determining the position information of the second character line image in the residual area of the blank background picture and the rotation angle information of the second character line image according to a random algorithm, further adding the second character line image into the blank background picture (without overlapping the first character line image), adding the third character line image without overlapping the first character line image and the second character line image, and so on.
The above is merely an exemplary description, and the present embodiment is not particularly limited thereto, as long as at least two character line images can be added to a blank background picture without overlapping.
S120, acquiring positioning information of each character image in at least one text picture to be identified.
After the character picture to be recognized is constructed, the position information and the rotation angle information of each character image in the character picture to be recognized are obtained.
Continuing the previous example, the pixels of the character line image are 10×50, which includes five character images, the pixels of each character image are 10×10, the positioning information of the character line image in the text image to be recognized is the pixel coordinates of the top left vertex of the character line image (5, 20), and the rotation angle is 5 °. From this, the rotation angle of each character image in the character line images is 5 ° as the rotation angle of the character line images, the pixel coordinates of the top left vertex of the first character image are (5, 20), the pixel size of each character image is known, and each character image is connected without interval, so that the position information of each character image can be determined in turn. Specifically, the determined position information of each character image may be pixel coordinate information of each boundary point of the character image, or may be pixel coordinate information of a center point of the character image.
If the character images in the character line images are not connected without intervals, the position information of the character images can be determined in sequence by combining the pixel sizes of the intervals.
If the character picture to be identified is generated after each character image is added to the blank background picture in sequence, special position information and rotation angle exist for each character image, and the special position information and rotation angle of each character image are used as positioning information of the corresponding character image.
S130, the positioning information of each character image in the character picture to be recognized is correspondingly used as a group of character detection training sample data.
After the positioning information of each character image in the character image to be recognized is determined, the positioning information of the character images is corresponding to the character image to be recognized, and the character image to be recognized can be used as a group of character detection training sample data.
Multiple sections of corpus corresponding to the small language are obtained through the web crawler, a large number of text pictures to be recognized can be constructed based on the corpus, and a large number of character detection training sample data aiming at the small language are further generated. Even for the same section of corpus obtained by the web crawler, each character image corresponding to the corpus can be added into different blank background pictures according to different preset positioning information to generate different character pictures to be identified, so that a large amount of character detection training sample data for small language can be generated.
As a specific implementation manner of this embodiment, before the text image to be identified and the positioning information of each character image in the text image to be identified are correspondingly used as a set of character detection training sample data, noise may also be added to the text image to be identified.
The character detection model often has noise points, such as gaussian noise or pretzel noise, on the text and picture to be identified, which are detected in practical application. Therefore, in order to simulate the text image to be recognized in the real scene, after the text image to be recognized is constructed and generated, image noise, such as gaussian noise or pretzel noise, can be added to the text image to be recognized.
And S140, training the standard detection model by adopting at least one group of character detection training sample data to generate a character detection model.
After a large amount of character detection training sample data are generated, training the standard detection model by using the character detection training sample data, so that the standard detection model learns according to the sample data to generate a character detection model capable of detecting each small language character image in the character picture to be recognized.
Optionally, the standard detection model is an original machine learning model.
Correspondingly, the training sample data for character detection is at least one group, the standard detection model is trained, and the character detection model is generated, specifically: and training the original machine learning model by adopting at least one group of character detection training sample data and a standard character detection training sample set to generate a character detection model.
The original machine learning model may refer to an untrained machine learning model. The character detection model may be a trained machine learning model, and is used for detecting positioning information of each character image in the to-be-identified text image, inputting the to-be-identified text image, and outputting the positioning information of each character image in the to-be-identified text image. The character detection training samples included in the standard character detection training sample set may refer to existing training samples for training a character detection model to successfully detect positioning information of a large language (e.g., chinese, english, etc.) character image in the picture to be identified, that is, the character detection training samples generated by S110-S130 are not used. Furthermore, a large amount of character detection training sample data and standard character detection training sample sets are adopted, and the character detection model is generated after the original machine learning model is trained, so that the positioning information of large-language character pictures in the character pictures to be recognized can be detected, and the positioning information of small-language character pictures in the character pictures to be recognized can be detected.
According to the technical scheme, a large number of corpora aiming at the small language are obtained through the web crawler technology, character images of corresponding characters are obtained according to the contents of the corpora, and then the character images are added into blank background pictures according to preset positioning information, so that a large number of character pictures to be recognized are automatically constructed. The character image to be recognized and the positioning information of each character image are correspondingly used as character detection training sample data of the training character detection model, so that a large amount of character detection training sample data can be quickly generated by using the method provided by the embodiment, and the character detection model can effectively recognize the positioning information of the corresponding small language character image in the character image to be recognized after being trained.
According to the technical scheme, the marking frame is used for marking all the characters in the existing character pictures to be identified, which correspond to the small language, manually, positioning information of all the characters in the character pictures to be identified is determined, and then character detection training sample data of a training character detection model is generated, so that the problems of low efficiency and high cost of manual marking are solved, automation of generating the character detection training sample data is realized, and the generation efficiency of the character detection training sample data is improved.
Example two
Fig. 2 is a flowchart of a character detection method according to a second embodiment of the present disclosure, where the method may be applied to detecting positioning information of each character image in a text image to be recognized, and the method may be performed by a character detection device, where the device may be implemented in software and/or hardware, and the device may be configured in an electronic device, for example, typically a computer.
As shown in fig. 2, the method specifically includes the following steps:
s210, acquiring a text picture to be identified.
The text picture to be recognized in the embodiment refers to a text picture with a small language, for example, a text picture with a hindi language character, which is required to be recognized by a user in an actual application process.
S220, inputting the text picture to be recognized into a character detection model generated by the method for generating the character detection model according to any embodiment of the disclosure.
In the practical application process of character recognition of the character picture to be recognized, firstly, the positioning information of each character in the character picture to be recognized needs to be detected by using a character detection model, and then each character in the character picture to be recognized is recognized based on the positioning information of each character by using a character recognition model.
The character detection model may refer to the description of the foregoing embodiments, and specifically refers to a character detection model capable of detecting positioning information of each hindi character in a text picture to be recognized.
Inputting the acquired text picture to be recognized including the characters of the hindi language into a character detection model generated by the method in the first embodiment of the disclosure to acquire positioning information of each character of the hindi language in the text picture to be recognized.
S230, acquiring positioning information of each character image in the character picture to be recognized, which is output by the character detection model.
Wherein the positioning information includes position information and rotation angle information.
After the character detection model detects the character picture to be identified, the positioning information of each Hindi character image in the character picture to be identified is output, so that the character recognition model carries out character recognition on each Hindi character image in the character picture to be identified.
The character detection model provided by the embodiment is used when the positioning information of each character in the character picture to be identified is detected, and because the character detection model provided by the embodiment is trained according to a large number of training samples which are automatically constructed, the training samples can be specific to a small language, such as a hindi language, and the character detection method provided by the embodiment can effectively detect the positioning information of the corresponding small language character image in the character picture to be identified.
Example III
Fig. 3 is a schematic structural diagram of a device for generating a character detection model according to an embodiment of the present disclosure, where the embodiment may be applicable to a case of generating a character detection model for detecting positioning information of a character in a text image to be recognized. The apparatus may be implemented in software and/or hardware, and the apparatus may be configured in an electronic device. As shown in fig. 3, the apparatus may include: the system comprises a to-be-identified text and picture construction module 310, a positioning information acquisition module 320, a training sample data generation module 330 and a model training module 340, wherein.
A to-be-recognized text image construction module 310, configured to construct at least one to-be-recognized text image according to the at least one character image and the blank background image;
the positioning information obtaining module 320 is configured to obtain positioning information of each character image in the at least one text image to be identified;
the training sample data generating module 330 is configured to correspond the text image to be identified and the positioning information of each character image in the text image to be identified as a set of character detection training sample data;
the model training module 340 is configured to train the standard detection model by using at least one set of character detection training sample data, and generate a character detection model.
Further, the text-to-be-identified picture construction module 310 includes: a character line image construction unit and a character picture construction unit to be identified, wherein,
the character line image construction unit is used for splicing at least one character image into at least one character line image;
the character picture construction unit to be identified is used for constructing at least one character picture to be identified according to the at least one character line image and the blank background picture.
Further, the to-be-identified text image construction unit is specifically configured to add the at least one character line image to the blank background image according to preset positioning information, so as to construct at least one to-be-identified text image.
Further, the positioning information includes position information and rotation angle information.
Further, the device for generating a character detection model further includes: the image processing module is used for adding noise to the to-be-identified text image before the to-be-identified text image and the positioning information of each character image in the to-be-identified text image are correspondingly used as a group of character detection training sample data.
Further, the standard detection model is an original machine learning model;
The model training module 340 is specifically configured to train the original machine learning model to generate a character detection model by using at least one set of character detection training sample data and a standard character detection training sample set.
Specifically, the characters include hindi characters.
According to the embodiment of the disclosure, a large number of corpora aiming at the small language are obtained through the web crawler technology, character images of corresponding characters are obtained according to the content of the corpora, and then the character images are added into blank background pictures according to preset positioning information, so that a large number of character pictures to be recognized are automatically constructed. The character image to be recognized and the positioning information of each character image are correspondingly used as character detection training sample data of the training character detection model, so that a large amount of character detection training sample data can be quickly generated by using the method provided by the embodiment, and the character detection model can effectively recognize the positioning information of the corresponding small language character image in the character image to be recognized after being trained.
According to the technical scheme, the marking frame is used for marking all the characters in the existing character pictures to be identified, which correspond to the small language, manually, positioning information of all the characters in the character pictures to be identified is determined, and then character detection training sample data of a training character detection model is generated, so that the problems of low efficiency and high cost of manual marking are solved, automation of generating the character detection training sample data is realized, and the generation efficiency of the character detection training sample data is improved.
The apparatus for generating a character detection model according to the embodiment of the present disclosure belongs to the same inventive concept as the method for generating a character detection model according to the first embodiment, and technical details not described in detail in the embodiment of the present disclosure can be found in the first embodiment, and the embodiments of the present disclosure have the same beneficial effects as the first embodiment.
Example IV
Fig. 4 is a schematic structural diagram of a character detection device according to an embodiment of the present disclosure, where the embodiment is applicable to detecting positioning information of each character image in a text image to be recognized. The apparatus may be implemented in software and/or hardware, and the apparatus may be configured in an electronic device. As shown in fig. 4, the apparatus may include: a text and picture acquisition module 410 to be identified, a detection module 420 and a detection result acquisition module 430, wherein,
a text-to-be-identified picture obtaining module 410, configured to obtain a text-to-be-identified picture;
the detection module 420 is configured to input the text image to be identified to a character detection model generated by the generating device of the character detection model according to any embodiment of the present disclosure;
the detection result obtaining module 430 is configured to obtain positioning information of each character image in the text image to be identified, which is output by the character detection model.
Further, the positioning information includes position information and rotation angle information, and the characters include a hindi character.
The character detection model provided by the embodiment is used when the positioning information of each character in the character picture to be identified is detected, and because the character detection model provided by the embodiment is trained according to a large number of training samples which are automatically constructed, the training samples can be specific to a small language, such as a hindi language, and the character detection method provided by the embodiment can effectively detect the positioning information of the corresponding small language character image in the character picture to be identified.
The character detection device provided in the embodiment of the present disclosure belongs to the same inventive concept as the character detection method provided in the second embodiment, and technical details not described in detail in the embodiment of the present disclosure can be seen in the second embodiment, and the embodiment of the present disclosure has the same beneficial effects as the second embodiment.
Example five
The disclosed embodiments provide an electronic device, referring next to fig. 5, which illustrates a schematic structural diagram of an electronic device (e.g., a terminal device or server) 500 suitable for use in implementing the disclosed embodiments. Electronic devices in embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, personal Digital Assistants (PDAs), tablet computers (PADs), portable Multimedia Players (PMPs), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, as well as stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 5 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 5, the electronic device 500 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 501, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
In general, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 507 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 508 including, for example, magnetic tape, hard disk, etc.; and communication means 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 shows an electronic device 500 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or from the storage means 508, or from the ROM 502. When the computer program is executed by the processing apparatus 501, the above-described functions defined in the generation method of the character detection model or the character detection method of the embodiment of the present disclosure are performed.
Example six
The disclosed embodiments also provide a computer readable storage medium, which may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, radio Frequency (RF), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: constructing at least one text picture to be recognized according to at least one character image and a blank background picture; acquiring positioning information of each character image in the at least one text picture to be identified; the positioning information of each character image in the character picture to be identified is correspondingly used as a group of character detection training sample data; and training the standard detection model by adopting at least one group of character detection training sample data to generate a character detection model.
Or the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a text picture to be identified; inputting the text picture to be recognized into a character detection model generated by the character detection model generation method according to any embodiment of the disclosure; and acquiring positioning information of each character image in the character image to be recognized, which is output by the character detection model.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present disclosure may be implemented in software or hardware. The name of the module is not limited to the module itself in some cases, and for example, the "module for constructing a text picture to be recognized" may also be described as "a module for constructing at least one text picture to be recognized".
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (14)

1. A method for generating a character detection model, comprising:
splicing at least one character image into at least one character line image, wherein the character image is an image of a single character corresponding to a small language;
Constructing at least one text picture to be recognized according to the at least one character line image and the blank background picture, wherein the text picture to be recognized comprises: adding the at least one character line image to a blank background picture according to preset positioning information to construct at least one text picture to be recognized, wherein the positioning information comprises position information and rotation angle information;
acquiring positioning information of each character image in the at least one text picture to be identified;
the positioning information of each character image in the character picture to be identified is correspondingly used as a group of character detection training sample data;
training an original machine learning model by adopting at least one group of character detection training sample data and a standard character detection training sample set to generate a character detection model, wherein the character detection model is used for detecting positioning information of each character image in a character picture to be recognized, and the standard character detection training sample set is an existing training sample used for training that the character detection model can successfully detect positioning information of large language character images in the picture to be recognized.
2. The method according to claim 1, further comprising, before said associating the positioning information of each character image in the text image to be recognized and the text image to be recognized as a set of character detection training sample data:
And adding noise to the text picture to be identified.
3. The method of claim 1, wherein the small language characters comprise hindi characters.
4. A character detection method, comprising:
acquiring a text picture to be identified;
inputting the text picture to be recognized into a character detection model generated by the method of any one of claims 1-3;
and acquiring positioning information of each character image in the character image to be recognized, which is output by the character detection model.
5. The method of claim 4, wherein the step of determining the position of the first electrode is performed,
the positioning information comprises position information and rotation angle information, and the small language characters comprise Hindi characters.
6. A character detection model generation apparatus, comprising:
the character image is an image of a single character corresponding to a small language;
the positioning information acquisition module is used for acquiring positioning information of each character image in the at least one text picture to be identified;
the training sample data generation module is used for correspondingly taking the to-be-identified text picture and the positioning information of each character image in the to-be-identified text picture as a group of character detection training sample data;
The model training module is used for training the original machine learning model by adopting at least one group of character detection training sample data and a standard character detection training sample set to generate a character detection model, wherein the character detection model is used for detecting positioning information of each character image in a character picture to be recognized, and the standard character detection training sample set is an existing training sample used for training the character detection model to successfully detect positioning information of large language character images in the picture to be recognized;
the text and picture construction module to be identified comprises: a character line image construction unit and a character picture construction unit to be identified, wherein,
the character line image construction unit is used for splicing at least one character image into at least one character line image;
the character picture construction unit to be identified is used for constructing at least one character picture to be identified according to the at least one character line image and the blank background picture, and is specifically used for adding the at least one character line image to the blank background picture according to preset positioning information to construct at least one character picture to be identified;
wherein the positioning information includes position information and rotation angle information.
7. The apparatus as recited in claim 6, further comprising: the image processing module is used for adding noise to the to-be-identified text image before the to-be-identified text image and the positioning information of each character image in the to-be-identified text image are correspondingly used as a group of character detection training sample data.
8. The apparatus of claim 6, wherein the small language characters comprise hindi characters.
9. A character detecting apparatus, comprising:
the character and picture to be identified acquisition module is used for acquiring the character and picture to be identified;
a detection module for inputting the text picture to be recognized into a character detection model generated by the apparatus according to any one of claims 6-8;
the detection result acquisition module is used for acquiring the positioning information of each character image in the character image to be identified, which is output by the character detection model.
10. The apparatus of claim 9, wherein the positioning information includes position information and rotation angle information, and the small language characters include hindi characters.
11. An electronic device, comprising:
one or more processors;
A memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of generating a character detection model as recited in any one of claims 1-3.
12. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, causes the one or more processors to implement the character detection method of any one of claims 4-5.
13. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements a method of generating a character detection model according to any one of claims 1-3.
14. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the character detection method according to any one of claims 4-5.
CN201910027515.3A 2019-01-11 2019-01-11 Character detection model generation method, character detection device, character detection equipment and medium Active CN109766879B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910027515.3A CN109766879B (en) 2019-01-11 2019-01-11 Character detection model generation method, character detection device, character detection equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910027515.3A CN109766879B (en) 2019-01-11 2019-01-11 Character detection model generation method, character detection device, character detection equipment and medium

Publications (2)

Publication Number Publication Date
CN109766879A CN109766879A (en) 2019-05-17
CN109766879B true CN109766879B (en) 2023-06-30

Family

ID=66453918

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910027515.3A Active CN109766879B (en) 2019-01-11 2019-01-11 Character detection model generation method, character detection device, character detection equipment and medium

Country Status (1)

Country Link
CN (1) CN109766879B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276352A (en) * 2019-06-28 2019-09-24 拉扎斯网络科技(上海)有限公司 Index identification method, device, electronic equipment and computer readable storage medium
CN110490232B (en) * 2019-07-18 2021-08-13 北京捷通华声科技股份有限公司 Method, device, equipment and medium for training character row direction prediction model
CN110503105A (en) * 2019-09-02 2019-11-26 苏州美能华智能科技有限公司 Character identifying method, training data acquisition methods, device and medium
CN110895696A (en) * 2019-11-05 2020-03-20 泰康保险集团股份有限公司 Image information extraction method and device
CN111353491B (en) * 2020-03-12 2024-04-26 中国建设银行股份有限公司 Text direction determining method, device, equipment and storage medium
CN111951154B (en) * 2020-08-14 2023-11-21 中国工商银行股份有限公司 Picture generation method and device containing background and medium
CN113743438A (en) * 2020-08-20 2021-12-03 北京沃东天骏信息技术有限公司 Method, device and system for generating data set for text detection
CN112183020A (en) * 2020-10-26 2021-01-05 阳光保险集团股份有限公司 Multi-font sample synthesis method and device, electronic equipment and storage medium
CN112686243A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Method and device for intelligently identifying picture characters, computer equipment and storage medium
CN113034415B (en) * 2021-03-23 2021-09-14 哈尔滨市科佳通用机电股份有限公司 Method for amplifying small parts of railway locomotive image
CN113420766B (en) * 2021-07-05 2022-09-16 北京理工大学 Low-resource language OCR method fusing language information
CN114419613A (en) * 2022-01-17 2022-04-29 北京百度网讯科技有限公司 Image sample generation method, text recognition method, device, equipment and medium
CN115830599B (en) * 2023-02-08 2023-04-21 成都数联云算科技有限公司 Industrial character recognition method, model training method, device, equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388078A (en) * 2008-09-27 2009-03-18 腾讯科技(深圳)有限公司 Text identification method and device based on verification
CN102346847A (en) * 2011-09-26 2012-02-08 青岛海信网络科技股份有限公司 License plate character recognizing method of support vector machine
CN106407976A (en) * 2016-08-30 2017-02-15 百度在线网络技术(北京)有限公司 Image character identification model generation and vertical column character image identification method and device

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104966097B (en) * 2015-06-12 2019-01-18 成都数联铭品科技有限公司 A kind of complex script recognition methods based on deep learning
CN105468732A (en) * 2015-11-23 2016-04-06 中国科学院信息工程研究所 Image keyword inspecting method and device
CN107122785B (en) * 2016-02-25 2022-09-27 中兴通讯股份有限公司 Text recognition model establishing method and device
CN107784316A (en) * 2016-08-26 2018-03-09 阿里巴巴集团控股有限公司 A kind of image-recognizing method, device, system and computing device
US10108883B2 (en) * 2016-10-28 2018-10-23 Intuit Inc. Image quality assessment and improvement for performing optical character recognition
CN106778730B (en) * 2016-12-29 2020-07-07 深圳爱拼信息科技有限公司 Self-adaptive method and system for rapidly generating OCR training samples
CN108288064B (en) * 2017-01-09 2022-06-07 北京京东尚科信息技术有限公司 Method and device for generating pictures
CN108304814B (en) * 2018-02-08 2020-07-14 海南云江科技有限公司 Method for constructing character type detection model and computing equipment
CN108764036A (en) * 2018-04-24 2018-11-06 西安电子科技大学 A kind of handwritten form Tibetan language word fourth recognition methods
CN109086772A (en) * 2018-08-16 2018-12-25 成都市映潮科技股份有限公司 A kind of recognition methods and system distorting adhesion character picture validation code

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388078A (en) * 2008-09-27 2009-03-18 腾讯科技(深圳)有限公司 Text identification method and device based on verification
CN102346847A (en) * 2011-09-26 2012-02-08 青岛海信网络科技股份有限公司 License plate character recognizing method of support vector machine
CN106407976A (en) * 2016-08-30 2017-02-15 百度在线网络技术(北京)有限公司 Image character identification model generation and vertical column character image identification method and device

Also Published As

Publication number Publication date
CN109766879A (en) 2019-05-17

Similar Documents

Publication Publication Date Title
CN109766879B (en) Character detection model generation method, character detection device, character detection equipment and medium
CN109753968B (en) Method, device, equipment and medium for generating character recognition model
CN111368562B (en) Method and device for translating characters in picture, electronic equipment and storage medium
CN109472852B (en) Point cloud image display method and device, equipment and storage medium
US20220248102A1 (en) Subtitle border-crossing processing method and apparatus, and electronic device
CN112396032B (en) Writing detection method and device, storage medium and electronic equipment
CN111738316B (en) Zero sample learning image classification method and device and electronic equipment
CN111507123B (en) Method and device for placing reading materials, reading equipment, electronic equipment and medium
CN109871465B (en) Time axis calculation method and device, electronic equipment and storage medium
CN111462548A (en) Paragraph point reading method, device, equipment and readable medium
CN110619597A (en) Semitransparent watermark removing method and device, electronic equipment and storage medium
CN111459443A (en) Character point-reading method, device, equipment and readable medium
CN111460086A (en) Point reading marking method, device, equipment and readable medium
CN113936271A (en) Text recognition method and device, readable medium and electronic equipment
CN114155545A (en) Form identification method and device, readable medium and electronic equipment
CN111382577B (en) Document translation method, device, electronic equipment and storage medium
CN114332324A (en) Image processing method, apparatus, device and medium
CN111054072A (en) Role model trailing method, device, equipment and storage medium
CN111459347A (en) Intelligent point reading method, device, equipment and readable medium
CN111797591B (en) Layout recovery method and device and electronic equipment
CN111461227B (en) Sample generation method, device, electronic equipment and computer readable medium
US11748969B2 (en) Image processing method and apparatus
CN111325117B (en) Training method and device for target object recognition model and electronic equipment
CN112991147B (en) Image processing method, device, electronic equipment and computer readable storage medium
CN112346630B (en) State determination method, device, equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant