CN108288064B - Method and device for generating pictures - Google Patents

Method and device for generating pictures Download PDF

Info

Publication number
CN108288064B
CN108288064B CN201710012538.8A CN201710012538A CN108288064B CN 108288064 B CN108288064 B CN 108288064B CN 201710012538 A CN201710012538 A CN 201710012538A CN 108288064 B CN108288064 B CN 108288064B
Authority
CN
China
Prior art keywords
gray level
characters
background
picture
probability distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710012538.8A
Other languages
Chinese (zh)
Other versions
CN108288064A (en
Inventor
陈标龙
王永亮
王青泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710012538.8A priority Critical patent/CN108288064B/en
Publication of CN108288064A publication Critical patent/CN108288064A/en
Application granted granted Critical
Publication of CN108288064B publication Critical patent/CN108288064B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The application discloses a method and a device for generating pictures. One embodiment of the method comprises: randomly generating a sample picture comprising characters and a background; selecting the gray level probability distribution of characters from a pre-generated gray level probability distribution set of the characters, and selecting the gray level probability distribution of a background from a pre-generated gray level probability distribution set of the background; and adjusting the gray level of the characters of the sample picture according to the gray level probability distribution of the selected characters, and adjusting the gray level of the background of the sample picture according to the gray level probability distribution of the selected background to obtain the adjusted sample picture. This embodiment generates a sample picture in which the grayscale probability distribution of the text and the grayscale probability distribution of the background do not over-fit.

Description

Method and device for generating pictures
Technical Field
The present application relates to the field of computer technologies, and in particular, to the field of image processing technologies, and in particular, to a method and an apparatus for generating a picture.
Background
In recent years, deep learning techniques have been developed rapidly, and in the field of Optical Character Recognition (OCR), the accuracy and coverage rate of Character Recognition can be improved significantly by using Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) techniques. The deep learning technique can significantly exceed the conventional algorithm (such as artificial feature + SVM (Support Vector Machine)) in recognition performance, but the corresponding deep learning technique requires more sample data than the conventional algorithm. It can be said that the existence of enough high-quality sample data is an important prerequisite for determining whether a deep learning model can have a good effect.
The existing method for acquiring the text marking data mainly comprises 2 methods: 1. and manual marking, namely manually selecting a character area of the picture and inputting corresponding character information of the area. 2. And generating a program, namely adding characters on the picture by using the program, wherein the positions of the characters and the character information can be known because the characters are generated by the program. In order to simulate the fuzzy situation of the real picture characters, random noise and fuzzification are added to the pictures generated by a general program.
The quality of the manually marked data is the best, but the cost required for manually marking the data is high, and the period for acquiring the data is long. The image data generated by the program is large in quantity, can be generated infinitely theoretically, and is short in data acquisition time. The method has the disadvantages that the difference between the image data generated by the program and the reality is often large, so that one problem is that the model trained by the data generated by the program has overfitting on a data set, namely, the recognition effect on the image generated by the program is good, but the recognition effect is poor when the image to be recognized is faced with the actual image to be recognized.
Disclosure of Invention
It is an object of the present application to propose an improved method and apparatus for generating pictures to solve the technical problems mentioned in the background section above.
In a first aspect, the present application provides a method for generating a picture, the method comprising: randomly generating a sample picture comprising characters and a background; selecting the gray level probability distribution of characters from a pre-generated gray level probability distribution set of the characters, and selecting the gray level probability distribution of a background from a pre-generated gray level probability distribution set of the background; and adjusting the gray level of the characters of the sample picture according to the gray level probability distribution of the selected characters, and adjusting the gray level of the background of the sample picture according to the gray level probability distribution of the selected background to obtain the adjusted sample picture.
In some embodiments, the method further comprises the step of pre-generating a set of probability distributions for the text and a set of probability distributions for the background: acquiring a template picture set, wherein each template picture in the template picture set comprises characters and a background; acquiring the gray level of characters and the gray level of a background of each template picture in a template picture set; determining the mean and variance of the gray scale of characters of each template picture in the template picture set and the mean and variance of the gray scale of the background; determining the gray level probability distribution of the characters of each template picture according to the mean value and the variance of the gray level of the characters of each template picture in the template picture set to obtain a gray level probability distribution set of the characters; and determining the gray level probability distribution of the background of each template picture according to the mean value and the variance of the gray level of the background of each template picture in the template picture set to obtain a background gray level probability distribution set.
In some embodiments, obtaining a set of template pictures comprises: acquiring a picture to be identified; converting the picture to be identified into a gray-scale image; detecting characters of each line in the gray level image; and for each line of characters, intercepting a part containing the line of characters from the gray-scale image as a template picture to obtain a template picture set.
In some embodiments, acquiring the gray level of the text and the gray level of the background of each template picture in the template picture set comprises: carrying out binarization processing on each template picture in the template picture set; dividing each template picture after binarization processing into characters and a background according to the gray level; and respectively acquiring the gray levels of the characters and the background of the template picture before binarization processing aiming at each template picture in the template picture set.
In some embodiments, the grayscale probability distribution of the text and the grayscale probability distribution of the background are gaussian distributions.
In some embodiments, the method further comprises: and training a deep learning neural network for recognizing characters by using the adjusted sample pictures.
In a second aspect, the present application provides an apparatus for generating a picture, the apparatus comprising: the generating unit is used for randomly generating a sample picture comprising characters and a background; the selecting unit is used for selecting the gray level probability distribution of the characters from a pre-generated gray level probability distribution set of the characters and selecting the gray level probability distribution of the background from a pre-generated gray level probability distribution set of the background; and the adjusting unit is used for adjusting the gray level of the characters of the sample picture according to the gray level probability distribution of the selected characters, and adjusting the gray level of the background of the sample picture according to the gray level probability distribution of the selected background to obtain the adjusted sample picture.
In some embodiments, the apparatus further comprises: the template acquisition unit is used for acquiring a template picture set, wherein each template picture in the template picture set comprises characters and a background; the gray level acquisition unit is used for acquiring the gray level of characters and the gray level of a background of each template picture in the template picture set; the mean variance determining unit is used for determining the mean and variance of the gray level of each template picture in the template picture set and the mean and variance of the gray level of the background; the character gray level distribution determining unit is used for determining the gray level probability distribution of the characters of the template picture according to the mean value and the variance of the gray level of the characters of each template picture in the template picture set to obtain a gray level probability distribution set of the characters; and the background gray level distribution determining unit is used for determining the gray level probability distribution of the background of each template picture according to the mean value and the variance of the gray level of the background of each template picture in the template picture set so as to obtain a background gray level probability distribution set.
In some embodiments, the template acquisition unit is further configured to: acquiring a picture to be identified; converting the picture to be identified into a gray-scale image; detecting characters of each line in the gray scale image; and for each line of characters, intercepting a part containing the line of characters from the gray-scale image as a template picture to obtain a template picture set.
In some embodiments, the grayscale acquisition unit is further to: carrying out binarization processing on each template picture in the template picture set; dividing each template picture after binarization processing into characters and a background according to the gray level; and respectively acquiring the gray levels of the characters and the background of the template picture before binarization processing aiming at each template picture in the template picture set.
In some embodiments, the grayscale probability distribution of the text and the grayscale probability distribution of the background are gaussian distributions.
In some embodiments, the apparatus further comprises: and the training unit is used for training the deep learning neural network for recognizing characters by using the adjusted sample pictures.
According to the method and the device for generating the picture, the gray level of the characters in the randomly generated picture is adjusted according to the probability distribution of the gray level of the characters generated in advance, and the gray level of the background is adjusted according to the probability distribution of the gray level of the background generated in advance, so that the sample picture is closer to the picture to be identified to prevent over-fitting, and the quality of the sample picture is effectively improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for generating a picture according to the present application;
3a, 3b, 3c and 3d are schematic diagrams of application scenarios of the method for generating pictures according to the present application;
FIG. 4 is a flow diagram of yet another embodiment of a method for generating a picture according to the present application;
FIG. 5 is a schematic diagram of an embodiment of an apparatus for generating pictures according to the present application;
fig. 6 is a schematic structural diagram of a computer system suitable for implementing a server or a terminal device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the method for generating pictures or the apparatus for generating pictures of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various document applications, such as PDF readers, WORD, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting text reading, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.
The server 105 may be a server that provides various services, such as a background picture server that provides support for text displayed on the terminal devices 101, 102, 103. The background picture server can generate a sample picture for training a deep learning neural network for recognizing characters by receiving a picture to be recognized sent by the terminal, and feed back a processing result (such as the generated sample picture) to the terminal device. The background picture server can also receive a picture to be recognized sent by the terminal, recognize characters in the picture to be recognized by using a deep learning neural network trained by the sample picture obtained by the method, and feed back a recognition result to the terminal equipment.
It should be noted that the method for generating pictures provided in the embodiments of the present application is generally performed by the server 105, and accordingly, the apparatus for generating pictures is generally disposed in the server 105. The method for generating the picture provided by the embodiment of the application can also be executed by the terminal equipment 101, 102, 103, and accordingly, the device for generating the picture is generally arranged on the terminal equipment 101, 102, 103.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. In practical application, the sample picture can be directly generated by the server without using a terminal device, or characters in the picture to be recognized can be directly recognized by the server. The sample picture can also be generated directly by the terminal device without using a server, or the characters in the picture to be recognized can be recognized directly by the terminal device.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for generating a picture in accordance with the present application is shown. The method for generating the picture comprises the following steps:
step 201, randomly generating a sample picture including characters and a background.
In this embodiment, an electronic device (for example, the server or the terminal device shown in fig. 1) on which the method for generating a picture operates may randomly generate a sample picture, where the sample picture includes text and a background. The background refers to a non-text area in the sample picture. Characters can be added to the picture by using a program, and the positions of the characters and character information can be known because the characters are generated by the program.
Step 202, selecting the gray level probability distribution of the characters from the pre-generated gray level probability distribution set of the characters, and selecting the gray level probability distribution of the background from the pre-generated gray level probability distribution set of the background.
In this embodiment, the grayscale probability distribution set of the text and the grayscale probability distribution set of the background may be obtained from some pictures containing the text in advance. And selecting the gray level probability distribution of the characters from the gray level probability distribution set of the characters, and then selecting the gray level probability distribution of the background from the gray level probability distribution set of the background. The grayscale probability distribution may be a uniform distribution, a gaussian distribution, or the like. Both selection processes can be random. Or selecting the gray level probability distribution of the preset characters, and then randomly selecting the gray level probability distribution of the background from the gray level probability distribution set of the background for multiple times. Thereby obtaining the combination of the gray level probability distribution of various characters and the gray level probability distribution of the background, and each combination corresponds to a sample picture. Training a deep learning neural network for recognizing characters using the generated sample pictures. And testing the training effect of the sample picture generated by combining the gray level probability distributions of different characters and backgrounds, and counting the combination of the gray level probability distributions of the characters and the backgrounds with the best training effect as the preferred combination when the sample picture is generated later.
In some optional implementations of the present embodiment, the grayscale probability distribution of the text and the grayscale probability distribution of the background may be gaussian distributions. The gradation probability distribution of the character and the gradation probability distribution of the background may be the same or different.
Step 203, adjusting the gray level of the characters of the sample picture according to the gray level probability distribution of the selected characters, and adjusting the gray level of the background of the sample picture according to the gray level probability distribution of the selected background to obtain an adjusted sample picture.
In this embodiment, in order to simulate the blurring of real picture characters, random noise and blurring may be added to a randomly generated sample picture. And carrying out noise addition and fuzzification processing on the sample picture by using the gray level probability distribution of the selected characters and the gray level probability distribution of the background, so that the gray level probability distribution of the characters and the background of the adjusted sample picture is closer to the picture to be detected actually.
In some optional implementations of the embodiment, the method further includes training a deep learning neural network for recognizing the text using the adjusted sample picture. The more generated sample images, the more sample data can be used, and the deep learning neural network for recognizing characters can be better trained, so that the accuracy and the coverage rate of character recognition are obviously improved. The deep learning neural network may be a convolutional neural network or a recurrent neural network.
With continuing reference to fig. 3a-3d, fig. 3a-3d are schematic diagrams of application scenarios of the method for generating pictures according to the present embodiment. In practical applications, the server has previously obtained the gray probability distribution of the text from the black text shown in fig. 3a and the gray probability distribution of the background from the gray background. The server also obtains the gray probability distribution of the text from the gray-white text shown in fig. 3b and the gray probability distribution of the background from the gray-black background in advance. The gray level probability distributions of these two characters constitute a set of gray level probability distributions for the characters. The gray level probability distributions of the two backgrounds constitute a set of gray level probability distributions of the backgrounds. The user can initiate a request for generating the sample picture to the server through the terminal. The server receives the request and then randomly generates a sample picture, randomly selects a character gray level probability distribution from the character gray level probability distribution set to adjust the gray level of the generated sample picture, and randomly selects a background gray level probability distribution from the background gray level probability distribution set to adjust the gray level of the generated sample picture, so as to obtain the sample picture shown in fig. 3 c. The server randomly generates a sample picture, and then adjusts the text gray scale and the background gray scale of the sample picture by using the method, so as to obtain the sample picture shown in fig. 3 d. The server may send these two sample pictures to the user as sample data for training a deep learning neural network for recognizing text.
According to the method provided by the embodiment of the application, the character gray scale and the background gray scale of the sample picture generated by the server are associated with the character gray scale and the background gray scale of the picture to be recognized in practice, so that the sample picture closer to the picture to be recognized is obtained.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for generating a picture is shown. The flow 400 of the method for generating a picture comprises the following steps:
step 401, randomly generating a sample picture including characters and a background.
Step 401 is substantially the same as step 201, and therefore is not described again.
Step 402, acquiring a template picture set.
In this embodiment, each template picture in the template picture set includes a text and a background, where the background is a non-text region in the template picture.
In some optional implementations of this embodiment, acquiring the template picture set includes: acquiring a picture to be identified; converting the picture to be identified into a gray-scale image; detecting characters of each line in the gray level image; and for each line of characters, intercepting a part containing the line of characters from the gray-scale image as a template picture to obtain a template picture set. For example, a picture to be recognized has three lines of characters, the picture can be converted into a gray-scale image, and then an edge detection algorithm is used to detect the edge of each line of characters. And then, three pictures are intercepted from the gray-scale image according to lines, namely three template pictures, and the three template pictures form a template picture set.
In an image, an edge is where the local intensity variation of the image is most pronounced, and it exists mainly between objects, objects and the background, regions and areas (including different colors). The edges indicate the end of one feature region and the beginning of another feature region. The internal features or attributes of the regions into which the edges are divided are consistent, while the internal features or attributes of different regions are different. The edge detection is realized by using the difference between an object and a background on certain image features, wherein the difference comprises gray scale, color or texture features, and the edge detection is actually to detect the position where the image features change. The types of edges are many, and the following three are common: the first is a staircase-shaped edge, whose gray level jumps from low to high; the second is a roof-shaped edge whose gray scale gradually goes from low to high and then gradually decreases; the third is a linear edge whose gray level changes in a pulse skipping manner.
Step 403, acquiring the gray level of the text and the gray level of the background of each template picture in the template picture set.
In this embodiment, the template picture set may include a plurality of template pictures. If the gray levels of the characters and the background in the template pictures are unknown, the gray levels can be obtained through an image processing method.
In some optional implementation manners of this embodiment, the obtaining the gray scale of the text and the gray scale of the background of each template picture in the template picture set includes: carrying out binarization processing on each template picture in the template picture set; dividing each template picture after binarization processing into characters and a background according to the gray level; and respectively acquiring the gray levels of the characters and the background of the template picture before binarization processing aiming at each template picture in the template picture set. For example, the grayscale data of an image is divided into two parts by a threshold: pixel groups greater than a threshold value and pixel groups less than the threshold value. The pixel value of the pixel group larger than the threshold is set to white (or black), and the pixel value of the pixel group smaller than the threshold is set to black (or white). According to the edge detection algorithm, the two parts of gray data can be determined to belong to characters and a background respectively. And the gradation of pixels in the text and background before the binarization process is acquired.
Step 404, determine the mean and variance of the gray levels of the text and the background of each template picture in the template picture set.
In this embodiment, according to the gray scale of the text and the gray scale of the background of each template picture in the template picture set obtained in step 403, the mean and the variance of the gray scale of the text and the mean and the variance of the gray scale of the background of each template picture in the template picture set can be calculated.
Step 405, determining the gray level probability distribution of the characters of the template picture according to the mean value and the variance of the gray level of the characters of each template picture in the template picture set, so as to obtain a gray level probability distribution set of the characters.
In this embodiment, the gray level probability distribution of the text in the template picture may be determined according to the mean and the variance of the gray levels of the text in each template picture in the template picture set calculated in step 404, so as to obtain a gray level probability distribution set of the text.
And 406, determining the gray level probability distribution of the background of each template picture according to the mean value and the variance of the gray level of the background of each template picture in the template picture set to obtain a background gray level probability distribution set.
In this embodiment, the gray level probability distribution of the background of each template picture in the template picture set may be determined according to the mean and the variance of the gray levels of the background of the template picture calculated in step 404, so as to obtain the gray level probability distribution set of the background.
Step 407, selecting a gray level probability distribution of the character from the pre-generated gray level probability distribution set of the character, and selecting a gray level probability distribution of the background from the pre-generated gray level probability distribution set of the background.
Step 407 is substantially the same as step 202, and therefore, will not be described again.
And 408, adjusting the gray level of the characters of the sample picture according to the gray level probability distribution of the selected characters, and adjusting the gray level of the background of the sample picture according to the gray level probability distribution of the selected background to obtain the adjusted sample picture.
Step 408 is substantially the same as step 203 and thus will not be described again.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for generating a picture in the present embodiment highlights the step of generating the grayscale probability distribution set of the text and the grayscale probability distribution set of the background in advance. Therefore, the scheme described in the embodiment can introduce more gray level probability distribution sets of characters and background gray level probability distribution sets, so that more comprehensive generation of the gray level probability distribution sets of the characters and the background gray level probability distribution sets is realized, and more effective generation of sample pictures is realized.
With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, the present application provides an embodiment of an apparatus for generating a picture, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 5, the apparatus 500 for generating a picture of the present embodiment includes: a generating unit 501, a selecting unit 502 and an adjusting unit 503. The generating unit 501 is configured to randomly generate a sample picture including text and a background; the selecting unit 502 is configured to select a grayscale probability distribution of a character from a pre-generated grayscale probability distribution set of the character, and select a grayscale probability distribution of a background from a pre-generated grayscale probability distribution set of the background; the adjusting unit 503 is configured to adjust the gray scale of the text in the sample picture according to the gray scale probability distribution of the selected text, and adjust the gray scale of the background in the sample picture according to the gray scale probability distribution of the selected background, so as to obtain an adjusted sample picture.
In this embodiment, the specific processing of the generating unit 501, the selecting unit 502 and the adjusting unit 503 of the apparatus 500 for generating a picture may refer to step 201, step 202 and step 203 in the corresponding embodiment of fig. 2.
In some optional implementations of this embodiment, the apparatus 500 further includes: the template acquisition unit is used for acquiring a template picture set, wherein each template picture in the template picture set comprises characters and a background; the gray level acquisition unit is used for acquiring the gray level of characters and the gray level of a background of each template picture in the template picture set; the mean variance determining unit is used for determining the mean and variance of the gray level of each template picture in the template picture set and the mean and variance of the gray level of the background; the character gray level distribution determining unit is used for determining the gray level probability distribution of the characters of the template picture according to the mean value and the variance of the gray level of the characters of each template picture in the template picture set to obtain a gray level probability distribution set of the characters; and the background gray level distribution determining unit is used for determining the gray level probability distribution of the background of each template picture according to the mean value and the variance of the gray level of the background of each template picture in the template picture set so as to obtain a background gray level probability distribution set.
In some optional implementations of this embodiment, the template obtaining unit is further configured to: acquiring a picture to be identified; converting the picture to be identified into a gray-scale image; detecting characters of each line in the gray level image; and for each line of characters, intercepting a part containing the line of characters from the gray-scale image as a template picture to obtain a template picture set.
In some optional implementations of this embodiment, the grayscale acquiring unit is further configured to: carrying out binarization processing on each template picture in the template picture set; dividing each template picture after binarization processing into characters and a background according to the gray level; and respectively acquiring the gray levels of the characters and the background of the template picture before binarization processing aiming at each template picture in the template picture set.
In some optional implementations of this embodiment, the grayscale probability distribution of the text and the grayscale probability distribution of the background are gaussian distributions.
In some optional implementations of this embodiment, the apparatus 500 further includes: and the training unit is used for training the deep learning neural network for recognizing characters by using the adjusted sample pictures.
Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use in implementing a server or terminal device of an embodiment of the present application is shown.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 601.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a generating unit, a selecting unit, and a selecting unit. Where the names of these units do not in some cases constitute a limitation of the unit itself, for example, a generation unit may also be described as a "unit that randomly generates a sample picture including text and background".
As another aspect, the present application also provides a non-volatile computer storage medium, which may be the non-volatile computer storage medium included in the apparatus in the above-described embodiments; or it may be a non-volatile computer storage medium that exists separately and is not incorporated into the terminal. The non-volatile computer storage medium stores one or more programs that, when executed by a device, cause the device to: randomly generating a sample picture comprising characters and a background; selecting the gray level probability distribution of characters from a pre-generated gray level probability distribution set of the characters, and selecting the gray level probability distribution of a background from a pre-generated gray level probability distribution set of the background; and adjusting the gray level of the characters of the sample picture according to the gray level probability distribution of the selected characters, and adjusting the gray level of the background of the sample picture according to the gray level probability distribution of the selected background to obtain the adjusted sample picture.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (12)

1. A method for generating a picture, the method comprising:
randomly generating a sample picture comprising characters and a background;
randomly selecting the gray level probability distribution of characters from a pre-generated gray level probability distribution set of the characters, and randomly selecting the gray level probability distribution of a background from a pre-generated gray level probability distribution set of the background, wherein the gray level probability distribution of the characters and the gray level probability distribution of the background are Gaussian distribution;
and carrying out noise addition and fuzzification processing according to the gray level probability distribution of the selected characters to adjust the gray level of the characters of the sample picture, and carrying out noise addition and fuzzification processing according to the gray level probability distribution of the selected background to adjust the gray level of the background of the sample picture to obtain the adjusted sample picture.
2. The method of claim 1, further comprising the step of pre-generating a set of probability distributions for the gray scale of the text and a set of probability distributions for the gray scale of the background:
acquiring a template picture set, wherein each template picture in the template picture set comprises characters and a background;
acquiring the gray level of characters and the gray level of a background of each template picture in the template picture set;
determining the mean and variance of the gray scale of each template picture in the template picture set and the mean and variance of the gray scale of the background;
determining the gray level probability distribution of the characters of each template picture according to the mean value and the variance of the gray level of the characters of each template picture in the template picture set to obtain a gray level probability distribution set of the characters;
and determining the gray level probability distribution of the background of each template picture according to the mean value and the variance of the gray level of the background of each template picture in the template picture set to obtain a gray level probability distribution set of the background.
3. The method of claim 2, wherein obtaining the set of template pictures comprises:
acquiring a picture to be identified;
converting the picture to be identified into a gray scale image;
detecting characters of each line in the gray-scale image;
and for each line of characters, intercepting a part containing the line of characters from the gray-scale image as a template picture to obtain a template picture set.
4. The method of claim 2, wherein the obtaining the gray level of the text and the gray level of the background of each template picture in the template picture set comprises:
carrying out binarization processing on each template picture in the template picture set;
dividing each template picture after binarization processing into characters and a background according to the gray level;
and respectively acquiring the gray levels of the characters and the background of the template picture before binarization processing aiming at each template picture in the template picture set.
5. The method according to any one of claims 1-4, further comprising:
and training a deep learning neural network for recognizing characters by using the adjusted sample pictures.
6. An apparatus for generating a picture, the apparatus comprising:
the generating unit is used for randomly generating a sample picture comprising characters and a background;
the selecting unit is used for randomly selecting the gray level probability distribution of characters from a pre-generated gray level probability distribution set of the characters and randomly selecting the gray level probability distribution of a background from a pre-generated gray level probability distribution set of the background, wherein the gray level probability distribution of the characters and the gray level probability distribution of the background are Gaussian distribution;
and the adjusting unit is used for performing noise addition and fuzzification processing according to the gray level probability distribution of the selected characters so as to adjust the gray level of the characters of the sample picture, and performing noise addition and fuzzification processing according to the gray level probability distribution of the selected background so as to adjust the gray level of the background of the sample picture, so that the adjusted sample picture is obtained.
7. The apparatus of claim 6, further comprising:
the template acquisition unit is used for acquiring a template picture set, wherein each template picture in the template picture set comprises characters and a background;
the gray level acquisition unit is used for acquiring the gray level of characters and the gray level of a background of each template picture in the template picture set;
the mean variance determining unit is used for determining the mean and variance of the gray scale of each template picture in the template picture set and the mean and variance of the gray scale of the background;
the character gray level distribution determining unit is used for determining the gray level probability distribution of the characters of each template picture according to the mean value and the variance of the gray level of the characters of the template picture in the template picture set to obtain a gray level probability distribution set of the characters;
and the background gray level distribution determining unit is used for determining the gray level probability distribution of the background of each template picture according to the mean value and the variance of the gray level of the background of the template picture in the template picture set so as to obtain a gray level probability distribution set of the background.
8. The apparatus of claim 7, wherein the template obtaining unit is further configured to:
acquiring a picture to be identified;
converting the picture to be identified into a gray scale image;
detecting characters of each line in the gray-scale image;
and for each line of characters, intercepting a part containing the line of characters from the gray-scale image as a template picture to obtain a template picture set.
9. The apparatus of claim 7, wherein the grayscale acquisition unit is further configured to:
carrying out binarization processing on each template picture in the template picture set;
dividing each template picture after binarization processing into characters and a background according to the gray level;
and respectively acquiring the gray levels of the characters and the background of the template picture before binarization processing aiming at each template picture in the template picture set.
10. The apparatus according to any one of claims 6-9, further comprising:
and the training unit is used for training a deep learning neural network for recognizing characters by using the adjusted sample picture.
11. An apparatus, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201710012538.8A 2017-01-09 2017-01-09 Method and device for generating pictures Active CN108288064B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710012538.8A CN108288064B (en) 2017-01-09 2017-01-09 Method and device for generating pictures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710012538.8A CN108288064B (en) 2017-01-09 2017-01-09 Method and device for generating pictures

Publications (2)

Publication Number Publication Date
CN108288064A CN108288064A (en) 2018-07-17
CN108288064B true CN108288064B (en) 2022-06-07

Family

ID=62819197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710012538.8A Active CN108288064B (en) 2017-01-09 2017-01-09 Method and device for generating pictures

Country Status (1)

Country Link
CN (1) CN108288064B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7218118B2 (en) * 2018-07-31 2023-02-06 キヤノン株式会社 Information processing device, information processing method and program
CN109272043B (en) * 2018-09-21 2021-03-30 京东数字科技控股有限公司 Training data generation method and system for optical character recognition and electronic equipment
CN109255826B (en) * 2018-10-11 2023-11-21 平安科技(深圳)有限公司 Chinese training image generation method, device, computer equipment and storage medium
CN109766879B (en) * 2019-01-11 2023-06-30 北京字节跳动网络技术有限公司 Character detection model generation method, character detection device, character detection equipment and medium
CN115599384B (en) * 2022-12-14 2023-05-26 深圳市明源云科技有限公司 Picture character generating method, device, equipment and storage medium thereof

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4823194A (en) * 1986-08-01 1989-04-18 Hitachi, Ltd. Method for processing gray scale images and an apparatus thereof
CN101763516B (en) * 2010-01-15 2012-02-29 南京航空航天大学 Character recognition method based on fitting functions
CN102867180A (en) * 2011-07-08 2013-01-09 富士通株式会社 Gray character image normalization device and gray character image normalization method
CN102289668A (en) * 2011-09-07 2011-12-21 谭洪舟 Binaryzation processing method of self-adaption word image based on pixel neighborhood feature
CN104217399B (en) * 2013-05-29 2018-03-30 无锡华润矽科微电子有限公司 Realize that infrared image carries out the circuit and method of plateau equalization processing function
CN103606137B (en) * 2013-11-13 2016-04-13 天津大学 Keep the histogram equalization method of background and detailed information
CN104361312B (en) * 2014-10-16 2017-11-14 北京捷通华声语音技术有限公司 A kind of method and apparatus that character recognition is carried out to image

Also Published As

Publication number Publication date
CN108288064A (en) 2018-07-17

Similar Documents

Publication Publication Date Title
CN108288064B (en) Method and device for generating pictures
CN107911753B (en) Method and device for adding digital watermark in video
CN107578017B (en) Method and apparatus for generating image
CN107633218B (en) Method and apparatus for generating image
CN108038880B (en) Method and apparatus for processing image
CN110458918B (en) Method and device for outputting information
US10796685B2 (en) Method and device for image recognition
CN108228463B (en) Method and device for detecting first screen time
CN109635627A (en) Pictorial information extracting method, device, computer equipment and storage medium
CN111275784B (en) Method and device for generating image
KR102423710B1 (en) Translucent image watermark detection
CN109255767B (en) Image processing method and device
CN107622504B (en) Method and device for processing pictures
CN110222694B (en) Image processing method, image processing device, electronic equipment and computer readable medium
CN109344762B (en) Image processing method and device
CN109389096B (en) Detection method and device
CN108090885B (en) Method and apparatus for processing image
US20210200971A1 (en) Image processing method and apparatus
US20180181804A1 (en) Data normalization for handwriting recognition
CN112749695A (en) Text recognition method and device
CN113436222A (en) Image processing method, image processing apparatus, electronic device, and storage medium
US20220156873A1 (en) Image synthesis device and method for embedding watermark
CN109241930B (en) Method and apparatus for processing eyebrow image
CN108921138B (en) Method and apparatus for generating information
CN113516697A (en) Image registration method and device, electronic equipment and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant