CN111178362A - Text image processing method, device, equipment and storage medium - Google Patents

Text image processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN111178362A
CN111178362A CN201911306248.XA CN201911306248A CN111178362A CN 111178362 A CN111178362 A CN 111178362A CN 201911306248 A CN201911306248 A CN 201911306248A CN 111178362 A CN111178362 A CN 111178362A
Authority
CN
China
Prior art keywords
text image
target pixel
pixel band
text
band
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911306248.XA
Other languages
Chinese (zh)
Other versions
CN111178362B (en
Inventor
何胜
喻宁
冯晶凌
柳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN201911306248.XA priority Critical patent/CN111178362B/en
Publication of CN111178362A publication Critical patent/CN111178362A/en
Application granted granted Critical
Publication of CN111178362B publication Critical patent/CN111178362B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/273Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion removing elements interfering with the pattern to be recognised
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/243Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a text image processing method, which comprises the following steps: when a text image processing request is received, acquiring a text image corresponding to the text image processing request, and horizontally correcting the text image; when the horizontal correction of the text image is finished, acquiring a target pixel band in the text image; processing the target pixel band through a preset Gaussian mixture model, and judging whether a cross area intersected with text strokes in the text image exists in the target pixel band; and if the target pixel band has a cross region, taking the target pixel band except the cross region as an interference line, and deleting the interference line in the text image. The invention also discloses a text image processing device, equipment and a storage medium. The method improves the accuracy of deleting the interference lines in the text image and effectively avoids the deletion of the writing strokes.

Description

Text image processing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of image processing, and in particular, to a text image processing method, apparatus, device, and storage medium.
Background
With the rapid development of computer recognition technology, the application field of optical character detection and recognition technology is more and more extensive, the noise in the text image is removed by optical character detection, and the situations of text stroke fracture and deletion can occur in the process of removing the noise of the text image.
The current removal of text image noise can be summarized into two types; the first category of color feature-based methods: according to the color difference between the interference lines and the text characters, the interference lines are segmented from the text image, so that the purpose of removing the interference lines is achieved, and the method can well remove the interference lines with larger difference with the text color information; such methods cannot remove such interference lines when they are similar or even identical in color to the text characters. The second category of methods based on width features: according to the difference between the interference lines and the widths of the strokes of the text characters, when the widths of the interference lines and the widths of the strokes of the characters are different, the interference lines can be removed through proper corrosion expansion operation, and the strokes of the text are reserved; however, when the width of the interference line is consistent with the width of the character stroke, a large number of character strokes may be lost while the erosion-dilation operation removes the interference line.
Disclosure of Invention
The invention mainly aims to provide a text image processing method, a text image processing device, text image processing equipment and a storage medium, and aims to solve the technical problem that interference lines in a text image cannot be accurately removed when the current text image is denoised.
In order to achieve the above object, the present invention provides a text image processing method, including the steps of: .
When a text image processing request is received, acquiring a text image corresponding to the text image processing request, and horizontally correcting the text image;
when the horizontal correction of the text image is finished, acquiring a target pixel band in the text image;
processing the target pixel band through a preset Gaussian mixture model, and judging whether a cross area intersected with text strokes in the text image exists in the target pixel band;
and if the target pixel band has a cross region, taking the target pixel band except the cross region as an interference line, and deleting the interference line in the text image.
In one embodiment, the step of horizontally rectifying the text image includes:
carrying out binarization processing on the text image to obtain a binarization area;
extracting a text binarization area in the binarization area through a preset character classification model, and summarizing each text binarization area to generate a binarization image;
and projecting the binary image to obtain an inclination angle of the text in the text image, and adjusting the text in the text image according to the inclination angle to finish the horizontal correction of the text image.
In an embodiment, the step of obtaining a target pixel band in the text image when the horizontal rectification of the text image is completed includes:
when the horizontal correction of the text image is finished, dividing the text image into pixel bands according to a preset direction, wherein the preset direction comprises a horizontal direction and a vertical direction;
determining the gray pixel value proportion and/or the black pixel value proportion in the pixel band, and acquiring a target pixel band of which the gray pixel value proportion and/or the black pixel value proportion exceed a preset threshold.
In an embodiment, before the step of processing the target pixel band by a preset gaussian mixture model and determining whether there is an intersection region intersecting with a text stroke in the text image in the target pixel band, the method includes:
when a Gaussian mixture model building instruction is received, obtaining a predefined initial Gaussian model, and adjusting coefficients of the initial Gaussian model according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises an algorithm:
Figure BDA0002320326700000021
the P (X) represents the probability of the pixel value in two classes, the K represents the classification class K-2 of the Gaussian mixture model, and the pi iskDenotes that the mixing coefficient sum is 1, the λ denotes a constant of the gaussian mixture model, the μ denotes a class center, and the X denotes a pixel value.
In an embodiment, the step of processing the target pixel band through a preset gaussian mixture model and determining whether there is an intersection region intersecting with a text stroke in the text image in the target pixel band includes:
analyzing the target pixel band according to the preset quantized values of three RGB channels, and dividing the target pixel band into a black pixel band and a gray pixel band;
classifying the gray pixel bands through a preset Gaussian mixture model, and judging whether a black pixel set exists in the gray pixel bands or not;
if a black pixel set exists in the gray pixel band, a cross region exists in the target pixel band;
if no black pixel set exists in the gray pixel band, no intersection region exists in the target pixel band.
In an embodiment, the step of processing the target pixel band through a preset gaussian mixture model and determining whether there is an intersection region intersecting with a text stroke in the text image in the target pixel band includes:
processing the target pixel band through a preset Gaussian mixture model, and comparing the target pixel band with character strokes in the text image;
if the distance difference between the target pixel band and the character strokes is smaller than a preset difference, judging that a cross area exists in the target pixel band;
and if the distance difference between the target pixel band and the character strokes is larger than or equal to a preset difference, judging that no cross area exists in the target pixel band.
In an embodiment, after the step of processing the target pixel band by a preset gaussian mixture model and determining whether there is an intersection region intersecting with a text stroke in the text image in the target pixel band, the method includes:
if the target pixel zone does not have a cross region, acquiring the pixel value of each pixel point in the target pixel zone;
and when the pixel values of all the pixel points in the target pixel band are the same, taking the target pixel band as an interference line, and deleting the interference line in the text image.
Further, to achieve the above object, the present invention also provides a text image processing apparatus comprising:
the request receiving module is used for acquiring a text image corresponding to a text image processing request and horizontally correcting the text image when the text image processing request is received;
the information acquisition module is used for acquiring a target pixel band in the text image when the horizontal correction of the text image is finished;
the processing and judging module is used for processing the target pixel band through a preset Gaussian mixture model and judging whether a cross area intersected with the text strokes in the text image exists in the target pixel band;
and the interference deleting module is used for taking the target pixel band except the cross area as an interference line and deleting the interference line in the text image if the cross area exists in the target pixel band.
In addition, in order to achieve the above object, the present invention also provides a text image processing apparatus;
the text image processing apparatus includes: a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein:
the computer program realizes the steps of the text image processing method as described above when executed by the processor.
In addition, to achieve the above object, the present invention also provides a computer storage medium;
the computer storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the text image processing method as described above.
According to the text image processing method, the text image processing device, the text image processing equipment and the storage medium, in the embodiment of the invention, the writing auxiliary line in the text image is obtained as the target pixel band by analyzing the text image; processing a target pixel band through a preset Gaussian mixture model, and judging whether a cross area formed by intersecting with character strokes exists in the target pixel band; and if the target pixel band has the cross area, taking the target pixel band except the cross area as an interference line, and deleting the interference line in the text image. The accurate deletion of the interference line is realized, and the erroneous deletion of the character strokes is effectively avoided.
Drawings
FIG. 1 is a schematic diagram of an apparatus in a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a text image processing method according to a first embodiment of the present invention;
FIG. 3 is a flowchart illustrating a text image processing method according to a fourth embodiment of the present invention;
fig. 4 is a functional block diagram of a text image processing apparatus according to an embodiment of the invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a schematic structural diagram of a terminal (also called a text image processing apparatus, where the text image processing apparatus may be formed by a separate text image processing device, or may be formed by combining other devices with a text image processing device) in a hardware operating environment according to an embodiment of the present invention.
The terminal of the embodiment of the invention can be a fixed terminal or a mobile terminal, such as an intelligent air conditioner with a networking function, an intelligent lamp, an intelligent power supply, an intelligent sound box, an automatic driving automobile, a Personal Computer (PC), a smart phone, a tablet computer, an electronic book reader, a portable computer and the like.
As shown in fig. 1, the terminal may include: a processor 1001, e.g., a Central Processing Unit (CPU), a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., WIFI interface, WIreless FIdelity, WIFI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Optionally, the terminal may further include a camera, a Radio Frequency (RF) circuit, a sensor, an audio circuit, and a WiFi module; the input unit is compared with a display screen and a touch screen; the network interface may optionally be other than WiFi, bluetooth, probe, etc. in the wireless interface. Such as light sensors, motion sensors, and other sensors. In particular, the light sensor may include an ambient light sensor and a proximity sensor; of course, the mobile terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which are not described herein again.
Those skilled in the art will appreciate that the terminal structure shown in fig. 1 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, the computer software product is stored in a storage medium (storage medium: also called computer storage medium, computer medium, readable storage medium, computer readable storage medium, or direct storage medium, etc., and the storage medium may be a non-volatile readable storage medium, such as RAM, magnetic disk, optical disk), and includes several instructions for enabling a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method according to the embodiments of the present invention, and a memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a computer program.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call up a computer program stored in the memory 1005 and perform steps in the text image processing method provided by the following embodiment of the present invention.
Referring to fig. 2, in a first embodiment of the text image processing method of the present invention, the text image processing method includes:
step S10, when receiving a text image processing request, acquiring a text image corresponding to the text image processing request, and performing horizontal rectification on the text image.
The text image processing method in the embodiment is applied to a terminal, the terminal receives a text image processing request, and the triggering mode of the text image processing request is not specifically limited, that is, the text image processing request can be actively triggered by a user, for example, the user clicks a "text image processing" key in a display interface of the terminal to trigger the text image processing request; in addition, the text image processing request can also be automatically triggered by the terminal, for example, the terminal performs image scanning, the terminal determines that the scanned image contains text information, and the terminal automatically triggers the text image processing request.
When the terminal receives a text image processing request, the terminal obtains a text image corresponding to the text image processing request, where the text image is a text image including a handwritten word and a writing auxiliary line, the text image may be obtained by a user scanning a handwritten word file, the text image may also be obtained by a user shooting a handwritten word file, a format of the text image in this embodiment is not specifically limited, that is, the text image may be in a bmp format, a jpg format, a png format, a tif format, a gif format, or the like.
In this embodiment, the purpose of text image processing is to remove an interference line in a text image, where the interference line in the embodiment of the present invention refers to a writing auxiliary line in a printing process, for example, a horizontal auxiliary line, a vertical auxiliary line, a field lattice auxiliary line, or a grid auxiliary line of writing, before the text image processing by a terminal, the terminal first performs horizontal correction on the text image in advance, and then recognizes the corrected text image, and the horizontal correction of the text image by the terminal in this embodiment refers to the terminal performing translation, tilt, and rotation on characters in the text image so that the text in the text image after the horizontal correction is in a horizontal direction.
And step S20, when the horizontal correction of the text image is completed, acquiring a target pixel band in the text image.
When the terminal finishes horizontal correction of the text image, the terminal divides the text image into different pixel bands, for example, the terminal takes pixel points with the same abscissa and different ordinates as one pixel band, or the terminal takes pixel points with the same ordinate and different abscissas as one pixel band.
The terminal divides the pixel value into two ranges according to the height of the pixel value in advance, and if the pixel value is in a first pixel range, the terminal judges that a pixel point corresponding to the pixel value is a blank area; if the pixel value is in the second pixel range, the terminal determines that the pixel point corresponding to the pixel value is a non-blank area.
The terminal acquires the pixel value of each pixel point in a pixel band, counts the occupation ratio of the non-blank area of each pixel band, and judges that the pixel band corresponds to a writing auxiliary line if the occupation ratio of the non-blank area of the pixel band exceeds a preset threshold value; and if the occupation ratio of the non-blank area of the pixel band does not exceed the preset threshold, the terminal judges that the pixel band is a normal writing line. The terminal takes the pixel zone with the non-blank area ratio exceeding a preset threshold as a target pixel zone, wherein the preset threshold can be flexibly set according to specific conditions, for example, the preset threshold is set to 80%; the terminal acquires a target pixel band to analyze the target pixel to remove a writing auxiliary line, and specifically:
and step S30, processing the target pixel band through a preset Gaussian mixture model, and judging whether a cross region intersected with the text strokes in the text image exists in the target pixel band.
The terminal processes a target pixel band through the preset Gaussian mixture model, and can effectively judge whether a cross region intersected with text strokes in the text image exists in the target pixel band.
That is, two implementation manners of processing a target pixel band through a preset gaussian mixture model and determining that a cross region exists in the target pixel band are provided in this embodiment, specifically:
the implementation mode is as follows:
a1, analyzing the target pixel band according to the preset quantized values of RGB three channels, and dividing the target pixel band into a black pixel band and a gray pixel band;
a2, classifying the gray pixel bands through a preset Gaussian mixture model, and judging whether a black pixel set exists in the gray pixel bands;
a step a3, if a black pixel set exists in the gray pixel strip, an intersection area exists in the target pixel strip;
step a4, if there is no black pixel set in the gray pixel band, there is no intersection region in the target pixel band.
That is, the terminal classifies target pixel bands by a preset gaussian mixture model, the terminal divides the target pixel bands into black pixel bands and gray pixel bands (the black pixel bands and the gray pixel bands, for the computer, the division of colors is achieved by quantization values of three channels of RGB), then, the terminal uses a Gaussian mixture model to divide gray pixel bands into two types, one type is a gray pixel set, the gray pixel set is the original color of the target pixel band, the other type is a black pixel set, the black pixel set is the intersected pixels of the target pixel band and characters, the terminal judges whether the target pixel band is intersected with the strokes of the texts in the text images to form an intersection region according to the classification of the gray pixel bands in the target pixel band, that is, if a gray pixel band includes both a set of gray pixels and a combination of black pixels, then there is an intersection region in the target pixel band; if only the set of gray pixels in the band of gray pixels does not include the set of black pixels, then no intersection region exists in the target band of pixels.
The implementation mode two is as follows:
b1, processing the target pixel band through a preset Gaussian mixture model, and comparing the target pixel band with character strokes in the text image;
step b2, if the distance difference between the target pixel band and the character strokes is smaller than a preset difference, determining that a cross area exists in the target pixel band;
step b3, if the distance difference between the target pixel band and the character stroke is greater than or equal to a preset difference, it is determined that no intersection region exists in the target pixel band.
The method comprises the steps that a terminal processes a target pixel band through a preset Gaussian mixture model, the terminal compares the target pixel band with character strokes in a text image, and the terminal judges whether a distance difference value between the target pixel band and the character strokes is smaller than a preset difference value, wherein the preset difference value is a preset distance critical value, for example, the preset difference value is set to be 0.5mm, and if the distance difference value between the target pixel band and the character strokes is smaller than the preset difference value, the terminal judges that a cross area exists in the target pixel band; and if the distance difference between the target pixel band and the character strokes is larger than or equal to the preset difference, the terminal judges that no cross area exists in the target pixel band.
Step S40, if there is a crossing region in the target pixel band, taking the pixels in the target pixel band except the crossing region as interference lines, and deleting the interference lines in the text image.
If a cross area exists in the target pixel band, that is, the character strokes pass through the horizontal writing auxiliary line, or the character strokes pass through the vertical writing auxiliary line, the terminal takes the target pixel band except the cross area as an interference line, and deletes the interference line in the text image, for example:
the first method is as follows: the process that the horizontal target pixel zone is gray- > black (one or more) - > gray from top to bottom obtains pixels of upper and lower gray color zones (namely A [0] and A [ N-1]) rows; if a cross region exists in the target pixel band, namely, when character strokes pass through a cross line, the upper part and the lower part of the cross line are intersected with the characters, when the lines A0 and A N-1 are matched with the intersected positions of the characters up and down, whether non-black pixels exist in a line of pixels on the cross line is calculated, if not, namely, no cross region exists, namely, the longitudinal direction is consistent, the pixels of the middle cross line are not required to be removed, and therefore, the character strokes are ensured not to be segmented; namely, when an intersection region with the stroke is found above the transverse target pixel band, but no intersection region with the stroke transverse line is found below the corresponding transverse target pixel band, the intersection region is found in a certain pixel range of the left and right sides right below the transverse target pixel band, if the intersection region is found, the upper and lower alignment forms an oblique line, an extension line with a certain length is constructed upwards, if the extension line is covered by the stroke black pixel, the middle pixel of the upper and lower contact positions is not required to be removed, otherwise, the middle pixel is not reserved, a part of the pixels considered as the transverse line on the transverse target pixel band is removed, and an area considered as the superposition of character strokes is reserved.
The second method comprises the following steps: the process that the vertical line target pixel zone is gray- > black (one or more) - > gray from left to right, and left and right gray band pixels (namely A [ N-1] and A [0]) rows of pixels are obtained; if the target pixel band has a cross area, namely, when the character strokes pass through the longitudinal lines, the left and right sides of the longitudinal lines are intersected with the characters, and when the intersecting positions of the A [ N-1] and A [0] rows and the characters are vertically matched, namely, the longitudinal directions are consistent, the pixels of the middle transverse lines are not required to be removed, so that the character strokes are not cut.
In the embodiment, the writing auxiliary line in the text image is obtained as the target pixel band by analyzing the text image; processing a target pixel band through a preset Gaussian mixture model, and judging whether a cross area formed by intersecting with character strokes exists in the target pixel band; and if the target pixel band has the cross area, taking the target pixel band except the cross area as an interference line, and deleting the interference line in the text image. The accurate deletion of the interference line is realized, and the erroneous deletion of the character strokes is effectively avoided.
Further, on the basis of the first embodiment of the present invention, a second embodiment of the text image processing method of the present invention is proposed.
This embodiment is a refinement of step S10 in the first embodiment, and is different from the first embodiment of the present invention in that:
carrying out binarization processing on the text image to obtain a binarization area;
extracting a text binarization area in the binarization area through a preset character classification model, and summarizing each text binarization area to generate a binarization image;
and projecting the binary image to obtain an inclination angle of the text in the text image, and adjusting the text in the text image according to the inclination angle to finish the horizontal correction of the text image.
In this embodiment, a terminal performs graying processing on a text image to obtain a grayscale image, and then performs binarization processing on the grayscale image, where the binarization processing refers to setting a grayscale value of each pixel point in the grayscale image to 0 or 255, that is, by selecting an appropriate binarization threshold, setting the grayscale value of each pixel point in the grayscale image to 0 or 255 according to the binarization threshold, an image capable of reflecting the overall and local characteristics of the grayscale image is obtained, and the terminal performs binarization processing on the text image to obtain a binarization region.
The method comprises the steps that a character classification model is preset in a terminal, the character classification model is obtained through training, the terminal extracts text information in a binarization region through the preset character classification model, the terminal takes the binarization region corresponding to the text information as a text binarization region, and the terminal collects all the text binarization regions to generate a binarization image.
The terminal projects the binary image according to a projection algorithm, wherein the projection algorithm refers to projection of different angles on a binary region, a projection value is obtained by projection of each angle, a curve formed by the projection values has the characteristic of a quadratic parabola, the maximum value of the curve is located at the vertex of the parabola, the terminal takes the angle corresponding to the projection value located at the vertex of the parabola as the inclination angle of the text, the inclination angle of the terminal corrects the text image, and the text image is corrected according to the inclination angle. In this embodiment, before the terminal performs text image analysis, the terminal performs direction correction on the text image to improve the accuracy of the text image analysis.
Further, on the basis of the above-described embodiment of the present invention, a third embodiment of the text image processing method of the present invention is proposed.
In this embodiment, the terminal pre-constructs a gaussian mixture model to process the text image through the gaussian mixture model, specifically:
when a Gaussian mixture model building instruction is received, obtaining a predefined initial Gaussian model, and adjusting coefficients of the initial Gaussian model according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises an algorithm:
Figure BDA0002320326700000101
the P (X) represents the probability of the pixel value in two classes, the K represents the classification class K-2 of the Gaussian mixture model, and the pi iskDenotes that the mixing coefficient sum is 1, the λ denotes a constant of the gaussian mixture model, the μ denotes a class center, and the X denotes a pixel value.
In this embodiment, a terminal receives a gaussian mixture model building instruction, a triggering mode of the gaussian mixture model building instruction is not specifically limited, when the terminal receives the gaussian mixture model building instruction, the terminal obtains a predefined initial gaussian model, and the terminal adjusts coefficients of the initial gaussian model according to pixels in a text image to obtain the gaussian mixture model, that is, the terminal initializes several predefined initial gaussian models first, then, the terminal processes each pixel in the text image to determine whether the text image matches a certain initial gaussian model, if so, the terminal puts the initial gaussian model in the model, and updates the initial gaussian model according to the pixels in the text image to obtain the gaussian mixture model. And if not, deleting the initial Gaussian model by the terminal.
The gaussian mixture model constructed in the embodiment includes an algorithm:
Figure BDA0002320326700000111
among them, the gaussian mixture model is classified into two classes, so K is 2, μ denotes the class center, and pikThe sum of the mixing coefficients is 1, λ represents a constant of a gaussian mixture model, and can be flexibly set according to specific situations, wherein X in the formula is a pixel value of a gray band, the gaussian mixture model divides the pixels of the gray band into two classes, that is, each pixel value is classified into a K1 class or a K2 class, the two classes are two independent gaussian distributions (normal distributions), and P (X) is probability of the pixel value in the two classes; k1 and K2 are gray horizontal line portions and intersection portions with brush strokes, respectively, and are distinguished by two categories of gaussian-distributed μ values, one is a relatively small μ value for the original color of the horizontal line image gray band, and one is a relatively large μ value for the intersection pixels with characters.
In this embodiment, a method for constructing a gaussian mixture model is described to realize accurate text image analysis by the constructed gaussian mixture model.
Further, referring to fig. 3, a fourth embodiment of the text image processing method of the present invention is proposed on the basis of the above-mentioned embodiment of the present invention.
This embodiment is a step after step S30 in the first embodiment, and is different from the other embodiments in that:
step S50, if there is no intersection region in the target pixel band, obtaining a pixel value of each pixel point in the target pixel band.
If the target pixel band does not have a cross region, the terminal acquires the pixel value of each pixel point in the target pixel band, compares whether the pixel values of the pixel points are the same or not, and if the pixel values of the pixel points are the same, judges that the target pixel band is a writing auxiliary line; and if the pixel values of the pixel points are different, the terminal outputs prompt information for the user to confirm.
And step S60, when the pixel values of all the pixel points in the target pixel band are the same, taking the target pixel band as an interference line, and deleting the interference line in the text image.
The terminal takes the target pixel zone as an interference line when the pixel values of all pixel points in the target pixel zone are the same, and deletes the interference line in the text image; in this embodiment, when the interference line is deleted, the terminal performs analysis according to the pixel value of the target pixel band, so as to avoid the situation of erroneous deletion and improve the deletion accuracy.
Furthermore, referring to fig. 4, an embodiment of the present invention further provides a text image processing apparatus, including:
the request receiving module 10 is configured to, when a text image processing request is received, obtain a text image corresponding to the text image processing request, and perform horizontal correction on the text image;
an information obtaining module 20, configured to obtain a target pixel band in the text image when horizontal correction of the text image is completed;
the processing and judging module 30 is configured to process the target pixel band through a preset gaussian mixture model, and judge whether a cross region intersecting with a text stroke in the text image exists in the target pixel band;
and an interference deleting module 40, configured to, if there is a crossing region in the target pixel band, use a portion of the target pixel band other than the crossing region as an interference line, and delete the interference line in the text image.
In one embodiment, the request receiving module 10 includes:
a binarization unit, configured to perform binarization processing on the text image to obtain a binarization area;
the generating and summarizing unit is used for extracting the text binarization areas in the binarization areas through a preset character classification model and summarizing the text binarization areas to generate a binarization image;
and the rotation correction unit is used for projecting the binary image to obtain an inclination angle of the text in the text image, and adjusting the text in the text image according to the inclination angle to finish the horizontal correction of the text image.
In an embodiment, the information obtaining module 20 includes:
the image dividing unit is used for dividing the text image into pixel bands according to a preset direction when the horizontal correction of the text image is finished, wherein the preset direction comprises a horizontal direction and a vertical direction;
and the determining and acquiring unit is used for determining the gray pixel value proportion and/or the black pixel value proportion in the pixel band and acquiring a target pixel band of which the gray pixel value proportion and/or the black pixel value proportion exceed a preset threshold.
In one embodiment, the text image processing apparatus includes:
the model building module is used for obtaining a predefined initial Gaussian model when a Gaussian mixture model building instruction is received, adjusting coefficients of the initial Gaussian model according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises an algorithm:
Figure BDA0002320326700000131
p (X) represents the probability of the pixel value in two classes, and K represents the class of Gaussian mixture model classificationOther K is 2, said pikDenotes that the mixing coefficient sum is 1, the λ denotes a constant of the gaussian mixture model, the μ denotes a class center, and the X denotes a pixel value.
In an embodiment, the processing determining module 30 includes:
the pixel division unit is used for analyzing the target pixel band according to the preset quantization values of three RGB channels and dividing the target pixel band into a black pixel band and a gray pixel band;
the classification judgment unit is used for classifying the gray pixel bands through a preset Gaussian mixture model and judging whether a black pixel set exists in the gray pixel bands or not;
a first determination unit, configured to determine that a cross region exists in the target pixel band if a black pixel set exists in the gray pixel band;
a second determining unit, configured to determine that no intersection region exists in the target pixel band if no black pixel set exists in the gray pixel band.
In an embodiment, the processing determining module 30 includes:
the stroke comparison unit is used for processing the target pixel band through a preset Gaussian mixture model and comparing the target pixel band with character strokes in the text image;
a third determination unit, configured to determine that a cross area exists in the target pixel band if a distance difference between the target pixel band and the character stroke is smaller than a preset difference;
and the fourth judging unit is used for judging that no cross area exists in the target pixel band if the distance difference between the target pixel band and the character strokes is greater than or equal to a preset difference.
In one embodiment, the text image processing apparatus includes:
the pixel acquisition module is used for acquiring the pixel value of each pixel point in the target pixel band if no intersection region exists in the target pixel band;
and the interference deleting module is used for taking the target pixel band as an interference line when the pixel values of all the pixel points in the target pixel band are the same, and deleting the interference line in the text image.
The steps implemented by each functional module of the text image processing apparatus may refer to each embodiment of the text image processing method of the present invention, and are not described herein again.
In addition, the embodiment of the invention also provides a computer storage medium.
The computer storage medium stores thereon a computer program that, when executed by a processor, implements operations in the text image processing method provided by the above-described embodiments.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity/action/object from another entity/action/object without necessarily requiring or implying any actual such relationship or order between such entities/actions/objects; the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
For the apparatus embodiment, since it is substantially similar to the method embodiment, it is described relatively simply, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, in that elements described as separate components may or may not be physically separate. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A text image processing method, characterized by comprising the steps of:
when a text image processing request is received, acquiring a text image corresponding to the text image processing request, and horizontally correcting the text image;
when the horizontal correction of the text image is finished, acquiring a target pixel band in the text image;
processing the target pixel band through a preset Gaussian mixture model, and judging whether a cross area intersected with text strokes in the text image exists in the target pixel band;
and if the target pixel band has a cross region, taking the target pixel band except the cross region as an interference line, and deleting the interference line in the text image.
2. The text image processing method according to claim 1, wherein the step of horizontally rectifying the text image comprises:
carrying out binarization processing on the text image to obtain a binarization area;
extracting a text binarization area in the binarization area through a preset character classification model, and summarizing each text binarization area to generate a binarization image;
and projecting the binary image to obtain an inclination angle of the text in the text image, and adjusting the text in the text image according to the inclination angle to finish the horizontal correction of the text image.
3. The text image processing method according to claim 1, wherein the step of acquiring the target pixel band in the text image when the horizontal rectification of the text image is completed comprises:
when the horizontal correction of the text image is finished, dividing the text image into pixel bands according to a preset direction, wherein the preset direction comprises a horizontal direction and a vertical direction;
determining the gray pixel value proportion and/or the black pixel value proportion in the pixel band, and acquiring a target pixel band of which the gray pixel value proportion and/or the black pixel value proportion exceed a preset threshold.
4. The text image processing method according to claim 1, wherein the step of processing the target pixel band by a preset gaussian mixture model and determining whether there is an intersection region intersecting a text stroke in the text image in the target pixel band comprises:
when a Gaussian mixture model building instruction is received, obtaining a predefined initial Gaussian model, and adjusting coefficients of the initial Gaussian model according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises an algorithm:
Figure FDA0002320326690000021
the P (X) represents the probability of the pixel value in two classes, the K represents the classification class K-2 of the Gaussian mixture model, and the pi iskDenotes that the mixing coefficient sum is 1, the λ denotes a constant of the gaussian mixture model, the μ denotes a class center, and the X denotes a pixel value.
5. The text image processing method according to claim 1, wherein the step of processing the target pixel band through a preset gaussian mixture model and determining whether there is an intersection region intersecting a text stroke in the text image in the target pixel band comprises:
analyzing the target pixel band according to the preset quantized values of three RGB channels, and dividing the target pixel band into a black pixel band and a gray pixel band;
classifying the gray pixel bands through a preset Gaussian mixture model, and judging whether a black pixel set exists in the gray pixel bands or not;
if a black pixel set exists in the gray pixel band, a cross region exists in the target pixel band;
if no black pixel set exists in the gray pixel band, no intersection region exists in the target pixel band.
6. The text image processing method according to claim 1, wherein the step of processing the target pixel band through a preset gaussian mixture model and determining whether there is an intersection region intersecting a text stroke in the text image in the target pixel band comprises:
processing the target pixel band through a preset Gaussian mixture model, and comparing the target pixel band with character strokes in the text image;
if the distance difference between the target pixel band and the character strokes is smaller than a preset difference, judging that a cross area exists in the target pixel band;
and if the distance difference between the target pixel band and the character strokes is larger than or equal to a preset difference, judging that no cross area exists in the target pixel band.
7. The method according to any one of claims 1 to 6, wherein the step of processing the target pixel band by a preset Gaussian mixture model and determining whether there is an intersection region intersecting with a text stroke in the text image in the target pixel band is followed by:
if the target pixel zone does not have a cross region, acquiring the pixel value of each pixel point in the target pixel zone;
and when the pixel values of all the pixel points in the target pixel band are the same, taking the target pixel band as an interference line, and deleting the interference line in the text image.
8. A text image processing apparatus characterized by comprising:
the request receiving module is used for acquiring a text image corresponding to a text image processing request and horizontally correcting the text image when the text image processing request is received;
the information acquisition module is used for acquiring a target pixel band in the text image when the horizontal correction of the text image is finished;
the processing and judging module is used for processing the target pixel band through a preset Gaussian mixture model and judging whether a cross area intersected with the text strokes in the text image exists in the target pixel band;
and the interference deleting module is used for taking the target pixel band except the cross area as an interference line and deleting the interference line in the text image if the cross area exists in the target pixel band.
9. A text image processing apparatus characterized by comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein:
the computer program, when executed by the processor, implements the steps of the text image processing method of any one of claims 1 to 7.
10. A computer storage medium, characterized in that the computer storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the text image processing method according to any one of claims 1 to 7.
CN201911306248.XA 2019-12-16 2019-12-16 Text image processing method, device, equipment and storage medium Active CN111178362B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911306248.XA CN111178362B (en) 2019-12-16 2019-12-16 Text image processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911306248.XA CN111178362B (en) 2019-12-16 2019-12-16 Text image processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111178362A true CN111178362A (en) 2020-05-19
CN111178362B CN111178362B (en) 2023-05-26

Family

ID=70652160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911306248.XA Active CN111178362B (en) 2019-12-16 2019-12-16 Text image processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111178362B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814780A (en) * 2020-07-08 2020-10-23 重庆农村商业银行股份有限公司 Bill image processing method, device and equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6438265B1 (en) * 1998-05-28 2002-08-20 International Business Machines Corp. Method of binarization in an optical character recognition system
CN105069452A (en) * 2015-08-07 2015-11-18 武汉理工大学 Straight line removing method based on local structure analysis
WO2017162069A1 (en) * 2016-03-25 2017-09-28 阿里巴巴集团控股有限公司 Image text identification method and apparatus
CN108681729A (en) * 2018-05-08 2018-10-19 腾讯科技(深圳)有限公司 Text image antidote, device, storage medium and equipment
CN108805126A (en) * 2017-04-28 2018-11-13 上海斯睿德信息技术有限公司 A kind of long interfering line minimizing technology of text image

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6438265B1 (en) * 1998-05-28 2002-08-20 International Business Machines Corp. Method of binarization in an optical character recognition system
CN105069452A (en) * 2015-08-07 2015-11-18 武汉理工大学 Straight line removing method based on local structure analysis
WO2017162069A1 (en) * 2016-03-25 2017-09-28 阿里巴巴集团控股有限公司 Image text identification method and apparatus
CN108805126A (en) * 2017-04-28 2018-11-13 上海斯睿德信息技术有限公司 A kind of long interfering line minimizing technology of text image
CN108681729A (en) * 2018-05-08 2018-10-19 腾讯科技(深圳)有限公司 Text image antidote, device, storage medium and equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814780A (en) * 2020-07-08 2020-10-23 重庆农村商业银行股份有限公司 Bill image processing method, device and equipment and storage medium
CN111814780B (en) * 2020-07-08 2023-05-26 重庆农村商业银行股份有限公司 Bill image processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111178362B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN108009543B (en) License plate recognition method and device
CN111914834A (en) Image recognition method and device, computer equipment and storage medium
CN112101317B (en) Page direction identification method, device, equipment and computer readable storage medium
JP5223675B2 (en) Vehicle detection device, vehicle detection method, and vehicle detection program
JP2014131277A (en) Document image compression method and application of the same to document authentication
JP2016516245A (en) Classification of objects in images using mobile devices
CN107977658B (en) Image character area identification method, television and readable storage medium
CN105046254A (en) Character recognition method and apparatus
CN110197238B (en) Font type identification method, system and terminal equipment
WO2022134771A1 (en) Table processing method and apparatus, and electronic device and storage medium
CN110414649B (en) DM code positioning method, device, terminal and storage medium
CN111985465A (en) Text recognition method, device, equipment and storage medium
CN104268512B (en) Character identifying method and device in image based on optical character identification
CN110431563B (en) Method and device for correcting image
CN111080665B (en) Image frame recognition method, device, equipment and computer storage medium
US7146047B2 (en) Image processing apparatus and method generating binary image from a multilevel image
CN111079730A (en) Method for determining area of sample image in interface image and electronic equipment
CN114283435A (en) Table extraction method and device, computer equipment and storage medium
CN111582134A (en) Certificate edge detection method, device, equipment and medium
CN112560840B (en) Method for identifying multiple identification areas, identification terminal, and readable storage medium
CN117115774B (en) Lawn boundary identification method, device, equipment and storage medium
CN114937270A (en) Ancient book word processing method, ancient book word processing device and computer readable storage medium
CN108647570B (en) Zebra crossing detection method and device and computer readable storage medium
CN111178362B (en) Text image processing method, device, equipment and storage medium
CN106663212B (en) Character recognition device, character recognition method, and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant