CN111178362B - Text image processing method, device, equipment and storage medium - Google Patents

Text image processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN111178362B
CN111178362B CN201911306248.XA CN201911306248A CN111178362B CN 111178362 B CN111178362 B CN 111178362B CN 201911306248 A CN201911306248 A CN 201911306248A CN 111178362 B CN111178362 B CN 111178362B
Authority
CN
China
Prior art keywords
text image
target pixel
text
pixel band
band
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911306248.XA
Other languages
Chinese (zh)
Other versions
CN111178362A (en
Inventor
何胜
喻宁
冯晶凌
柳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN201911306248.XA priority Critical patent/CN111178362B/en
Publication of CN111178362A publication Critical patent/CN111178362A/en
Application granted granted Critical
Publication of CN111178362B publication Critical patent/CN111178362B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/273Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion removing elements interfering with the pattern to be recognised
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/243Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Facsimile Image Signal Circuits (AREA)
  • Character Input (AREA)

Abstract

The invention discloses a text image processing method, which comprises the following steps: when a text image processing request is received, acquiring a text image corresponding to the text image processing request, and horizontally correcting the text image; when the horizontal correction of the text image is completed, acquiring a target pixel band in the text image; processing the target pixel band through a preset Gaussian mixture model, and judging whether an intersection area intersecting with text strokes in the text image exists in the target pixel band or not; and if the crossing area exists in the target pixel band, taking the other parts except the crossing area in the target pixel band as interference lines, and deleting the interference lines in the text image. The invention also discloses a text image processing device, equipment and a storage medium. The invention improves the accuracy of deleting the interference line in the text image and effectively avoids deleting the writing strokes.

Description

Text image processing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of image processing, and in particular, to a text image processing method, apparatus, device, and storage medium.
Background
Along with the rapid development of computer recognition technology, the application field of the optical character detection recognition technology is wider and wider, noise in a text image is removed by optical character detection, and the situations of breakage and deletion of text strokes can occur in the process of removing the noise of the text image.
The noise of the text image removed at present can be generalized into two categories; a first class of color feature-based methods: according to the difference of the colors of the interference lines and the text characters, the interference lines are segmented from the text images, so that the purpose of removing the interference lines is achieved, and the interference lines with larger difference of the color information of the text can be well removed by the method; such methods cannot remove the interference lines when they are similar or even identical in color to the text characters. A second class of width feature-based methods: according to the difference of the width of the interference line and the width of the text character stroke, when the width of the interference line is different from the width of the text character stroke, the interference line can be removed through proper corrosion expansion operation, and the text stroke is reserved; however, when the width of the interference line is consistent with the width of the character strokes, the erosion dilation operation removes the interference line and simultaneously loses a large number of character strokes.
Disclosure of Invention
The invention mainly aims to provide a text image processing method, a device, equipment and a storage medium, which aim to solve the technical problem that interference lines in a text image cannot be accurately removed when the current text image is denoised.
To achieve the above object, the present invention provides a text image processing method comprising the steps of: .
When a text image processing request is received, acquiring a text image corresponding to the text image processing request, and horizontally correcting the text image;
when the horizontal correction of the text image is completed, acquiring a target pixel band in the text image;
processing the target pixel band through a preset Gaussian mixture model, and judging whether an intersection area intersecting with text strokes in the text image exists in the target pixel band or not;
and if the crossing area exists in the target pixel band, taking the other parts except the crossing area in the target pixel band as interference lines, and deleting the interference lines in the text image.
In one embodiment, the step of horizontally correcting the text image includes:
performing binarization processing on the text image to obtain a binarization area;
extracting text binarization areas in the binarization areas through a preset character classification model, and summarizing the text binarization areas to generate a binarization image;
projecting the binarized image to obtain an inclination angle of a text in the text image, and adjusting the text in the text image according to the inclination angle to finish horizontal correction of the text image.
In one embodiment, the step of acquiring the target pixel band in the text image when the text image level correction is completed includes:
dividing the text image into pixel bands according to a preset direction when the horizontal correction of the text image is completed, wherein the preset direction comprises a horizontal direction and a vertical direction;
and determining a gray pixel value duty ratio and/or a black pixel value duty ratio in the pixel band, and acquiring a target pixel band of which the gray pixel value duty ratio and/or the black pixel value duty ratio exceeds a preset threshold.
In one embodiment, before the step of processing the target pixel band through a preset gaussian mixture model and determining whether an intersection area intersecting with a text stroke in the text image exists in the target pixel band, the method includes:
when a Gaussian mixture model construction instruction is received, a predefined initial Gaussian model is obtained, and coefficients of the initial Gaussian model are adjusted according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises the following algorithm:
Figure BDA0002320326700000021
the P (X) represents the probability of the pixel value being in two classes, the K represents the gaussian mixture model classification class k=2, the pi k The sum of the mixing coefficients is denoted 1, λ denotes the constant of the gaussian mixture model, μ denotes the class center, and X denotes the pixel value.
In one embodiment, the step of processing the target pixel band through a preset gaussian mixture model and determining whether an intersection area intersecting with a text stroke in the text image exists in the target pixel band includes:
analyzing the target pixel band according to the quantized values of the preset RGB three channels, and dividing the target pixel band into a black pixel band and a gray pixel band;
classifying the gray pixel bands through a preset Gaussian mixture model, and judging whether black pixel sets exist in the gray pixel bands or not;
if a black pixel set exists in the gray pixel band, an intersection area exists in the target pixel band;
if there is no black pixel set in the gray pixel band, there is no intersection region in the target pixel band.
In one embodiment, the step of processing the target pixel band through a preset gaussian mixture model and determining whether an intersection area intersecting with a text stroke in the text image exists in the target pixel band includes:
processing the target pixel band through a preset Gaussian mixture model, and comparing the target pixel band with character strokes in the text image;
if the distance difference between the target pixel zone and the character strokes is smaller than a preset difference value, judging that a crossing area exists in the target pixel zone;
and if the distance difference between the target pixel band and the character strokes is greater than or equal to a preset difference value, judging that no crossing area exists in the target pixel band.
In one embodiment, after the step of processing the target pixel band by a preset gaussian mixture model and determining whether there is an intersection area intersecting with a text stroke in the text image in the target pixel band, the method includes:
if the target pixel zone does not have the intersection area, acquiring the pixel value of each pixel point in the target pixel zone;
and when the pixel values of all the pixel points in the target pixel band are the same, taking the target pixel band as an interference line, and deleting the interference line in the text image.
In addition, in order to achieve the above object, the present invention also provides a text image processing apparatus including:
the request receiving module is used for acquiring a text image corresponding to the text image processing request and horizontally correcting the text image when the text image processing request is received;
the information acquisition module is used for acquiring a target pixel band in the text image when the horizontal correction of the text image is completed;
the processing judging module is used for processing the target pixel band through a preset Gaussian mixture model and judging whether an intersection area intersecting with the text strokes in the text image exists in the target pixel band or not;
and the interference deleting module is used for taking the target pixel band except the intersection area as an interference line and deleting the interference line in the text image if the intersection area exists in the target pixel band.
In addition, in order to achieve the above object, the present invention also provides a text image processing apparatus;
the text image processing apparatus includes: a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein:
the computer program when executed by the processor implements the steps of the text image processing method as described above.
In addition, in order to achieve the above object, the present invention also provides a computer storage medium;
the computer storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the text image processing method as described above.
According to the text image processing method, device, equipment and storage medium provided by the embodiment of the invention, through analyzing the text image, a writing auxiliary line in the text image is obtained as a target pixel band; processing a target pixel zone through a preset Gaussian mixture model, and judging whether a crossing area formed by crossing the character strokes exists in the target pixel zone or not; if the crossing area exists in the target pixel band, the other parts except the crossing area in the target pixel band are used as interference lines, and the interference lines in the text image are deleted. The accurate deletion of the interference line is realized, and the erroneous deletion of the character strokes is effectively avoided.
Drawings
FIG. 1 is a schematic diagram of a device architecture of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart of a text image processing method according to a first embodiment of the present invention;
FIG. 3 is a flowchart of a text image processing method according to a fourth embodiment of the present invention;
fig. 4 is a schematic functional block diagram of an embodiment of a text image processing apparatus according to the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, fig. 1 is a schematic structural diagram of a terminal (also called text image processing device) of a hardware running environment according to an embodiment of the present invention, where the text image processing device may be formed by a separate text image processing device, or may be formed by a combination of other devices and a text image processing device.
The terminal of the embodiment of the invention can be a fixed terminal or a mobile terminal, such as an intelligent air conditioner with networking function, an intelligent electric lamp, an intelligent power supply, an intelligent sound box, an automatic driving automobile, a PC (personal computer) personal computer, an intelligent mobile phone, a tablet personal computer, an electronic book reader, a portable computer and the like.
As shown in fig. 1, the terminal may include: a processor 1001, e.g. a central processing unit Central Processing Unit, a CPU), a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., WIreless-FIdelity, WIFI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Optionally, the terminal may further include a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, and a WiFi module; the input unit is compared with the display screen and the touch screen; the network interface may optionally be other than WiFi in a wireless interface, bluetooth, probe, etc. Among other sensors, such as light sensors, motion sensors, and other sensors. In particular, the light sensor may include an ambient light sensor and a proximity sensor; of course, the mobile terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, and the like, which are not described herein.
It will be appreciated by those skilled in the art that the terminal structure shown in fig. 1 is not limiting of the terminal and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, the computer software product is stored in a storage medium (storage medium: also called computer storage medium, computer medium, readable storage medium, computer readable storage medium, or direct called medium, etc.), and the storage medium may be a nonvolatile readable storage medium, such as RAM, a magnetic disk, an optical disk, etc.), and includes several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method according to the embodiments of the present invention, and the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a computer program.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be used to call a computer program stored in the memory 1005 and execute steps in the text image processing method provided in the following embodiment of the present invention.
Referring to fig. 2, in a first embodiment of the text image processing method of the present invention, the text image processing method includes:
step S10, when a text image processing request is received, a text image corresponding to the text image processing request is acquired, and horizontal correction is carried out on the text image.
The text image processing method in this embodiment is applied to a terminal, the terminal receives a text image processing request, and the triggering mode of the text image processing request is not particularly limited, that is, the text image processing request may be actively triggered by a user, for example, the user clicks a "text image processing" button in a display interface of the terminal to trigger the text image processing request; in addition, the text image processing request may also be automatically triggered by the terminal, for example, the terminal performs image scanning, the terminal determines that the scanned image contains text information, and the terminal automatically triggers the text image processing request.
When receiving a text image processing request, the terminal acquires a text image corresponding to the text image processing request, wherein the text image refers to a text image containing handwriting and writing auxiliary lines, the text image can be obtained by scanning a handwriting file by a user, and the text image can also be obtained by shooting the handwriting file by the user, and the format of the text image in the embodiment is not particularly limited, that is, the text image can be in bmp, jpg, png, tif or gif format, and the like.
The purpose of text image processing in this embodiment is to remove the disturbing lines in the text image, where the disturbing lines in this embodiment refer to writing auxiliary lines in the printing process, such as a horizontal auxiliary line, a vertical auxiliary line, a field auxiliary line or a rice auxiliary line for writing, and before the text image processing, the terminal firstly carries out horizontal correction on the text image in advance, and then carries out recognition on the corrected text image, and in the embodiment, the horizontal correction on the text image by the terminal means that the terminal carries out translation, inclination and rotation on characters in the text image, so that the text in the text image is in the horizontal direction after the horizontal correction.
And step S20, when the horizontal correction of the text image is completed, acquiring a target pixel band in the text image.
When the horizontal correction of the text image is completed, the terminal divides the text image into different pixel bands, for example, the terminal takes pixels with different abscissas with the same abscissas as one pixel band, or takes pixels with the same abscissas with different abscissas as one pixel band.
Dividing the pixel value into two ranges in the terminal in advance according to the height of the pixel value, and judging the pixel point corresponding to the pixel value as a blank area by the terminal if the pixel value is in the first pixel range; if the pixel value is in the second pixel range, the terminal judges that the pixel point corresponding to the pixel value is a non-blank area.
The terminal acquires pixel values of all pixel points in the pixel band, the terminal counts the duty ratio of the non-blank area of each pixel band, and if the duty ratio of the non-blank area of the pixel band exceeds a preset threshold, the terminal judges that the pixel band corresponds to a writing auxiliary line; if the duty ratio of the non-blank area of the pixel band does not exceed the preset threshold value, the terminal judges that the pixel band is a normal writing line. The terminal takes a pixel band with the non-blank area ratio exceeding a preset threshold value as a target pixel band, wherein the preset threshold value can be flexibly set according to specific situations, for example, the preset threshold value is set to 80%; the terminal acquires a target pixel band to analyze the target pixel to remove a writing auxiliary line, specifically:
and step S30, processing the target pixel band through a preset Gaussian mixture model, and judging whether an intersection area intersected with the text strokes in the text image exists in the target pixel band or not.
The terminal processes the target pixel band through the preset Gaussian mixture model, and whether an intersection area intersected with text strokes in the text image exists in the target pixel band can be effectively judged.
That is, in this embodiment, two implementation manners of processing a target pixel band by a preset gaussian mixture model and determining that there is an intersection area in the target pixel band are provided, specifically:
the implementation mode is as follows:
step a1, analyzing the target pixel band according to the quantized values of three preset RGB channels, and dividing the target pixel band into a black pixel band and a gray pixel band;
step a2, classifying the gray pixel bands through a preset Gaussian mixture model, and judging whether a black pixel set exists in the gray pixel bands or not;
step a3, if a black pixel set exists in the gray pixel band, an intersection area exists in the target pixel band;
and a step a4, if the black pixel set does not exist in the gray pixel band, the crossing area does not exist in the target pixel band.
The terminal classifies the target pixel band into a black pixel band and a gray pixel band (the black pixel band and the gray pixel band are classified by quantization values of three channels of RGB for a computer), then the terminal classifies the gray pixel band into two types by using the Gaussian mixture model, wherein one type is a gray pixel set, the gray pixel set is an original color of the target pixel band, the other type is a black pixel set, the black pixel set is an intersection pixel of the target pixel band and a character, and the terminal judges whether the target pixel band intersects text strokes in a text image according to the classification of the gray pixel band in the target pixel band to form an intersection area, namely, if the gray pixel band comprises both the gray pixel set and the black pixel combination, the intersection area exists in the target pixel band; if only the gray pixel set is included in the gray pixel band and the black pixel set is not included in the gray pixel band, no intersection area exists in the target pixel band.
The implementation mode II is as follows:
step b1, processing the target pixel band through a preset Gaussian mixture model, and comparing the target pixel band with character strokes in the text image;
step b2, if the distance difference between the target pixel zone and the character strokes is smaller than a preset difference value, judging that a crossing area exists in the target pixel zone;
and b3, if the difference value of the distances between the target pixel band and the character strokes is larger than or equal to a preset difference value, judging that no crossing area exists in the target pixel band.
The terminal processes the target pixel band through a preset Gaussian mixture model, compares the target pixel band with character strokes in a text image, judges whether a distance difference value between the target pixel band and the character strokes is smaller than a preset difference value, wherein the preset difference value is a preset distance critical value, for example, the preset difference value is set to be 0.5mm, and judges that a crossing area exists in the target pixel band if the distance difference value between the target pixel band and the character strokes is smaller than the preset difference value; if the distance difference between the target pixel band and the character strokes is greater than or equal to the preset difference, the terminal judges that no crossing area exists in the target pixel band.
And step S40, if the crossing area exists in the target pixel band, taking the other parts except the crossing area in the target pixel band as interference lines, and deleting the interference lines in the text image.
If there is an intersection area in the target pixel band, that is, when the text stroke passes through the horizontal writing auxiliary line or the text stroke passes through the vertical writing auxiliary line, the terminal takes the target pixel band except the intersection area as an interference line, and deletes the interference line in the text image, for example:
mode one: the horizontal target pixel zone is the gray- > black (one or more) - > gray process from top to bottom, obtain the gray zone pixel (namely A [0] and A [ N-1 ]) row pixel up and down; if a crossing area exists in the target pixel band, namely, when the character strokes pass through the transverse lines, the upper and lower parts of the transverse lines are intersected with the characters, when the intersecting positions of the A0 row and the A N-1 row are matched up and down with the characters, namely, whether a non-black pixel exists in a row of pixels above the transverse lines is calculated, if the non-black pixel exists, namely, no crossing area exists, namely, when the longitudinal directions are consistent, the middle transverse line pixels are not removed, so that the character strokes are not cut; namely, a region intersecting with the strokes is found above the horizontal target pixel band, but when the corresponding horizontal target pixel band is not found right below the horizontal target pixel band, the region intersecting with the strokes is found in a certain pixel range right below left and right, if the region intersecting with the strokes is found, oblique lines are formed by upper and lower alignment, an extension line with a certain length is constructed upwards, if the extension line is covered by the black strokes, intermediate pixels at upper and lower contact positions are not required to be removed, otherwise, the region is not reserved, and the effect that a part of pixels which are regarded as horizontal lines on the horizontal target pixel band are removed is realized, and the region which is regarded as overlapping of character strokes is reserved.
Mode two: the vertical line target pixel zone is the process of gray- > black (one or more) - > gray from left to right, and left and right gray zone pixels (namely A [ N-1] and A [0 ]) are obtained; if there is a crossing area in the target pixel band, that is, when the character strokes pass through the vertical lines, the left and right sides of the vertical lines will intersect with the characters, and when the intersecting positions of A [ N-1] and A [0] rows and the characters are matched up and down, that is, vertically consistent, the middle horizontal line pixels will not be removed, thus ensuring that the character strokes will not be cut.
In the embodiment, a writing auxiliary line in a text image is obtained as a target pixel band through text image analysis; processing a target pixel zone through a preset Gaussian mixture model, and judging whether a crossing area formed by crossing the character strokes exists in the target pixel zone or not; if the crossing area exists in the target pixel band, the other parts except the crossing area in the target pixel band are used as interference lines, and the interference lines in the text image are deleted. The accurate deletion of the interference line is realized, and the erroneous deletion of the character strokes is effectively avoided.
Further, on the basis of the first embodiment of the present invention, a second embodiment of the text image processing method of the present invention is proposed.
This embodiment is a refinement of step S10 in the first embodiment, and differs from the first embodiment of the present invention in that:
performing binarization processing on the text image to obtain a binarization area;
extracting text binarization areas in the binarization areas through a preset character classification model, and summarizing the text binarization areas to generate a binarization image;
projecting the binarized image to obtain an inclination angle of a text in the text image, and adjusting the text in the text image according to the inclination angle to finish horizontal correction of the text image.
In this embodiment, the terminal performs the graying process on the text image to obtain a gray image, and then performs the binarizing process on the gray image, where the binarizing process refers to setting the gray value of each pixel point in the gray image to 0 or 255, that is, setting the gray value of each pixel point in the gray image to 0 or 255 according to a binarizing threshold by selecting an appropriate binarizing threshold, so as to obtain an image capable of reflecting the whole and local characteristics of the gray image, and the terminal performs the binarizing process on the text image to obtain a binarizing region.
The terminal extracts text information in the binarization area through the preset character classification model, takes the binarization area corresponding to the text information as the text binarization area, and gathers all the text binarization areas to generate a binarization image.
The terminal projects the binary image according to a projection algorithm, wherein the projection algorithm is to project the binary region at different angles, the projection of each angle obtains a projection value, a curve formed by each projection value has the characteristic of a quadratic parabola, the maximum value of the curve is positioned at the vertex of the parabola, the angle corresponding to the projection value positioned at the vertex of the parabola is used as the inclination angle of the text by the terminal, the text image is corrected by the inclination angle of the terminal, and the text image is corrected according to the inclination angle. In this embodiment, before the terminal performs text image analysis, the terminal performs direction correction on the text image, so as to improve accuracy of text image analysis.
Further, on the basis of the above-described embodiment of the present invention, a third embodiment of the text image processing method of the present invention is proposed.
In this embodiment, the terminal builds a gaussian mixture model in advance to process the text image through the gaussian mixture model, specifically:
when a Gaussian mixture model construction instruction is received, a predefined initial Gaussian model is obtained, and coefficients of the initial Gaussian model are adjusted according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises the following algorithm:
Figure BDA0002320326700000101
the P (X) represents the probability of the pixel value being in two classes, the K represents the gaussian mixture model classification class k=2, the pi k The sum of the mixing coefficients is denoted 1, λ denotes the constant of the gaussian mixture model, μ denotes the class center, and X denotes the pixel value.
In this embodiment, the triggering manner of the gaussian mixture model construction instruction is not specifically limited, when the terminal receives the gaussian mixture model construction instruction, the terminal acquires a predefined initial gaussian model, adjusts coefficients of the initial gaussian model according to pixels in a text image to obtain a gaussian mixture model, that is, the terminal firstly initializes a plurality of predefined initial gaussian models, then processes each pixel in the text image, determines whether the text image matches a certain initial gaussian model, if so, the terminal classifies the initial gaussian model into the model, and updates the initial gaussian model according to the pixels in the text image to obtain the gaussian mixture model. If not, the terminal deletes the initial Gaussian model.
The gaussian mixture model constructed in this embodiment includes an algorithm:
Figure BDA0002320326700000111
wherein the gaussian mixture model is divided into two classes, so k=2, μ represents the class center, pi k The sum of the mixing coefficients is 1, lambda represents the constant of a Gaussian mixture model, the constant can be flexibly set according to specific conditions, X in the formula is the pixel value of a gray band, the Gaussian mixture model divides the pixels of the gray band into two types, namely each pixel value can be classified into K1 type or K2 type, the two types are two independent Gaussian distributions (normal distributions), and P (X) is the probability that the pixel value is in the two types; k1 and K2 are respectively a gray horizontal line part and a part intersected with a pen touch, and are distinguished by mu values of Gaussian distribution of two categories, namely the original color of a horizontal line image gray color band is relatively smaller in mu value, and the pixel intersected with characters is relatively larger in mu value.
In this embodiment, a method of constructing a gaussian mixture model is described to achieve accurate text image analysis by the constructed gaussian mixture model.
Further, referring to fig. 3, a fourth embodiment of the text image processing method of the present invention is proposed on the basis of the above-described embodiment of the present invention.
This embodiment is a step subsequent to step S30 in the first embodiment, and differs from the other embodiments in that:
and S50, if the crossing area does not exist in the target pixel band, acquiring the pixel value of each pixel point in the target pixel band.
If the pixel values of the pixel points are the same, the terminal judges that the target pixel zone is a writing auxiliary line; if the pixel values of the pixel points are different, the terminal outputs prompt information for the user to confirm.
And S60, when the pixel values of all the pixel points in the target pixel band are the same, taking the target pixel band as an interference line, and deleting the interference line in the text image.
When the pixel values of all pixel points in a target pixel zone are the same, the terminal takes the target pixel zone as an interference line and deletes the interference line in a text image; in the embodiment, when the interference line is deleted, the terminal analyzes according to the pixel value of the target pixel band, so that the situation of error deletion is avoided, and the deleting accuracy is improved.
In addition, referring to fig. 4, an embodiment of the present invention further proposes a text image processing apparatus, including:
the request receiving module 10 is configured to obtain a text image corresponding to a text image processing request when receiving the text image processing request, and horizontally correct the text image;
an information obtaining module 20, configured to obtain a target pixel band in the text image when the horizontal correction of the text image is completed;
a processing judging module 30, configured to process the target pixel band through a preset gaussian mixture model, and judge whether an intersection area intersecting with a text stroke in the text image exists in the target pixel band;
and the interference deleting module 40 is configured to take the target pixel band except the intersection area as an interference line if the intersection area exists in the target pixel band, and delete the interference line in the text image.
In one embodiment, the request receiving module 10 includes:
the binarization unit is used for carrying out binarization processing on the text image to obtain a binarization area;
the generation summarizing unit is used for extracting text binarization areas in the binarization areas through a preset character classification model and summarizing the text binarization areas to generate a binarization image;
and the rotation correction unit is used for projecting the binarized image to obtain the inclination angle of the text in the text image, and adjusting the text in the text image according to the inclination angle so as to finish the horizontal correction of the text image.
In one embodiment, the information acquisition module 20 includes:
the image dividing unit is used for dividing the text image into pixel bands according to a preset direction when the horizontal correction of the text image is completed, wherein the preset direction comprises a horizontal direction and a vertical direction;
and the determining and acquiring unit is used for determining a gray pixel value duty ratio and/or a black pixel value duty ratio in the pixel band and acquiring a target pixel band of which the gray pixel value duty ratio and/or the black pixel value duty ratio exceeds a preset threshold value.
In one embodiment, the text image processing apparatus includes:
the model construction module is used for acquiring a predefined initial Gaussian model when a Gaussian mixture model construction instruction is received, and adjusting coefficients of the initial Gaussian model according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises an algorithm:
Figure BDA0002320326700000131
the P (X) represents the probability of the pixel value being in two classes, the K represents the gaussian mixture model classification class k=2, the pi k The sum of the mixing coefficients is denoted 1, λ denotes the constant of the gaussian mixture model, μ denotes the class center, and X denotes the pixel value.
In one embodiment, the processing determining module 30 includes:
the pixel dividing unit is used for analyzing the target pixel band according to the quantized values of the preset RGB three channels and dividing the target pixel band into a black pixel band and a gray pixel band;
the classification judging unit is used for classifying the gray pixel bands through a preset Gaussian mixture model and judging whether black pixel sets exist in the gray pixel bands or not;
a first judging unit, configured to, if a black pixel set exists in the gray pixel band, cause a crossing region to exist in the target pixel band;
and the second judging unit is used for judging that if the black pixel set does not exist in the gray pixel band, the crossing area does not exist in the target pixel band.
In one embodiment, the processing determining module 30 includes:
the stroke comparison unit is used for processing the target pixel band through a preset Gaussian mixture model and comparing the target pixel band with character strokes in the text image;
a third judging unit, configured to judge that an intersection area exists in the target pixel band if a distance difference between the target pixel band and the character strokes is smaller than a preset difference;
and the fourth judging unit is used for judging that no crossing area exists in the target pixel band if the distance difference between the target pixel band and the character strokes is larger than or equal to a preset difference value.
In one embodiment, the text image processing apparatus includes:
the pixel acquisition module is used for acquiring pixel values of all pixel points in the target pixel band if the intersection area does not exist in the target pixel band;
and the deleting interference module is used for taking the target pixel band as an interference line when the pixel values of all the pixel points in the target pixel band are the same, and deleting the interference line in the text image.
The steps of implementing each functional module of the text image processing apparatus may refer to each embodiment of the text image processing method of the present invention, which is not described herein again.
In addition, the embodiment of the invention also provides a computer storage medium.
The computer storage medium has stored thereon a computer program which, when executed by a processor, implements the operations in the text image processing method provided by the above embodiment.
It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity/operation/object from another entity/operation/object without necessarily requiring or implying any actual such relationship or order between such entities/operations/objects; the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points. The apparatus embodiments described above are merely illustrative, in which the units illustrated as separate components may or may not be physically separate. Some or all of the modules may be selected according to actual needs to achieve the objectives of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (9)

1. A text image processing method, characterized in that the text image processing method comprises the steps of:
when a text image processing request is received, acquiring a text image corresponding to the text image processing request, and horizontally correcting the text image;
when the horizontal correction of the text image is completed, acquiring a target pixel band in the text image;
processing the target pixel band through a preset Gaussian mixture model, and judging whether an intersection area intersecting with text strokes in the text image exists in the target pixel band or not;
if the crossing area exists in the target pixel band, taking the other parts except the crossing area in the target pixel band as interference lines, and deleting the interference lines in the text image;
before the step of processing the target pixel band through a preset Gaussian mixture model and judging whether an intersection area intersecting with a text stroke in the text image exists in the target pixel band, the method comprises the following steps:
when a Gaussian mixture model construction instruction is received, a predefined initial Gaussian model is obtained, and coefficients of the initial Gaussian model are adjusted according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises an algorithm:
Figure QLYQS_1
the p (x) represents the probability of the pixel value being in two classes, and the K represents the Gaussian mixture modulusType classification category, k=2, pi k Representing the mixing coefficient, pi k The sum is 1, λ represents the constant of the gaussian mixture model, μ represents the class center, and x represents the pixel value.
2. The text image processing method of claim 1, wherein the step of horizontally correcting the text image includes:
performing binarization processing on the text image to obtain a binarization area;
extracting text binarization areas in the binarization areas through a preset character classification model, and summarizing the text binarization areas to generate a binarization image;
projecting the binarized image to obtain an inclination angle of a text in the text image, and adjusting the text in the text image according to the inclination angle to finish horizontal correction of the text image.
3. The text image processing method of claim 1, wherein the step of acquiring the target pixel band in the text image when the text image level correction is completed comprises:
dividing the text image into pixel bands according to a preset direction when the horizontal correction of the text image is completed, wherein the preset direction comprises a horizontal direction and a vertical direction;
and determining a gray pixel value duty ratio and/or a black pixel value duty ratio in the pixel band, and acquiring a target pixel band of which the gray pixel value duty ratio and/or the black pixel value duty ratio exceeds a preset threshold.
4. The text image processing method of claim 1, wherein the step of processing the target pixel band by a preset gaussian mixture model and judging whether there is an intersection area intersecting text strokes in the text image in the target pixel band includes:
analyzing the target pixel band according to the quantized values of the preset RGB three channels, and dividing the target pixel band into a black pixel band and a gray pixel band;
classifying the gray pixel bands through a preset Gaussian mixture model, and judging whether black pixel sets exist in the gray pixel bands or not;
if a black pixel set exists in the gray pixel band, an intersection area exists in the target pixel band;
if there is no black pixel set in the gray pixel band, there is no intersection region in the target pixel band.
5. The text image processing method of claim 1, wherein the step of processing the target pixel band by a preset gaussian mixture model and judging whether there is an intersection area intersecting text strokes in the text image in the target pixel band includes:
processing the target pixel band through a preset Gaussian mixture model, and comparing the target pixel band with character strokes in the text image;
if the distance difference between the target pixel zone and the character strokes is smaller than a preset difference value, judging that a crossing area exists in the target pixel zone;
and if the distance difference between the target pixel band and the character strokes is greater than or equal to a preset difference value, judging that no crossing area exists in the target pixel band.
6. The text image processing method of any one of claims 1 to 5, wherein after the step of processing the target pixel band by a preset gaussian mixture model and determining whether there is an intersection area in the target pixel band intersecting text strokes in the text image, the method comprises:
if the target pixel zone does not have the intersection area, acquiring the pixel value of each pixel point in the target pixel zone;
and when the pixel values of all the pixel points in the target pixel band are the same, taking the target pixel band as an interference line, and deleting the interference line in the text image.
7. A text image processing apparatus, characterized in that the text image processing apparatus comprises:
the request receiving module is used for acquiring a text image corresponding to the text image processing request and horizontally correcting the text image when the text image processing request is received;
the information acquisition module is used for acquiring a target pixel band in the text image when the horizontal correction of the text image is completed;
the processing judging module is used for processing the target pixel band through a preset Gaussian mixture model and judging whether an intersection area intersecting with the text strokes in the text image exists in the target pixel band or not;
the interference deleting module is used for taking the target pixel band except the intersection area as an interference line and deleting the interference line in the text image if the intersection area exists in the target pixel band;
the model construction module is used for acquiring a predefined initial Gaussian model when a Gaussian mixture model construction instruction is received, and adjusting coefficients of the initial Gaussian model according to pixels in the text image to obtain the Gaussian mixture model, wherein the Gaussian mixture model comprises an algorithm:
Figure QLYQS_2
the p (x) represents the probability of the pixel value in two classes, the K represents the classification class of the Gaussian mixture model, K=2, the pi k Representing the mixing coefficient, pi k The sum is 1, λ represents the constant of the gaussian mixture model, μ represents the class center, and x represents the pixel value.
8. A text image processing apparatus, characterized in that the text image processing apparatus comprises: a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein:
the computer program, when executed by the processor, implements the steps of the text image processing method as claimed in any one of claims 1 to 6.
9. A computer storage medium, characterized in that the computer storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the text image processing method according to any of claims 1 to 6.
CN201911306248.XA 2019-12-16 2019-12-16 Text image processing method, device, equipment and storage medium Active CN111178362B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911306248.XA CN111178362B (en) 2019-12-16 2019-12-16 Text image processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911306248.XA CN111178362B (en) 2019-12-16 2019-12-16 Text image processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111178362A CN111178362A (en) 2020-05-19
CN111178362B true CN111178362B (en) 2023-05-26

Family

ID=70652160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911306248.XA Active CN111178362B (en) 2019-12-16 2019-12-16 Text image processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111178362B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814780B (en) * 2020-07-08 2023-05-26 重庆农村商业银行股份有限公司 Bill image processing method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6438265B1 (en) * 1998-05-28 2002-08-20 International Business Machines Corp. Method of binarization in an optical character recognition system
CN105069452A (en) * 2015-08-07 2015-11-18 武汉理工大学 Straight line removing method based on local structure analysis
WO2017162069A1 (en) * 2016-03-25 2017-09-28 阿里巴巴集团控股有限公司 Image text identification method and apparatus
CN108681729A (en) * 2018-05-08 2018-10-19 腾讯科技(深圳)有限公司 Text image antidote, device, storage medium and equipment
CN108805126A (en) * 2017-04-28 2018-11-13 上海斯睿德信息技术有限公司 A kind of long interfering line minimizing technology of text image

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6438265B1 (en) * 1998-05-28 2002-08-20 International Business Machines Corp. Method of binarization in an optical character recognition system
CN105069452A (en) * 2015-08-07 2015-11-18 武汉理工大学 Straight line removing method based on local structure analysis
WO2017162069A1 (en) * 2016-03-25 2017-09-28 阿里巴巴集团控股有限公司 Image text identification method and apparatus
CN108805126A (en) * 2017-04-28 2018-11-13 上海斯睿德信息技术有限公司 A kind of long interfering line minimizing technology of text image
CN108681729A (en) * 2018-05-08 2018-10-19 腾讯科技(深圳)有限公司 Text image antidote, device, storage medium and equipment

Also Published As

Publication number Publication date
CN111178362A (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN111914834B (en) Image recognition method, device, computer equipment and storage medium
CN110942074B (en) Character segmentation recognition method and device, electronic equipment and storage medium
US9286536B2 (en) Image processing system for determining a boundary line using a shadow image
US10769473B2 (en) Image processing apparatus, image processing method, and non-transitory computer-readable storage medium
US8331670B2 (en) Method of detection document alteration by comparing characters using shape features of characters
JP2016516245A (en) Classification of objects in images using mobile devices
CN112101317B (en) Page direction identification method, device, equipment and computer readable storage medium
CN111325104B (en) Text recognition method, device and storage medium
CN107977658B (en) Image character area identification method, television and readable storage medium
CN111985465A (en) Text recognition method, device, equipment and storage medium
CN112686919B (en) Object boundary line determining method and device, electronic equipment and storage medium
CN104268512B (en) Character identifying method and device in image based on optical character identification
US8538191B2 (en) Image correction apparatus and method for eliminating lighting component
CN111080665B (en) Image frame recognition method, device, equipment and computer storage medium
CN110431563A (en) The method and apparatus of image rectification
US7146047B2 (en) Image processing apparatus and method generating binary image from a multilevel image
CN111178362B (en) Text image processing method, device, equipment and storage medium
CN117115774B (en) Lawn boundary identification method, device, equipment and storage medium
CN106663212B (en) Character recognition device, character recognition method, and computer-readable storage medium
US7130085B2 (en) Half-tone dot elimination method and system thereof
CN109635798B (en) Information extraction method and device
JP4810853B2 (en) Character image cutting device, character image cutting method and program
CN114202665A (en) Image similarity determining method and device, equipment and storage medium
CN111260623A (en) Picture evaluation method, device, equipment and storage medium
CN114463242A (en) Image detection method, device, storage medium and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant