CN111461126A

CN111461126A - Space recognition method and device in text line, electronic equipment and storage medium

Info

Publication number: CN111461126A
Application number: CN202010231850.8A
Authority: CN
Inventors: 尚太章
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-03-23
Filing date: 2020-03-23
Publication date: 2020-07-28
Anticipated expiration: 2040-03-23
Also published as: WO2021190155A1; CN111461126B

Abstract

The application discloses a method and a device for identifying a space in a text line, electronic equipment and a storage medium, and relates to the technical field of image processing. Wherein, the method comprises the following steps: acquiring a text gray image, wherein the text gray image only comprises a single line of text; calculating the sum of pixel values of each row of pixel points in the text gray-scale image in a preset direction, wherein the preset direction is a direction vertical to the arrangement direction of characters in the single-line text; and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the space in the text gray image is located. The technical scheme can determine the blank space in the single-line text.

Description

Space recognition method and device in text line, electronic equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for identifying a space in a text line, an electronic device, and a storage medium.

Background

In an image, if a line of characters exists, extraction of spaces in the line of characters is needed to determine which characters have spaces therebetween, so as to obtain real text information containing the spaces.

Disclosure of Invention

In view of the above problems, the present application provides a method and an apparatus for identifying a space in a text line, an electronic device, and a storage medium to improve the above problems.

In a first aspect, an embodiment of the present application provides a method for identifying a space in a text line, where a text grayscale image is obtained, and the text grayscale image only includes a single line of text; calculating the sum of pixel values of each row of pixel points in the text gray-scale image in a preset direction, wherein the preset direction is a direction vertical to the arrangement direction of characters in the single-line text; and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the space in the text gray image is located.

In a second aspect, an embodiment of the present application provides an apparatus for space recognition in a text line, the apparatus including: the image acquisition module is used for acquiring a text gray image, and the text gray image only comprises a single line of text; the pixel value acquisition module is used for calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray-scale image, wherein the preset direction is a direction vertical to the character arrangement direction in the single-line text; and the space determining module is used for taking a connected domain formed by pixels corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the space in the text gray image is located.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a memory; one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs being executed by the processors for performing the methods described above.

In a fourth aspect, the present application provides a computer-readable storage medium, in which a program code is stored, and the program code can be called by a processor to execute the above method.

According to the space identification method, the device, the electronic equipment and the storage medium in the text line, in a text gray scale image only comprising a single line of text, the sum of the pixel values of each line of pixel points in the preset direction perpendicular to the text arrangement direction is calculated, and therefore whether the position of the pixel point corresponding to the sum of the pixel values is a space is determined according to whether the sum of the pixel values is in the range of the sum of the pixel values corresponding to the space. And determining the connected domain formed by the pixel points with the spaces at the positions as the spaces, thereby determining the spaces in the single-line text.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart illustrating a method for identifying a space in a text line according to an embodiment of the present application.

Fig. 2 shows a schematic arrangement diagram of a pixel point according to an embodiment of the present application.

Fig. 3 is a flowchart illustrating a method for space recognition in a text line according to another embodiment of the present application.

Fig. 4 shows a schematic text gray scale provided by an embodiment of the present application.

Fig. 5 shows a fitting curve graph between pixel values and the number of pixel points in a text gray scale image provided by the embodiment of the present application.

Fig. 6 shows a schematic diagram after the pixel values of the background portion are unified, which is provided in the embodiment of the present application.

Fig. 7 shows an exemplary text gray scale map after a closing operation is performed on the text gray scale map shown in fig. 6 according to an embodiment of the present application.

Fig. 8 shows a schematic diagram after the color of fig. 7 is turned over, which is provided by the embodiment of the present application.

Fig. 9 is a diagram showing the statistical results obtained by summing the pixel values for each column of pixels in fig. 8.

Fig. 10 is a functional block diagram of a space recognition apparatus in a text line according to an embodiment of the present application.

Fig. 11 shows a block diagram of an electronic device according to an embodiment of the present application.

Fig. 12 is a storage unit according to an embodiment of the present application, which stores or carries program code for implementing a space recognition method in a text line according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Since the characters in the picture cannot be directly edited, copied, or cut, it is usually necessary to recognize the characters and obtain characters that can be expressed in text form, so that the obtained text can be edited, copied, or cut.

For the processing of Chinese characters in pictures, each character can exist as a single character, and whether a space is recognized or not, each character can be sequenced at certain intervals to form a text with real text information. However, for recognition of other languages, such as english words, each word is composed of corresponding alphabetic characters, and the letters of different words are all selected from the same alphabet. When a continuous line of characters appears, the extraction of the spaces between the words is very important, if the spaces cannot be extracted, each line of text obtained is a series of letters connected together, and the words cannot be specified respectively, so that the subsequent processing difficulty of a machine and the difficulty of understanding and recognizing by human beings are caused.

Therefore, the embodiment of the application provides a space recognition method, a device, an electronic device and a storage medium in a text line, and the space of a single line of text in a text gray-scale map is determined by whether the sum of the calculated pixel values is in the interval where the sum of the pixel values corresponding to the space in the text gray-scale map is located. The method, the apparatus, the electronic device, and the storage medium for recognizing a space in a text line provided by the embodiments of the present application will be described in detail below with specific embodiments.

Referring to fig. 1, a method for recognizing a space in a text line according to an embodiment of the present application is shown. Specifically, the method comprises the following steps:

step S110: and acquiring a text gray map, wherein the text gray map only comprises a single line of text.

The text gray map is a gray map comprising only one line of text, i.e. a single line of text in the text gray map. And identifying a space in the text line in the text gray map, namely identifying a space in the single line of text.

In the embodiment of the present application, the specific arrangement direction of the single-line text in the text gray-scale image is not limited, and may be a horizontal arrangement; the arrangement may be in the vertical direction, or may be in another direction. The embodiments of the present application are described by way of examples of the lateral arrangement.

Optionally, the identification of whether the characters in the text gray-scale image are arranged in the horizontal direction or the vertical direction is not limited in the embodiment of the present application. For example, the horizontal arrangement may be processed by default, or the vertical arrangement may be processed by default. For another example, the character arrangement direction is determined according to two mutually perpendicular sides in the text grayscale map, and the extending direction of one longer side can be determined as the character arrangement direction.

Step S120: and calculating the sum of the pixel values of each row of pixel points in the text gray-scale image in a preset direction, wherein the preset direction is a direction vertical to the arrangement direction of characters in the single-line text.

In the text gray-scale image, the direction of the character arrangement in the single-line text is defined as the character arrangement direction, and the direction perpendicular to the character arrangement direction is defined as a preset direction. For example, in a single line of text arranged horizontally, the horizontal direction is the direction in which the text is arranged, and the vertical direction is the preset direction.

In the embodiment of the application, the sum of the pixel values of each row of pixel points in the text gray-scale image in the preset direction can be calculated. The row of pixel points in the preset direction represents that the arrangement direction of the row of pixel points is the preset direction. For example, in a single-line text with characters arranged horizontally, a row of pixels in the longitudinal direction is a column of pixels, and the sum of pixel values of each column of pixels in a text gray map is calculated. As shown in fig. 2, which is a schematic diagram of pixel points in a text gray scale map with horizontally arranged characters, the pixel points in column I1 include (I1, J1) (I1, J2) (I1, J3); the pixel points in the I2 th column include (I2, J1) (I2, J2) (I2, J3); the pixel points in column I3 include (I3, J1) (I3, J2) (I3, J3), and so on. The preset direction is longitudinal direction, the sum of the pixel values of each row of pixel points in the preset direction is calculated, namely the sum of the gray values of each column of pixel points is calculated, and the sum of 7 pixel values respectively corresponding to the I1 th column to the I7 th column is obtained.

Optionally, because the pixels are arranged closely, each space may include multiple rows of pixels, and in order to increase the calculation speed, in this embodiment of the application, the sum of the pixel values of two adjacent rows or multiple adjacent rows of pixels in the preset direction may also be calculated.

Optionally, because the pixels are arranged closely, each blank space may include multiple rows of pixels, and in order to increase the calculation speed, in this embodiment of the application, the sum of the pixel values of one or more rows of pixels in the preset direction may also be calculated at intervals of one or more rows.

Step S130: and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the space in the text gray image is located.

In order to facilitate the recognition of the text, in a single line of text picture, the colors of the characters are generally the same or close to each other, the color of the background is the same or close to each other, and the difference in color between the characters and the background is large. That is, the pixel values of the pixels forming the text are the same or close to each other, the pixel values of the pixels forming the background are the same or close to each other, and the difference between the pixel values of the pixels forming the text and the pixel values of the pixels forming the background is large, for example, the difference between the pixel values is larger than a certain preset pixel difference value. The background is the part except the text in the single-line text picture, including the blank space, the upper, lower, left and right of the text line, and other areas. Then, in a single line of text picture, the difference between the pixel value of the pixel point forming the text and the pixel value of the pixel point forming the space is larger.

Therefore, in the embodiment of the present application, after the pixel points in the blank are calculated according to the method for calculating the sum of the pixel values in the foregoing step, the pixel points may be in a range of a pixel value interval, and are defined as a first pixel value interval. The first pixel value interval has uniqueness and is different from a pixel value interval range in which a pixel point in a text is possibly located after being calculated according to the mode of calculating the sum of pixel values in the previous step.

Connected components formed by pixel points corresponding to the sum of pixel values in the first pixel value interval can be recognized as a space in a single line of text. Namely, the area formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval is determined as a blank.

The calculated sum is the sum of pixel values of each row of pixels in the preset direction, or the calculated sum is the sum of pixel values of each adjacent two rows or each adjacent multiple rows of pixels in the preset direction, and the pixels corresponding to the sum of pixel values may be all the pixels of which pixel values are used for calculating the sum of pixel values.

If the sum of the pixel values of the pixels is calculated, the sum of the pixel values of one or more rows of pixels in the preset direction is calculated at intervals of one row or a plurality of rows, and the pixels corresponding to the sum of the pixel values may include all the pixels of which the pixel values are used for calculating the sum of the pixel values and the pixels which are not subjected to the calculation of the sum of the pixel values and are spaced when the sum of the pixel values is calculated. For example, in a single-line text arranged transversely, the sum of pixel values of the pixels in the first column, the sum of pixel values of the pixels in the third column, and the sum of pixel values of the pixels in the odd-numbered columns are calculated. The pixels corresponding to the sum of the pixel values of each odd-numbered row may include the pixels of the odd-numbered row and the pixels of the even-numbered row spaced by the odd-numbered row, and it can be understood that, in this example, the pixels of the even-numbered row spaced by the pixels of each odd-numbered row are the pixels of the even-numbered row one less than the odd-numbered row.

In the embodiment of the application, for the text gray-scale image where the single-line text is located, the direction perpendicular to the arrangement direction of the characters in the single-line text is taken as the preset direction, and the sum of the pixel values of each row of pixel points in the preset direction is calculated. And then determining which pixel values are in the first pixel value interval according to a first pixel value interval possibly existing in the pixel value sum of the pixel points corresponding to the space in the text gray-scale image, and using a connected domain formed by the pixel points corresponding to the pixel value sum in the first pixel value interval as the space in the single-line text, thereby accurately identifying the space in the single-line text.

In the space identification method in the text line provided in another embodiment of the present application, color unification may be performed on the background portion, so that pixel values of the space portion are more unified, a difference between sums of different pixel values obtained by calculation is smaller, the calculation is more centralized, and the calculation is more accurately set to a first pixel value interval in which a sum of pixel values can be measured. Referring to fig. 3, the method includes:

step S210: and acquiring a text gray map, wherein the text gray map only comprises a single line of text.

In the embodiment of the present application, a specific obtaining manner of the text grayscale map is not limited. In the text gray scale image, the size of the text, the ratio between the text height and the picture height, the ratio between the text width and the picture width, and the like are not limited.

Alternatively, the text gray scale map may be a single-line text picture obtained by performing single-line text extraction from the text picture, and including only a single line of text. And then the single-line text picture is converted into the text gray-scale image through image preprocessing. The way of extracting the single line of text is not limited in the embodiment of the present application, and for example, the single line of text may be extracted by a deep learning algorithm, such as textboxes series algorithms, east series algorithms, sglink, and the like.

Optionally, the text gray-scale map may also be obtained by image preprocessing a single-line text picture which only includes a line of text.

In the embodiment of the present application, the image preprocessing may include one or more of the following:

if the single-line text picture is not a gray image, such as an RGB three-channel picture, graying processing can be performed on the single-line text picture, and the single-line text picture is converted into a gray image to serve as the text gray image;

denoising the text gray level image, such as median fuzzy processing;

and carrying out equalization processing on the text gray-scale image so as to enable the pixel value distribution in the text gray-scale image to be more balanced and prevent the excessive deviation of the pixel values in the image.

Optionally, when the image preprocessing includes two or more processing modes, the processing order between the processing modes may be consistent with the description order, denoising is performed after the gray level conversion, and equalization processing is performed after denoising, so that the difficulty of each step of processing is reduced, and the effectiveness of processing is improved.

In addition, optionally, in the embodiment of the present application, after the grayed text grayscale is acquired, image preprocessing operations such as denoising and equalization may be performed after the grayed text grayscale is acquired.

Step S220: and acquiring the segmentation pixel value in the text gray-scale image.

Due to the presence of noise in the text gray map, and the fact that the text gray map itself is not binarized, the space portion is not a pure pixel value, and the pixel value with a higher probability of being a space can be determined and unified by dividing the pixel value.

That is, the segmentation pixel values in the text grayscale map can more accurately distinguish spaces and characters, and the pixel value on one side of the segmentation pixel value is more likely to be the pixel value of a space than the pixel value of a character; the pixel values on the other side of the split pixel values are more likely to be the pixel values of the character than the spaces. That is, the probability that the pixel value on one side of the divided pixel value is a space pixel value is greater than the probability that the pixel value is a character pixel value; the pixel value on the other side of the divided pixel value has a probability of being a character larger than a probability of being a space.

One side and the other side of the divided pixel values indicate two opposite sides of the divided pixel values, and the pixel values of the two opposite sides are respectively a pixel value larger than the divided pixel value and a pixel value smaller than the divided pixel value. The specific side with higher probability is the space and the specific side with higher probability is the character, which is determined by the space in the text gray map and the actual pixel of the character.

Therefore, in the embodiment of the present application, the divided pixel values in the text gray map may be obtained, and the pixel values on the side where the space is more likely to be formed are unified into one same pixel value, so that the pixel values calculated in the space are more concentrated. In order to effectively distinguish between spaces and characters, the unified pixel value is different from the pixel value on the side where characters are more likely.

Because the color of the background part including the blank is generally greatly different from that of the character, and the number of the pixel points of the background part is more than that of the character, optionally, a coordinate system can be established by taking the pixel value as an abscissa and the pixel value as an ordinate, so as to determine the change of the pixel value along with the pixel value. In the coordinate system, a fitting curve between the pixel values and the number of the pixel points corresponding to each pixel value can be obtained, and the maximum value of all the maximum values in the fitting curve is obtained. The greater probability of this maximum being the pixel value of the background, the greater probability is the pixel value of a space.

Since one side of the divided pixel values needs to be more probable to be a text pixel value and the other side needs to be more probable to be a space pixel value, the number of the divided pixel values should be small and the more probable rate is between the text pixel value and the space pixel value. Therefore, the selected split pixel value may be a pixel value with a small number of corresponding pixels, and a minimum value adjacent to the maximum value may be obtained, and the pixel value corresponding to the obtained minimum value is used as the split pixel value. The minimum value indicates that the number of the pixel points is small, and the minimum value is adjacent to the maximum value, and indicates that the corresponding pixel value is possibly between the character pixel value and the space pixel value.

In addition, the division pixel value is between the space pixel value and the character pixel value, and the minimum value adjacent to the left side of the maximum value or the minimum value adjacent to the right side of the maximum value can be selected according to the actual situation of the text gray map. Wherein, the minimum value on the left side of the maximum value represents the minimum value of the corresponding pixel value smaller than the pixel value corresponding to the maximum value; the minimum value on the right side of the maximum value indicates a minimum value in which the corresponding pixel value is larger than the pixel value corresponding to the maximum value.

The pixel value distribution is more balanced in the text gray level image after the equalization processing. The color distinction degree between the background and the character is larger, and the difference between the pixel value of the background and the pixel value of the character is larger. And the background comprises a space, and the processing of the background pixel value can realize the processing of the space pixel value by representing the pixel value of the space by the pixel value of the background. If the background is closer to white, the characters are closer to black, the segmentation pixel value should be smaller than the pixel value corresponding to the maximum value, and a minimum value adjacent to and left of the maximum value is selected; if the background is closer to black, the text is closer to white, the split pixel values should be larger than the pixel value corresponding to the maximum value, and the minimum value to the right of the maximum value and adjacent to the maximum value is selected.

In one embodiment, whether the background is closer to white or black may be a default, and the processing is directly performed according to a processing mode corresponding to the default color. For example, the background in the default text grayscale is white, or the background is closer to white, and the pixel value corresponding to the minimum value on the left side of the maximum value is selected as the divided pixel value by processing such that the background is closer to white.

Optionally, in this embodiment, if the actual color of the background is different from the default background color, the color of the background and the color of the text in the text grayscale image may be converted to the default color by color inversion. If the default background is white or closer to white, but the background in the actual text gray image is black or closer to black, the pixel value of each pixel point can be converted into a difference value of 255 minus the current pixel value, so as to realize the inversion of black and white colors. For example, the pixel value of a certain pixel is 214, and the converted pixel value becomes (255 and 214) ═ 41.

In another embodiment, white and black can be distinguished by a preset pixel value to distinguish the background closer to white and black. In this embodiment, if the pixel value of the background is greater than the predetermined pixel value, it is determined that the background is closer to white; and if the color of the background is less than or equal to the preset pixel value, determining that the background is closer to black. The specific value of the preset pixel value is not particularly limited, and may be set according to actual requirements, for example, set to be a central gray-scale value, such as 127 or 128.

Optionally, in this embodiment, because there are many pixels, the pixel value of the background may be represented by the pixel value corresponding to the maximum value, that is, the pixel value corresponding to the maximum value is used as the background pixel value.

Optionally, because the corner in the text grayscale image is usually a part of the background, one of the average pixel values of the four corners in the text grayscale image, the average pixel value of one or more corners in the four corners, the maximum pixel value of the corresponding pixel point in one or more corners in the four corners, and the like may be selected to represent the pixel value of the background.

In addition, optionally, the manner of obtaining the divided pixel values may also be to obtain a pixel value with the largest number of corresponding pixels in a coordinate system established by taking the pixel value as an abscissa and the number of pixels as an ordinate, and define the number of pixels corresponding to the pixel value as a maximum value. And then acquiring a minimum value adjacent to the maximum value, and taking a pixel value corresponding to the acquired minimum value as a segmentation pixel value. The specific manner of obtaining the minimum value adjacent to the maximum value is referred to the foregoing description, and is not described herein again.

Step S230: and if the pixel value of the background in the text gray image is greater than the segmentation pixel value, setting the pixel point of which the pixel value is greater than the segmentation pixel value as a first pixel value, wherein the first pixel value is greater than or equal to the pixel value of the background. And if the pixel value of the background in the text gray image is smaller than the segmentation pixel value, setting the pixel point of which the pixel value is smaller than the segmentation pixel value as a second pixel value, wherein the second pixel value is smaller than or equal to the pixel value of the background.

If the background is closer to white and the pixel value of the background is larger than the segmentation pixel value, the pixel value of the background can be converted into the same pixel value. Specifically, a pixel point whose pixel value is greater than the divided pixel value may be set as a first pixel value, where the first pixel value is greater than or equal to a pixel value of the background. Optionally, in this embodiment of the present application, the background may be uniformly converted into white, that is, the first pixel value is 255.

If the background is closer to black and the pixel value of the background is smaller than the segmentation pixel value, the pixel value of the background can also be converted into the same pixel value. Specifically, the pixel point whose pixel value is smaller than the divided pixel value may be set as a second pixel value, where the second pixel value is smaller than or equal to the pixel value of the background. Optionally, in this embodiment of the application, the background may be uniformly converted into black, that is, the second pixel value is 0.

Optionally, in this embodiment of the application, it may be determined whether a pixel point of the background should be set to the first pixel value or the second pixel value by obtaining a pixel value of the background in the text grayscale image.

Optionally, in this embodiment, it may also be default which color the background is closer to, and the pixel value corresponding to the closer color is set. If the default background is closer to white, the pixel points with the pixel values larger than the segmentation pixel values are directly set as the first pixel values.

Optionally, if the actual color of the background does not match the default color condition, the pixel values in the text grayscale image may be inverted. For example, if the background in the default text grayscale image is closer to white, but the background in the actual text grayscale image is closer to black, the pixel value of each pixel point is converted into a difference value obtained by subtracting the current pixel value from 255, so as to realize the inversion of black and white.

In the embodiment of the present application, the background includes a space, and the pixel value of the background is converted into the same pixel value, so that the pixel value of the space is converted into the same pixel value.

Step S240: and calculating the sum of the pixel values of each row of pixel points in the text gray-scale image in a preset direction, wherein the preset direction is a direction vertical to the arrangement direction of characters in the single-line text.

Optionally, in this embodiment of the application, the closing operation may eliminate narrow gaps and long gaps, eliminate small holes, and fill up breaks in the contour lines, and may perform the closing operation on the text grayscale, further reduce noise of the space portion, make the pixel value of the space portion cleaner, and make the sum of the pixel values obtained by calculating the space portion less interfered, and may be more concentrated.

Step S250: and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the space in the text gray image is located.

In the text gray-scale image, the sum of the pixel values of each row of pixel points in the direction vertical to the character arrangement direction is calculated, namely the sum of the pixel values of each row of pixel points is calculated. Since the pixel values of the spaces are unified, the sum of the pixel values obtained by the space part is more concentrated in a section, and therefore, whether the sum of the obtained pixel values is the sum of the pixel values obtained by the pixel points of the space part can be determined according to the section in which the sum of the calculated pixel values is not in the section in which the sum of the pixel values corresponding to the spaces is located. And taking a connected domain formed by pixel points corresponding to the sum of the pixel values in the first pixel value interval as a space in the single-line text.

Optionally, the connected component formed by the pixels corresponding to the sum of the pixel values in the first pixel value interval may be determined by sequentially detecting, from one end of the text grayscale image in the text arrangement direction, to the other end, whether the sum of the pixel values is within the first pixel value interval. When the sum of pixel values within the first pixel value interval is detected, the start of the connected component is regarded as the start of the connected component, and whether the sum of pixel values is within the first pixel value interval or not is continuously detected in the same direction. When the sum of pixel values not within the first pixel value interval is detected, the end of the connected component corresponding to the sum of the previous pixel values is determined, and thus a connected component is determined.

Optionally, since the characters themselves have a certain interval, as there is an interval between different letters of a word, in order to reduce misrecognition of the space, the width of the space may be limited, a width interval of the space is set, and a connected domain formed by pixel points corresponding to the sum of pixel values in the first pixel value interval, whose width is in the width interval of the space, is used as the space in the single-line text. It is understood that the width of the blank space indicates the width in the direction in which the characters are arranged. For example, in a text gray scale image in which characters are arranged horizontally, the space width is the length of the space in the horizontal direction.

In one embodiment, the width interval of the space may be set according to the character width.

In this embodiment, the character width in the text gray map may be acquired, and the character width may be a width in the character arrangement direction. And setting a width interval of the blank space according to the character width.

In this embodiment, since the number of characters is included in plural in a single line text picture, it is possible to obtain that the width of the character is included in plural. A character width may be determined according to the obtained widths of the plurality of characters for setting a width interval of the space.

Optionally, the width of each character in the text grayscale image may be obtained; and taking the median of the obtained widths of all the characters as the character width of the width interval for setting the blank space.

Optionally, the width of each character in the text grayscale may also be obtained, and an average of the obtained widths of all the characters is used as the character width of the width interval for setting the space.

The width of the character can also be determined according to the range of the interval where the sum of the pixel values is located. Specifically, the width of each connected component formed by the pixel points corresponding to the sum of the pixel values in the second pixel value interval may be used as the width of a single character. That is, each connected domain formed by the pixel points corresponding to the sum of the pixel values in the second pixel value interval corresponds to a character, and the width of each connected domain is taken as the width of the corresponding character.

The second pixel value interval is a pixel value interval in which the sum of pixel values corresponding to the characters in the text gray-scale image is located, or in an area including the characters and not including spaces, the sum of pixel values of each row of pixel points in the preset direction is located. For example, in the pixel arrangement diagram shown in fig. 2, if the pixel point (I2, J2) is the pixel point of the character, the sum of the pixel values of the I2 row is the sum of the pixel values corresponding to the character in the text grayscale image. In addition, it will be appreciated that the text and space have different pixel values, and that the second interval of pixel values is different from the first interval of pixel values.

Optionally, when the width interval of the space is set according to the character width of the width interval for setting the space, since the width of the space is usually larger than one proportion of the character width and smaller than another proportion of the character width, a range formed by multiplying the character width by one proportion to multiplying the character width by another proportion may be used as the width interval of the space. For example, one proportion is one third and the other proportion is one half, the width interval of the space is set to one third of the character width to one half of the character width. The specific ratio setting is not limited in the embodiment of the present application, and may be set according to actual conditions.

Optionally, the widths of the corresponding spaces are different for characters with different font sizes, and the widths of the corresponding characters are different for characters with different font sizes. In this embodiment, space width sections corresponding to different character widths may be set. And after the character width in the text gray level image is determined, determining the width interval of the blank space according to the corresponding relation.

In one embodiment, the width of the blank space is larger than the gap inside the character itself and smaller than the non-character region that may be formed at the end of the character, so in this embodiment, a connected domain with a central width in the connected domain formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval may also be obtained. And adding or subtracting a certain proportion to form a space width interval on the basis of the width of the connected domain with the middle width. For example, if the width of the connected component with the middle width is 6 and the ratio of addition and subtraction is one third, i.e. 2, the space width interval obtained is 4 to 8.

In one embodiment, the width interval of the space may be set in advance according to the general space width.

In the embodiment of the application, on the basis of determining the width space of a space, a connected domain with a width within the width interval in a connected domain formed by pixel points corresponding to the sum of pixel values within the first pixel value interval is determined as the space in the single-line text. That is, of the connected component formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval, the connected component having a width within the width interval of the space is determined as the space in the text gray-scale image, and the connected component having a width outside the width interval is not considered as the space in the text gray-scale image.

Optionally, in the text gray-scale map, the connected component with the width larger than the width interval may also be determined as a blank component after the end of the single-line text.

In this embodiment of the application, after the pixel values of the background are uniform, whether the pixel values are closer to white or closer to black, or the background pixel values are black, and the set first pixel value intervals may be different.

In one embodiment, if the background is uniformly black, that is, the pixel values of the pixels of the background are uniformly 0, that is, the pixel values of the spaces are uniformly 0, and the sum of the number of pixels with pixel values of 0 is 0. Even if the blank has noise points, the number of the noise points is small after the processing, and because the noise points are expressed by the gray values, the sum of the pixel values of the blank part can be concentrated in a small interval and is basically not influenced by factors such as the height of a text gray map, the font size and the like. In this embodiment, for effective fault tolerance, the first pixel value interval may be set to be small, and one pixel value interval may be set as the first pixel value interval, such as 5 to 40, within pixel values smaller than 127.

In this embodiment, the pixel value of the pixel point of the text is greater than 0, the maximum value after summation is uncertain, and the second pixel value interval is an infinite interval with infinite right end, that is, an interval greater than a minimum pixel value. For effective fault tolerance, the first pixel value interval and the second pixel value interval may have intersection, and the minimum pixel value of the first pixel value interval is smaller than the minimum pixel value of the second pixel value interval. For example, the first pixel value interval is set to 5 to 40, and the second pixel value interval is set to be greater than 10.

In an embodiment, if the pixel values of the pixels of the background are unified to be a pixel value other than 0, for example, unified to be greater than the preset pixel value, due to the difference in height of the text gray-scale map in the preset direction, the sum of the pixels of the space portion is different, the minimum pixel value of the sum of the pixels of the space portion is different, and the maximum pixel value that may be reached is also uncertain, that is, the range of the first pixel value interval is indefinite, and the first pixel value interval may be an infinite interval with infinite right end, that is, an interval greater than a minimum pixel value.

Alternatively, in this embodiment, different first pixel value intervals may be set corresponding to different heights. And selecting a corresponding first pixel value interval according to the current actual height of the text gray-scale map.

Optionally, in this embodiment, a default first pixel value interval may be set, and the default first pixel value interval corresponds to a default height. The original text gray map can be converted to a default height in proportion, and then the text gray map at the default height is used as the text gray map for determining the space position. After the position of the blank in the gray-scale image below the default height is determined, the position of the blank in the gray-scale image of the original text is determined according to the proportional relation between the gray-scale image below the default height and the gray-scale image of the original text. The method comprises the steps of obtaining the sum of pixel values of pixel points in each row in a preset direction for a text gray-scale image at a default height, obtaining a connected domain formed by the pixel points corresponding to the sum of the pixel values in a first pixel value interval, and determining a blank according to the obtained connected domain.

Correspondingly, in this embodiment, since the text is close to black and the pixel value is small, the second pixel value interval may be a pixel value interval from one pixel value to another pixel value. However, since the text portion also includes the color of a part of the background, the number of the pixel points is different for the text gray level images with different heights, and the range of the second pixel value interval is also different.

Therefore, in this embodiment, different second pixel value intervals may be set corresponding to different heights. And selecting a corresponding second pixel value interval according to the current actual height of the text gray-scale map.

Optionally, in this embodiment, a default second pixel value interval may be set, and the default second pixel value interval corresponds to a default height. The original text gray-scale map can be converted to the default height in proportion, the character width is calculated according to the text gray-scale map at the default height and the second pixel value interval, and the width interval of the blank space is determined according to the character width.

In addition, in this embodiment, for fault tolerance, the first pixel value interval and the second pixel value interval may intersect, and the minimum pixel value of the first pixel value interval is greater than the minimum pixel value of the second pixel value interval.

In the embodiment of the present application, for each processing of the text grayscale, the processing operation may be that the default background is closer to white or the background is closer to black. If the current color of the background in the text gray-scale image does not accord with the default color, the text gray-scale image can be processed after being subjected to color inversion.

In the embodiment of the application, the pixel points of the background part are determined by segmenting the pixel values, the color of the pixel points of the background part is unified, the blank part belongs to the background, and the pixel values of the blank part are theoretically unified, so that when the sum of the pixel values is calculated, even if the sum is influenced by noise, the sum of the pixel values obtained by the blank part is concentrated, and the blank can be selected through the first pixel value interval, so that the blank of a single line of text in the text gray image is determined.

The embodiment of the present application describes a space identification method in the text line through a specific use scenario.

The gray-scale image of the acquired text is shown in fig. 4, where the background part is closer to white and the text part is closer to black, but includes much noise. Optionally, image preprocessing such as median blurring and equalization may be performed on the text gray scale image, so that the text gray scale image after image preprocessing is used as a text gray scale image for subsequent processing.

In the text gray-scale map, a fitting curve between a pixel value and the number of pixels corresponding to each pixel value may be obtained by using the pixel value as an abscissa and the number of pixels as an ordinate, and the obtained fitting curve is shown as a curve L in fig. 5.

In the curve L, it can be determined that the maximum value among the maximum values is the maximum point m1., since the background portion is more white in the text gray map, the adjacent minimum value to the left of the maximum value is selected, that is, the first minimum value point to the left of m1 is selected, as the minimum value point m2 in fig. 5, the pixel value corresponding to m2 is taken as the split pixel value, as the pixel value corresponding to the minimum value point m2 in fig. 5 is taken as 213 for example.

In the text gray-scale image, the background is more biased to white, for example, the pixel value of the pixel point corresponding to the maximum point represents the pixel value of the background, and it can also be determined that the background is biased to white. And setting pixel points which are larger than the segmentation pixel values in the text gray-scale image as first pixel values. In the embodiment of the present application, 255 is used as a first pixel value, and a pixel point in the text grayscale image with a pixel value greater than 213 is set to 255, and the obtained text grayscale image is shown in fig. 6.

To further remove noise, the text gray-scale map shown in fig. 6 may be closed, and the obtained text gray-scale map is shown in fig. 7. As can be seen from fig. 7, the gray scale image of the text after the closing operation has less noise and the blank part is cleaner.

In the embodiment of the present application, for convenience of calculation, the first pixel value interval and the second pixel value interval may be set to be black for the background, that is, the pixel value of the background is 0. Therefore, the text gray map shown in fig. 7 may be color-reversed, that is, the pixel value of each pixel is set to 255 minus the current pixel value, all white parts with pixel values of 255 in the text gray map are converted into black parts with pixel values of 0, and the obtained text gray map is shown in fig. 8.

The statistics of the sum of the pixel values of each line is performed for fig. 8. The text gray-scale map is a text gray-scale map with characters arranged transversely, and the sum of pixel values of each column is counted. The statistical results obtained can be shown in fig. 9. In the statistical result shown in fig. 9, the ordinate represents the sum of pixel values; the abscissa represents the position of each pixel point in the direction of the arrangement of the letters, or the order of the columns. For example, the ordinate value corresponding to the abscissa 100 in fig. 9 may represent the sum of the pixel values of the 100 th column pixel.

It is determined whether the sum of the individual pixel values is within a first interval of pixel values. In the embodiments of the present application, 5 to 40 are taken as the first pixel value interval as an example. It can be confirmed from the left to the right in the statistical results shown in fig. 9 whether or not the respective ordinate values are in the range of 5 to 40. And determining the abscissa corresponding to the continuous ordinate in the range, and connecting the pixel columns corresponding to the determined abscissas to form a connected domain formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval.

In addition, a space width section may be provided. For example, one third to one half of the character width is determined as the width interval of the space. And the character width determination can be found in the foregoing embodiment.

According to the width section of the blank, the connected domain determined according to the first pixel value section is determined as the blank in the width section of the blank, and the connected domain not in the width section of the blank is determined as the non-blank. Therefore, the space in the text line can be effectively determined by the space identification method in the text line.

The space recognition method may be used in OCR optical character recognition to detect a single line of text after the text is detected. By using the method, the position of the blank is extracted, the divided words are sent to an OCR recognition module to be recognized one word, and then the blank is used for connecting the recognized words, so that the final complete single-line text with the blank can be obtained, and the text which is not recognized by human beings because all the words are connected together without the blank is prevented from being obtained.

The embodiment of the present application further provides a device 300 for identifying a space in a text line, as shown in fig. 10, where the device 300 includes: the image obtaining module 310 is configured to obtain a text grayscale image, where the text grayscale image only includes a single line of text; a pixel value obtaining module 320, configured to calculate a sum of pixel values of each row of pixel points in a preset direction in the text grayscale, where the preset direction is a direction perpendicular to a text arrangement direction in the single-line text; the space determining module 330 is configured to use a connected domain formed by pixels corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, where the first pixel value interval is an interval in which the sum of pixel values corresponding to a space in a text grayscale image is located.

Optionally, the apparatus may further include a segmentation module, including a segmentation pixel value determination unit, configured to obtain a segmentation pixel value in the text grayscale image; the pixel value setting unit is used for setting a pixel point with a pixel value larger than the segmentation pixel value as a first pixel value if the pixel value of the background in the text gray image is larger than the segmentation pixel value, wherein the first pixel value is larger than or equal to the pixel value of the background; and if the pixel value of the background in the text gray image is smaller than the segmentation pixel value, setting the pixel point of which the pixel value is smaller than the segmentation pixel value as a second pixel value, wherein the second pixel value is smaller than or equal to the pixel value of the background.

Optionally, the divided pixel value determining unit is configured to obtain a fitting curve between the pixel value and the number of pixels corresponding to each pixel value by using the pixel value as an abscissa and the number of pixels as an ordinate; obtaining the maximum value of all maximum values in the fitting curve; acquiring a minimum value adjacent to the maximum value; and taking the pixel value corresponding to the acquired minimum value as the segmentation pixel value.

Optionally, the apparatus may further include a denoising module, configured to perform a close operation on the text grayscale map.

Optionally, the space determining module 330 may be configured to obtain a character width in the text gray map; setting a width interval of a blank according to the character width; and determining the connected domain with the width within the width interval in the connected domain formed by the pixel points corresponding to the sum of the pixel values within the first pixel value interval as a blank in the single-line text.

Optionally, the space determining module 330 may be configured to obtain a width of each character in the text gray map; and taking the median of the obtained widths of all the characters as the character width.

Optionally, the space determining module 330 may be configured to use a width of each connected component formed by the pixel points corresponding to the sum of the pixel values in a second pixel value interval as a width of a single character, where the second pixel value interval is different from the first pixel value interval, and the second pixel value interval is an interval in which the sum of the pixel values corresponding to the character in the text grayscale image is located.

Optionally, if a pixel value of a background in the text grayscale image is greater than a preset pixel value, the first pixel value interval and the second pixel value interval are intersected, and a minimum pixel value of the first pixel value interval is greater than a minimum pixel value of the second pixel value interval; if the pixel value of the background in the text gray-scale image is smaller than or equal to a preset pixel value, the first pixel value interval and the second pixel value interval are crossed, and the minimum pixel value of the first pixel value interval is smaller than the minimum pixel value of the second pixel value interval.

Optionally, the apparatus may further include an equalization module, configured to perform equalization processing on the text grayscale map.

According to the space identification method and device in the text line, the best segmentation point is found through intelligence, and the space position in the text can be effectively extracted.

It will be clear to those skilled in the art that, for convenience and brevity of description, the various method embodiments described above may be referred to one another; for the specific working processes of the above-described devices and modules, reference may be made to corresponding processes in the foregoing method embodiments, which are not described herein again.

In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling.

In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. Each module may be configured in different electronic devices, or may be configured in the same electronic device, and the embodiments of the present application are not limited thereto.

Referring to fig. 11, a block diagram of an electronic device 500 according to an embodiment of the present disclosure is shown. The electronic device may include one or more processors 510 (only one shown), memory 520, and one or more programs. Wherein the one or more programs are stored in the memory 520 and configured to be executed by the one or more processors 510. The one or more programs are executed by the processor for performing the methods described in the foregoing embodiments.

The processor 510 may include one or more Processing cores, the processor 510 may be connected to various portions throughout the electronic device 500 using various interfaces and lines to perform various functions and process data of the electronic device 500 by running or executing instructions, programs, code sets, or instruction sets stored in the memory 520, and calling data stored in the memory 520. alternatively, the processor 510 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), Programmable logic Array (Programmable L organic Array, P L a), the processor 510 may be implemented in the form of a Central Processing Unit (CPU), Graphics Processing Unit (GPU), or a modem, wherein the CPU is primarily responsible for Processing operating systems, user interfaces, application programs, etc., the GPU is responsible for displaying content, the modem is used for rendering, and the modem may be implemented separately for communication, or may be implemented in a separate chip.

The Memory 520 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 520 may be used to store instructions, programs, code sets, or instruction sets. The memory 520 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function, instructions for implementing the various method embodiments described above, and the like. The stored data area may also store data created by the electronic device in use, and the like.

Referring to fig. 12, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable storage medium 700 has stored therein program code that can be called by a processor to execute the methods described in the above-described method embodiments.

The computer-readable storage medium 700 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, the computer-readable storage medium 700 includes a non-volatile computer-readable storage medium. The computer readable storage medium 700 has storage space for program code 710 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 710 may be compressed, for example, in a suitable form.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method of space recognition in a line of text, the method comprising:

acquiring a text gray image, wherein the text gray image only comprises a single line of text;

calculating the sum of pixel values of each row of pixel points in the text gray-scale image in a preset direction, wherein the preset direction is a direction vertical to the arrangement direction of characters in the single-line text;

and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the space in the text gray image is located.

2. The method of claim 1, wherein before calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray-scale map, the method further comprises:

acquiring a segmentation pixel value in the text gray-scale image;

if the pixel value of the background in the text gray image is larger than the segmentation pixel value, setting the pixel point of which the pixel value is larger than the segmentation pixel value as a first pixel value, wherein the first pixel value is larger than or equal to the pixel value of the background;

and if the pixel value of the background in the text gray image is smaller than the segmentation pixel value, setting the pixel point of which the pixel value is smaller than the segmentation pixel value as a second pixel value, wherein the second pixel value is smaller than or equal to the pixel value of the background.

3. The method of claim 2, wherein obtaining the segmented pixel values in the text gray map comprises:

taking the pixel values as horizontal coordinates and the pixel point number as vertical coordinates, and obtaining a fitting curve between the pixel values and the pixel point number corresponding to each pixel value;

obtaining the maximum value of all maximum values in the fitting curve;

acquiring a minimum value adjacent to the maximum value;

and taking the pixel value corresponding to the acquired minimum value as the segmentation pixel value.

4. The method of claim 1, wherein before calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray-scale map, the method further comprises:

and performing closing operation on the text gray-scale map.

5. The method of claim 1, wherein the step of forming a connected component by using pixels corresponding to the sum of pixel values in the first pixel value interval as a space in the single-line text comprises:

acquiring the character width in the text gray-scale image;

setting a width interval of a blank according to the character width;

and determining the connected domain with the width within the width interval in the connected domain formed by the pixel points corresponding to the sum of the pixel values within the first pixel value interval as a blank in the single-line text.

6. The method of claim 5, wherein the obtaining the character width in the text gray map comprises:

acquiring the width of each character in the text gray-scale image;

and taking the median of the obtained widths of all the characters as the character width.

7. The method of claim 6, wherein obtaining the width of each character in the text gray map comprises:

and taking the width of each connected domain formed by the pixel points corresponding to the sum of the pixel values in a second pixel value interval as the width of a single character, wherein the second pixel value interval is different from the first pixel value interval, and the second pixel value interval is the interval in which the sum of the pixel values corresponding to the characters in the text gray-scale image is located.

8. The method of claim 7, wherein if the pixel value of the background in the text gray-scale image is greater than a predetermined pixel value, the first pixel value interval crosses the second pixel value interval, and the minimum pixel value of the first pixel value interval is greater than the minimum pixel value of the second pixel value interval;

if the pixel value of the background in the text gray-scale image is smaller than or equal to a preset pixel value, the first pixel value interval and the second pixel value interval are crossed, and the minimum pixel value of the first pixel value interval is smaller than the minimum pixel value of the second pixel value interval.

9. The method according to any one of claims 1 to 8, wherein before calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray-scale map, the method further comprises: and carrying out equalization processing on the text gray-scale map.

10. An apparatus for space recognition in a line of text, the apparatus comprising:

the image acquisition module is used for acquiring a text gray image, and the text gray image only comprises a single line of text;

the pixel value acquisition module is used for calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray-scale image, wherein the preset direction is a direction vertical to the character arrangement direction in the single-line text;

and the space determining module is used for taking a connected domain formed by pixels corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the space in the text gray image is located.

11. An electronic device, comprising:

one or more processors;

a memory;

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors for performing the method recited in any of claims 1-9.

12. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 9.