CN111461126B

CN111461126B - Space recognition method and device in text line, electronic equipment and storage medium

Info

Publication number: CN111461126B
Application number: CN202010231850.8A
Authority: CN
Inventors: 尚太章
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-03-23
Filing date: 2020-03-23
Publication date: 2023-08-18
Anticipated expiration: 2040-03-23
Also published as: CN111461126A; WO2021190155A1

Abstract

The application discloses a space recognition method, a device, electronic equipment and a storage medium in a text line, and relates to the technical field of image processing. Wherein the method comprises the following steps: acquiring a text gray scale map, wherein the text gray scale map only comprises a single line of text; calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray level diagram, wherein the preset direction is a direction perpendicular to the word arrangement direction in the single-row text; and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the blank in the text gray level diagram is located. The technical scheme can determine the blank space in the single-line text.

Description

Space recognition method and device in text line, electronic equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and apparatus for identifying a space in a text line, an electronic device, and a storage medium.

Background

In the image, if there is a line of characters, space in the line of characters needs to be extracted to determine which characters have spaces between them, so as to obtain real text information containing the spaces.

Disclosure of Invention

In view of the above, the present application proposes a method, an apparatus, an electronic device, and a storage medium for identifying spaces in text lines, so as to improve the above-mentioned problems.

In a first aspect, an embodiment of the present application provides a method for identifying a space in a text line, where a text gray scale map is obtained, where the text gray scale map includes only a single line of text; calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray level diagram, wherein the preset direction is a direction perpendicular to the word arrangement direction in the single-row text; and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the blank in the text gray level diagram is located.

In a second aspect, an embodiment of the present application provides a space recognition apparatus in a text line, the apparatus including: the image acquisition module is used for acquiring a text gray level image, wherein the text gray level image only comprises a single line of text; the pixel value acquisition module is used for calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray level diagram, wherein the preset direction is a direction perpendicular to the word arrangement direction in the single-row text; the space determining module is used for taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-row text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the blank in the text gray level diagram is located.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs being executed by the processors for performing the methods described above.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium having program code stored therein, the program code being callable by a processor to perform the method described above.

According to the space recognition method, the device, the electronic equipment and the storage medium in the text line, in the text gray level diagram only comprising a single line of text, the sum of the pixel values of each row of pixel points in the preset direction perpendicular to the text arrangement direction is calculated, so that whether the position of the pixel point corresponding to the sum of the pixel values is a space is determined according to whether the sum of the pixel values is in the sum section of the pixel values corresponding to the space. And determining the connected domain formed by the pixel points with the spaces at the positions as the spaces, thereby determining the spaces in the single-line text.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a method for identifying spaces in text lines according to an embodiment of the present application.

Fig. 2 is a schematic diagram illustrating pixel arrangement according to an embodiment of the application.

Fig. 3 is a flowchart illustrating a method for identifying spaces in text lines according to another embodiment of the present application.

Fig. 4 shows a schematic text gray scale provided by an embodiment of the present application.

Fig. 5 shows a graph of a fit between pixel values and the number of pixels in a gray-scale image of text provided by an embodiment of the present application.

Fig. 6 is a schematic diagram after unifying pixel values of a background portion according to an embodiment of the present application.

Fig. 7 shows a schematic text gray scale map provided by an embodiment of the present application after the text gray scale map shown in fig. 6 is closed.

Fig. 8 shows a schematic diagram of the color of fig. 7 after being flipped, according to an embodiment of the present application.

Fig. 9 shows a graph of the statistical result after calculating the sum of pixel values for each column of pixel points in fig. 8.

Fig. 10 is a functional block diagram of a space recognition device in text lines according to an embodiment of the present application.

Fig. 11 shows a block diagram of an electronic device according to an embodiment of the present application.

Fig. 12 is a memory unit for storing or carrying program codes for implementing a space recognition method in a text line according to an embodiment of the present application.

Detailed Description

In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present application with reference to the accompanying drawings.

Since the text in the picture cannot be directly edited, copied, cut, or other word processing operations, it is generally necessary to recognize the text to obtain text that can be expressed in text form, and thus the obtained text can be edited, copied, cut, or other word processing operations.

For processing Chinese characters in a picture, each character can exist as a single text, and each character can be ordered at certain intervals to form a text with real text information no matter whether a space is recognized or not. However, for the recognition of other languages, such as English words, each word is composed of corresponding alphabetic characters, and the letters of the different words are selected from the same alphabet. When a continuous line of characters appears, space extraction between words is very important, if the space cannot be extracted, each line of obtained text will be a series of letters connected together, and what words cannot be specified respectively, so that subsequent machine processing difficulty and recognition difficulty for human self understanding are caused.

Therefore, the embodiment of the application provides a space recognition method, a device, electronic equipment and a storage medium in text lines, which are used for determining the space of a single-line text in a text gray scale map by calculating whether the sum of pixel values obtained through calculation is in the section of the sum of pixel values corresponding to the blank in the text gray scale map. The method, the device, the electronic equipment and the storage medium for identifying the space in the text line provided by the embodiment of the application are described in detail below through specific embodiments.

Referring to fig. 1, a method for identifying spaces in text lines according to an embodiment of the present application is shown. Specifically, the method comprises the following steps:

step S110: and acquiring a text gray scale map, wherein the text gray scale map only comprises a single line of text.

The text gray scale map is a gray scale map that includes only one line of text, i.e., a single line of text in the text gray scale map. Spaces in a line of text in the gray scale map of the text are identified, i.e., spaces in the single line of text are identified.

In the embodiment of the application, the specific arrangement direction of a single line of text in the text gray scale is not limited, and the text gray scale can be transversely arranged; the arrangement may be longitudinal, but may be in other directions. The embodiments of the present application will be described by taking a lateral arrangement as an example.

Optionally, the text gray scale is not limited in this embodiment of the present application, and the text gray scale is identified as whether the text is arranged in a horizontal or vertical direction. For example, the horizontal arrangement may be handled by default, or the vertical arrangement may be handled by default. For another example, the text alignment direction is determined according to two sides perpendicular to each other in the text gray scale map, and the extending direction of the longer side can be determined to be the text alignment direction.

Step S120: calculating the sum of pixel values of each row of pixel points in the preset direction in the text gray level diagram, wherein the preset direction is a direction perpendicular to the word arrangement direction in the single-row text.

In the text gray level diagram, the direction of character arrangement in a single line text is defined as a character arrangement direction, and the direction perpendicular to the character arrangement direction is defined as a preset direction. For example, in a single line of text arranged horizontally, the horizontal direction is the text arrangement direction, and the vertical direction is the preset direction.

In the embodiment of the application, the sum of the pixel values of each row of pixel points in the preset direction in the text gray scale can be calculated. The row of pixel points in the preset direction represents that the arrangement direction of the row of pixel points is the preset direction. For example, in a single-row text with characters arranged horizontally, a row of pixels in the vertical direction is a column of pixels, and the sum of pixel values of each column of pixels in the text gray scale map is calculated. As shown in fig. 2, a schematic diagram of pixels in a text gray scale map with transversely arranged characters is shown, and the pixels in the I1 th column include (I1, J1) (I1, J2) (I1, J3); the pixel point of the I2 column comprises (I2, J1) (I2, J2) (I2, J3); the pixel of column I3 includes (I3, J1) (I3, J2) (I3, J3), and so on. The preset direction is the longitudinal direction, and the sum of pixel values of each row of pixel points in the preset direction is calculated, namely, the sum of gray values of each column of pixel points is calculated, and the sum of 7 pixel values respectively corresponding to the I1 th column to the I7 th column is obtained.

Alternatively, since the pixels are closely arranged, each space may include multiple rows of pixels, and in order to increase the calculation speed, in an embodiment of the present application, the sum of pixel values of each two adjacent rows or each multiple adjacent rows of pixels in the preset direction may also be calculated.

Alternatively, since the pixels are closely arranged, each space may include multiple rows of pixels, and in order to increase the calculation speed, in an embodiment of the present application, the sum of the pixel values of one or more rows of pixels in the preset direction may be calculated by one row or multiple rows.

Step S130: and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the blank in the text gray level diagram is located.

In order to facilitate recognition of text, in a single text picture, the colors of the text are generally the same or close, the colors of the background are the same or close, and the color difference between the text and the background is large. That is, the pixel values of the pixels forming the text are the same or close, the pixel values of the pixels forming the background are the same or close, the difference between the pixel values of the pixels forming the text and the pixel values of the pixels forming the background is large, and if the difference between the pixel values is larger than a certain preset pixel difference value. The background, namely the part outside the text in the single-line text picture, comprises blank spaces, upper, lower, left, right and other areas of the text line. In a single line text picture, the pixel values of the pixels forming the text differ significantly from the pixel values of the pixels forming the space.

Therefore, in the embodiment of the present application, after the pixel points in the space are used for calculating the sum of the pixel values in the previous step, the pixel points may be in a range of pixel value intervals, which is defined as a first pixel value interval. The first pixel value interval has uniqueness and is different from the pixel value interval range which the pixel points in the text can be in after being used for calculating according to the mode of calculating the sum of the pixel values in the previous step.

The connected domain formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval can be identified as a space in the single-line text. That is, a region composed of pixels corresponding to the sum of pixel values in the first pixel value interval is determined as a space.

The pixel value of each row of pixel points in the preset direction is calculated, or the pixel value of each two adjacent rows or each multiple adjacent rows of pixel points in the preset direction is calculated, and the pixel point corresponding to the pixel value of the sum can be all the pixel points of which the pixel value is used for calculating the sum of the pixel values.

If the sum of the pixel values of the pixel points is calculated, calculating the sum of the pixel values of one or more rows of the pixel points in the preset direction at intervals of one or more rows, the pixel points corresponding to the sum of the pixel values may include all the pixel points whose pixel values are used for calculating the sum of the pixel values, and the pixel points which are not subjected to the calculation of the sum of the pixel values and are spaced when the sum of the pixel values is calculated. For example, in a single line of text arranged laterally, the sum of the pixel values of the first column of pixels, the sum of the pixel values of the third column of pixels, and so on are calculated. The pixel corresponding to the sum of the pixel values of each odd column may include the pixel of the odd column and the pixel of the even column spaced apart by the odd column, and it is understood that in this example, the pixel of the even column spaced apart by the pixel of each odd column is the pixel of the even column smaller than the odd column.

In the embodiment of the application, for a text gray level map where a single-line text is located, a direction perpendicular to the arrangement direction of the characters in the single-line text is taken as a preset direction, and the sum of pixel values of each row of pixel points in the preset direction is calculated. And determining which pixel value sum is in the first pixel value interval according to the first pixel value interval in which the pixel value sum of the pixel points corresponding to the blank in the text gray scale map possibly exists, and taking a connected domain formed by the pixel points corresponding to the pixel value sum in the first pixel value interval as a blank in the single-row text, so that the blank in the single-row text is accurately identified.

In the space recognition method in the text line provided by the other embodiment of the application, the color of the background part is unified, so that the pixel values of the space part are more unified, the difference between the calculated different pixel values is smaller, the space recognition method is more centralized, and the space recognition method is convenient and accurate to set to the first pixel value interval in which the sum of the pixel values can be measured. Referring to fig. 3, the method includes:

step S210: and acquiring a text gray scale map, wherein the text gray scale map only comprises a single line of text.

In the embodiment of the present application, the specific method for acquiring the text gray scale is not limited. In the text tone map, the size of the text, the ratio between the text height and the picture height, the ratio between the text width and the picture width, and the like are not limited.

Alternatively, the text gray map may be a single line text picture including only a single line of text obtained by extracting a single line of text from the text picture. And converting the single-line text picture into the text gray map through image preprocessing. The method of extracting the text of a single line is not limited in the embodiment of the present application, and may be, for example, extracting by a deep learning algorithm, such as a textboxes series algorithm, an east series algorithm, a sglink, and the like.

Alternatively, the text gray scale map may be obtained by performing image preprocessing on a single line text picture that itself includes only one line of text.

Wherein, in the embodiment of the application, the image preprocessing can comprise one or more of the following:

if the single-line text picture is not a gray level picture, for example, an RGB three-channel picture, the single-line text picture can be subjected to gray level treatment and converted into a gray level picture, and the gray level picture is used as the text gray level picture;

denoising the text gray level map, such as median blurring;

and carrying out equalization processing on the text gray level graph so as to enable the pixel value distribution in the text gray level graph to be more balanced and prevent excessive offset of the pixel value in the picture.

Optionally, when the image preprocessing includes two or more processing modes, the processing sequence among the processing modes may be consistent with the above description sequence, denoising is performed after the gray level conversion, and the processing is balanced after denoising, so that the difficulty of each step of processing is reduced, and the effectiveness of the processing is improved.

In addition, optionally, in the embodiment of the present application, after the obtained text gray-scale image is obtained, image preprocessing operations such as denoising and equalization may be performed.

Step S220: and obtaining the segmentation pixel value in the text gray scale map.

The space portion is not a mere pixel value due to the presence of noise in the text gray scale map and the text gray scale map itself is not binarized, and the pixel value that is more likely to be a space can be determined and unified by dividing the pixel value.

That is, the divided pixel values in the text gray scale map can more accurately distinguish between a space and a text, and the pixel value on one side of the divided pixel value has a higher probability of being the pixel value of the space than the pixel value of the text; the pixel value on the other side of the divided pixel value is more probable to be the pixel value of the character than the space. That is, the probability that the pixel value on the divided pixel value side is a space pixel value is larger than the probability that the pixel value is a character pixel value; the other side of the divided pixel value is a character with a larger probability of being a space.

One side and the other side of the divided pixel value represent two opposite sides of the divided pixel value, and the pixel values of the two opposite sides are respectively a pixel value larger than the divided pixel value and a pixel value smaller than the divided pixel value. Which side is more probable is space and which side is more probable is text, is determined according to the space and the actual pixel of the text in the gray scale map of the text.

Therefore, in the embodiment of the application, the divided pixel values in the text gray level diagram can be obtained, and the pixel values on one side of the space with higher probability are unified into the same pixel value, so that the pixel values calculated in the space are more concentrated. In order to effectively distinguish between a space and a character, the unified pixel value is different from the pixel value on the side of the character with a higher probability.

Since the background portion including the space generally has a larger difference in color from the text, and the number of pixels in the background portion is larger than that of the text, alternatively, the pixel value may be taken as an abscissa, and the number of pixels as an ordinate, and a coordinate system may be established for determining the change of the number of pixels with the pixel value. In the coordinate system, a fitting curve between the pixel values and the number of pixel points corresponding to each pixel value can be obtained, and the maximum value of all the maximum values in the fitting curve is obtained. The maximum value is the pixel value of the background, and the maximum value is the pixel value of the space.

Since one side of the split pixel value needs to be more probable to be a text pixel value and the other side needs to be more probable to be a space pixel value, the number of split pixel values should be small and the more probable to be between the text pixel value and the space pixel value. Therefore, the selected divided pixel value may be a pixel value with a smaller number of corresponding pixel points, and a minimum value adjacent to the maximum value may be obtained, where the pixel value corresponding to the obtained minimum value is used as the divided pixel value. The minimum value indicates that the number of pixels is small, and is adjacent to the maximum value, indicating that its corresponding pixel value may lie between a text pixel value and a space pixel value.

In addition, from the divided pixel value between the space pixel value and the text pixel value, the minimum value adjacent to the left of the maximum value or the minimum value adjacent to the right of the maximum value can be selected according to the actual situation of the text gray scale. The minimum value at the left side of the maximum value represents a minimum value of which the corresponding pixel value is smaller than the pixel value corresponding to the maximum value; the minimum value on the right of the maximum value indicates a minimum value at which the corresponding pixel value is larger than the pixel value corresponding to the maximum value.

The pixel value distribution is more balanced in the text gray scale after the equalization processing. The color distinction degree between the background and the text is larger, and the pixel value of the background and the pixel value of the text are larger. The background comprises a space, and the processing of the background pixel value can realize the processing of the blank pixel value by the pixel value of the background representing the pixel value of the space. If the background is closer to white, the text is closer to black, the segmentation pixel value should be smaller than the pixel value corresponding to the maximum value, and the minimum value adjacent to the maximum value and at the left side of the maximum value is selected; if the background is closer to black, the text is closer to white, the divided pixel value should be larger than the pixel value corresponding to the maximum value, and the minimum value to the right of the maximum value and adjacent to the maximum value is selected.

In one embodiment, whether the background is closer to white or black may be default, and processing is performed directly in accordance with the processing mode corresponding to the default color. For example, in the default text tone map, the background is white or the background is closer to white, and the pixel value corresponding to the minimum value on the left of the maximum value is selected as the divided pixel value by processing the text tone map so that the background is closer to white.

Alternatively, in this embodiment, if the actual color of the background is different from the default background color, the color of the background and the color of the text in the text gray scale may be converted to the default color by color flipping. If the default background is white or closer to white, but the background is black or closer to black in the actual text gray scale image, the pixel value of each pixel point can be converted into a difference value of 255 minus the current pixel value, so that the black and white color can be turned over. For example, a pixel value of a certain pixel point is 214, and the converted pixel value becomes (255-214) =41.

In another embodiment, white and black may be distinguished by one preset pixel value to distinguish the background closer to white and black. In this embodiment, if the pixel value of the background is greater than the preset pixel value, determining that the background is closer to white; if the color of the background is less than or equal to the preset pixel value, determining that the background is closer to black. The specific value of the preset pixel value is not particularly limited, and may be set according to practical requirements, for example, set to be a central gray value, for example, 127 or 128.

Alternatively, in this embodiment, since there are many pixels, the pixel value of the background may be represented by the pixel value corresponding to the maximum value, that is, the pixel value corresponding to the maximum value is taken as the background pixel value.

Optionally, since the corners in the text gray scale map are usually a part of the background, one of the average pixel values of the four corners, the average pixel value of one or more corners in the four corners, the pixel value of the most corresponding pixel point in the four corners, and the pixel value of the most corresponding pixel point in one or more corners in the four corners may be selected to represent the pixel value of the background.

Alternatively, the method of obtaining the divided pixel values may be that, in a coordinate system established by taking the pixel value as an abscissa and the number of pixels as an ordinate, the pixel value with the largest number of corresponding pixels is obtained, and the number of pixels corresponding to the pixel value is defined as the maximum value. And acquiring a minimum value adjacent to the maximum value, and taking a pixel value corresponding to the acquired minimum value as a segmentation pixel value. The specific manner of obtaining the minimum value adjacent to the maximum value is referred to in the foregoing description, and will not be described herein.

Step S230: and if the pixel value of the background in the text gray scale map is larger than the segmentation pixel value, setting the pixel point with the pixel value larger than the segmentation pixel value as a first pixel value, wherein the first pixel value is larger than or equal to the pixel value of the background. And if the pixel value of the background in the text gray scale map is smaller than the segmentation pixel value, setting the pixel point with the pixel value smaller than the segmentation pixel value as a second pixel value, wherein the second pixel value is smaller than or equal to the pixel value of the background.

If the background is closer to white, the pixel value of the background is larger than the segmentation pixel value, and the pixel value of the background can be converted into the same pixel value. Specifically, a pixel point having a pixel value greater than the divided pixel value may be set as a first pixel value greater than or equal to the pixel value of the background. Alternatively, in the embodiment of the present application, the background may be uniformly converted into white, that is, the first pixel value is 255.

If the background is closer to black, the pixel value of the background is smaller than the segmentation pixel value, and the pixel value of the background can be converted into the same pixel value. Specifically, a pixel point having a pixel value smaller than the divided pixel value may be set as a second pixel value smaller than or equal to the pixel value of the background. Alternatively, in the embodiment of the present application, the background may be uniformly converted to black, that is, the second pixel value is 0.

Alternatively, in the embodiment of the present application, it may be determined whether the pixel point of the background should be set to the first pixel value or the second pixel value by acquiring the pixel value of the background in the text gray scale map.

Alternatively, in this embodiment, it is also possible to default which color the background is closer to, and set the pixel value corresponding to the closer color. If the default background is closer to white, the pixel point with the pixel value larger than the segmentation pixel value is directly set as the first pixel value.

Alternatively, if the actual color of the background does not match the default color condition, the pixel values in the text gray scale map may be flipped. For example, if the background in the default text gray scale map is closer to white, but the background in the actual text gray scale map is closer to black, the pixel value of each pixel point is converted into a difference value of 255 minus the current pixel value, so as to realize the inversion of black and white color.

In the embodiment of the application, the background comprises a space, and the pixel value of the background is converted into the same pixel value, so that the pixel value of the space is converted into the same pixel value.

Step S240: calculating the sum of pixel values of each row of pixel points in the preset direction in the text gray level diagram, wherein the preset direction is a direction perpendicular to the word arrangement direction in the single-row text.

Optionally, in the embodiment of the present application, since the closing operation can eliminate a narrow break and a long and thin gap, eliminate a small hole, and fill a break in a contour line, the closing operation can be performed on the gray level diagram of the text, so that noise of a space part is further reduced, pixel values of the space part are purer, and the sum of pixel values obtained by calculating the space part is less disturbed, so that the text gray level diagram can be more concentrated.

Step S250: and taking a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval as a space in the single-line text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to the blank in the text gray level diagram is located.

In the text gray scale, the sum of the pixel values of each row of pixel points in the direction perpendicular to the text arrangement direction is calculated, i.e. the pixel values of each row of pixel points are summed. Since the pixel values of the spaces are unified, the sum of the pixel values obtained in the space portion is more concentrated in one section, and therefore, whether or not the sum of the obtained pixel values is the sum of the pixel values obtained in the space portion can be determined based on whether or not the sum of the calculated pixel values is in the section in which the sum of the pixel values corresponding to the spaces is located. And taking a connected domain formed by pixel points corresponding to the sum of pixel values in the first pixel value interval as a blank space in the single-line text.

Alternatively, the determination method of the connected domain formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval may be that, starting from one end in the text alignment direction in the text gray scale map, whether the sum of the pixel values is in the first pixel value interval is sequentially detected from the other end. When the sum of the pixel values within the first pixel value section is detected, as the start of the connected domain, and whether the sum of the pixel values is within the first pixel value section is continuously detected in the same direction. When the sum of pixel values not within the first pixel value interval is detected, the end of the connected domain corresponding to the sum of the previous pixel values is determined, thereby determining one connected domain.

Optionally, since the text itself also has a certain interval, like there is an interval between different letters of a word, in order to reduce misidentification of spaces, the width of a space may be defined, a width section of a space is set, and the width of a space in the width section of a space in a connected domain formed by pixels corresponding to the sum of pixel values in the first pixel value section is used as a space in the single-line text. It is understood that the width of a space indicates the width in the direction of the letter arrangement. For example, in a text gray scale in which characters are arranged laterally, the space width is the length of the space in the lateral direction.

In one embodiment, the width interval of the space may be set according to the character width.

In this embodiment, the character width in the text gray scale map may be acquired, and the character width may be a width in the text arrangement direction. And setting the width interval of the blank according to the character width.

In this embodiment, since the number of characters includes a plurality in a single line of text pictures, it is possible to obtain that the width of the characters includes a plurality. One character width may be determined based on the obtained widths of the plurality of characters for setting the width section of the space.

Alternatively, the width of each character in the text gray scale map may be obtained; the median of the widths of all the characters obtained is taken as the character width of the width section for setting the space.

Alternatively, the width of each character in the text tone map may be acquired to obtain the average of the widths of all the characters acquired as the character width of the width section for setting the space.

The width of the character may be determined according to the range of the interval where the sum of the pixel values is located. Specifically, the width of each connected region formed by the pixel points corresponding to the sum of the pixel values in the second pixel value interval may be regarded as the width of a single character. That is, each connected domain formed by the pixel points corresponding to the sum of the pixel values in the second pixel value interval corresponds to one character, and the width of each connected domain is used as the width of the corresponding character.

The second pixel value interval is an interval in which the sum of pixel values corresponding to characters in the text gray level diagram is located, or a pixel value interval in which the sum of pixel values of each row of pixel points in the preset direction is located in an area which comprises the characters and does not comprise spaces. The sum of pixel values corresponding to the character is the sum of pixel values of a row of pixel points where the pixel points of the character are located, for example, in the pixel arrangement diagram shown in fig. 2, if the pixel points (I2, J2) are the pixel points of the character, the sum of pixel values of the I2 column is the sum of pixel values corresponding to the character in the text gray scale map. In addition, it is understood that the text and space have different pixel values, and the second pixel value interval is different from the first pixel value interval.

Alternatively, when the width section of the space is set according to the character width of the width section for setting the space, since the width of the space is generally larger than one proportion of the character width and smaller than the other proportion of the character width, the section formed by multiplying the character width by one proportion to the character width by the other proportion may be used as the width section of the space. For example, one ratio is one third and the other ratio is one half, and the width section of the space is set to one third to one half of the character width. The specific ratio is not limited in the embodiment of the present application, and may be set according to actual conditions.

Alternatively, the corresponding space width is different due to the characters of different character sizes, and the corresponding character width is also different for the characters of different character sizes. In this embodiment, space width sections corresponding to different character widths may be set. After the character width in the text gray level diagram is determined, the width interval of the space is determined according to the corresponding relation.

In one embodiment, since the space has a width larger than the gap in the text itself and smaller than the text-free region that may be formed at the end of the text, a connected region having a width centered in the connected region formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval may be obtained. On the basis of the width of the communication domain with the central width, a certain proportion is added and subtracted to form a space width interval. For example, if the width of the connected domain with the central width is 6 and the addition and subtraction ratio is one third, namely 2, the space width interval is 4 to 8.

In one embodiment, the space width section may be preset according to a common space width.

In the embodiment of the application, on the basis of determining the space of the width of the space, the connected domain with the width in the width interval is determined as the space in the single-line text in the connected domain formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval. That is, the connected domain having a width within the width range of the space among the connected domains formed by the pixel points corresponding to the sum of the pixel values within the first pixel value range is determined as a space in the text gray scale, and the connected domain having a width outside the width range is not considered as a space in the text gray scale.

Alternatively, in the text gray scale map, a connected region having a width larger than the width section may be determined as a blank region portion after the end of a single line of text.

In the embodiment of the present application, after the pixel values of the background are unified, the pixel values closer to white or closer to black, or the background pixel value is black, the set first pixel value intervals may be different.

In one embodiment, if the background is uniform black, that is, the pixel values of the pixels of the background are uniform to 0, that is, the pixel values of the spaces are uniform to 0, the sum of the pixel values of the pixels having the pixel values of 0 is 0. Even if the space has noise, the number of the noise after processing is small, and the noise is represented by the gray value, the sum of the pixel values in the space part can be concentrated in a small interval, and the noise is basically not influenced by factors such as the height of a gray level image of a text, the size of a font and the like. In this embodiment, for effective fault tolerance, the first pixel value interval may be set to be small, and one pixel value interval, such as 5 to 40, is set as the first pixel value interval within the pixel values smaller than 127.

In this embodiment, the pixel value of the pixel point of the text is greater than 0, the maximum value after summation is uncertain, and the second pixel value interval is an infinite interval with infinite right end, that is, an interval greater than one minimum pixel value. For effective fault tolerance, the first pixel value interval and the second pixel value interval may have an intersection, and a minimum pixel value of the first pixel value interval is smaller than a minimum pixel value of the second pixel value interval. For example, the first pixel value interval is set to 5 to 40, and the second pixel value interval is set to more than 10.

In one embodiment, if the pixel values of the pixels of the background are unified to be non-0 pixel values, for example unified to be greater than a preset pixel value, the sum of the pixel points of the blank part is different due to the difference of the heights of the text gray level images in the preset direction, the sum of the pixel points of the blank part is different, and the possible maximum pixel value is also uncertain, that is, the range of the first pixel value interval is not definite, and the first pixel value interval can be an infinite interval with infinite right end, that is, an interval greater than a minimum pixel value.

Alternatively, in this embodiment, different first pixel value sections may be set corresponding to different heights. And selecting a corresponding first pixel value interval according to the current actual height of the text gray scale map.

Alternatively, in this embodiment, a default first pixel value interval may be set, where the default first pixel value interval corresponds to a default height. The original text gray level map can be scaled to a default height and then the text gray level map at the default height is used as the text gray level map for determining the space position. After determining the space position in the text gray level diagram under the default height, determining the space position in the original text gray level diagram according to the proportional relation between the text gray level diagram and the original text gray level diagram under the default height. The method comprises the steps of obtaining the sum of pixel values of pixel points of each row in a preset direction for a text gray level diagram at a default height, obtaining a connected domain formed by pixel points corresponding to the sum of pixel values in a first pixel value interval, and determining a space according to the obtained connected domain.

Correspondingly, in this embodiment, since the text is near black, the pixel value is smaller, and the second pixel value interval may be a pixel value interval from one pixel value to another. However, since the text portion further includes a partial background color, the number of pixels is different for the text gray scale images of different heights, and the range of the second pixel value interval is also different.

Thus, alternatively, in this embodiment, different second pixel value sections may be set corresponding to different heights. And selecting a corresponding second pixel value interval according to the current actual height of the text gray scale map.

Alternatively, in this embodiment, a default second pixel value interval may be set, where the default second pixel value interval corresponds to a default height. The original text gray level map can be scaled to a default height, then the character width is calculated according to the text gray level map at the default height and the second pixel value interval, and the space width interval is determined according to the character width.

In this embodiment, for fault tolerance, the first pixel value section and the second pixel value section may intersect, and the minimum pixel value of the first pixel value section may be larger than the minimum pixel value of the second pixel value section.

In the embodiment of the application, for each processing of the text gray scale image, the processing operation can be default that the background is closer to white or the background is closer to black. If the current color of the background in the text gray level map is not consistent with the default color, the text gray level map can be processed after being color-turned.

In the embodiment of the application, the pixel points of the background part are determined by dividing the pixel values, the pixel points of the background part are unified in color, the blank part belongs to the background, and the pixel values of the blank part are unified theoretically, so that when the sum of the pixel values is calculated, even if the sum of the pixel values is influenced by noise points, the sum of the pixel values obtained by the blank part is concentrated, the blank can be selected through a first pixel value interval, and the blank of a single line of text in the text gray level diagram is determined.

The embodiment of the application describes a space recognition method in the text line through a specific use scene.

The acquired text gray scale map is shown in fig. 4, the background part is closer to white, the text part is closer to black, but includes much noise. Optionally, image preprocessing such as median blurring and equalization can be performed on the text gray map, so that the text gray map after image preprocessing can be used as a text gray map for subsequent processing.

In the text gray scale map, a pixel value is taken as an abscissa, the number of pixel points is taken as an ordinate, a fitting curve between the pixel value and the number of pixel points corresponding to each pixel value is obtained, and the obtained fitting curve is shown as a curve L in fig. 5.

In this curve L, the maximum value among the maximum values can be determined as the extreme point m1. Since the background portion is more white in this text gray scale, an adjacent minimum value to the left of the large value is selected, i.e., the first minimum value point to the left of m1 is selected, as in the minimum value point m2 in fig. 5. The pixel value corresponding to m2 is taken as the divided pixel value. The embodiment of the present application is exemplified by a pixel value corresponding to the minimum value point m2 as 213 in fig. 5.

In the text gray scale, the background is more biased to white, for example, the pixel value of the background is represented by the pixel value of the pixel point corresponding to the maximum point, and it is also possible to determine that the background is more biased to white. And setting the pixel point larger than the segmentation pixel value in the text gray scale map as a first pixel value. In the embodiment of the present application, 255 is used as a first pixel value, and a pixel point with a pixel value greater than 213 in the text gray scale map is set to 255, and the obtained text gray scale map is shown in fig. 6.

To further remove noise, the text gray scale map shown in fig. 6 may be closed, and the obtained text gray scale map is shown in fig. 7. As can be seen from fig. 7, in the text gray scale after the closing operation, the noise is less, and the blank part is purer.

In the embodiment of the present application, for more convenient calculation, the set first pixel value interval and second pixel value interval may be black for the background, that is, the pixel value of the background is 0. Therefore, the text gray scale image shown in fig. 7 can be subjected to color inversion, that is, the pixel value of each pixel point is set to 255 minus the current pixel value, so that all white parts with the pixel value of 255 in the text gray scale image are converted into black parts with the pixel value of 0, and the obtained text gray scale image is shown in fig. 8.

Statistics of the sum of pixel values for each row are performed for fig. 8. The text gray level map is a text gray level map with characters arranged transversely, and the sum of pixel values of each column is counted. The statistics obtained may be as shown in fig. 9. In the statistical result as shown in fig. 9, the ordinate represents the sum of pixel values; the abscissa indicates the position of each pixel point in the character arrangement direction, or what number of columns. For example, the ordinate value corresponding to the abscissa 100 in fig. 9 may represent the sum of pixel values of the pixel points of the 100 th column.

It is determined whether the sum of the individual pixel values is within the first pixel value interval. The embodiment of the present application takes 5 to 40 as the first pixel value interval as an example. It is possible to confirm whether or not each ordinate value is in the range of 5 to 40 in order from left to right in the statistical result shown in fig. 9. And determining an abscissa corresponding to a continuous ordinate in the range, and connecting pixel columns corresponding to the determined abscissas to form a connected domain formed by pixel points corresponding to the sum of pixel values in the first pixel value interval.

In addition, a space width section may be set. For example, one third of the character width to one half of the character width is determined as the width section of the space. And the determination of the character width can be seen from the previous embodiments.

Based on the space width section, the connected domain within the space width section is determined as a space, and the connected domain not within the space width section is determined as a non-space. So that the space in the text line can be effectively determined by the space recognition method in the text line.

The space recognition method may be used in OCR optical character recognition, after text is detected, and then a single line of text is detected. The method is used for extracting the space positions, sending the separated words into an OCR module for word-by-word recognition, and then using the spaces to connect the recognized words, so that the final complete single-line text with the spaces can be obtained, and the text which is not recognized by human beings and cannot be obtained because all the words are connected together and no spaces are generated.

The embodiment of the application also provides a space recognition device 300 in text lines, as shown in fig. 10, the device 300 includes: a picture obtaining module 310, configured to obtain a text gray scale map, where the text gray scale map includes only a single line of text; a pixel value obtaining module 320, configured to calculate a sum of pixel values of each row of pixel points in a preset direction in the text gray scale map, where the preset direction is a direction perpendicular to a word arrangement direction in the single-row text; the space determining module 330 is configured to take, as a space in the single-line text, a connected domain formed by a pixel point corresponding to a sum of pixel values in a first pixel value interval, where the sum of pixel values corresponding to a blank in a gray-scale image of the text is located.

Optionally, the apparatus may further include a segmentation module, including a segmentation pixel value determining unit, configured to obtain a segmentation pixel value in the text gray scale map; a pixel value setting unit, configured to set, if a pixel value of a background in the text gray scale map is greater than the divided pixel value, a pixel point whose pixel value is greater than the divided pixel value as a first pixel value, where the first pixel value is greater than or equal to the pixel value of the background; and if the pixel value of the background in the text gray scale map is smaller than the segmentation pixel value, setting the pixel point with the pixel value smaller than the segmentation pixel value as a second pixel value, wherein the second pixel value is smaller than or equal to the pixel value of the background.

Optionally, the determining unit of the pixel value of segmentation is used for taking the pixel value as the abscissa, the number of pixel points is the ordinate, obtain the fitting curve between pixel value and the number of pixel points corresponding to each pixel value; obtaining the maximum value of all maximum values in the fitting curve; acquiring a minimum value adjacent to the maximum value; and taking the pixel value corresponding to the acquired minimum value as the divided pixel value.

Optionally, the device may further include a denoising module, configured to perform a closing operation on the text gray scale map.

Alternatively, the space determining module 330 may be configured to obtain a character width in the text gray scale map; setting a width interval of the blank according to the character width; and determining a connected domain with the width in the width interval in the connected domain formed by the pixel points corresponding to the sum of the pixel values in the first pixel value interval as a blank in the single-line text.

Alternatively, the space determining module 330 may be configured to obtain a width of each character in the text gray scale map; taking the median of the widths of all the acquired characters as the character width.

Alternatively, the space determining module 330 may be configured to use, as the width of a single character, the width of each connected domain formed by the pixel points corresponding to the sum of the pixel values in a second pixel value interval, where the second pixel value interval is different from the first pixel value interval, and the second pixel value interval is an interval where the sum of the pixel values corresponding to the characters in the text gray scale map is located.

Optionally, if the pixel value of the background in the text gray scale map is greater than a preset pixel value, the first pixel value interval and the second pixel value interval are intersected, and the minimum pixel value of the first pixel value interval is greater than the minimum pixel value of the second pixel value interval; if the pixel value of the background in the text gray scale map is smaller than or equal to the preset pixel value, the first pixel value interval and the second pixel value interval are intersected, and the minimum pixel value of the first pixel value interval is smaller than the minimum pixel value of the second pixel value interval.

Optionally, the device may further include an equalization module, configured to perform equalization processing on the text gray map.

According to the space recognition method and device in the text line, the optimal division points are intelligently found, and space positions in the text can be effectively extracted.

It will be apparent to those skilled in the art that, for convenience and brevity of description, reference may be made to the above-described embodiments of the method; the specific working process of the above-described device and module may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

In several embodiments provided by the present application, the coupling of the modules to each other may be electrical, mechanical, or other.

In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules. The modules may be configured in different electronic devices or may be configured in the same electronic device, and embodiments of the present application are not limited.

Referring to fig. 11, a block diagram of an electronic device 500 according to an embodiment of the application is shown. The electronic device may include one or more processors 510 (only one shown), a memory 520, and one or more programs. Wherein the one or more programs are stored in the memory 520 and configured to be executed by the one or more processors 510. The one or more programs are executed by the processor for performing the methods described in the previous embodiments.

Processor 510 may include one or more processing cores. The processor 510 utilizes various interfaces and lines to connect various portions of the overall electronic device 500, perform various functions of the electronic device 500, and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 520, and invoking data stored in the memory 520. Alternatively, the processor 510 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 510 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for being responsible for rendering and drawing of display content; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 510 and may be implemented solely by a single communication chip.

The Memory 520 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Memory 520 may be used to store instructions, programs, code sets, or instruction sets. The memory 520 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function, instructions for implementing the various method embodiments described above, and the like. The stored data area may also be data created by the electronic device in use, etc.

Referring to fig. 12, a block diagram of a computer readable storage medium according to an embodiment of the present application is shown. The computer readable storage medium 700 has stored therein program code that can be invoked by a processor to perform the methods described in the method embodiments described above.

The computer readable storage medium 700 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, the computer readable storage medium 700 comprises a non-volatile computer readable medium (non-transitory computer-readable storage medium). The computer readable storage medium 700 has memory space for program code 710 that performs any of the method steps described above. The program code can be read from or written to one or more computer program products. Program code 710 may be compressed, for example, in a suitable form.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be appreciated by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not drive the essence of the corresponding technical solutions to depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A method of identifying spaces in a text line, the method comprising:

acquiring a text gray scale map, wherein the text gray scale map only comprises a single line of text;

calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray level diagram, wherein the preset direction is a direction perpendicular to the word arrangement direction in the single-row text;

taking the width of each connected domain formed by the pixel points corresponding to the sum of the pixel values in a second pixel value interval as the width of a single character, wherein the second pixel value interval is different from the first pixel value interval, and the second pixel value interval is the interval where the sum of the pixel values corresponding to the characters in the text gray scale map is;

If the pixel value of the background in the text gray scale map is larger than a preset pixel value, the first pixel value interval is intersected with the second pixel value interval, and the minimum pixel value of the first pixel value interval is larger than the minimum pixel value of the second pixel value interval;

if the pixel value of the background in the text gray scale map is smaller than or equal to a preset pixel value, the first pixel value interval is intersected with the second pixel value interval, and the minimum pixel value of the first pixel value interval is smaller than the minimum pixel value of the second pixel value interval;

taking the median in the width of all the acquired characters as the character width; setting a space width section according to the character width, wherein the space width section comprises a section formed by multiplying the character width by one proportion to multiplying the character width by the other proportion;

and determining a connected domain with the width in the width interval in a connected domain formed by pixel points corresponding to the sum of pixel values in the first pixel value interval as a blank space in the single-row text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to blank boxes in a text gray level graph is located.

2. The method according to claim 1, further comprising, before calculating a sum of pixel values of each row of pixels in the preset direction in the text gray scale map:

obtaining a segmentation pixel value in the text gray scale map;

if the pixel value of the background in the text gray scale map is larger than the segmentation pixel value, setting the pixel point with the pixel value larger than the segmentation pixel value as a first pixel value, wherein the first pixel value is larger than or equal to the pixel value of the background;

and if the pixel value of the background in the text gray scale map is smaller than the segmentation pixel value, setting the pixel point with the pixel value smaller than the segmentation pixel value as a second pixel value, wherein the second pixel value is smaller than or equal to the pixel value of the background.

3. The method of claim 2, wherein the obtaining the segmented pixel values in the text gray scale map comprises:

taking pixel values as an abscissa and the number of pixel points as an ordinate, and obtaining a fitting curve between the pixel values and the number of pixel points corresponding to each pixel value;

obtaining the maximum value of all maximum values in the fitting curve;

acquiring a minimum value adjacent to the maximum value;

And taking the pixel value corresponding to the acquired minimum value as the divided pixel value.

4. The method according to claim 1, further comprising, before calculating a sum of pixel values of each row of pixels in the preset direction in the text gray scale map:

and closing the text gray level map.

5. The method according to any one of claims 1 to 4, further comprising, before calculating a sum of pixel values of each row of pixels in a preset direction in the text gray scale map: and carrying out equalization processing on the text gray level map.

6. A space recognition apparatus in a text line, the apparatus comprising:

the image acquisition module is used for acquiring a text gray level image, wherein the text gray level image only comprises a single line of text;

the pixel value acquisition module is used for calculating the sum of pixel values of each row of pixel points in a preset direction in the text gray level diagram, wherein the preset direction is a direction perpendicular to the word arrangement direction in the single-row text;

the space determining module is used for taking the width of each connected domain formed by the pixel points corresponding to the sum of the pixel values in a second pixel value interval as the width of a single character, wherein the second pixel value interval is different from the first pixel value interval, and the second pixel value interval is the interval where the sum of the pixel values corresponding to the characters in the text gray level diagram is located; if the pixel value of the background in the text gray scale map is larger than a preset pixel value, the first pixel value interval is intersected with the second pixel value interval, and the minimum pixel value of the first pixel value interval is larger than the minimum pixel value of the second pixel value interval; if the pixel value of the background in the text gray scale map is smaller than or equal to a preset pixel value, the first pixel value interval is intersected with the second pixel value interval, and the minimum pixel value of the first pixel value interval is smaller than the minimum pixel value of the second pixel value interval; taking the median in the width of all the acquired characters as the character width; setting a space width section according to the character width, wherein the space width section comprises a section formed by multiplying the character width by one proportion to multiplying the character width by the other proportion; and determining a connected domain with the width in the width interval in a connected domain formed by pixel points corresponding to the sum of pixel values in the first pixel value interval as a blank space in the single-row text, wherein the first pixel value interval is an interval in which the sum of pixel values corresponding to blank boxes in a text gray level graph is located.

7. An electronic device, comprising:

one or more processors;

a memory;

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs being executed by the processor for performing the method of any of claims 1-5.

8. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a program code, which is callable by a processor for executing the method according to any one of claims 1-5.