CN109871743B

CN109871743B - Text data positioning method and device, storage medium and terminal

Info

Publication number: CN109871743B
Application number: CN201811633052.7A
Authority: CN
Inventors: 刘泉; 吴洋; 杨宇; 陈晨; 魏世康; 田正中; 兰杰; 朱兴
Original assignee: Koubei Shanghai Information Technology Co Ltd
Current assignee: Koubei Shanghai Information Technology Co Ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2021-01-12
Anticipated expiration: 2038-12-29
Also published as: CN109871743A

Abstract

The invention discloses a text data positioning method and device, a storage medium and a terminal, relates to the technical field of data processing, and mainly aims to solve the problem that when the upper, lower, left and right typesetting floating of text data is positioned by using a mode that a central point is on a straight line, the text data cannot be accurately positioned, so that an increased error is caused in the identification of the text data. The method mainly comprises the following steps: acquiring vertex coordinate data of reference text data, and configuring boundary relaxation amount for the vertex coordinate data; judging whether the target text data and the reference text data belong to one line and/or one column or not through the vertex coordinates configured with the boundary relaxation amount; and if the target text data belongs to one row and/or one column, determining the positioning of the reference text data as the positioning of the target text data.

Description

Text data positioning method and device, storage medium and terminal

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for positioning text data, a storage medium, and a terminal.

Background

With the rapid development of the big data era, the conversion of the content recorded on the paper seal into the data recorded on the computer side has become a convenient means for data processing. When uploading data content with pictures and characters to a computer, pictures and characters need to be uploaded or input respectively, for example, a merchant inputs menu data of food, that is, uploading pictures of food and inputting characters such as name and price of food.

Currently, text data in a picture can be extracted by an Optical Character Recognition (OCR) method, and then the text data is recognized by a specific algorithm. The algorithms need to analyze layout information in the picture when recognizing text data, and the basis of analysis depends on the specific positioning of the text data, for example, in menu layout analysis, analysis of the dish name and the price needs to analyze the text data in a row or a column according to a specific algorithm, and in the process of analysis, reference is made to the row or the column where the dish name and the price appear. However, when the text data is typeset on the page, the text data is not laid out according to the precise row and column positions, and the text data is often typeset and floated up and down, left and right, and when the text data is positioned by using a mode that the central point is on a straight line, an increased error is caused to the identification of the text data, and the text data cannot be accurately positioned, so that the text data which belongs to one row or one column is omitted or incompletely identified due to the floating, and the accuracy of the text data entry is influenced.

Disclosure of Invention

In view of the above, the present invention provides a method and an apparatus for locating text data, a storage medium, and a terminal, and mainly aims to solve the problems that when text data is directly located by using a straight line with a central point, an error is increased for identifying text data which is typeset and floated up and down, and text data cannot be accurately located, so that text data which belongs to one row or one column is missed or incompletely identified due to floating.

According to an aspect of the present invention, there is provided a method for locating text data, including:

acquiring vertex coordinate data of reference text data, and configuring boundary relaxation amount for the vertex coordinate data, wherein the boundary relaxation amount is used for extending boundary values of rows and columns belonging to the reference text data in the vertex coordinate data;

judging whether the target text data and the reference text data belong to one line and/or one column or not through the vertex coordinates configured with the boundary relaxation amount;

and if the target text data belongs to one row and/or one column, determining the positioning of the reference text data as the positioning of the target text data.

Further, the acquiring vertex coordinate data of the reference text data and configuring a boundary relaxation amount for the vertex coordinate data includes:

selecting reference text data from all target text data, and extracting vertex coordinate data of the reference text data;

dividing row coordinate data and column coordinate data from the vertex coordinate data, and respectively configuring row boundary relaxation amount and column boundary relaxation amount for the row coordinate data and the column coordinate data.

Further, the determining whether the target text data and the reference text data belong to one row and/or one column by the vertex coordinates configured with the boundary slack amount includes:

judging whether the line coordinate data configured with the line slack contains the line coordinate data of the target text data; and/or the presence of a gas in the gas,

and judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition, wherein the preset boundary inclusion condition is used for determining the inclusion relationship between the column coordinate data of the target text data and the column coordinate configured with the column relaxation variable.

Further, the determining whether the column coordinate data of the target text data meets a preset boundary inclusion condition includes:

judging whether a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data;

if the first area is larger than the second area and the line coordinate data configured with the line slack contains the line coordinate data of the target text data, judging whether a weight value between the first area and the second area is larger than a preset weight value;

and if the first area is smaller than the second area and the column coordinate data configured with the column slack is included in the column coordinate data of the target text data, updating the column coordinate data configured with the column slack according to the column coordinate data of the second area, and executing a step of judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition.

Further, after determining whether a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data, the method further includes:

if the first area is equal to the second area, comparing the column coordinate data configured with the column slack with the column coordinate data of the target text data, determining whether to update the column coordinate data configured with the column slack according to a comparison result, and judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition;

if the comparison results are different, updating the column coordinate data configured with the column relaxation amount according to the comparison results;

and if the comparison results are the same, determining the column coordinate data of the target text data to meet the preset boundary inclusion condition.

Further, after determining whether the weight value between the first area and the second area is greater than a preset weight value, the method further includes:

and if the weight value is less than or equal to the preset weight value, updating the column coordinate data configured with the column slack according to the column coordinate data of the target text data, and executing the step of judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition.

And if the weight value is greater than the preset weight value, determining that the column coordinate data of the target text data meets a preset boundary inclusion condition.

Further, the determining the location of the reference text data as the location of the target text data may include:

if the line coordinate data configured with the line slack includes the line coordinate data of the target text data, determining the line of the target text data as the line of the reference text data; and/or the presence of a gas in the gas,

and if the column coordinate data of the target text data meets a preset boundary inclusion condition, determining the column of the target text data as the column of the reference text data.

Further, the method further comprises:

and if the line coordinate data configured with the line slack do not contain the line coordinate data of the target text data, and/or if the column coordinate data of the target text data does not meet the preset boundary containing condition, taking the target text data as the target text data of other reference text data, and executing the text data positioning method again.

Further, after the reference text data is selected from all the target text data and the vertex coordinate data of the reference text data is extracted, the method further includes:

and analyzing the vertex coordinate data by using a preset relaxation amount optimization algorithm to generate a row boundary relaxation amount and a column boundary relaxation amount which are matched with the vertex coordinate data.

According to an aspect of the present invention, there is provided a text data locating apparatus, including:

the configuration module is used for acquiring vertex coordinate data of reference text data and configuring boundary relaxation amount for the vertex coordinate data, wherein the boundary relaxation amount is used for extending boundary values of rows and columns belonging to the reference text data in the vertex coordinate data;

the judging module is used for judging whether the target text data and the reference text data belong to a row and/or a column through the vertex coordinates configured with the boundary relaxation amount;

and the determining module is used for determining the positioning of the reference text data as the positioning of the target text data if the reference text data belongs to one row and/or one column.

Further, the configuration module includes:

the selecting unit is used for selecting reference text data from all target text data and extracting vertex coordinate data of the reference text data;

and the configuration unit is used for dividing row coordinate data and column coordinate data from the vertex coordinate data and respectively configuring row boundary relaxation amount and column boundary relaxation amount for the row coordinate data and the column coordinate data.

Further, the judging module comprises:

a first judging unit, configured to judge whether the line coordinate data configured with the line slack amount includes line coordinate data of the target text data; and/or the presence of a gas in the gas,

a second judging unit, configured to judge whether the column coordinate data of the target text data satisfies a preset boundary inclusion condition, where the preset boundary inclusion condition is used to determine an inclusion relationship between the column coordinate data of the target text data and the column coordinate configured with the column slack variable.

Further, the second determination unit includes:

a first judging subunit, configured to judge whether a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data;

a second determining subunit, configured to determine whether a weight value between the first area and the second area is greater than a preset weight value if the first area is greater than the second area and the row coordinate data configured with the row slack includes row coordinate data of the target text data;

a first updating subunit, configured to update, if the first area is smaller than the second area and the column coordinate data with the column slack is included in the column coordinate data of the target text data, the column coordinate data with the column slack according to the column coordinate data of the second area, and perform a step of determining whether the column coordinate data of the target text data satisfies a preset boundary inclusion condition.

Further, the second determination unit further includes:

a second updating subunit, configured to, if the first area is equal to the second area, compare the column coordinate data configured with the column slack with the column coordinate data of the target text data, determine whether to update the column coordinate data configured with the column slack according to a comparison result, and perform a step of determining whether the column coordinate data of the target text data satisfies a preset boundary inclusion condition;

a third updating subunit, configured to update, according to the comparison result, the column coordinate data configured with the column slack amount if the comparison result is different;

and the determining subunit is used for determining the column coordinate data of the target text data as meeting a preset boundary inclusion condition if the comparison results are the same.

Further, the second determination unit further includes:

and the fourth updating subunit is configured to update the column coordinate data configured with the column slack according to the column coordinate data of the target text data if the weight value is less than or equal to the preset weight value, and perform a step of judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition.

A determining subunit, configured to determine, if the weight value is greater than the preset weight value, that the column coordinate data of the target text data satisfies a preset boundary inclusion condition.

Further, the determining module is specifically configured to determine, if the line coordinate data configured with the line slack includes line coordinate data of the target text data, a line to which the target text data belongs as a line to which the reference text data belongs; and/or the presence of a gas in the gas,

the determining module is specifically further configured to determine the column of the target text data as the column of the reference text data if the column coordinate data of the target text data meets a preset boundary inclusion condition.

Further, the apparatus further comprises:

and the execution module is used for taking the target text data as the target text data of other reference text data and re-executing the text data positioning method if the line coordinate data configured with the line slack does not contain the line coordinate data of the target text data and/or if the column coordinate data of the target text data does not meet the preset boundary containing condition.

Further, the configuration module further comprises:

and the generating unit is used for analyzing the vertex coordinate data by utilizing a preset relaxation amount optimization algorithm and generating a row boundary relaxation amount and a column boundary relaxation amount which are matched with the vertex coordinate data.

According to another aspect of the present invention, a storage medium is provided, and the storage medium stores at least one executable instruction, which causes a processor to perform operations corresponding to the above method for locating text data.

According to still another aspect of the present invention, there is provided a terminal including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the text data positioning method.

By the technical scheme, the technical scheme provided by the embodiment of the invention at least has the following advantages:

the invention provides a text data positioning method and device, a storage medium and a terminal, compared with the existing text data positioning method in which a central point is on a straight line, the embodiment of the invention configures the slack quantity by using the vertex coordinate of the reference text data, then judges whether each target text data can be determined as a line or a column with the reference text data by traversing the vertex coordinate configured with the slack quantity, and if the target text data belongs to the line or the column, the reference text data and the target text data are determined as the line or the column, thereby realizing the accurate positioning of the text data, avoiding the text data belonging to the line or the column from being omitted or unidentified, and further improving the integrity of the text data entry.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 is a flowchart illustrating a method for locating text data according to an embodiment of the present invention;

FIG. 2 is a flow chart of another method for locating text data according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating line location of text data according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating column alignment of text data according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating another example of column alignment of text data provided by an embodiment of the present invention;

FIG. 6 is a diagram illustrating a column location of text data according to another embodiment of the present invention;

FIG. 7 is a diagram illustrating column alignment of text data according to another embodiment of the present invention;

FIG. 8 is a block diagram of a line locator for text data according to an embodiment of the present invention;

FIG. 9 is a block diagram of another apparatus for line locating text data provided by an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

In order to accurately find out the line position of text data when identifying the text data in a picture and avoid missing the content of the text data when identifying the content of the text data in the prior art, an embodiment of the present invention provides a method for locating text data, as shown in fig. 1, the method includes:

101. and acquiring vertex coordinate data of the reference text data, and configuring boundary relaxation amount for the vertex coordinate data.

The reference text data is any one of all text data to be positioned, and is used for positioning as a reference when determining the positions of other text data, and the text data is a text which needs to be positioned in a picture and may include characters, numbers and the like. In the embodiment of the invention, all text data are extracted from the picture by using a character recognition technology OCR (optical character recognition), namely, a position with the text data is extracted from the layout of the picture by using the OCR technology, and the content in the position is intercepted, wherein the specific description mode of the position is expressed by the vertex coordinates of each segment of text data so as to identify the content in the complete text data in the position.

In addition, since the boundary slack amount is used to extend the boundary value of the row and the column belonging to the reference text data in the vertex coordinate data, the slack variables include the row boundary slack amount and the column boundary slack amount of the upper and lower boundaries of the row and the column, and the specific size of the boundary slack amount is not particularly limited in the embodiments of the present invention. For example, the 4 vertices of the reference text data are pos ═ (lu, ru, ld, rd), respectively, and the vertex coordinate data are lu ═(x _ lu, y _ lu); run ═ x _ ru, y _ ru; ld ═ x _ ld, y _ ld; and rd ═ x _ rd, y _ rd, where e0 and e1 are defined as boundary relaxation amounts, and the boundary relaxation amounts are [ lu _ y-e 0, ld _ y + e1], [ lu _ x-e 0, ru _ x + e1], respectively, for the vertex coordinate data.

It should be noted that, in the embodiment of the present invention, the content of the text data in the picture generally exists in a form of a row and a column, and the text data is recognized through OCR, so that a block structure formed by four vertex coordinate data is obtained, the position of the block structure is the position where we need to perform text data positioning, the position of the text data exists in the picture through the block structure of the text data, OCR does not recognize specific text content, but only determines the position of the text data.

For example, the name, price and feature introduction of a meal in a menu picture, red-cooked pork, 39 yuan and feature gourmet are recorded in one line: the first-class pork is prepared, so that the braised pork in brown sauce, 39 yuan and special cate can be conveniently checked by customers: the equal text data made by the equal pork exists in the picture in a row and can be displayed in different columns, and the text data extracted by the OCR technology corresponds to the braised pork, 39 yuan and special gourmet food: the first-class pork is made, 4 vertex coordinates form a text block 1 of the braised pork text data, a text block 2 of the 39-element text data, and the special cate: the first-class pork is made into a text block 3 of the text data, and the text block 1, the text block 2 and the text block 3 are the text data needing line and column positioning.

102. And judging whether the target text data and the reference text data belong to one line and/or one column or not according to the vertex coordinates with the boundary relaxation amount.

Wherein the determining whether to belong to a row and/or a column comprises determining whether to belong to the same row and determining whether to belong to the same column. For example, the vertex coordinates [ lu _ y-e 0, ld _ y + e1] in which the line boundary slack amount is arranged and the vertex coordinates [ lu _ x-e 0, ru _ x + e1] in which the column boundary slack amount is arranged are used to determine whether the target text data and the text data corresponding to the vertex coordinate pos (lu, ru, ld, rd) are located on the same line or the same column.

It should be noted that, in general, the judgment for a row may determine whether the row coordinate data of the target text data is located therein according to [ lu _ y-e 0, ld _ y + e1], and the judgment for a column may determine whether the column vertex data of the target text data is matched with the column vertex data of the target text data into a column according to [ lu _ x-e 0, ru _ x + e1], which is not limited in this embodiment of the present invention.

103. And if the target text data belongs to one row and/or one column, determining the positioning of the reference text data as the positioning of the target text data.

For the embodiment of the present invention, if the target text data and the reference text data belong to one row and/or one column, it is indicated that the target text data and the reference text data belong to one row or one column, and when the target text data and the reference text data belong to one row. Wherein, by using a line width configured with a boundary slack amount as a line width of an entire line, when recognizing the content of text data in the line, the entire content is recognized; by using the column width in which the boundary slack amount is arranged as the column width of the entire column, when the content of the text data in the column is recognized, the entire content is recognized.

It should be noted that, in the embodiment of the present invention, the positioning of the reference text data includes row coordinates and column coordinates, and when the target text data and the reference text data belong to a row and/or a column, it needs to be further determined whether the target text data belongs to a row or a column, or belongs to the same row and the same column, so as to determine the coordinates of the target text data for identifying the content of the text data.

For example, if e0 and e1 are 4, the line coordinate data of the reference text data in which the amount of boundary slack is arranged is [ 5-4, 3+4], and if the line coordinate data of the target text data is [3, 6], and the line coordinate data of the target text data is included in the line coordinate data of the reference text data, it is described that the target text data and the reference text data belong to the same line, and the target text data is positioned as [ 5-4, 3+4 ].

The invention provides a text data positioning method, compared with the existing method of positioning text data by using a mode that a central point is on a straight line, the embodiment of the invention configures the slack quantity by using the vertex coordinate of reference text data, then traverses and judges whether each target text data can be determined as a line or a column with the reference text data or not by using the vertex coordinate configured with the slack quantity, and if the target text data belongs to a line and/or a column, the reference text data and the target text data are determined as a line or a column, thereby realizing the accurate positioning of the text data, avoiding the text data belonging to the line or the column from being omitted or unidentified, and further improving the integrity of the text data entry.

An embodiment of the present invention provides another method for positioning text data, as shown in fig. 2, the method includes:

201. and selecting reference text data from all the target text data, and extracting vertex coordinate data of the reference text data.

In order to determine whether all the text data are located in one row or one column by way of comparison, the reference text data is selected from all the target text data, that is, any one text data in one row or one column is selected as the reference text data for comparison, and the others are the target text data to be compared, as shown in fig. 3 to 7, the text data in the gray part is the reference text data, and the others are the target text data. The selection of the reference text data may select a first one of a row or a column, or may select any one of the row or the column, if the selected reference text data is the first one of the row or the column, all the target text data may be traversed backwards for comparison, if the selected reference text data is any one of the row or the column except the first one, all the target text data that is traversed backwards from the current reference text data may be compared, and then, forward, and backward may also be traversed, and embodiments of the present invention are not specifically limited. In addition, the vertex used as the boundary in the embodiment of the present invention may be a vertex arbitrarily selected as the boundary between upper and lower points, left and right points, or a central point, and is not particularly limited.

It should be noted that, in the embodiment of the present invention, vertex coordinate data of all target text data is recognized by an OCR technology, so that when the target text data is compared with the reference text data, the vertex coordinate data is used for comparison.

202. Dividing row coordinate data and column coordinate data from the vertex coordinate data, and respectively configuring row boundary relaxation amount and column boundary relaxation amount for the row coordinate data and the column coordinate data.

For the embodiment of the present invention, since the vertex coordinate data of each text data includes 4 vertex coordinate data, the text content in the form of text blocks may be composed, for example, each text data is labeled by 4 vertices (lu, ru, ld, rd), and the obtained coordinates are lu ═ x _ lu, y _ lu, ru ═ x _ ru, y _ ru, ld ═ x _ ld, y _ ld, and rd ═ x _ rd, y _ rd. In order to determine whether to belong to one row or one column according to the vertex coordinate data, therefore, row coordinate data [ lu _ y, ld _ y ] and column coordinate data [ lu _ x, ru _ x ] are divided into the vertex coordinate data.

In order to accurately incorporate the target text data into the line or column to which the reference text data belongs, a line boundary slack amount and a column boundary slack amount are respectively arranged for the line coordinate data and the column coordinate data of the reference text data. The selection of the slack amounts e0 and e1 may be directly half, one third, and the like of the distance between the row or column vertices, and may also be selected according to a preferred algorithm, which is not specifically limited in the embodiment of the present invention. For example, the row coordinate data in which the row boundary slack amount is arranged is [ lu _ y-e 0, ld _ y + e1], and the column coordinate data in which the column boundary slack amount is arranged is [ lu _ x-e 0, ru _ x + e1 ].

Further, in order to accurately find the configured row boundary slack and column boundary slack, in this embodiment, step 202 may include: and analyzing the vertex coordinate data by using a preset relaxation amount optimization algorithm to generate a row boundary relaxation amount and a column boundary relaxation amount which are matched with the vertex coordinate data.

For the embodiment of the present invention, the preset relaxation amount optimization algorithm may be any intelligent optimization algorithm, such as a neural network algorithm, an ant colony algorithm, and the like, the value range of the boundary relaxation amount is set after the vertex coordinate data is analyzed, and the optimal e0 and e1 are selected by using the intelligent optimization algorithm, which is not specifically limited in the embodiment of the present invention. In the embodiment of the present invention, it is preferable that half of the average height of the text data is calculated as the boundary slack amount of the text data, and e0 is equal to e 1.

203a, judging whether the line coordinate data configured with the line slack amount contains the line coordinate data of the target text data.

In the embodiment of the present invention, in order to determine the conditions of the rows and the columns, whether the row coordinate data of the target text data is included in the row coordinate data configured with the row slack is used to determine whether the target text data and the reference text data are on the same row. After the vertex coordinate data of all the target text data is identified by the OCR technology in step 201, the line coordinate data of the target text data is defined as (n _ lu _ y, n _ ld _ y), the line coordinate data configured with the line slack amount is [ lu _ y-e 0, ld _ y + e1], and whether the line coordinate data configured with the line slack amount in step 203a includes the line coordinate data of the target text data is the line coordinate data satisfying the condition n _ lu _ y > -lu _ y-e 0; n _ ld _ y < ═ ld _ y + e1, as shown in fig. 3, len _ y is ld _ y-lu _ y, and if the target text data and the reference text data in which the line slack amount is arranged are defined as one line, the line coordinate data of the reference text data may include the line coordinate data of all the target text data. The inclusion relationship in the embodiment of the present invention includes a case where the row coordinate data is the same.

For the embodiment of the present invention, step 203b, which is parallel to step 203a, determines whether the column coordinate data of the target text data satisfies a preset boundary inclusion condition.

For the embodiment of the present invention, since the width of the text data of each row in a column is different, a column may include one text data, or may include 2 or 3 text data, and the selection of the reference data in the embodiment of the present invention is arbitrary, and may be selected widely or may be selected narrowly, therefore, when the width of the reference text data is wide, the column coordinate data of the target text data is included in the column coordinate data in which the column slack amount is allocated to the reference text data, and when the width of the reference text data is narrow, the column coordinate data of the target text data is included in the column coordinate data in which the column slack amount is allocated to the reference text data, however, in both cases, the reference text data and the target text data may be located in a column, and therefore, the preset boundary inclusion condition is used to determine that the inclusion between the column coordinate data of the target text data and the column coordinate in which the column slack variable is allocated to be included in the column coordinate data of the target text data and the column coordinate And (4) relationship. The preset boundary inclusion condition includes a condition when the column coordinate data in which the column slack is arranged in the reference text data includes the column coordinate data of the target text data, and a condition when the column coordinate data of the target text data includes the column coordinate data in which the column slack is arranged in the reference text data. The inclusion relationship of the embodiment of the present invention includes a case where the column coordinate data is the same.

Further, in order to further describe a specific inclusion condition of the preset boundary inclusion condition when the preset boundary inclusion condition belongs to a column, step 203b in the embodiment of the present invention may specifically include: judging whether a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data; if the first area is larger than the second area and the line coordinate data configured with the line slack contains the line coordinate data of the target text data, judging whether a weight value between the first area and the second area is larger than a preset weight value; and if the first area is smaller than the second area and the column coordinate data configured with the column slack is included in the column coordinate data of the target text data, updating the column coordinate data configured with the column slack according to the column coordinate data of the second area, and executing a step of judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition.

As shown in fig. 4, the column coordinate data of the reference text data is [ lu _ x, ru _ x ], the corresponding region is len _ x ═ ru _ x-lu _ x, the coordinates configured with the column boundary slack amount are [ lu _ x-e 0, ru _ x + e1], the column coordinate data of the target text data is (n _ lu _ x, n _ ru _ x), the corresponding region is n _ ru _ x-n _ lu _ x, and if ru _ x-lu _ x is greater than n _ ru _ x-n _ lu _ x, the conditions include: lu _ x-e 0< n _ lu _ x and ru _ x + e1> n _ ru _ x, and when the first area is larger than the second area and the row coordinate data with the row slack is included in the row coordinate data of the target text data, it indicates that the width of the target text data is smaller than the width of the reference text data with the row boundary slack, and the target text data is included in the row corresponding to the reference text data. Further, in order to determine that there may be a plurality of target text data included in the column and determine whether the plurality of target text data are in the same column, it is determined whether a weight value between the first area and the second area is greater than a preset weight value. The calculation mode of the weight value comprises the following steps: and if the first area is smaller than the second area, the first area or the second area is used as the denominator, and the difference value between the first area and the second area is used as the numerator to perform ratio calculation to obtain the weight value. If Ifn _ lu _ x > lu _ x and n _ ru _ x < ru _ x (ru _ x-n _ lu _ x)/(ru _ x-lu _ x) > weight or (ru _ x-n _ lu _ x)/(n _ ru _ x-n _ lu _ x) > weight; if lu _ x > n _ lu _ x and lu _ x < n _ ru _ x (n _ ru _ x-lu _ x)/(ru _ x-lu _ x) > weight or (n _ ru _ x-lu _ x)/(n _ ru _ x-n _ lu _ x) > weight, where preferably, the weight is preset to be >0.5 and weight <1, such as weight being 0.8, embodiments of the present invention are not particularly limited.

In addition, if the first area is smaller than the second area and the line coordinate data in which the amount of line slack is arranged is included in the line coordinate data of the target text data, it is described that the width of the target text data is larger than the width of the reference text data in which the amount of column boundary slack is arranged and the column coordinate data in which the amount of column slack is arranged is included in the column coordinate data of the target text data. Therefore, it is necessary to widen the column coordinate data of the current reference text data and update the column coordinate data in which the column slack is arranged based on the column coordinate data of the second area, that is, to update the column coordinate data of the target text data to the column coordinate data in which the column slack is arranged. Further, in order to ensure the column scalability, the column coordinate data of the text data may be updated to the column coordinate data configured with the column slack variable by a specific update rule, for example, as shown in fig. 5, the column coordinate data of the reference text data is [ l _ b, r _ b ], the column coordinates of the target text data are [ n _ lu _ x, n _ ru _ x ] and len _ x-r _ b-l _ b, and if the area formed by the column coordinate data of the target text data is larger than the area formed by the column coordinate data of the reference text data, the excess portion, i.e., n _ lu _ x < l _ b and n _ ru _ x > r _ b, is updated, and the column coordinates of the reference text data are [ n _ lu _ x, n _ ru _ x ]. Wherein if n _ lu _ x < l _ b, l _ b < -n _ lu _ x; if n _ ru _ x > r _ b is satisfied, r _ b < -n _ ru _ x.

For the embodiment of the present invention, in order to further describe a specific inclusion condition of the preset boundary inclusion condition when the preset boundary inclusion condition belongs to one column, after the determining whether a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data, the method further includes: if the first area is equal to the second area, comparing the column coordinate data configured with the column slack with the column coordinate data of the target text data, determining whether to update the column coordinate data configured with the column slack according to a comparison result, and judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition; if the comparison results are different, updating the column coordinate data configured with the column relaxation amount according to the comparison results; and if the comparison results are the same, determining the column coordinate data of the target text data to meet the preset boundary inclusion condition.

For the embodiment of the present invention, when the first area and the second area are equal, the widths of the reference text data and the target text data are the same, and in order to determine whether the reference text data and the target text data belong to one column, that is, whether the target text data and the reference text data have cross arrangement, if so, whether the target text data and the target text data belong to the same column. Therefore, the column coordinate data in which the column slack is arranged is compared with the column coordinate data of the target text data, and whether or not to update the column coordinate data in which the column slack is arranged is determined based on the comparison result.

Wherein, when the comparison results are different, the column coordinate data including the target text data is on the left side or the right side of the column coordinate data configured with the column slack amount, and in order to avoid using the text data in other columns as the current column, a comparison threshold is configured for the column coordinate data extending to the left side or the right side, for example, if the column coordinate data configured with the column slack amount is [ l _ b-e0, r _ b + e1], the column coordinate of the target text data is [ n _ lu _ x, n _ ru _ x ], then when n _ lu _ x-l _ b-e0 >0 or n _ ru _ x-r _ b + e1> 0, it is said that the target text data is on the right side of the reference text data or when n _ lu _ x-l _ b-e 0< 0 or n _ ru _ x-r _ b + e1 < 0, it is said that the target text data is on the left side of the reference text data, whether to update the boundary value is determined by whether | n _ lu _ x-l _ b-e0|/(n _ ru _ x-n _ lu _ x) is greater than or equal to a threshold value, if so, it indicates that the intersection part is large and can be placed in a column, preferably, the threshold value can be one half or one third, and the embodiment of the present invention is not particularly limited. For the embodiment of the present invention, the column coordinate data with the column slack is updated, that is, the column coordinate data belonging to the left side is updated to be smaller of n _ lu _ x and l _ b-e0, and the column coordinate data belonging to the right side is updated to be larger of n _ ru _ x and r _ b + e1, as shown in fig. 6, so that the column boundary is enlarged. In addition, when the comparison result is the same, it indicates that the target text data and the reference text data belong to the same column.

For the embodiment of the present invention, in order to further describe a specific inclusion condition of a preset boundary inclusion condition when the preset boundary inclusion condition belongs to a column, after the determining whether a weight value between the first area and the second area is greater than a preset weight value, the method further includes: and if the weight value is less than or equal to the preset weight value, updating the column coordinate data configured with the column slack according to the column coordinate data of the target text data, and executing the step of judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition. And if the weight value is greater than the preset weight value, enabling the column coordinate data of the target text data to meet a preset boundary inclusion condition.

For the embodiment of the present invention, if the weight value is less than or equal to the preset weight value, it indicates that the cross-overlapped portion of the first area and the second area is less, the boundary of the column needs to be determined again, that is, updating the column coordinate data configured with the column slack amount is to update the column coordinate data on the left side of the second area to be smaller in n _ lu _ x and l _ b-e0, update the column coordinate data on the right side of the second area to be larger in n _ ru _ x and r _ b + e1, as shown in fig. 7, and then perform the determination again by using the updated column coordinate data, which is not specifically limited in the embodiment of the present invention. In addition, if the weight value is greater than the preset weight value, it indicates that the cross overlapping portion of the first area and the second area is large and belongs to a column, and it may be determined that the column coordinate data of the target text data satisfies the preset boundary inclusion condition.

For the embodiment of the present invention, in step 204a after step 203a, if the line coordinate data configured with the line slack amount includes the line coordinate data of the target text data, the line to which the target text data belongs is determined as the line to which the reference text data belongs.

For the embodiment of the present invention, the line coordinate data configured with the line slack amount includes the line coordinate data of the target text data, that is, the line boundary of the line coordinate data configured with the line slack amount may include the line boundary of the line coordinate data of the target text data, and as shown in fig. 3, the target text data and the reference text data are determined as one line.

In the embodiment of the present invention, in step 204b, which is parallel to step 204a, if the line coordinate data with the line slack does not include the line coordinate data of the target text data, the target text data is used as the target text data of other reference text data, and the text data positioning method is executed again.

For the embodiment of the present invention, if the line coordinate data configured with the line slack does not include the line coordinate data of the target text data, the line boundary of the target text data already exceeds the line boundary of the line coordinate data configured with the slack, and therefore, the line boundary cannot be regarded as a line, and the current target text data can be relocated as the target text data of the reference text data of other lines.

For the embodiment of the present invention, in step 204c after step 203a, if the column coordinate data of the target text data meets the preset boundary inclusion condition, the column of the target text data is determined as the column to which the reference text data belongs.

For the embodiment of the present invention, the column coordinate data of the target text data satisfies the preset boundary inclusion condition, that is, the column coordinate data configured with the column slack amount may be included in the column coordinate data of all the target text data, as shown in fig. 4 to 7, the target text data and the reference text data are determined as one column.

For the embodiment of the present invention, in step 204d, which is parallel to step 204c, if the column coordinate data of the target text data does not satisfy the preset boundary inclusion condition, the target text data is used as the target text data of other reference text data, and the text data positioning method is executed again.

For the embodiment of the present invention, if the column coordinate data of the target text data does not satisfy the preset boundary inclusion condition, the boundary of the column of the target text data already exceeds a certain range of the column boundary of the column coordinate data configured with the slack amount, and therefore, the column cannot be regarded as a column, and the current target text data can be relocated as the target text data of the reference text data of other columns.

The embodiment of the invention configures the relaxation amount by using the vertex coordinate of the reference text data, then judges whether each target text data can be determined to be a row or a column with the reference text data by traversing the vertex coordinate configured with the relaxation amount, and determines the reference text data and the target text data to be a row or a column if the target text data belong to the row and/or the column, thereby realizing the accurate positioning of the text data, avoiding the omission or the unrecognizable text data in the row or the column, and improving the integrity of the text data entry.

Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides a device for locating text data, where as shown in fig. 8, the device includes: a configuration module 31, a judgment module 32 and a determination module 33.

A configuration module 31, configured to obtain vertex coordinate data of reference text data, and configure a boundary slack amount for the vertex coordinate data, where the boundary slack amount is used to extend a boundary value of a row and a column of the reference text data in the vertex coordinate data; the configuration module 31 executes a program module for acquiring vertex coordinate data of reference text data for a positioning device of text data, and configuring a boundary slack amount for the vertex coordinate data.

The judging module 32 is configured to judge whether the target text data and the reference text data belong to one row and/or one column through the vertex coordinates configured with the boundary slack; the determination module 32 executes a program module for determining whether the target text data and the reference text data belong to one row and/or one column by using the vertex coordinates configured with the boundary slack amount for the positioning device of the text data.

A determining module 33, configured to determine the location of the reference text data as the location of the target text data if the reference text data belongs to a row and/or a column. The determining module is a program module for executing 33 the step of determining the positioning of the reference text data as the positioning of the target text data if the target text data conforms to a preset row and column definition rule for the positioning device of the text data.

The invention provides a text data positioning device, compared with the existing text data positioning method that a central point is on a straight line, the embodiment of the invention configures the slack quantity by using the vertex coordinate of the reference text data, then traverses and judges whether each target text data can be determined as a line or a column with the reference text data or not by using the vertex coordinate configured with the slack quantity, and if the target text data belongs to a line and/or a column, the reference text data and the target text data are determined as a line or a column, thereby realizing the accurate positioning of the text data, avoiding the text data belonging to the line or the column from being omitted or unidentified, and further improving the integrity of the text data entry.

Further, as an implementation of the method shown in fig. 2, an embodiment of the present invention provides another text data positioning apparatus, as shown in fig. 9, where the apparatus includes: a configuration module 41, a judgment module 42, a determination module 43, and an execution module 44.

A configuration module 41, configured to obtain vertex coordinate data of reference text data, and configure a boundary slack amount for the vertex coordinate data, where the boundary slack amount is used to extend a boundary value of a row and a column of the reference text data in the vertex coordinate data;

the judging module 42 is configured to judge whether the target text data and the reference text data belong to a row and/or a column by using the vertex coordinates configured with the boundary slack;

a determining module 43, configured to determine the location of the reference text data as the location of the target text data if the reference text data belongs to a row and/or a column.

Further, the configuration module 41 includes:

a selecting unit 4101 configured to select reference text data from all target text data, and extract vertex coordinate data of the reference text data;

an allocating unit 4102, configured to divide row coordinate data and column coordinate data from the vertex coordinate data, and allocate a row boundary slack amount and a column boundary slack amount for the row coordinate data and the column coordinate data, respectively.

Further, the determining module 42 includes:

a first determining unit 4201, configured to determine whether the line coordinate data configured with the line slack amount includes line coordinate data of the target text data; and/or the presence of a gas in the gas,

a second determining unit 4202, configured to determine whether the column coordinate data of the target text data meets a preset boundary inclusion condition, where the preset boundary inclusion condition is used to determine an inclusion relationship between the column coordinate data of the target text data and the column coordinate where the column slack variable is configured.

Further, the second determination unit 4202 includes:

a first judgment subunit 420201, configured to judge whether or not a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data;

a second determining subunit 420202, configured to determine whether a weight value between the first area and the second area is greater than a preset weight value if the first area is greater than the second area and the line coordinate data configured with the line slack includes the line coordinate data of the target text data;

a first updating subunit 420203, configured to, if the first area is smaller than the second area and the column coordinate data with the column slack is included in the column coordinate data of the target text data, update the column coordinate data with the column slack according to the column coordinate data of the second area, and perform a step of determining whether the column coordinate data of the target text data satisfies a preset boundary inclusion condition.

Further, the second determining unit 4202 further includes:

a second updating subunit 420204, configured to, if the first area is equal to the second area, compare the column coordinate data with the column slack amount with the column coordinate data of the target text data, determine whether to update the column coordinate data with the column slack amount according to a comparison result, and perform a step of determining whether the column coordinate data of the target text data satisfies a preset boundary inclusion condition;

a third updating subunit 420205, configured to update the column coordinate data in which the column slack is arranged, based on the comparison result, if the comparison result is different;

a determining subunit 420206, configured to determine that the column coordinate data of the target text data satisfies a preset boundary inclusion condition if the comparison results are the same.

Further, the second determining unit 4202 further includes:

a fourth updating subunit 420207, configured to, if the weight value is less than or equal to the preset weight value, update the column coordinate data configured with the column slack according to the column coordinate data of the target text data, and perform a step of determining whether the column coordinate data of the target text data satisfies a preset boundary inclusion condition.

A determining subunit 420208, configured to, if the weight value is greater than the preset weight value, satisfy the column coordinate data of the target text data with a preset boundary inclusion condition.

Further, the determining module 43 is specifically configured to determine, if the line coordinate data configured with the line slack amount includes the line coordinate data of the target text data, the line to which the target text data belongs as the line to which the reference text data belongs; and/or the presence of a gas in the gas,

the determining module 43 is further configured to determine the column of the target text data as the column of the reference text data if the column coordinate data of the target text data meets a preset boundary inclusion condition.

Further, the apparatus further comprises:

an executing module 44, configured to, if the row coordinate data with the configured row slack does not include the row coordinate data of the target text data, and/or if the column coordinate data of the target text data does not satisfy a preset boundary inclusion condition, take the target text data as target text data of other reference text data, and re-execute the text data positioning method.

Further, the configuration module 41 further includes:

a generating unit 4103, configured to analyze the vertex coordinate data by using a preset slack optimization algorithm, and generate a row boundary slack and a column boundary slack that match the vertex coordinate data.

The embodiment of the invention configures the relaxation amount by using the vertex coordinate of the reference text data, then judges whether each target text data can be determined as a line or a column with the reference text data by traversing the vertex coordinate configured with the relaxation amount, and determines the reference text data and the target text data as a line or a column if the target text data belong to a line and/or a column, thereby realizing the accurate positioning of the text data, avoiding the omission or unrecognizable text data belonging to a line or a column, and improving the integrity of the text data entry.

According to an embodiment of the present invention, a storage medium is provided, where at least one executable instruction is stored, and the computer executable instruction can execute the method for locating text data in any of the above method embodiments.

Fig. 10 is a schematic structural diagram of a terminal according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the terminal.

As shown in fig. 10, the terminal may include: a processor (processor)502, a Communications Interface 504, a memory 506, and a communication bus 508.

Wherein: the processor 502, communication interface 504, and memory 506 communicate with one another via a communication bus 508.

A communication interface 504 for communicating with network elements of other devices, such as clients or other servers.

The processor 502 is configured to execute the program 510, and may specifically execute relevant steps in the above embodiment of the text data positioning method.

In particular, program 510 may include program code that includes computer operating instructions.

The processor 502 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement an embodiment of the present invention. The terminal comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

And a memory 506 for storing a program 510. The memory 506 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The program 510 may specifically be used to cause the processor 502 to perform the following operations:

judging whether the target text data accords with a preset row and column definition rule or not through the vertex coordinates configured with the boundary slack quantity, wherein the preset row and column definition rule is a rule for determining whether the target text data and the reference text data belong to one row and/or one column;

and if the target text data accords with a preset row and column definition rule, determining the positioning of the reference text data as the positioning of the target text data.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for locating text data, comprising:

acquiring vertex coordinate data of reference text data, and configuring a boundary relaxation amount for the vertex coordinate data, wherein the boundary relaxation amount is used for extending a boundary value of a row and a column which belong to the reference text data in the vertex coordinate data, and the reference text data is any text data in text data to be positioned;

judging whether the target text data and the reference text data belong to one line or one column through the vertex coordinates configured with the boundary slack quantity comprises the following steps: judging whether the column coordinate data of the target text data meets a preset boundary containing condition, specifically comprising: judging whether a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data; if the first area is larger than the second area, and the row coordinate data configured with the row slack amount comprises row coordinate data of the target text data, judging whether a weight value between the first area and the second area is larger than a preset weight value; if the first area is smaller than the second area and the column coordinate data configured with the column slack is included in the column coordinate data of the target text data, updating the column coordinate data configured with the column slack according to the column coordinate data of the second area, and executing a step of judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition;

and if the target text data belongs to one row or one column, determining the positioning of the reference text data as the positioning of the target text data.

2. The method of claim 1, wherein obtaining vertex coordinate data of reference text data and configuring a boundary relaxation amount for the vertex coordinate data comprises:

3. The method according to claim 2, wherein the preset boundary inclusion condition is used to determine an inclusion relationship between column coordinate data of the target text data and the column coordinate configured with the column slack variable;

the judging whether the target text data and the reference text data belong to one line or one column according to the vertex coordinates configured with the boundary slack comprises:

and judging whether the line coordinate data configured with the line slack amount contains the line coordinate data of the target text data.

4. The method according to claim 1, wherein after determining whether a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data, the method further comprises:

5. The method according to claim 1, wherein after determining whether the weight value between the first area and the second area is greater than a preset weight value, the method further comprises:

if the weight value is smaller than or equal to the preset weight value, updating the column coordinate data configured with the column slack according to the column coordinate data of the target text data, and executing a step of judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition;

6. The method according to claim 3, wherein the determining the location of the reference text data as the location of the target text data if the reference text data belongs to one row or one column comprises:

if the line coordinate data configured with the line slack includes the line coordinate data of the target text data, determining the line of the target text data as the line of the reference text data; or the like, or, alternatively,

7. The method of claim 6, further comprising:

and if the line coordinate data configured with the line slack do not contain the line coordinate data of the target text data, or if the column coordinate data of the target text data does not meet the preset boundary containing condition, taking the target text data as the target text data of other reference text data, and executing the text data positioning method again.

8. The method according to claim 2, wherein after the reference text data is selected from all the target text data and the vertex coordinate data of the reference text data is extracted, the method further comprises:

9. A device for locating text data, comprising:

the configuration module is used for acquiring vertex coordinate data of reference text data and configuring boundary relaxation quantity for the vertex coordinate data, wherein the boundary relaxation quantity is used for extending boundary values of rows and columns of the reference text data in the vertex coordinate data, and the reference text data is any text data in text data to be positioned;

the judging module is used for judging whether the target text data and the reference text data belong to a row or a column through the vertex coordinates configured with the boundary slack quantity, and comprises: judging whether the column coordinate data of the target text data meets a preset boundary containing condition, specifically comprising: judging whether a first area formed by the column coordinate data of the reference text data is larger than a second area formed by the column coordinate data of the target text data; if the first area is larger than the second area, and the row coordinate data configured with the row slack amount comprises row coordinate data of the target text data, judging whether a weight value between the first area and the second area is larger than a preset weight value; if the first area is smaller than the second area and the column coordinate data configured with the column slack is included in the column coordinate data of the target text data, updating the column coordinate data configured with the column slack according to the column coordinate data of the second area, and executing a step of judging whether the column coordinate data of the target text data meets a preset boundary inclusion condition;

and the determining module is used for determining the positioning of the reference text data as the positioning of the target text data if the reference text data belongs to one row or one column.

10. The apparatus of claim 9, wherein the configuration module comprises:

11. The apparatus according to claim 10, wherein the preset boundary inclusion condition is used to determine an inclusion relationship between column coordinate data of the target text data and the column coordinate configured with the column slack variable; the judging module comprises:

and the first judging unit is used for judging whether the line coordinate data configured with the line slack amount contains the line coordinate data of the target text data.

12. The apparatus of claim 9, wherein the determining module further comprises: a second determination unit, further comprising:

13. The apparatus of claim 9, wherein the determining module further comprises: a second determination unit, further comprising:

a fourth updating subunit, configured to update, if the weight value is less than or equal to the preset weight value, the column coordinate data configured with the column slack according to the column coordinate data of the target text data, and perform a step of determining whether the column coordinate data of the target text data satisfies a preset boundary inclusion condition;

14. The apparatus of claim 11,

the determining module is specifically configured to determine, if the line coordinate data configured with the line slack includes the line coordinate data of the target text data, a line to which the target text data belongs as a line to which the reference text data belongs; or the like, or, alternatively,

15. The apparatus of claim 14, further comprising:

and the execution module is used for taking the target text data as the target text data of other reference text data and re-executing the text data positioning method if the line coordinate data configured with the line slack does not contain the line coordinate data of the target text data or if the column coordinate data of the target text data does not meet the preset boundary containing condition.

16. The apparatus of any of claims 9-15, wherein the configuration module further comprises:

17. A storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the method of locating text data according to any one of claims 1 to 8.

18. A terminal, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the text data positioning method according to any one of claims 1-8.