WO2021168703A1 - 字符处理及字符识别方法、存储介质和终端设备 - Google Patents
字符处理及字符识别方法、存储介质和终端设备 Download PDFInfo
- Publication number
- WO2021168703A1 WO2021168703A1 PCT/CN2020/076828 CN2020076828W WO2021168703A1 WO 2021168703 A1 WO2021168703 A1 WO 2021168703A1 CN 2020076828 W CN2020076828 W CN 2020076828W WO 2021168703 A1 WO2021168703 A1 WO 2021168703A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- characters
- character
- difference
- spacing
- determined
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 98
- 238000012545 processing Methods 0.000 title claims abstract description 26
- 238000004364 calculation method Methods 0.000 claims abstract description 65
- 238000003672 processing method Methods 0.000 claims abstract description 11
- 230000006870 function Effects 0.000 claims description 78
- 230000008569 process Effects 0.000 claims description 18
- 230000009466 transformation Effects 0.000 claims description 12
- 238000007781 pre-processing Methods 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000012015 optical character recognition Methods 0.000 description 15
- 238000004422 calculation algorithm Methods 0.000 description 13
- 238000009533 lab test Methods 0.000 description 7
- 238000012360 testing method Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 108010068977 Golgi membrane glycoproteins Proteins 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
Definitions
- the embodiments of the present disclosure relate to, but are not limited to, word processing technology, in particular to a character processing and character recognition method, storage medium, and terminal device.
- the paper test form After a patient undergoes a physical examination or a check-up test in a physical examination institution or hospital, the paper test form is not easy to keep. Moreover, when the user goes to another hospital for examination, a series of problems such as the inability to structure the paper test report data, the inability to use the data between the medical examination institution and the hospital or the hospital and the hospital, make the current hospitals unable to conduct a lot of information on the patient’s situation. Good assessment. It often happens that a new hospital has to be rechecked, and a lot of time, money and manpower are wasted. Therefore, there is a need for a method to structure the data in the patient’s paper physical examination report or laboratory test form, and integrate the fragmented laboratory test information. This is important for the establishment of patient electronic medical records and the communication of data between various hospitals. significance.
- the form is an important part of the laboratory test form and the medical examination report form.
- the recognition of the characters in the form is a problem that needs to be solved.
- an embodiment of the present disclosure provides a character processing method, including: recognizing and obtaining the coordinates of characters in an image; using a kernel density function to perform a clustering calculation on the difference between the first coordinate values of two adjacent characters to determine that they belong to Characters in the same row.
- embodiments of the present disclosure provide a character recognition method, including: preprocessing an image; recognizing the coordinates of characters in the image; using the aforementioned character processing method to recognize characters in the image, and determine whether they belong to Characters in the same row.
- embodiments of the present disclosure provide a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to implement the foregoing method.
- embodiments of the present disclosure provide a terminal device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
- the processor executes the program, the above method is implemented. step.
- FIG. 1 is a flowchart of a word processing method provided by an embodiment of the disclosure
- FIG. 2 is a flowchart of a method for implementing step 120 in the steps shown in FIG. 1;
- FIG. 3 is a flowchart of another method for implementing step 120 in the steps shown in FIG. 1;
- FIG. 4 is a flowchart of a text recognition method provided by an embodiment of the disclosure.
- FIG. 5 is a flowchart of another text recognition method provided by an embodiment of the disclosure.
- Fig. 6 is a general flow chart of the algorithm of an exemplary embodiment of the present disclosure.
- Fig. 7 is a flowchart of picture preprocessing in an exemplary embodiment of the present disclosure.
- FIG. 8 is an example of a table image with a tilted situation in an exemplary embodiment of the present disclosure.
- FIG. 9 is an image after image preprocessing according to an embodiment of the present disclosure.
- FIG. 10 is a flowchart of table reconstruction according to an exemplary embodiment of the present disclosure.
- FIG. 11 is a density curve diagram fitted by a discrete sequence of kernel density estimation calculation in an exemplary embodiment of the present disclosure
- FIG. 12 is a diagram showing a table recognition result of an exemplary embodiment of the present disclosure.
- FIG. 13a and 13b are diagrams showing the effect of an application program to which the method of an embodiment of the present disclosure is applied;
- FIG. 14 is a schematic structural diagram of a terminal device provided by an embodiment of the disclosure.
- the present disclosure includes and contemplates combinations with features and elements known to those of ordinary skill in the art.
- the embodiments, features, and elements disclosed in the present disclosure can also be combined with any conventional features or elements to form a unique invention solution defined by the claims.
- Any feature or element of any embodiment can also be combined with features or elements from other invention solutions to form another unique invention solution defined by the claims. Therefore, it should be understood that any feature shown and/or discussed in this disclosure can be implemented individually or in any suitable combination. Therefore, the embodiments are not subject to other restrictions except for the restrictions made according to the appended claims and their equivalents.
- various modifications and changes can be made within the protection scope of the appended claims.
- the specification may have presented the method and/or process as a specific sequence of steps. However, to the extent that the method or process does not depend on the specific order of the steps described herein, the method or process should not be limited to the steps in the specific order described. As those of ordinary skill in the art will understand, other sequence of steps are also possible. Therefore, the specific order of the steps set forth in the specification should not be construed as a limitation on the claims. In addition, the claims for the method and/or process should not be limited to performing their steps in the written order. Those skilled in the art can easily understand that these orders can be changed and still remain within the spirit and scope of the embodiments of the present disclosure. Inside.
- OCR Optical Character Recognition
- One method of OCR table recognition is to recognize the borders in the table, divide the text in the borders into text blocks, identify the content of each text block, and then combine the text blocks and the borders to form a table.
- OCR table recognition is to recognize the borders in the table, divide the text in the borders into text blocks, identify the content of each text block, and then combine the text blocks and the borders to form a table.
- most of the medical check-up test form forms are borderless forms, which will make it impossible to distinguish each text block in the form.
- the photos may have a certain degree of distortion, although the perspective transformation in the Open Source Computer Vision Library (Open Source Computer Vision Library, referred to as opencv) can be used to correct the image to a certain extent.
- Opencv Open Source Computer Vision Library
- the text originally on the same horizontal line will be offset to a certain extent, which will have a huge impact on the recognition result of the OCR table.
- the text that should be in the same line may be recognized as multiple lines, which affects the recognition accuracy. And precision.
- the embodiments of the present disclosure provide a character processing and character recognition method, which can correct the error of recognizing characters belonging to one row as multiple rows due to image deformation. It can be applied to an image including characters, or an image including a frameless form, or an image including a framed form.
- FIG. 1 is a flowchart of a character processing method provided by an embodiment of the present disclosure. As shown in the figure, it includes the following steps 110 to 120.
- Step 110 Recognize and obtain the coordinates of the characters in the image
- the OCR text recognition method can be used to recognize the coordinates of the characters in the form image to obtain the coordinate positions of the characters in the form image.
- Step 120 Use the kernel density function to perform a clustering calculation on the difference between the first coordinate values of two adjacent characters, and determine the characters belonging to the same row.
- the same row includes: the same row or the same column, which is determined according to the arrangement of characters.
- “row” includes rows, that is, it is determined that the characters belong to the same row.
- “row” includes columns, that is, characters that belong to the same column are determined.
- the row includes rows, and in step 120, using the kernel density function to perform clustering calculation on the difference between the first coordinate values of two adjacent characters to determine the characters belonging to the same row includes:
- the kernel density function is used to perform clustering calculation on the difference between the first ordinates of two laterally adjacent characters to determine the first lateral spacing, and to determine the characters belonging to the same row according to the first lateral spacing.
- the two adjacent characters in the lateral direction include characters that are adjacent in position, for example, characters that are adjacent in the x-axis direction, and characters that are adjacent in content.
- the last character in one line and the first character in the next line can be regarded as adjacent characters in content, that is, they belong to horizontally adjacent characters.
- Writing in horizontal direction and writing from left to right and writing from top to bottom is an example.
- one or more points in all characters can be preset The coordinates of is summarized in a list in the order of characters from left to right and lines from top to bottom. Each character in the list is one item, and two adjacent items in the list indicate two adjacent characters.
- the first ordinate refers to the ordinate of one or more preset points.
- the difference between the first ordinate of two adjacent characters in the lateral direction may be the difference between the first ordinate of one or more points in the first character and the first ordinate of the corresponding one or more points in the second character. Difference, including one of the following:
- Case 1 The difference between the ordinate of a point in the first character and the corresponding point in the second character, for example, the difference between the ordinate of the uppermost point of the first character and the ordinate of the uppermost point of the second character, or the first character
- the end point, the lowermost point, and the center point are only examples for illustration. In other embodiments, other points can be selected for calculation, as long as the standard for selecting points is unified;
- Case 2 The difference between the mean value of the ordinates of the multiple points in the first character and the mean value of the ordinates of the corresponding points in the second character, for example, the mean value of the ordinates of the multiple upper end points in the first character corresponds to that in the second character
- first characters and second characters For any two laterally adjacent first characters and second characters, calculate the first ordinate of one or more points in the first character and the first ordinate of the corresponding one or more points in the second character A difference in the ordinate; the kernel density function is used to perform clustering calculation on all the calculated differences, the first lateral distance is determined according to the calculation result (minimum value) of the kernel density function, and the lateral phase is determined according to the first lateral distance The difference between the first ordinates of two adjacent characters is judged, and the characters belonging to the same character line are determined.
- the kernel density function is a method of estimating the basic distribution of data.
- a kernel function such as a Gaussian kernel
- all the kernel functions are added to obtain the kernel density estimate of the data set.
- the kernel function bandwidth parameter used is different, the density function obtained will be different.
- Mean shift (a gradient ascent algorithm) is used to make the data points move in the direction where the density increases the fastest, and finally converge at the local maximum point to form a cluster, and the points that converge to the same maximum value are members of the same cluster.
- Mean shift a gradient ascent algorithm
- Each cluster represents a type of spacing.
- the boundary of the cluster (the minimum value of the kernel density function) is also the boundary of this type of spacing, so it can be calculated Find the minimum value of the kernel density function and determine the corresponding distance.
- a minimum value represents the maximum value of a horizontal spacing.
- the first horizontal distance represents the maximum character line offset, that is, the deviation of two adjacent characters in the x-axis direction in the y-axis direction in the same line.
- the row includes columns
- the kernel density function is used in step 120 to perform clustering calculation on the difference between the first coordinate values of two adjacent characters to determine the characters belonging to the same row, including : Use the kernel density function to perform clustering calculation on the difference between the first abscissas of two longitudinally adjacent characters, determine the first longitudinal spacing, and determine the characters belonging to the same column according to the first longitudinal spacing.
- the two longitudinally adjacent characters include characters that are adjacent in position, for example, characters that are adjacent in the y-axis direction, and characters that are adjacent in content.
- the last character in one column and the first character in the next column can be regarded as adjacent characters in content, that is, they belong to adjacent characters in the vertical direction.
- the preset characters After recognizing the coordinates of all characters in the image, one or more of the preset characters The coordinates of the points are summarized in a list in the order of characters from top to bottom and columns from right to left. Each character in the list is one item, and two adjacent items in the list indicate two adjacent characters.
- the first abscissa refers to the abscissa of one or more preset points.
- the difference between the first abscissa of the two adjacent characters in the longitudinal direction may be the first abscissa of one or more points in the third character and the first abscissa of the corresponding one or more points in the fourth character. Difference, including one of the following:
- Case 3 The difference between the abscissa of a point in the third character and the corresponding point in the fourth character, for example, the difference between the abscissa of the leftmost point of the third character and the abscissa of the leftmost point of the fourth character, Or the difference between the abscissa of the rightmost point of the third character and the abscissa of the rightmost point of the fourth character, or the difference between the abscissa of the center point of the third character and the abscissa of the center point of the fourth character;
- the above-mentioned leftmost point, rightmost point, and center point are only examples. In other embodiments, other points can be selected for calculation, as long as the standard for selecting points is unified;
- Case 4 The difference between the mean value of the abscissa of the multiple points in the third character and the mean value of the abscissa of the corresponding multiple points in the fourth character, for example, the mean value of the abscissa of the multiple left end points in the third character and the mean value of the abscissa in the fourth character.
- the kernel density function is used to perform clustering calculation on all the calculated differences, the first longitudinal distance is determined according to the calculation result (minimum value) of the kernel density function, and the longitudinal phase is determined according to the first longitudinal distance.
- the kernel density function performs clustering calculation on all the calculated differences, and the calculation result represents one or several types of distances that may appear in the graph.
- a minimum value represents the maximum value of a longitudinal spacing.
- the first longitudinal distance represents the maximum value of character column offset, that is, the deviation of two adjacent characters in the x-axis direction in the y-axis direction in the same column.
- the offset of the character row in the image can be determined by performing a clustering calculation on the difference between the first coordinate values of two adjacent characters. For example, one or more minimum values can be obtained by using the kernel density function, where the smallest first minimum value represents the maximum value of the character row offset, which can be used to determine the characters belonging to the same character row or character column, when adjacent When the difference between the first coordinate values of the two characters is less than or equal to the first minimum value, it can be determined that the two adjacent characters belong to the same character row or character column.
- the implementation method of the present disclosure can correct the characters whose positions are shifted due to the tilt of the shooting angle.
- the clustering kernel density algorithm can classify these shifts into their correct category, that is, the shift can be removed and the original Characters belonging to the same row or column are recognized as the same row or column.
- clustering calculation of the coordinate difference using the kernel density function may obtain a minimum value, which can be used to correct the offset of the character row. It is also possible to use the kernel density function to perform clustering calculation of the coordinate difference to obtain multiple minimum values.
- the smallest minimum value (the first minimum value) is used to correct the offset of the character row, and the other one or more minimum values
- the value (the second minimum value) reflects the existence of several types of spacing in the image (for horizontal writing, the spacing includes line spacing, for vertical writing, the spacing includes column spacing), a second minimum value corresponds to The maximum value of a type of spacing.
- one or more of the minimum values obtained by clustering the coordinate difference using the kernel density function may all represent the distance, and one minimum value corresponds to the maximum value of a type of distance.
- the row includes behavior examples
- the character processing method further includes a step of determining a line spacing and a step of determining a character column group.
- the foregoing step 120 may include the following steps 121 to 124.
- Step 121 For any two laterally adjacent first characters and second characters, calculate the first ordinate of one or more points in the first character and one or more corresponding ones in the second character The difference of the first ordinate of the point;
- Step 122 Use the kernel density function to perform clustering calculation on all the calculated differences, determine the first lateral distance according to the calculation result of the kernel density function (for example, the first minimum value), and determine the first lateral distance according to the first lateral distance. Determine the difference between the first ordinates of the two characters and determine the characters belonging to the same character line;
- Step 123 Determine a second lateral distance according to the calculation result of the kernel density function (for example, the second minimum value), and judge the difference between the first ordinates of two laterally adjacent characters according to the second lateral distance, Determine the line spacing;
- the kernel density function for example, the second minimum value
- step 122 and step 123 can be executed in combination.
- the kernel density function is used to cluster all the calculated differences. Since the horizontally adjacent characters include not only positionally adjacent characters but also content adjacent characters, the kernel density function is used to cluster the coordinate differences of horizontally adjacent characters. The maximum value of the offset of the character line and the maximum value of one or more line spacings (ie, the second horizontal spacing) can be obtained.
- the calculation result of the kernel density function includes one or more minimum values, such as the smallest first minimum value and the second minimum value greater than the first minimum value, where the first minimum value represents the smallest type that exists
- the maximum value of spacing (character line offset)
- the first minimum value represents the maximum offset value of two adjacent characters in the x-axis direction in the y-axis direction
- the first minimum value can be used to correct slanted characters
- there can be one or more second minimum values which can represent one or more row spacings, that is, the spacing between two adjacent rows.
- Step 124 Use the kernel density function to perform a clustering calculation on the difference between the second coordinate values of two laterally adjacent characters to determine a third lateral distance, and calculate the second lateral distance of the two laterally adjacent characters according to the third lateral distance. The difference between the coordinate values is judged, it is judged whether the two adjacent characters in the horizontal direction belong to the same character string group, and finally the characters belonging to the same character string group are determined.
- the second coordinate value is the second abscissa, which may be the abscissa of one or more preset points.
- the difference between the second abscissas of two adjacent characters in the horizontal direction may be the second abscissa of one or more points in the fifth character and the sixth character.
- the difference of the second abscissa of the corresponding or corresponding one or more points includes one of the following situations:
- Case 5 The difference between the abscissa of a point in the fifth character and the corresponding point in the sixth character, for example, the difference between the abscissa of the leftmost point of the fifth character and the abscissa of the leftmost point of the sixth character, Or the difference between the abscissa of the rightmost point of the fifth character and the abscissa of the rightmost point of the sixth character, or the difference between the abscissa of the center point of the fifth character and the abscissa of the center point of the sixth character;
- the above-mentioned leftmost point, rightmost point, and center point are only examples. In other embodiments, other points can be selected for calculation, as long as the standard for selecting points is unified;
- Case 7 The difference between the mean value of the abscissas of multiple points in the fifth character and the mean value of the abscissas of the corresponding points in the sixth character, for example, the mean value of the abscissas of the multiple left end points in the fifth character and that of the sixth character.
- Case 8 The difference between the abscissas of multiple points in the fifth character and the corresponding multiple points in the sixth character, for example, the mean value of the abscissas of the multiple right end points in the fifth character and the multiple left end points in the sixth character The difference of the mean value of the abscissa, the abscissa of the sixth character is greater than the abscissa of the fifth character;
- corresponding endpoints refer to endpoints that have the same position or the same rules for selecting positions.
- corresponding end points refer to end points that have mirror symmetry.
- the third lateral distance represents the coordinate difference between two adjacent characters in the x-axis direction in the x-axis direction, and “adjacent” here includes the case where the foregoing content is adjacent.
- the kernel density function is used to perform clustering calculation on the difference of all the calculated second coordinate values.
- the calculation result of the kernel density function includes one or more minimum values, and each minimum value represents the existence of a type of third horizontal spacing in the figure. , For example, including the smallest third smallest value and the fourth smallest value greater than the third smallest value, where the smallest third smallest value represents the spacing between normal characters, such as " ⁇ " in the word "character” The distance between "Fu” and "Fu”. There may be one or more fourth minimum values greater than the third minimum value, which can represent a variety of spacings.
- the last character in the first column and the first character in the second column The space between the two is a kind of space, and the space between the last word in the second column and the first word in the third column may be another kind of space. Since the writing method in this embodiment is horizontal writing, it is not necessary to consider the part of the difference less than the smallest third minimum value, and only consider the difference between the third minimum value and the fourth minimum value. When it is determined that the difference between the second coordinate values of the two laterally adjacent characters is greater than the third minimum value and less than the fourth minimum value, or the difference between the second coordinate values of the two laterally adjacent characters is determined When between the two fourth minimum values, it is determined that the two adjacent characters in the horizontal direction belong to two character string groups respectively.
- one or more character string groups existing in the image can be determined, and finally the characters belonging to the same character string group can be determined. If there is a table in the figure (not limited to whether there is a frame line), there may be a table line between the two characters, and a table column is a character column group.
- the character column group can be judged line by line. After the character column group judgment is performed on all the characters in the first character line, multiple character column groups can be obtained, and then the character column group judgment can be performed on all the characters in the second character line. The abscissas of the characters in the previously obtained character string group are compared to determine which character string group they are divided into. If the character column group is a table column, combined with the judgment of the row spacing, the table in the figure can be fully recognized.
- the character processing method further includes a step of determining a column spacing and a step of determining a character row group.
- the above-mentioned step 120 may include step 121' to step 124'.
- Step 121' for any two longitudinally adjacent third and fourth characters, calculate the first abscissa of one or more points in the third character and one or more corresponding ones in the fourth character. The difference of the first abscissa of each point;
- Step 122' cluster calculation of all the calculated differences using the kernel density function, determine the first longitudinal distance according to the calculation result of the kernel density function (for example, the fifth minimum value), and compare the longitudinal phase according to the first longitudinal distance. The difference between the first abscissas of two adjacent characters is judged, and the characters belonging to the same character column are determined;
- Step 123' Determine the second longitudinal distance according to the calculation result of the kernel density function (for example, the sixth minimum value), and judge the difference between the first abscissas of two longitudinally adjacent characters according to the second longitudinal distance To determine the column spacing;
- the kernel density function for example, the sixth minimum value
- step 122' and step 123' can be executed in combination.
- the kernel density function is used to cluster all the calculated differences. Since the longitudinally adjacent characters include not only the position adjacent characters but also the content adjacent characters, the kernel density function is used to cluster the coordinate differences of the longitudinally adjacent characters The maximum value of the offset of the character column and the maximum value of one or more column spacings (second vertical spacing) can be obtained.
- the calculation result of the kernel density function includes one or more minimum values, for example, including the smallest fifth minimum value and a sixth minimum value greater than the fifth minimum value, where the fifth minimum value represents the smallest value that exists
- the maximum value of a type of spacing (character column offset) the fifth minimum value represents the maximum offset value of two adjacent characters in the y-axis direction in the x-axis direction
- the fifth minimum value can be used for To correct the slanted character column, there can be one or more second minimum values, which can represent one or more column spacing, that is, the spacing between two adjacent columns.
- the difference between the first abscissas of the two longitudinally adjacent characters is less than or equal to the fifth minimum value, it is determined that the two longitudinally adjacent characters belong to the same character string, and it is determined that the longitudinally adjacent characters belong to the same character string.
- the difference between the first abscissas of two characters is greater than the fifth minimum value and less than the sixth minimum value, it is determined that the two longitudinally adjacent characters belong to two character strings respectively, that is, there exists between the two characters Column spacing.
- Step 124' using the kernel density function to perform clustering calculation on the difference between the second coordinate values of the two longitudinally adjacent characters to determine the third longitudinal spacing, and according to the third longitudinal spacing, the first two longitudinally adjacent characters
- the difference between the two coordinate values is determined to determine whether the two longitudinally adjacent characters belong to the same character line group, and finally the characters belonging to the same character line group are determined.
- the second coordinate value is the second ordinate, which may be the ordinate of one or more preset points.
- the difference between the second ordinates of the two longitudinally adjacent characters may be the second ordinate of one or more points in the seventh character and the eighth character.
- the difference in the second ordinate of the corresponding or corresponding one or more points includes one of the following situations:
- Case 9 The difference between the ordinate of a point in the seventh character and the corresponding point in the eighth character, for example, the difference between the ordinate of the uppermost point of the seventh character and the ordinate of the uppermost point of the eighth character, or the first The difference between the ordinate of the lowermost point of the seven character and the ordinate of the lowermost point of the eighth character, or the difference between the ordinate of the center point of the seventh character and the ordinate of the center point of the eighth character;
- the end point, the lowermost point, and the center point are only examples for illustration. In other embodiments, other points can be selected for calculation, as long as the standard for selecting points is unified;
- Case 10 The difference between the ordinate of a point in the seventh character and the corresponding point in the eighth character, for example, the difference between the ordinate of the lowest point of the seventh character and the ordinate of the uppermost point of the eighth character, the first The ordinate of the eight character is greater than the ordinate of the seventh character;
- Case 11 The difference between the mean value of the ordinates of multiple points in the seventh character and the mean value of the ordinates of the corresponding points in the eighth character, for example, the mean values of the ordinates of the multiple upper end points in the seventh character correspond to those in the eighth character
- Case 12 The difference between the ordinates of multiple points in the seventh character and the corresponding points in the eighth character, for example, the mean value of the ordinates of the multiple lower endpoints in the seventh character and the ordinate of the multiple upper endpoints in the eighth character The difference of the mean value of the coordinates, the ordinate of the eighth character is greater than the ordinate of the seventh character;
- corresponding endpoints refer to endpoints that have the same position or the same rules for selecting positions.
- corresponding end points refer to end points that have mirror symmetry.
- the third longitudinal distance represents the coordinate difference of two adjacent characters in the y-axis direction in the y-axis direction, and "adjacent" here includes the situation that the foregoing content is adjacent.
- the step of determining whether the characters belong to the same character row group can be performed with reference to the step of determining whether the characters belong to the same character row group in step 124, which will not be repeated here.
- the character row group is, for example, a table row.
- FIG. 4 is a flowchart of a character recognition method provided by an embodiment of the disclosure. As shown in the figure, it includes step 310 to step 330.
- Step 310 preprocessing the image
- the preprocessing includes one or more of the following processes: color image conversion to grayscale image, Gaussian filtering, background extraction, contrast compensation, binarization, and perspective transformation.
- Step 320 Identify the coordinates of the characters in the image
- Step 330 Use the character processing method in the foregoing embodiment (for example, FIG. 1, FIG. 2 or FIG. 3) to perform recognition processing on the characters in the image, and determine the characters belonging to the same row.
- the character recognition method further includes step 340 of displaying part or all of the row or rows of characters determined after the recognition processing.
- step 330 after performing recognition processing on the characters in the image, the following recognition processing results can be determined: which characters belong to the same row, and thus the number of rows existing in the image can be determined.
- row-included behavior you can determine the characters belonging to the same row and how many rows there are in the image.
- the row and column you can determine the characters that belong to the same column and how many columns there are in the image.
- the image includes a table, in addition to the above content, the table row and/or table column can also be determined.
- part or all of the recognition processing results can be selected and displayed on the interface of the terminal. Taking the table included in the image as an example, it can be one or more of the following two situations:
- the character processing and recognition method provided by the present disclosure has fast recognition speed and high recognition accuracy.
- the kernel density estimation function is used to estimate the offset distance and the distance between the characters in the text block after the character recognition result, and carry out the line reconstruction and the character block reconstruction. It has high robustness and anti-interference ability, and the recognition accuracy is high.
- the method of the embodiments of the present disclosure can be used not only for character recognition and frameless form recognition, but also for framed form recognition. Because the embodiment of the present disclosure performs identification analysis on the row spacing and column spacing, the presence or absence of frame lines does not affect the identification result.
- the method of the embodiment of the present disclosure can handle table photos with offsets in the text position due to the tilt of the shooting angle, uneven shooting of the lighting, etc., and the kernel density algorithm of the clustering nature can classify these offsets into its correct category (line ,List).
- FIG. 6 is an algorithm flow chart of the form recognition method of the exemplary embodiment, which includes the following steps 410 to 430.
- Step 410 image preprocessing
- the image preprocessing process may include one or more of the following steps as required: color image conversion to grayscale image, filtering (such as Gaussian filtering), background extraction, contrast compensation, binarization, and perspective transformation.
- filtering such as Gaussian filtering
- background extraction contrast compensation
- binarization binarization
- perspective transformation perspective transformation
- Step 411 Convert the input color image to be recognized into a grayscale image
- Converting a color image to a grayscale image can be achieved using opencv.
- Step 412 Use Gaussian filtering to remove noise in the grayscale image
- Gaussian filtering can use opencv to remove noise.
- Step 413 Perform background extraction on the grayscale image after the noise is removed
- the background of the photo taken when the light is poor is relatively dark, and the characters in the image are usually black, the background is equivalent to interference, resulting in inconspicuous characters. Therefore, the gray level of the background can be estimated through background extraction, and the main character and the background can be estimated. Peel it off.
- the background of a certain point in the picture can be estimated by the set of brighter points in the w*w neighborhood of the point, that is, some whiter points in an area can be used to represent the background of the area.
- w is an empirical value
- the value range of w is the number of pixels greater than 0 and less than the side length. For example, it may be about one-tenth of a certain side length (for example, the shortest side length).
- the processing speed can be improved by reducing the picture before processing. If the pixels of an image are several thousand times several thousand, the processing speed will be greatly improved after equal-ratio compression and will not affect the background extraction result.
- the values of the neighborhood range and the number of brightness samples are empirical values determined according to the image size, and other parameters can be tried according to actual conditions.
- Step 414 Compensate the contrast of the gray image after noise removal according to the background extraction result
- the uneven illumination background can be removed by contrast compensation, which can be calculated by the following formula:
- y is the gray value of any pixel after compensation
- p s is the gray value of the original image at the same pixel position obtained in step 102
- p b is the gray value of the background image at the same pixel position extracted in step 103 .
- the above-mentioned contrast compensation method is the realization method of contrast compensation in the open source tool ScanTailor, and the amount of calculation is small and the speed is fast.
- other contrast compensation methods can also be selected, which is not limited in the present disclosure.
- Step 415 Binarize the contrast-compensated image
- the local threshold Sauvola an image binarization method that considers the local mean brightness
- the input of the Sauvola algorithm is a grayscale image, which takes the current pixel as the center, and dynamically calculates the threshold of the pixel according to the gray average and standard deviation in the neighborhood of the current pixel. It can be implemented with scikit-image.
- scikit-image (abbreviated as skimage) is a collection of image processing and computer vision algorithms, including the Sauvola algorithm.
- Step 416 Perform perspective transformation on the binarized image.
- Perspective Transformation is used to solve the problem that Affine Transformation cannot change the relative positional relationship inside the shape. Similar to the "free transform” function in Photoshop, or the "perspective" function in the GNU (an operating system) image processing program (GNU Image Manipulation Program, GIMP for short), both can be implemented with a perspective transformation matrix.
- Perspective transformation uses a four-vertex perspective transformation method. By looking for a perspective transformation matrix, an oblique quadrilateral can be converted into a rectangle.
- Step 420 Perform general OCR text recognition on the preprocessed image
- FIG. 8 is an example of a medical examination report with an oblique angle taken by a user
- FIG. 9 is an image after the above-mentioned picture preprocessing and general text OCR recognition.
- Step 430 Perform table reconstruction on the content identified by OCR.
- the table reconstruction process of the embodiment of the present disclosure uses the Gaussian kernel density estimation method to reconstruct the result of the general OCR character recognition process to form a table.
- the reconstruction process includes text row reconstruction, table row reconstruction, and table column reconstruction. Through table row reconstruction and table column reconstruction, the cells in the table can be determined, and the inner text block can be reconstructed.
- the reconstruction process does not depend on the borders inside and outside the form, and has strong anti-interference ability against tilt.
- Figure 10 shows the table reconstruction process.
- the kernel density estimation algorithm is used to reconstruct the table to realize the OCR table recognition, which improves the shortcomings of the previous table recognition that relied on the table frame and the accuracy was not high.
- Kernel density estimation is a method used to estimate unknown density functions in probability theory, and it is one of the non-parametric testing methods.
- the kernel density estimation method does not use prior knowledge about the data distribution and does not attach any assumptions to the data distribution. It is a method to study the characteristics of the data distribution from the data sample itself.
- the table to be identified may be compared with the table in the pre-established table format library before the table reconstruction is performed.
- the table header and/or table footer are compared and determined.
- the table header part is the same as the pre-saved table format
- the table border is determined according to the pre-saved table format
- the table range is determined, and the table is performed within the determined table range Refactoring.
- the table reconstruction process includes the following steps 801 to 806.
- the row is determined first and then the column is determined as an example for description.
- the column may be determined first, and then the row may be determined.
- Step 801 After the pre-processed image is recognized by general characters, the upper, lower, left, and right coordinate positions (hereinafter referred to as coordinates) of each character in the image are obtained, and the upper coordinate of each character is recorded;
- the horizontal axis is the x axis and the vertical axis is the y axis.
- Each character can be regarded as a rectangle.
- a rectangle contains at least one character.
- a rectangle includes four sides: the upper side (or top side), the bottom side (or bottom side), the left side and the right side.
- the upper coordinate of the text is the distance from the top of the rectangle to the x-axis
- the left coordinate of the text is the distance from the left of the rectangle to the y-axis
- the lower coordinate of the text is the distance from the bottom of the rectangle to the x-axis
- the right coordinate of the text is The distance from the right side of the rectangle to the y axis.
- the upper coordinates of all text can be inserted into an upper coordinate list, and the upper coordinate list is recorded as list1;
- Step 802 traverse list1, calculate the absolute value of the upper coordinate difference of two adjacent characters in list1, and insert the obtained absolute value into a new list—the upper coordinate difference list, the upper coordinate difference list is recorded as list_top;
- calculating the upper coordinate difference is taken as an example for description.
- the lower coordinate difference of two adjacent characters can also be calculated.
- the method to find the distance is the same in the subsequent steps.
- Calculating the upper coordinate difference of two adjacent characters is to obtain the data distribution, which includes the following situations that may occur: the first horizontal spacing (ie the offset of the text line or the offset spacing, for example, the offset of the text line due to the tilt of the shooting angle) Shift), the second horizontal spacing (including text line spacing, table line spacing or other line spacing, other line spacing is for example the large distance between the last check item in the medical examination form or the laboratory test form and the signature at the end of the form, others There may be one or more line spacing). According to the test order form in this example, it has three types of horizontal spacing as described above-offset spacing, table line spacing (in this example, the text line spacing is the same as the table line spacing) and other spacings.
- the first horizontal spacing ie the offset of the text line or the offset spacing, for example, the offset of the text line due to the tilt of the shooting angle) Shift
- the second horizontal spacing including text line spacing, table line spacing or other line spacing, other line spacing is for example the large distance between the last check item in
- Step 803 Use Gaussian kernel density estimation to fit the density curve for the list list_top to find the minimum point of the discrete sequence list_top.
- the initial bandwidth bandwidth is the smoothing parameter
- the preset step size such as 0.1
- the preset step size such as 0.1
- the Gaussian kernel density estimation to fit the density curve until the number of minimum values is 3. .
- the minimum point of the discrete sequence is the dividing point of multiple types of horizontal spacing. In other embodiments, depending on the table, the number of horizontal spacing may also be different. There may be a type of horizontal spacing, that is, the first horizontal spacing. When the upper coordinate difference (or the lower coordinate difference) of the two characters is smaller than the first horizontal spacing When it is considered that the two characters are on the same line, the first horizontal spacing can be used to correct the character deviation caused by the shooting angle.
- Figure 11 is the density curve fitted to the coordinate difference on adjacent text using kernel density estimation.
- the X-axis is the value of the input data, which is the upper coordinate difference
- the Y-axis is the estimated logarithmic kernel under a certain upper coordinate difference. Density estimate.
- For the kernel density estimation curve of multiple data points since waveform synthesis occurs between adjacent peaks, the final curve shape is not closely related to the selected kernel function. Considering the ease of use of the function in waveform synthesis calculation, this embodiment uses a Gaussian kernel function (normal distribution curve) as the kernel function for kernel density estimation.
- the estimated value of the ordinate can be taken as a logarithm (lg10(f)) to compress the size of the estimated value.
- Kernel density estimation is to superimpose the kernel functions of each coordinate difference x i to form a density curve.
- n the sample size (that is, the total number of coordinate differences)
- h the bandwidth
- K() the kernel function
- This embodiment uses the Gaussian kernel function.
- the calculation formula of the Gaussian kernel function is as follows:
- kernel density estimation it is possible to distinguish a variety of lateral distances.
- Figure 11 only illustrates the density curve waveform and the corresponding minimum point in this example.
- the minimum value of the kernel density function can be calculated directly by derivation.
- Step 804 Determine the characters belonging to the same row and the characters belonging to the same table row according to the found minimum value
- the number of horizontal spacing is 3, that is, there are 3 minimum values (3 minimum values in Figure 11), and the X-axis coordinate corresponding to the first minimum value is the maximum text offset spacing Value (hereinafter referred to as text offset spacing), the X-axis coordinate corresponding to the second minimum value is the maximum table line spacing (hereinafter referred to as the table line spacing), and the X-axis coordinate corresponding to the third minimum value is other spacing.
- it is a large distance between rows (for example, the distance between the last check item in the medical examination form or the laboratory test form and the signature at the end of the form).
- Text offset spacing ⁇ table line spacing ⁇ large distance line spacing.
- the two characters when the upper coordinate difference between two characters is less than or equal to the character offset distance, then the two characters are considered to be in the same character line.
- the upper coordinate difference between the two characters is greater than the character offset distance and less than If it is equal to the table row spacing, it is considered that the two characters are in different table rows, that is, there is a table line between the two characters.
- the upper coordinate difference between two characters is greater than the table line spacing and less than or equal to the large line spacing, it is considered that there is a large line spacing between the two characters.
- the upper coordinate difference less than mi is regarded as the same text line.
- the first character Take the first character as the first line, traverse the remaining characters in list1, if the absolute value of the difference between the upper coordinate and the determined average value of the upper coordinate of the character on the line is less than mi, then The character is determined to be the character on the line, if it is greater than mi, it is recorded as an independent new text line. After all the characters are traversed, the characters in each line are sorted according to the left coordinate, and the reconstruction of the character line is completed.
- the x-coordinate value of the second minimum point is ni
- ni is the maximum table row spacing
- the upper coordinate difference less than ni is considered to be the same table row.
- the table line can also be determined on the basis of the determined text line, and the position of the table line can be determined by traversing the upper coordinate difference of each character to find a coordinate difference greater than mi and less than ni, or respectively Determine the average upper coordinate difference of two text lines, calculate the absolute value of the difference of the average upper coordinate difference of the two text lines, and determine if the absolute value is greater than mi and less than ni, then there is a table line between the two text lines.
- the determined table line can also be verified. For example, the above methods for determining the table line can be mutual verification methods. So far, the table row reconstruction is complete.
- Step 805 For each character line, calculate the absolute value of the difference between the right coordinate of each character and the left coordinate of the character located on the right side of the character and adjacent to the character (hereinafter referred to as the left and right coordinate difference of two adjacent characters), and Take all the differences as a discrete sequence and use Gaussian kernel density estimation to find the minimum point on the fitting curve, similar to step 803. For the table in this embodiment, 3 extreme points (number of column spacing) will be obtained through Gaussian kernel density estimation.
- the x-axis coordinates of the 3 extreme points represent the character column spacing, the table column spacing and other spacing ( Large distance column spacing greater than the table column spacing, such as the spacing on both sides of the table), where the text column spacing can be regarded as the maximum column spacing between texts in the same cell (the smallest cell of the table consisting of one row and one column), and the text column spacing is smaller than the table Column spacing, the table column spacing is smaller than other large distance column spacing. There may be one or more other spacings. In this embodiment, when the difference between the left and right coordinates of two adjacent characters is greater than the spacing between the character columns and less than or equal to the spacing between the table columns, it is considered that the two characters are in different table columns.
- the description is based on the writing habit of from left to right as an example. If the writing habit is from right to left, the left coordinate of each character in each text line and the left coordinate of each character on the left side of the character and the The absolute value of the difference between the right coordinates of adjacent characters. Another adjacent situation is that the last word in a row and the first word in another row are also considered adjacent characters.
- step 806 the text is reorganized according to the table row determined in step 804 and the table column determined in step 805, and the table is reconstructed.
- Using the method of the embodiments of the present disclosure does not depend on the presence or absence of a frame line in the table, because even if there is no frame line, the row spacing still exists, and the column spacing still exists.
- FIG. 12 shows the recognition result of the exemplary embodiment.
- you can preset the styles of different types of forms when displayed on the terminal interface, and determine the display style corresponding to the current form according to the header and/or footer identification, and then apply the above recognition method After the recognition result shown in FIG. 12 is obtained, the recognized content is displayed to the user through a preset display style.
- FIGS. 13a and 13b show the display effect of part of the content.
- the display style is set to black large font, and the unit and reference range are reference content, so the display style is set to gray small font.
- the serial number and English abbreviation are not the content that the user cares about, in this example, they are not displayed.
- the table is a standardized table
- the standardized content such as item name, unit, reference range, etc., can be filled in advance when the style is preset, and the item name or serial number in the preset style can be compared.
- Figures 12a and 12b are only a display example, and in other embodiments, it can be set to other styles as required.
- the technical solution of the present disclosure can achieve better results in borderless forms such as medical scenarios, physical examination reports and laboratory test forms.
- the method according to the embodiment of the present disclosure has higher robustness and anti-interference ability, as well as higher recognition speed and accuracy.
- the method of the embodiments of the present disclosure still has the characteristics of good results when the borders in the table are defaulted, and are especially suitable for the identification of borderless tables such as medical scene medical examination reports and laboratory test forms.
- a terminal device (referred to as a terminal for short) is also provided.
- the terminal device may include a processor, a memory, and a computer program that is stored on the memory and can run on the processor.
- the processor can implement character processing in the embodiments of the present disclosure when the processor executes the computer program. Method or character recognition method.
- the terminal device 1300 may include: a processor 1310, a memory 1320, a bus system 1330, and a transceiver 1340, where the processor 1310, the memory 1320, and the transceiver 1340 pass through the bus
- the system 1330 is connected, the memory 1320 is configured to store instructions, and the processor 1310 is configured to execute the instructions stored in the memory 1320 to implement the character processing or character recognition method as described above.
- the transceiver 1340 is configured to receive the coordinates of the characters in the image obtained by recognition and send them to the processor, and is also configured to output the character processing result of the processor.
- the transceiver 1340 is configured to obtain the image to be processed and receive the character recognition result of the processor.
- the terminal may be, for example, a mobile handheld device such as a mobile phone and a tablet computer.
- the transceiver 1340 may include a camera, which can obtain images to be processed by capturing images, or the transceiver 1340 may be installed from other applications (for example, with an image transmission function) installed in the mobile phone.
- the image to be processed may be obtained from the application program), or the transceiver 1340 may obtain the coordinates of the characters in the recognized image from other applications installed in the mobile phone (for example, an application program with a text recognition function).
- the transceiver 1340 may also include a display module configured to receive and display the character processing result (the result obtained by the processor performing the character processing method) or the character recognition result (the processor performing the character recognition method) output by the processor. ⁇ ); or the transceiver 1340 includes other application programs that are configured to receive character processing results or character recognition results output by the processor, and process the received results.
- the processor 1310 may be a central processing unit (Central Processing Unit, referred to as "CPU"), and the processor 1310 may also be other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), or off-the-shelf processors. Programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the memory 1320 may include a read-only memory and a random access memory, and provides instructions and data to the processor 1310. A part of the memory 1320 may also include a non-volatile random access memory. For example, the memory 1320 may also store device type information.
- the bus system 1330 may also include a power bus, a control bus, and a status signal bus. However, for the sake of clear description, various buses are marked as the bus system 1330 in FIG. 14.
- the processing performed by the terminal device may be completed by an integrated logic circuit of hardware in the processor 1310 or an instruction in the form of software. That is, the steps of the method disclosed in the embodiments of the present disclosure may be embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
- the software module can be located in storage media such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers, etc.
- the storage medium is located in the memory 1320, and the processor 1310 reads the information in the memory 1320, and completes the steps of the foregoing method in combination with its hardware. To avoid repetition, it will not be described in detail here.
- Such software may be distributed on a computer-readable medium, and the computer-readable medium may include a computer storage medium (or a non-transitory medium) and a communication medium (or a transitory medium).
- the term computer storage medium includes volatile and non-volatile data implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data). Sexual, removable and non-removable media.
- Computer storage media include but are not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or Any other medium used to store desired information and that can be accessed by a computer.
- communication media usually contain computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as carrier waves or other transmission mechanisms, and may include any information delivery media. .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Character Input (AREA)
Abstract
Description
Claims (16)
- 一种字符处理方法,包括:识别获得图像中字符的坐标;采用核密度函数对相邻两字符的第一坐标值的差值进行聚类计算,确定属于同一排的字符。
- 根据权利要求1所述的方法,其中,所述排包括行;所述采用核密度函数对相邻两字符的第一坐标值的差值进行聚类计算,确定属于同一排的字符,包括:采用核密度函数对横向相邻的两字符的第一纵坐标的差值进行聚类计算,确定第一横向间距,根据所述第一横向间距确定属于同一行的字符;或者所述排包括列;所述采用核密度函数对相邻两字符的第一坐标值的差值进行聚类计算,确定属于同一排的字符,包括:采用核密度函数对纵向相邻的两字符的第一横坐标的差值进行聚类计算,确定第一纵向间距,根据所述第一纵向间距确定属于同一列的字符。
- 根据权利要求2所述的方法,其中,所述采用核密度函数对横向相邻的两字符的第一纵坐标的差值进行聚类计算,确定第一横向间距,根据所述第一横向间距确定属于同一行的字符,包括:对于任意两个横向相邻的第一字符和第二字符,计算所述第一字符中的一个或多个点的第一纵坐标与所述第二字符中相应的一个或多个点的第一纵坐标的差值;采用核密度函数对计算得到的所有差值进行聚类计算,根据核密度函数计算结果确定第一横向间距,根据所述第一横向间距对横向相邻的两字符的第一纵坐标的差值进行判断,确定属于同一字符行的字符。
- 根据权利要求2所述的方法,其中,所述采用核密度函数对纵向相邻的两字符的第一横坐标的差值进行聚类计算,确定第一纵向间距,根据所述第一纵向间距确定属于同一列的字符,包括:对于任意两个纵向相邻的第三字符和第四字符,计算所述第三字符中的一个或多个点的第一横坐标与所述第四字符中相应的一个或多个点的第一横坐标的差值;采用核密度函数对计算得到的所有差值进行聚类计算,根据核密度函数计算结果确定第一纵向间距,根据所述第一纵向间距对纵向相邻的两字符的第一横坐标的差值进行判断,确定属于同一字符列的字符。
- 根据权利要求3所述的方法,还包括:根据所述核密度函数计算结果确定第二横向间距,根据所述第二横向间距对横向相邻的两字符的第一纵坐标的差值进行判断,确定行间距;采用核密度函数对横向相邻的两字符的第二坐标值的差值进行聚类计算,确定第三横向间距,根据所述第三横向间距对横向相邻的两字符的第二坐标值的差值进行判断,确定属于同一字符列组的字符。
- 根据权利要求5所述的方法,其中:所述横向相邻的两字符的第二坐标值的差值包括:第五字符中的一个或多个点的第二横坐标与第六字符中相应或对应的一个或多个点的第二横坐标的差值,所述第五字符与第六字符为横向相邻的两字符。
- 根据权利要求5所述的方法,所述根据所述第一横向间距对横向相邻的两字符的第一纵坐标的差值进行判断,确定属于同一字符行的字符,以及根据所述核密度函数计算结果确定第二横向间距,根据所述第二横向间距对横向相邻的两字符的第一纵坐标的差值进行判断,确定行间距,包括:所述核密度函数计算结果包括至少两个极小值,包括最小的第一极小值以及大于第一极小值的第二极小值,判断所述横向相邻两字符的第一纵坐标的差值小于或等于所述第一极小值时,确定所述横向相邻的两字符属于同一字符行,判断所述横向相邻两字符的第一纵坐标的差值大于所述第一极小值且小于或等于所述第二极小值时,确定所述横向相邻的两字符之间存在行间距。
- 根据权利要求4所述的方法,还包括:根据所述核密度函数计算结果确定第二纵向间距,根据所述第二纵向间距对纵向相邻的两字符的第一横坐标的差值进行判断,确定列间距;采用核密度函数对纵向相邻的两字符的第二坐标值的差值进行聚类计算,确定第三纵向间距,根据所述第三纵向间距对纵向相邻的两字符的第二坐标值的差值进行判断,确定属于同一组的字符行。
- 根据权利要求8所述的方法,其中:所述纵向相邻的两字符的第二坐标值的差值包括:第七字符中的一个或多个点的第二纵坐标与第八字符中相应或对应的一个或多个点的第二纵坐标的差值,所述第七字符与第八字符为纵向相邻的两字符。
- 根据权利要求8所述的方法,所述根据所述第一纵向间距对纵向相邻的两字符的第一横坐标的差值进行判断,确定属于同一字符列的字符,以及根据所述核密度函数计算结果确定第二纵向间距,根据所述第二纵向间距对纵向相邻的两字符的第一横坐标的差值进行判断,确定列间距,包括:所述核密度函数计算结果包括至少两个极小值,包括最小的第五极小值以及大于第五极小值的第六极小值,判断所述纵向相邻的两字符的第一横坐标的差值小于或等于所述第五极小值时,确定所述纵向相邻的两字符属于同一字符列,判断所述纵向相邻的两字符的第一横坐标的差值大于所述第五极小值且小于或等于所述第六极小值时,确定所述纵向相邻的两字符之间存在列间距。
- 根据权利要求1至10中任一项所述的方法,其中,所述核密度函数为高斯核密度函数。
- 一种字符识别方法,包括:对图像进行预处理;识别所述图像中字符的坐标;采用权利要求1-11中任一项所述字符处理方法对所述图像中字符进行识别处理,确定属于同一排的字符。
- 根据权利要求12所述的方法,其中,所述对图像进行预处理,包括以下处理中的一种或多种:彩色图转灰度图、高斯滤波、背景提取、对比度补偿、二值化和透视变换。
- 根据权利要求12所述的方法,所述确定属于同一排的字符后,所述方法还包括:显示识别处理后确定的一排或多排字符中的部分或者全部。
- 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至11中任一项或12至14中任一项所述的方法。
- 一种终端设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如权利要求1至11中任一项或12至14中任一项所述方法的步骤。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/076828 WO2021168703A1 (zh) | 2020-02-26 | 2020-02-26 | 字符处理及字符识别方法、存储介质和终端设备 |
CN202080000183.0A CN113557520A (zh) | 2020-02-26 | 2020-02-26 | 字符处理及字符识别方法、存储介质和终端设备 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/076828 WO2021168703A1 (zh) | 2020-02-26 | 2020-02-26 | 字符处理及字符识别方法、存储介质和终端设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021168703A1 true WO2021168703A1 (zh) | 2021-09-02 |
Family
ID=77489814
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/076828 WO2021168703A1 (zh) | 2020-02-26 | 2020-02-26 | 字符处理及字符识别方法、存储介质和终端设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113557520A (zh) |
WO (1) | WO2021168703A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116740059A (zh) * | 2023-08-11 | 2023-09-12 | 济宁金康工贸股份有限公司 | 一种门窗机加工智能调控方法 |
CN117037194A (zh) * | 2023-05-10 | 2023-11-10 | 广州方舟信息科技有限公司 | 单据图像的表格识别方法、装置、电子设备及存储介质 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116071771A (zh) * | 2023-03-24 | 2023-05-05 | 南京燧坤智能科技有限公司 | 表格重构方法、装置、非易失性存储介质及电子设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110222752A1 (en) * | 2008-04-08 | 2011-09-15 | Three Palm Software | Microcalcification enhancement from digital mammograms |
CN108597605A (zh) * | 2018-03-19 | 2018-09-28 | 特斯联(北京)科技有限公司 | 一种个人健康生活大数据采集与分析系统 |
CN108628824A (zh) * | 2018-04-08 | 2018-10-09 | 上海熙业信息科技有限公司 | 一种基于中文电子病历的实体识别方法 |
CN109815958A (zh) * | 2019-02-01 | 2019-05-28 | 杭州睿琪软件有限公司 | 一种化验单识别方法、装置、电子设备和存储介质 |
CN110837796A (zh) * | 2019-11-05 | 2020-02-25 | 泰康保险集团股份有限公司 | 图像处理方法及装置 |
-
2020
- 2020-02-26 CN CN202080000183.0A patent/CN113557520A/zh active Pending
- 2020-02-26 WO PCT/CN2020/076828 patent/WO2021168703A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110222752A1 (en) * | 2008-04-08 | 2011-09-15 | Three Palm Software | Microcalcification enhancement from digital mammograms |
CN108597605A (zh) * | 2018-03-19 | 2018-09-28 | 特斯联(北京)科技有限公司 | 一种个人健康生活大数据采集与分析系统 |
CN108628824A (zh) * | 2018-04-08 | 2018-10-09 | 上海熙业信息科技有限公司 | 一种基于中文电子病历的实体识别方法 |
CN109815958A (zh) * | 2019-02-01 | 2019-05-28 | 杭州睿琪软件有限公司 | 一种化验单识别方法、装置、电子设备和存储介质 |
CN110837796A (zh) * | 2019-11-05 | 2020-02-25 | 泰康保险集团股份有限公司 | 图像处理方法及装置 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117037194A (zh) * | 2023-05-10 | 2023-11-10 | 广州方舟信息科技有限公司 | 单据图像的表格识别方法、装置、电子设备及存储介质 |
CN116740059A (zh) * | 2023-08-11 | 2023-09-12 | 济宁金康工贸股份有限公司 | 一种门窗机加工智能调控方法 |
CN116740059B (zh) * | 2023-08-11 | 2023-10-20 | 济宁金康工贸股份有限公司 | 一种门窗机加工智能调控方法 |
Also Published As
Publication number | Publication date |
---|---|
CN113557520A (zh) | 2021-10-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210256253A1 (en) | Method and apparatus of image-to-document conversion based on ocr, device, and readable storage medium | |
WO2021168703A1 (zh) | 字符处理及字符识别方法、存储介质和终端设备 | |
US11929048B2 (en) | Method and device for marking target cells, storage medium and terminal device | |
WO2019174130A1 (zh) | 票据识别方法、服务器及计算机可读存储介质 | |
WO2020253508A1 (zh) | 异常细胞检测方法、装置及计算机可读存储介质 | |
WO2018233055A1 (zh) | 保单信息录入的方法、装置、计算机设备及存储介质 | |
WO2018233038A1 (zh) | 基于深度学习的车牌识别方法、装置、设备及存储介质 | |
US20200117943A1 (en) | Method and apparatus for positioning text over image, electronic apparatus, and storage medium | |
CN108830780B (zh) | 图像处理方法及装置、电子设备、存储介质 | |
JP2016517587A (ja) | モバイル装置を用いて取込まれたデジタル画像におけるオブジェクトの分類 | |
JP2016516245A (ja) | モバイル装置を用いた画像内のオブジェクトの分類 | |
WO2018049801A1 (zh) | 基于深度图的启发式手指检测方法 | |
BRPI0708452A2 (pt) | método e aparelho de correção de distorção baseado em modelo | |
US10169673B2 (en) | Region-of-interest detection apparatus, region-of-interest detection method, and recording medium | |
WO2020143316A1 (zh) | 证件图像提取方法及终端设备 | |
WO2019200802A1 (zh) | 合同影像图片的识别方法、电子装置及可读存储介质 | |
US20180253852A1 (en) | Method and device for locating image edge in natural background | |
US20200012879A1 (en) | Text region positioning method and device, and computer readable storage medium | |
WO2020248848A1 (zh) | 智能化异常细胞判断方法、装置及计算机可读存储介质 | |
WO2020038312A1 (zh) | 多通道舌体边缘检测装置、方法及存储介质 | |
CN107845068A (zh) | 图像视角变换装置以及方法 | |
WO2021189856A1 (zh) | 证件校验方法、装置、电子设备及介质 | |
CN115273115A (zh) | 一种文档元素标注方法、装置、电子设备和存储介质 | |
CN114359932B (zh) | 文本检测方法、文本识别方法及装置 | |
CN112419207A (zh) | 一种图像矫正方法及装置、系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20922251 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20922251 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20922251 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05.04.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20922251 Country of ref document: EP Kind code of ref document: A1 |