CN116824594B - Text ordering method for positioning keywords in image - Google Patents

Text ordering method for positioning keywords in image Download PDF

Info

Publication number
CN116824594B
CN116824594B CN202310834541.3A CN202310834541A CN116824594B CN 116824594 B CN116824594 B CN 116824594B CN 202310834541 A CN202310834541 A CN 202310834541A CN 116824594 B CN116824594 B CN 116824594B
Authority
CN
China
Prior art keywords
connected domain
coordinate
domains
connected domains
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310834541.3A
Other languages
Chinese (zh)
Other versions
CN116824594A (en
Inventor
韦箫华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Xike Intelligent Technology Co ltd
Original Assignee
Guangdong Xike Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Xike Intelligent Technology Co ltd filed Critical Guangdong Xike Intelligent Technology Co ltd
Priority to CN202310834541.3A priority Critical patent/CN116824594B/en
Publication of CN116824594A publication Critical patent/CN116824594A/en
Application granted granted Critical
Publication of CN116824594B publication Critical patent/CN116824594B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Character Input (AREA)

Abstract

The invention discloses a text ordering method for positioning keywords in images, which relates to the technical field of machine vision and comprises the following steps: a. sequencing all the extracted connected domains from left to right once according to the x direction, and storing the result in a vector R1; b. taking out all the connected domains of R1, and storing the connected domains into a vector R0, wherein R1 is empty; c. performing forward traversal on R0, judging the relative position relation between each connected domain of R0 and all connected domains of R1 in reverse traversal, taking out the connected domains meeting the conditions from R0, putting the connected domains into a vector R1, and otherwise, executing the step d; d. judging the relative position relation between the connected domain of R0 and the last connected domain of DR, taking out the connected domain meeting the condition from R0, putting into DR, otherwise putting into R1; the invention greatly reduces the probability of missed detection and false detection of the keywords.

Description

Text ordering method for positioning keywords in image
Technical Field
The invention relates to the technical field of machine vision, in particular to a text ordering method for positioning keywords in images.
Background
Locating the position of keywords in an image often requires the use of OCR technology as a step in OCR preprocessing: the ordering of the connected domains plays a very important role in the accuracy of keyword positioning. In this step, the connected domain needs to be text-ordered. The text ranking is to rank the central coordinates of the communication fields in the order from left to right and then from top to bottom, namely, the order of reading articles by people is met. The existing method (the 'chart' ordering algorithm in the sort_region function in Halcon) generally performs line segmentation on all connected domains, and the processing method is as follows: as long as the smallest circumscribed rectangle frame of the connected domains meets the set overlapping percentage in the Y-axis direction, the connected domains belong to the same row, and then the connected domains of each row are ordered from left to right. This method can only process images with regular layout, but in practical application, the text layout on the images is not regular. If the above method is used, error conditions as shown in fig. 2,3 and fig. 4, 5 are easily caused. In fig. 2, the smallest circumscribed rectangular box of the word "D" overlaps the smallest circumscribed rectangular boxes of the words "D" and "T" in the word "DOT" in the Y-axis direction, but does not overlap the smallest circumscribed rectangular box of the word "O" in the word "DOT". This results in OCR recognition results of "CORDDT" and not the desired "CORDDOT" results, and the keyword "DOT" cannot be extracted correctly under incorrect recognition results. In fig. 4, the smallest circumscribed rectangle of "Y" overlaps the smallest circumscribed rectangle of the character "5" in the Y-axis direction, resulting in a final recognition result of "2Y5", which is not the expected "2Y156" result, and the keyword "156" cannot be extracted correctly under the wrong recognition result.
In order to avoid the errors, the invention provides a sorting method for the image connected domains, which greatly improves the accuracy of positioning keywords in the images.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention aims to provide a text ordering method for positioning keywords in images.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a text ordering method for locating keywords in an image, comprising the steps of:
Step 1: after all connected domains in the image are extracted, sequencing the X coordinate of the left upper corner of the smallest circumscribed rectangle frame of the connected domains from left to right according to the X axis direction, and sequentially storing the sequenced results into a vector R1;
Step 2: taking out all the connected domains in R1 and storing the same into a vector R0, wherein R1 is an empty vector, taking out the first connected domain of R0 and storing the same into DR as a reference connected domain of a new row;
Step 3: traversing R0 according to the sequence from the index to the index, setting the circulation index as j, wherein R0[ j ] represents the j-th connected domain in the vector R0, and acquiring the position parameter of R0[ j ]: min_j, max_j, cy_j, cx_j, and up_j;
Step 4: traversing R0 to a jth connected domain, namely traversing R1 according to the sequence from the index to the small, setting a cyclic index as k, comparing the relative positions of R0[ j ] and a connected domain R1[ k ] meeting a condition III in R1, and meeting the conditions I and II when meeting the condition III, wherein the condition I and II indicate that the connected domain belonging to the same row with R0[ j ] exists in R1, so that R0[ j ] is stored in R1, and if R1 is traversed, and the conditions I, II and III can not be met simultaneously, executing step 5, otherwise returning to step 3, traversing the next connected domain in R0, namely R0[ j+1];
Step 5: the last connected domain of the vector DR is taken out, namely, the last connected domain DR [ end ] stored in the DR is taken out, and the position parameter of DR [ end ] is obtained: min_e, max_e, cy_e and cx_e, comparing the relative positions of the connected domain R0[ j ] which is not stored in R1 in the step 4 with the DR [ end ], storing R0[ j ] in DR if the conditions four, five and six are satisfied at the same time, otherwise storing R0[ j ] in R1;
Step 6: judging whether R0 is traversed to be over or not, if not, returning to the step 3, continuing traversing, otherwise judging whether R1 is empty or not, if R1 is empty, indicating that all connected domains are ordered to be over, DR stores all ordered connected domains, if R1 is not empty, indicating that all connected domains are not ordered, returning to the step 2, and continuing executing until R1 is judged to be empty in the step 6.
Further, R0 represents initializing to be empty, storing unordered connected domains, R1 represents initializing to be connected domains which have been ordered from left to right in the X-axis direction, and subsequently being used for temporarily storing connected domains which do not conform to the ordering rule, DR represents the finally outputted connected domains.
Further, cx_k denotes the center x coordinate of the smallest circumscribed rectangular frame of the kth connected domain, cy_k denotes the center Y coordinate of the smallest circumscribed rectangular frame of the kth connected domain, max_k denotes the maximum Y coordinate of the smallest circumscribed rectangular frame of the kth connected domain, min_k denotes the minimum Y coordinate of the smallest circumscribed rectangular frame of the kth connected domain, up_k denotes the midpoint between the center point and the upper boundary of the smallest circumscribed rectangular frame of the kth connected domain, H denotes the distance between up_k and min_k, W denotes the maximum range of the distances of the center point x coordinates of the two connected domains used for judging the relative positional relationship, which parameter needs to be manually set according to the actual situation for limiting the comparative range.
Further, min_j represents the minimum Y coordinate of the minimum circumscribed rectangular frame of the jth connected domain in R0, and other parameters are the same.
Further, min_e represents the minimum Y coordinate of the minimum circumscribed rectangular frame of DR [ end ], and other parameters are the same.
Compared with the prior art, the invention has the following beneficial effects:
Taking fig. 7 as an example, by using the method of the present invention, the final line segmentation result is shown in fig. 14, where the connected domains with the same color represent the same line, and the connected domains with different colors represent different lines, and as can be seen from fig. 15 and fig. 16, the present invention can avoid the erroneous line segmentation result obtained by the existing method on fig. 2 and fig. 4, and can ensure that each connected domain of the keyword "DOT" in fig. 2 belongs to the same line, and each connected domain of the keyword "156" in fig. 4 belongs to the same line, which is beneficial to correctly extracting the required keyword in the subsequent OCR recognition result.
Drawings
FIG. 1 is a flow chart of a text ranking method for locating keywords in an image according to the present invention;
FIG. 2 is a sample diagram of a prior art image text layout;
FIG. 3 is a graph of the result of line segmentation of sample graph one in the prior art;
FIG. 4 is a sample diagram II of a prior art image text layout;
FIG. 5 is a graph of the result of line segmentation of the sample graph two by the prior art method;
FIG. 6 is a diagram of the parameter definitions of the positions of the minimum bounding rectangle of connected domains;
FIG. 7 is a diagram showing the results of sorting connected-domain minimum bounding rectangular boxes according to the X-axis;
FIG. 8 is a diagram showing the conditions I, II and III to be satisfied when connected domains to be ordered and unordered belong to the same row;
FIG. 9 is an illustration of an erroneous split line case one;
FIG. 10 is a diagram showing correction of an erroneous split case to a correct split case;
FIG. 11 is a diagram showing the conditions four, five and six to be satisfied for communicating domains to be ordered and unordered belonging to the same row;
FIG. 12 is a display diagram of an erroneous split line case two;
FIG. 13 is a diagram showing correction of an erroneous split case two to a correct split case;
FIG. 14 is a graph showing the result of ordering the connected domain of FIG. 7 according to the present invention;
FIG. 15 is a graph showing the result of ordering the connected domain of FIG. 3 according to the present invention;
FIG. 16 is a graph showing the result of sorting the connected domains of FIG. 5 according to the present invention.
Detailed Description
Defining 3 vectors, named R0, R1 and DR respectively, whose roles are as follows:
R0 is initialized to be empty and unordered connected domains are stored. And R1, initializing the connected domains which are sequenced from left to right according to the X-axis direction, and temporarily storing the connected domains which do not accord with the sequencing rule. DR, finally outputting connected domain.
Defining related variables of the position of the minimum circumscribed rectangle frame of the connected domain: cx_k, the center x coordinate of the smallest circumscribed rectangular box of the kth connected domain. cy_k, the center y coordinate of the smallest circumscribed rectangular frame of the kth connected domain. max_k is the maximum Y coordinate of the minimum circumscribed rectangle frame of the kth connected domain. min_k is the minimum Y coordinate of the minimum circumscribed rectangular frame of the kth connected domain. up_k, the midpoint between the center point and the upper boundary of the smallest circumscribed rectangular box of the kth connected domain. H, distance between up_k and min_k. W is used for judging the maximum range of the distance of the x coordinates of the central points of the two connected domains of the relative position relationship, and the parameter needs to be manually set according to actual conditions and is used for limiting the comparison range.
Referring to fig. 1 to 16, a text ranking method for locating keywords in an image includes the steps of:
Step 1: after all the connected domains in the image are extracted, the X coordinates of the left upper corner of the smallest circumscribed rectangle frame of the connected domains are sequenced from left to right according to the X axis direction, as shown in fig. 7. And sequentially storing the ordered results into the vector R1.
Step 2: at this time, the connected domains in R1 do not satisfy the final ordering rule, all connected domains in R1 are taken out and stored in a vector R0, and at this time R1 is an empty vector. The first connected domain of R0 (i.e., R0[0 ]) is taken out and stored in DR as a new row of reference connected domains.
Step 3: and traversing R0 according to the sequence from the index to the index, setting the circulation index as j, and enabling R0[ j ] to represent the j-th connected domain in the vector R0. Obtaining the position parameter of R0[ j ]: min_j, max_j, cy_j, cx_j, and up_j. (wherein, min_j represents the minimum Y coordinate of the minimum circumscribed rectangle of the jth connected domain in R0, and other parameters are the same).
Step 4: when any one of the connected domains in R0 is traversed (here, it is assumed that the connected domains are traversed to the jth connected domain, that is, R0[ j ]), R1 is traversed in the order of the index from large to small, the cyclic index is set to k, the comparison of the relative positions of R0[ j ] and the connected domain satisfying the condition three (here, it is assumed that R1[ k ]) in R1 is performed (as shown in FIG. 8), and if the condition three is satisfied, the conditions one and two are satisfied, which means that there is a connected domain belonging to the same row as R0[ j ] in R1, so R0[ j ] is stored in R1. This step can prevent the erroneous split situation as shown in fig. 9, since the connected areas c and a satisfy the condition of the same row, but c and b also satisfy the condition of the same row, and b is not determined to belong to the same row of a at this time, b has already been stored in R1, and c should be stored in R1 at this time, and b and c can be set to the same row at the time of the next determination, as shown in fig. 10. If R1 is traversed and the conditions one, two and three cannot be met at the same time, then step 5 is executed, otherwise, step 3 is returned, and the next connected domain in R0, namely R0[ j+1], is traversed.
Step 5: the last connected domain of the vector DR is fetched, i.e., the connected domain that was last stored in DR (here set to DR [ end ]). Obtaining a location parameter of DR [ end ]: min_e, max_e, cy_e, and cx_e. (wherein, min_e represents the minimum Y coordinate of the minimum circumscribed rectangle of DR [ end ], and the other parameters are the same). And (3) comparing the relative positions of the connected domain R0[ j ] which is not stored in the R1 in the step (4) with the DR [ end ] (as shown in FIG. 11), if the conditions four, five and six are simultaneously met, storing the R0[ j ] in the DR, otherwise, storing the R0[ j ] in the R1. This step can achieve the result shown in fig. 10, dividing b and c into the same row, and a belonging to another row. Meanwhile, the situation of wrong line division as shown in fig. 12 can be avoided, the connected domains a and b meet the condition of the same line, and c and a do not meet the condition of the same line, and the relative position relationship between c and b can be judged through step 5, so that c and b meet the condition of the same line, and then a, b and c all belong to the same line, as shown in fig. 13.
Step 6: judging whether R0 is traversed to be over or not, if not, returning to the step 3, continuing traversing, otherwise judging whether R1 is empty or not, if R1 is empty, indicating that all connected domains are ordered to be over, DR stores all ordered connected domains, if R1 is not empty, indicating that all connected domains are not ordered, returning to the step 2, and continuing executing until R1 is judged to be empty in the step 6.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to those skilled in the art without departing from the principles of the present invention are intended to be considered as protecting the scope of the present template.
The foregoing describes one embodiment of the present invention in detail, but the description is only a preferred embodiment of the present invention and should not be construed as limiting the scope of the invention. All equivalent changes and modifications within the scope of the present invention are intended to be covered by the present invention.

Claims (3)

1. A text ranking method for locating keywords in an image, comprising the steps of:
Step 1: after all connected domains in the image are extracted, sequencing the X coordinate of the left upper corner of the smallest circumscribed rectangle frame of the connected domains from left to right according to the X axis direction, and sequentially storing the sequenced results into a vector R1;
Step 2: taking out all the connected domains in R1 and storing the same into a vector R0, wherein R1 is an empty vector, taking out the first connected domain of R0 and storing the same into DR as a reference connected domain of a new row;
Step 3: traversing R0 according to the sequence from the index to the index, setting the circulation index as j, wherein R0[ j ] represents the j-th connected domain in the vector R0, and acquiring the position parameter of R0[ j ]: min_j, max_j, cy_j, cx_j, and up_j, min_j represent the minimum Y coordinate of the minimum bounding rectangle frame of the jth connected domain in R0, max_j represents the maximum Y coordinate of the minimum bounding rectangle frame of the jth connected domain in R0, cx_j represents the center x coordinate of the minimum bounding rectangle frame of the jth connected domain in R0, cy_j represents the center Y coordinate of the minimum bounding rectangle frame of the jth connected domain in R0, up_j represents the midpoint between the center point and the upper boundary of the minimum bounding rectangle frame of the jth connected domain in R0;
Step 4: traversing R0 to a jth connected domain, namely traversing R1 according to the sequence from the index to the small, setting a cyclic index as k, comparing the relative positions of R0[ j ] and a connected domain R1[ k ] meeting a condition III in R1, and meeting the conditions I and II when meeting the condition III, wherein the condition I and II indicate that the connected domain belonging to the same row with R0[ j ] exists in R1, so that R0[ j ] is stored in R1, and if R1 is traversed, and the conditions I, II and III can not be met simultaneously, executing step 5, otherwise returning to step 3, traversing the next connected domain in R0, namely R0[ j+1];
Step 5: the last connected domain of the vector DR is taken out, namely, the last connected domain DR [ end ] stored in the DR is taken out, and the position parameter of DR [ end ] is obtained: min_e, max_e, cy_e, and cx_e, min_e represent the minimum Y coordinate of the minimum bounding rectangle frame of DR [ end ], max_e represents the maximum Y coordinate of the minimum bounding rectangle frame of DR [ end ], cx_e represents the center x coordinate of the minimum bounding rectangle frame of DR [ end ], cy_e represents the center Y coordinate of the minimum bounding rectangle frame of DR [ end ], compare the relative positions of the connected domain R0[ j ] which is not stored in R1 in step 4 and DR [ end ], store R0[ j ] in DR if the conditions four, five and six are satisfied at the same time, otherwise store R0[ j ] in R1;
Step 6: judging whether R0 is traversed to be over or not, if not, returning to the step 3, continuing traversing, otherwise judging whether R1 is empty or not, if R1 is empty, indicating that all connected domains are ordered to be over, DR stores all ordered connected domains, if R1 is not empty, indicating that all connected domains are not ordered, returning to the step 2, and continuing executing until R1 is judged to be empty in the step 6.
2. The text ranking method for locating keywords in images according to claim 1, wherein R0 represents initializing to null, storing unordered connected domains, R1 represents initializing to connected domains that have been ranked from left to right in the X-axis direction, and subsequently temporarily storing connected domains that do not conform to ranking rules, and DR represents the finally outputted connected domains.
3. The text ranking method for locating keywords in an image according to claim 2, wherein cx_k represents a center x coordinate of a smallest bounding rectangular frame of a kth connected domain, cy_k represents a center Y coordinate of a smallest bounding rectangular frame of a kth connected domain, max_k represents a largest Y coordinate of a smallest bounding rectangular frame of a kth connected domain, min_k represents a smallest Y coordinate of a smallest bounding rectangular frame of a kth connected domain, up_k represents a midpoint between a center point and an upper boundary of a smallest bounding rectangular frame of a kth connected domain, H represents a distance between up_k to min_k, and W represents a maximum range of distances of x coordinates of center points of two connected domains used for judging a relative positional relationship.
CN202310834541.3A 2023-07-10 2023-07-10 Text ordering method for positioning keywords in image Active CN116824594B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310834541.3A CN116824594B (en) 2023-07-10 2023-07-10 Text ordering method for positioning keywords in image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310834541.3A CN116824594B (en) 2023-07-10 2023-07-10 Text ordering method for positioning keywords in image

Publications (2)

Publication Number Publication Date
CN116824594A CN116824594A (en) 2023-09-29
CN116824594B true CN116824594B (en) 2024-04-26

Family

ID=88139221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310834541.3A Active CN116824594B (en) 2023-07-10 2023-07-10 Text ordering method for positioning keywords in image

Country Status (1)

Country Link
CN (1) CN116824594B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105844275A (en) * 2016-03-25 2016-08-10 北京云江科技有限公司 Method for positioning text lines in text image
CN105989366A (en) * 2015-01-30 2016-10-05 深圳市思路飞扬信息技术有限责任公司 Inclination angle correcting method of text image, page layout analysis method of text image, vision assistant device and vision assistant system
WO2022142627A1 (en) * 2020-12-28 2022-07-07 深圳壹账通智能科技有限公司 Address information extraction method and apparatus, device and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989366A (en) * 2015-01-30 2016-10-05 深圳市思路飞扬信息技术有限责任公司 Inclination angle correcting method of text image, page layout analysis method of text image, vision assistant device and vision assistant system
CN105844275A (en) * 2016-03-25 2016-08-10 北京云江科技有限公司 Method for positioning text lines in text image
WO2022142627A1 (en) * 2020-12-28 2022-07-07 深圳壹账通智能科技有限公司 Address information extraction method and apparatus, device and medium

Also Published As

Publication number Publication date
CN116824594A (en) 2023-09-29

Similar Documents

Publication Publication Date Title
US11878433B2 (en) Method for detecting grasping position of robot in grasping object
US5907631A (en) Document image processing method and system having function of determining body text region reading order
CN109740606B (en) Image identification method and device
CN109446885B (en) Text-based component identification method, system, device and storage medium
CN110907947B (en) Real-time loop detection method in mobile robot SLAM problem
CN110598634B (en) Machine room sketch identification method and device based on graph example library
CN115495055B (en) RPA element matching method and system based on interface region identification technology
JP2016095753A (en) Character recognition system and character recognition method
CN112935703B (en) Mobile robot pose correction method and system for identifying dynamic tray terminal
CN111967337A (en) Pipeline line change detection method based on deep learning and unmanned aerial vehicle images
CN114519328A (en) Integrated circuit parameterization method and device, storage medium and terminal equipment
CN114782487A (en) Sea surface ship detection tracking method and system
CN113240716A (en) Twin network target tracking method and system with multi-feature fusion
CN116824594B (en) Text ordering method for positioning keywords in image
CN111553410B (en) Point cloud identification method based on key point local curved surface feature histogram and spatial relationship
CN110490887B (en) 3D vision-based method for quickly identifying and positioning edges of rectangular packages
CN112241975B (en) Matching positioning method and matching positioning device for feature templates
CN112215240B (en) Optimization method for improving 2D complex edge detection precision
CN111008210B (en) Commodity identification method, commodity identification device, codec and storage device
CN111814619A (en) Method for acquiring scale value of house type graph
CN114519330A (en) Integrated circuit adjusting method and device, storage medium and terminal equipment
Ozdil et al. Optical character recognition without segmentation
Aizono et al. Efficient Outlier Removal Combining REPPnP and the Levenberg-Marquardt Method
CN115965927B (en) Pavement information extraction method and device, electronic equipment and readable storage medium
CN115034335B (en) Autonomous cooperative control method and system for robot based on decision tree model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: Unit 01-05 and 08, Floor 18, No. 15, the Pearl River West Road, Tianhe District, Guangzhou, Guangdong 510000

Applicant after: Guangdong Xike Intelligent Technology Co.,Ltd.

Address before: Unit 01-05, 08, Floor 18, No. 15, the Pearl River West Road, Tianhe District, Guangzhou, Guangdong 510000 (for office use only)

Applicant before: GUANGZHOU SICK SENSOR Co.,Ltd.

Country or region before: China

GR01 Patent grant
GR01 Patent grant