CN113780260A - Computer vision-based intelligent barrier-free character detection method - Google Patents

Computer vision-based intelligent barrier-free character detection method Download PDF

Info

Publication number
CN113780260A
CN113780260A CN202110849867.4A CN202110849867A CN113780260A CN 113780260 A CN113780260 A CN 113780260A CN 202110849867 A CN202110849867 A CN 202110849867A CN 113780260 A CN113780260 A CN 113780260A
Authority
CN
China
Prior art keywords
text box
free
text
barrier
color
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110849867.4A
Other languages
Chinese (zh)
Other versions
CN113780260B (en
Inventor
卜佳俊
燕雪雅
周晟
王炜
于智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110849867.4A priority Critical patent/CN113780260B/en
Publication of CN113780260A publication Critical patent/CN113780260A/en
Application granted granted Critical
Publication of CN113780260B publication Critical patent/CN113780260B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Character Input (AREA)

Abstract

A barrier-free character intelligent detection method based on computer vision comprises the steps of firstly, opening a webpage or an App and carrying out screenshot on the webpage or the App; carrying out basic processing on the obtained image, transmitting the processed image into an OCR (optical character recognition) model, and automatically predicting the position and the confidence coefficient of a text box which possibly exists; then, performing similarity pair comparison, fusing similar text boxes, and filtering according to the confidence; then, carrying out shape normalization and edge detection, and determining the minimum range of the text box; and finally, performing barrier-free rule detection on the finally determined text box, wherein the barrier-free rule detection comprises word size detection and color contrast detection, and providing the text box which does not accord with the barrier-free specification for reference and correction of developers. The method is suitable for any webpage and App, provides a set of unified intelligent detection scheme for the barrier-free characters, has the characteristics of high accuracy, strong applicability, wide universality and the like, and is beneficial to assisting the further popularization of Internet application, aging and barrier-free career in China.

Description

Computer vision-based intelligent barrier-free character detection method
The technical field is as follows: the invention belongs to the field of information barrier-free, and is suitable for intelligently detecting whether characters meet barrier-free application specifications.
Background art:
with the rapid development of internet technology, internet applications such as web pages and apps play more and more important roles in the life of people. When ordinary people enjoy the convenience brought by the information technology, the disabled groups such as visually impaired people and the old have a bigger challenge. China officially implements 'information technology internet content barrier-free accessibility technical requirements and test methods' (GB/T37668-2019) in March 2020, establishes national standards of internet information barrier-free in China, and vigorously promotes construction processes in fields of information barrier-free, aging-suitable and the like. The standard emphasis information needs to ensure a certain readability principle at any time and is implemented into a detailed rule, namely, the visual presentation of the size and the color of the characters is required to reach a certain standard.
However, due to the diversity of programming languages and technologies, the confidentiality of product technologies, the complexity of applications, and the like, the size of characters and the color contrast with the background cannot be accurately obtained from codes in general; in addition, the text area is generally manually processed in advance when few characters are recognized through OCR; in addition, no relevant character recognition technology is applied according to national standards in the information barrier-free field at present. Therefore, how to efficiently and accurately identify the text boxes possibly existing in any complex scene, detect characters and locate non-compliant characters promotes barrier-free improvement of product information, and becomes an important guarantee for reading internet application information.
The invention content is as follows:
aiming at the technical difficulties, the invention provides an intelligent barrier-free character detection method based on computer vision. The method has the advantages that non-invasive intelligent detection can be realized, text box areas possibly existing in any complex interface can be rapidly and accurately identified under the condition of not contacting source codes, whether information barrier-free specifications are violated or not is detected one by one, and the text box positions are obviously identified. In addition, the method can be universally applied to any webpage, App, applet and other internet applications, provides a set of unified barrier-free character intelligent detection scheme, has the characteristics of high accuracy, strong applicability, wide universality and the like, and is beneficial to assisting the further popularization of the internet application aging and barrier-free career in China.
A computer vision-based intelligent barrier-free character detection method comprises the following specific steps:
1. a barrier-free character intelligent detection method based on computer vision is characterized by comprising the following steps:
s1: opening a webpage or App, acquiring an interface screenshot, preprocessing the data format of a model input image, respectively reducing the width and the height of the image to integral multiples of 32, and adding a full 0 dimension to an image matrix;
s2: transferring the preprocessed image in the S1 into an OCR model, identifying a text box which may exist and the confidence coefficient of the text box, and representing the identification result by using a geometry map and a score map, wherein the geometry map represents the distance from the pixel point to four edges of the text box, and the score map represents the confidence coefficient that each pixel point in the image is a character;
s3: screening the possible text boxes identified in the S2 to determine a final text box set;
s31: primarily screening the possible text boxes identified in the step S2 according to the confidence level, and discarding the text boxes below the confidence level threshold score _ map _ thresh;
s32: calculating rectangular coordinates corresponding to four vertexes of the text box after primary screening according to a geometry map in S2;
s33: inhibiting and fusing similar text boxes according to the local maximum value, and calculating a new confidence coefficient;
s331: computing IoU values for local non-maximum suppression for the text box;
s332: if the IoU values of the two text boxes are larger than the IoU threshold nms _ thresh, fusion is needed, otherwise, the next two text boxes are continuously traversed;
s333: for the text box g and the text box p needing to be fused in the S332, the vertex coordinate matrix value of the fused text box q is (1), and the confidence coefficient is (2), wherein the first 8 parameters of the text box matrix represent the x and y coordinates of four vertices, and the 9 th parameter represents the confidence coefficient of the text box;
q[:8]=(g[8]*g[:8]+p[8]*p[:8])/(g[8]+p[8]) (1)
q[8]=(g[8]+p[8]) (2)
s34: performing secondary screening on the fused text box according to the new confidence threshold box _ thresh, and discarding the text box with the confidence lower than the box _ thresh;
s35: calculating the width and height of the text box according to the rectangular coordinates obtained in the step S32, and if any length of the width or the height is smaller than a length threshold value length _ thresh, discarding the text box;
s4: the shapes of the text boxes screened in the S3 are regulated, and the non-rectangular text boxes are expanded to be rectangular; then, character edge detection is carried out, an RGB matrix of a pixel point corresponding to the upper boundary of the character is taken, the change of the values of the color channels of all colors is judged, the boundary is downwards moved by taking 1 pixel as a step length, and if the difference of the color channel values of a certain boundary is large, the boundary is considered to be contacted with the character edge; in the same way, upward detection is carried out on the lower boundary of the matrix, so that the upper edge and the lower edge of the text box are basically flush with the characters, and the minimum range of the shape of the text box is determined;
s5: carrying out obstacle-free font size rule detection on the text box determined in the S4, converting the height of the text box into pt units by taking pixels as units, and marking out the position of the text box which does not conform to the requirements on the font size in the obstacle-free rule if the height of the text box is less than 18 pt;
s6: detecting the text box determined in the S4 according to the barrier-free color contrast rule, judging the color contrast difference between the characters and the background, if the difference value is less than 4.5, indicating that the color difference between the characters and the background is not easy to distinguish, namely, the requirements for the color contrast of the characters in the barrier-free rule are violated, and marking out the position of the text box which does not conform to the standard on the image;
s61: in the text box, R, G, B three color numerical values exist in each pixel point, and each color numerical value is divided by 255 to obtain sR, sG and sB respectively;
s62: calculating the contrast tR of the color R for sR obtained in S61, wherein if sR is less than or equal to 0.03928, tR is obtained by calculating in (3), otherwise, tR is obtained by calculating in (4); computing tG and tB in the same way;
tR=sR/12.92,(sR≤0.03928) (3)
tR=((sR+0.055)/1.055)2.4,(sR>0.03928) (4)
s63: calculating the color contrast of a single pixel to obtain (5), and calculating the contrast of each pixel point in the text box, thereby compressing the RGB three-dimensional matrix into a one-dimensional matrix;
t=0.2126*tR+0.7152*tG+0.0722*tB (5)
s64: performing binary clustering on the color contrast of the pixels in the text box range by using a K-Means clustering algorithm with K being 2, and calculating the average value of the contrast in each cluster to respectively represent the color contrast of characters and a background;
s65: and (6) calculating the contrast difference value of the characters and the background color according to the contrast ratio calculation standard (6).
ratio=(max(lum1,lum2)+0.05)/(min(lum1,lum2)+0.05) (6)
Specifically, the OCR model used in step S2 is east (an Efficient and Accurate Scene Text detector), the basic network structure of which is PVANet, features are extracted by 4 layers of convolution, and feature fusion is performed, so as to obtain a prediction result finally.
Preferably, the confidence threshold score _ map _ thresh of the text box preliminary screening in step S31 is 0.8.
Preferably, the IoU threshold nms _ thresh of the fused text box is determined to be 0.2 in step S332.
Preferably, the threshold box _ thresh for the secondary filtering of the fused text box in step S34 is 0.1.
Preferably, the length threshold length _ thresh of the text box with the small filtering area in step S35 is 5.
In summary, the invention creates an intelligent barrier-free character detection method based on computer vision, which has the following beneficial effects: (1) is non-invasive. The barrier-free character detection can be carried out without contacting an application source code. (2) Has universality. The method is suitable for any webpage, App, applet and other internet applications, and provides a set of unified barrier-free character detection scheme. (3) The intelligent detection is realized, the character area in the interface is not required to be manually preprocessed, the image characteristics are effectively extracted by using an OCR (optical character recognition) technology, the position of a text box is automatically recognized, and the accurate detection on complex and various service scenes is facilitated. (4) The method aims at the information barrier-free field, carries out standard detection according to national standards, effectively positions non-compliant characters, and promotes barrier-free improvement of product information.
Description of the drawings:
in order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a general flow chart of the computer vision-based intelligent detection method for unobstructed text;
FIG. 2 is a flow chart of text box screening in the computer vision-based intelligent obstacle-free character detection method provided by the invention;
FIG. 3 is a flow chart of font size rule detection in the computer vision-based intelligent barrier-free text detection method provided by the invention;
FIG. 4 is a flow chart of color contrast rule detection in the computer vision-based intelligent barrier-free text detection method provided by the invention;
fig. 5 is an exemplary diagram of word size rule detection in the computer vision-based intelligent barrier-free word detection method provided by the invention.
Fig. 6 is an exemplary diagram of color contrast rule detection in the computer vision-based intelligent barrier-free text detection method provided by the invention.
The specific implementation method comprises the following steps:
exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In the embodiment, taking a certain APP as an example, the method comprises the following specific steps:
s1: opening a webpage or App, acquiring an interface screenshot, preprocessing the data format of a model input image, respectively reducing the width and the height of the image to integral multiples of 32, and adding a full 0 dimension to an image matrix;
s2: transferring the preprocessed image in the S1 into an OCR model, identifying a text box which may exist and the confidence coefficient of the text box, and representing the identification result by using a geometry map and a score map, wherein the geometry map represents the distance from the pixel point to four edges of the text box, and the score map represents the confidence coefficient that each pixel point in the image is a character;
s3: screening the possible text boxes identified in the S2 to determine a final text box set;
s31: primarily screening the possible text boxes identified in the step S2 according to the confidence level, and discarding the text boxes below the confidence level threshold score _ map _ thresh;
s32: calculating rectangular coordinates corresponding to four vertexes of the text box after primary screening according to a geometry map in S2;
s33: inhibiting and fusing similar text boxes according to the local maximum value, and calculating a new confidence coefficient;
s331: computing IoU values for local non-maximum suppression for the text box;
s332: if the IoU values of the two text boxes are larger than the IoU threshold nms _ thresh, fusion is needed, otherwise, the next two text boxes are continuously traversed;
s333: for the text box g and the text box p needing to be fused in the S332, the vertex coordinate matrix value of the fused text box q is (1), and the confidence coefficient is (2), wherein the first 8 parameters of the text box matrix represent the x and y coordinates of four vertices, and the 9 th parameter represents the confidence coefficient of the text box;
q[:8]=(g[8]*g[:8]+p[8]*p[:8])/(g[8]+p[8]) (1)
q[8]=(g[8]+p[8]) (2)
s34: performing secondary screening on the fused text box according to the new confidence threshold box _ thresh, and discarding the text box with the confidence lower than the box _ thresh;
s35: calculating the width and height of the text box according to the rectangular coordinates obtained in the step S32, and if any length of the width or the height is smaller than a length threshold value length _ thresh, discarding the text box;
s4: the shapes of the text boxes screened in the S3 are regulated, and the non-rectangular text boxes are expanded to be rectangular; then, character edge detection is carried out, an RGB matrix of a pixel point corresponding to the upper boundary of the character is taken, the change of the values of the color channels of all colors is judged, the boundary is downwards moved by taking 1 pixel as a step length, and if the difference of the color channel values of a certain boundary is large, the boundary is considered to be contacted with the character edge; in the same way, upward detection is carried out on the lower boundary of the matrix, so that the upper edge and the lower edge of the text box are basically flush with the characters, and the minimum range of the shape of the text box is determined;
s5: carrying out obstacle-free font size rule detection on the text box determined in the S4, converting the height of the text box into pt units by taking pixels as units, and marking out the position of the text box which does not conform to the requirements on the font size in the obstacle-free rule if the height of the text box is less than 18 pt;
s6: detecting the text box determined in the S4 according to the barrier-free color contrast rule, judging the color contrast difference between the characters and the background, if the difference value is less than 4.5, indicating that the color difference between the characters and the background is not easy to distinguish, namely, the requirements for the color contrast of the characters in the barrier-free rule are violated, and marking out the position of the text box which does not conform to the standard on the image;
s61: in the text box, R, G, B three color numerical values exist in each pixel point, and each color numerical value is divided by 255 to obtain sR, sG and sB respectively;
s62: calculating the contrast tR of the color R for sR obtained in S61, wherein if sR is less than or equal to 0.03928, tR is obtained by calculating in (3), otherwise, tR is obtained by calculating in (4); computing tG and tB in the same way;
tR=sR/12.92,(sR≤0.03928) (3)
tR=((sR+0.055)/1.055)2.4,(sR>0.03928) (4)
s63: calculating the color contrast of a single pixel to obtain (5), and calculating the contrast of each pixel point in the text box, thereby compressing the RGB three-dimensional matrix into a one-dimensional matrix;
t=0.2126*tR+0.7152*tG+0.0722*tB (5)
s64: and performing binary clustering on the color contrast of the pixels in the text box range by using a K-Means clustering algorithm with K being 2, and calculating the average value of the contrast in each cluster to respectively represent the color contrast of characters and background.
S65: and (6) calculating the contrast difference value of the characters and the background color according to the contrast ratio calculation standard (6).
ratio=(max(lum1,lum2)+0.05)/(min(lum1,lum2)+0.05) (6)
Fig. 1 is a general flow chart of the computer vision-based intelligent detection method for unobstructed characters.
FIG. 2 is a flow chart of text box screening in the computer vision-based intelligent obstacle-free character detection method provided by the invention:
s3: screening the possible text boxes identified in the S2 to determine a final text box set;
s31: primarily screening the possible text boxes identified in the step S2 according to the confidence level, and discarding the text boxes below the confidence level threshold score _ map _ thresh;
s32: calculating rectangular coordinates corresponding to four vertexes of the text box after primary screening according to a geometry map in S2;
s33: inhibiting and fusing similar text boxes according to the local maximum value, and calculating a new confidence coefficient;
s331: computing IoU values for local non-maximum suppression for the text box;
s332: if the IoU values of the two text boxes are larger than the IoU threshold nms _ thresh, fusion is needed, otherwise, the next two text boxes are continuously traversed;
s333: for the text box g and the text box p needing to be fused in the S332, the vertex coordinate matrix value of the fused text box q is (1), and the confidence coefficient is (2), wherein the first 8 parameters of the text box matrix represent the x and y coordinates of four vertices, and the 9 th parameter represents the confidence coefficient of the text box;
q[:8]=(g[8]*g[:8]+p[8]*p[:8])/(g[8]+p[8]) (1)
q[8]=(g[8]+p[8]) (2)
s34: performing secondary screening on the fused text box according to the new confidence threshold box _ thresh, and discarding the text box with the confidence lower than the box _ thresh;
s35: the width and height of the text box are calculated according to the rectangular coordinates obtained in S32, and if either the width or height is less than the length threshold length _ thresh, the text box is discarded.
Fig. 3 is a flow chart of font size rule detection in the computer vision-based intelligent barrier-free text detection method provided by the invention. S5: and (4) carrying out obstacle-free word size rule detection on the text box determined in the S4, converting the height of the text box into pt units by taking pixels as units, and marking out the position of the text box which does not conform to the requirements on the word size in the obstacle-free rule on the image if the height of the text box is less than 18 pt.
FIG. 4 is a flow chart of color contrast rule detection in the computer vision-based intelligent barrier-free text detection method provided by the invention:
s6: detecting the text box determined in the S4 according to the barrier-free color contrast rule, judging the color contrast difference between the characters and the background, if the difference value is less than 4.5, indicating that the color difference between the characters and the background is not easy to distinguish, namely, the requirements for the color contrast of the characters in the barrier-free rule are violated, and marking out the position of the text box which does not conform to the standard on the image;
s61: in the text box, R, G, B three color numerical values exist in each pixel point, and each color numerical value is divided by 255 to obtain sR, sG and sB respectively;
s62: calculating the contrast tR of the color R for sR obtained in S61, wherein if sR is less than or equal to 0.03928, tR is obtained by calculating in (3), otherwise, tR is obtained by calculating in (4); computing tG and tB in the same way;
tR=sR/12.92,(sR≤0.03928) (3)
tR=((sR+0.055)/1.055)2.4,(sR>0.03928) (4)
s63: calculating the color contrast of a single pixel to obtain (5), and calculating the contrast of each pixel point in the text box, thereby compressing the RGB three-dimensional matrix into a one-dimensional matrix;
t=0.2126*tR+0.7152*tG+0.0722*tB (5)
s64: and performing binary clustering on the color contrast of the pixels in the text box range by using a K-Means clustering algorithm with K being 2, and calculating the average value of the contrast in each cluster to respectively represent the color contrast of characters and background.
S65: and (6) calculating the contrast difference value of the characters and the background color according to the contrast ratio calculation standard (6).
ratio=(max(lum1,lum2)+0.05)/(min(lum1,lum2)+0.05) (6)。

Claims (6)

1. A barrier-free character intelligent detection method based on computer vision is characterized by comprising the following steps:
s1: opening a webpage or App, acquiring an interface screenshot, preprocessing the data format of a model input image, respectively reducing the width and the height of the image to integral multiples of 32, and adding a full 0 dimension to an image matrix;
s2: transferring the preprocessed image in the S1 into an OCR model, identifying a text box which may exist and the confidence coefficient of the text box, and representing the identification result by using a geometry map and a score map, wherein the geometry map represents the distance from the pixel point to four edges of the text box, and the score map represents the confidence coefficient that each pixel point in the image is a character;
s3: screening the possible text boxes identified in the S2 to determine a final text box set;
s31: primarily screening the possible text boxes identified in the step S2 according to the confidence level, and discarding the text boxes below the confidence level threshold score _ map _ thresh;
s32: calculating rectangular coordinates corresponding to four vertexes of the text box after primary screening according to a geometry map in S2;
s33: inhibiting and fusing similar text boxes according to the local maximum value, and calculating a new confidence coefficient;
s331: computing IoU values for local non-maximum suppression for the text box;
s332: if the IoU values of the two text boxes are larger than the IoU threshold nms _ thresh, fusion is needed, otherwise, the next two text boxes are continuously traversed;
s333: for the text box g and the text box p needing to be fused in the S332, the vertex coordinate matrix value of the fused text box q is (1), and the confidence coefficient is (2), wherein the first 8 parameters of the text box matrix represent the x and y coordinates of four vertices, and the 9 th parameter represents the confidence coefficient of the text box;
q[:8]=(g[8]*g[:8]+p[8]*p[:8])/(g[8]+p[8]) (1)
q[8]=(g[8]+p[8]) (2)
s34: performing secondary screening on the fused text box according to the new confidence threshold box _ thresh, and discarding the text box with the confidence lower than the box _ thresh;
s35: calculating the width and height of the text box according to the rectangular coordinates obtained in the step S32, and if any length of the width or the height is smaller than a length threshold value length _ thresh, discarding the text box;
s4: the shapes of the text boxes screened in the S3 are regulated, and the non-rectangular text boxes are expanded to be rectangular; then, character edge detection is carried out, an RGB matrix of a pixel point corresponding to the upper boundary of the character is taken, the change of the values of the color channels of all colors is judged, the boundary is downwards moved by taking 1 pixel as a step length, and if the difference of the color channel values of a certain boundary is large, the boundary is considered to be contacted with the character edge; in the same way, upward detection is carried out on the lower boundary of the matrix, so that the upper edge and the lower edge of the text box are basically flush with the characters, and the minimum range of the shape of the text box is determined;
s5: carrying out obstacle-free font size rule detection on the text box determined in the S4, converting the height of the text box into pt units by taking pixels as units, and marking out the position of the text box which does not conform to the requirements on the font size in the obstacle-free rule if the height of the text box is less than 18 pt;
s6: detecting the text box determined in the S4 according to the barrier-free color contrast rule, judging the color contrast difference between the characters and the background, if the difference value is less than 4.5, indicating that the color difference between the characters and the background is not easy to distinguish, namely, the requirements for the color contrast of the characters in the barrier-free rule are violated, and marking out the position of the text box which does not conform to the standard on the image;
s61: in the text box, R, G, B three color numerical values exist in each pixel point, and each color numerical value is divided by 255 to obtain sR, sG and sB respectively;
s62: calculating the contrast tR of the color R for sR obtained in S61, wherein if sR is less than or equal to 0.03928, tR is obtained by calculating in (3), otherwise, tR is obtained by calculating in (4); computing tG and tB in the same way;
tR=sR/12.92,(sR≤0.03928) (3)
tR=((sR+0.055)/1.055)2.4,(sR>0.03928) (4)
s63: calculating the color contrast of a single pixel to obtain (5), and calculating the contrast of each pixel point in the text box, thereby compressing the RGB three-dimensional matrix into a one-dimensional matrix;
t=0.2126*tR+0.7152*tG+0.0722*tB (5)
s64: and performing binary clustering on the color contrast of the pixels in the text box range by using a K-Means clustering algorithm with K being 2, and calculating the average value of the contrast in each cluster to respectively represent the color contrast of characters and background.
S65: and (6) calculating the contrast difference value of the characters and the background color according to the contrast ratio calculation standard (6).
ratio=(max(lum1,lum2)+0.05)/(min(lum1,lum2)+0.05) (6)
2. The computer vision-based intelligent barrier-free character detection method is characterized in that: in the step S2, the OCR model used is east (an Efficient and Accurate Scene Text detector), the basic network structure is PVANet, the features are extracted by 4 layers of convolution, and the features are fused to finally obtain the prediction result.
3. The computer vision-based intelligent barrier-free character detection method is characterized in that: the step S31, wherein the confidence threshold score _ map _ thresh of the text box primary screening is preferably 0.8.
4. The computer vision-based intelligent barrier-free character detection method is characterized in that: the step S332, in which it is determined that IoU threshold nms _ thresh of the fused text box is preferably 0.2.
5. The computer vision-based intelligent barrier-free character detection method is characterized in that: in step S34, the threshold box _ thresh for the second filtering of the merged textbox is preferably 0.1.
6. The computer vision-based intelligent barrier-free character detection method is characterized in that: the step S35, wherein the length threshold length _ thresh of the text box with the too small filtering area is preferably 5.
CN202110849867.4A 2021-07-27 2021-07-27 Barrier-free character intelligent detection method based on computer vision Active CN113780260B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110849867.4A CN113780260B (en) 2021-07-27 2021-07-27 Barrier-free character intelligent detection method based on computer vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110849867.4A CN113780260B (en) 2021-07-27 2021-07-27 Barrier-free character intelligent detection method based on computer vision

Publications (2)

Publication Number Publication Date
CN113780260A true CN113780260A (en) 2021-12-10
CN113780260B CN113780260B (en) 2023-09-19

Family

ID=78836122

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110849867.4A Active CN113780260B (en) 2021-07-27 2021-07-27 Barrier-free character intelligent detection method based on computer vision

Country Status (1)

Country Link
CN (1) CN113780260B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0411388D0 (en) * 2000-03-14 2004-06-23 Intel Corp Generalized text localization in images
CN101989303A (en) * 2010-11-02 2011-03-23 浙江大学 Automatic barrier-free network detection method
KR20140049525A (en) * 2014-01-22 2014-04-25 가천대학교 산학협력단 System and method for displaying visual information based on haptic display for blind person
CN103838823A (en) * 2014-01-22 2014-06-04 浙江大学 Website content accessible detection method based on web page templates
CN108229397A (en) * 2018-01-04 2018-06-29 华南理工大学 Method for text detection in image based on Faster R-CNN
CN109740542A (en) * 2019-01-07 2019-05-10 福建博思软件股份有限公司 Method for text detection based on modified EAST algorithm
CN110619331A (en) * 2019-09-20 2019-12-27 江苏鸿信系统集成有限公司 Color distance-based color image field positioning method
CN110874618A (en) * 2020-01-19 2020-03-10 同盾控股有限公司 OCR template learning method and device based on small sample, electronic equipment and medium
US20210110194A1 (en) * 2019-10-14 2021-04-15 Hangzhou Dianzi University Method for automatic extraction of data from graph

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0411388D0 (en) * 2000-03-14 2004-06-23 Intel Corp Generalized text localization in images
CN101989303A (en) * 2010-11-02 2011-03-23 浙江大学 Automatic barrier-free network detection method
KR20140049525A (en) * 2014-01-22 2014-04-25 가천대학교 산학협력단 System and method for displaying visual information based on haptic display for blind person
CN103838823A (en) * 2014-01-22 2014-06-04 浙江大学 Website content accessible detection method based on web page templates
CN108229397A (en) * 2018-01-04 2018-06-29 华南理工大学 Method for text detection in image based on Faster R-CNN
CN109740542A (en) * 2019-01-07 2019-05-10 福建博思软件股份有限公司 Method for text detection based on modified EAST algorithm
CN110619331A (en) * 2019-09-20 2019-12-27 江苏鸿信系统集成有限公司 Color distance-based color image field positioning method
US20210110194A1 (en) * 2019-10-14 2021-04-15 Hangzhou Dianzi University Method for automatic extraction of data from graph
CN110874618A (en) * 2020-01-19 2020-03-10 同盾控股有限公司 OCR template learning method and device based on small sample, electronic equipment and medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
易剑;彭宇新;肖建国: "基于颜色聚类和多帧融合的视频文字识别方法", 软件学报, vol. 22, no. 012 *
李翌昕;马尽文;: "文本检测算法的发展与挑战", 信号处理, no. 04 *
王鹰汉;高斐;卜佳俊;于智;陈荣华;: "面向信息无障碍检测的网页距离权重学习方法", 科技通报, no. 09 *
赵英;傅沛蕾: "基于主客观检测法的半自动网页信息无障碍测评工具研究", 情报杂志, no. 008 *

Also Published As

Publication number Publication date
CN113780260B (en) 2023-09-19

Similar Documents

Publication Publication Date Title
KR100248917B1 (en) Pattern recognizing apparatus and method
CN111563509B (en) Tesseract-based substation terminal row identification method and system
CN108229485B (en) Method and apparatus for testing user interface
CN111368903B (en) Model performance optimization method, device, equipment and storage medium
US8019164B2 (en) Apparatus, method and program product for matching with a template
CN107256379A (en) Information collecting method, mobile terminal and storage medium based on image recognition
CN113139445A (en) Table recognition method, apparatus and computer-readable storage medium
CN114549993B (en) Method, system and device for grading line segment image in experiment and readable storage medium
CN110196917B (en) Personalized LOGO format customization method, system and storage medium
CN106372624A (en) Human face recognition method and human face recognition system
CN111368682A (en) Method and system for detecting and identifying station caption based on faster RCNN
CN112241730A (en) Form extraction method and system based on machine learning
CN114663904A (en) PDF document layout detection method, device, equipment and medium
CN111738252B (en) Text line detection method, device and computer system in image
CN113221878A (en) Detection frame adjusting method and device applied to signal lamp detection and road side equipment
CN112052730A (en) 3D dynamic portrait recognition monitoring device and method
CN115205883A (en) Data auditing method, device, equipment and storage medium based on OCR (optical character recognition) and NLP (non-line language)
CN116704542A (en) Layer classification method, device, equipment and storage medium
KR20140137254A (en) Terminal, server, system and method for providing location information using character recognition
CN113780260A (en) Computer vision-based intelligent barrier-free character detection method
CN113807315B (en) Method, device, equipment and medium for constructing object recognition model to be recognized
CN113705559B (en) Character recognition method and device based on artificial intelligence and electronic equipment
CN112084103A (en) Interface test method, device, equipment and medium
US20050060329A1 (en) Data classification supporting method and apparatus, program and recording medium recording the program
CN111950354A (en) Seal home country identification method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant