CN113434072B - Mobile terminal application control identification method based on computer vision - Google Patents

Mobile terminal application control identification method based on computer vision Download PDF

Info

Publication number
CN113434072B
CN113434072B CN202110597673.XA CN202110597673A CN113434072B CN 113434072 B CN113434072 B CN 113434072B CN 202110597673 A CN202110597673 A CN 202110597673A CN 113434072 B CN113434072 B CN 113434072B
Authority
CN
China
Prior art keywords
control
computer vision
image
mobile
straight line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110597673.XA
Other languages
Chinese (zh)
Other versions
CN113434072A (en
Inventor
卜佳俊
张建锋
周晟
刘美含
王炜
于智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110597673.XA priority Critical patent/CN113434072B/en
Priority to PCT/CN2021/098490 priority patent/WO2022252239A1/en
Publication of CN113434072A publication Critical patent/CN113434072A/en
Application granted granted Critical
Publication of CN113434072B publication Critical patent/CN113434072B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention discloses a mobile terminal application control identification method based on computer vision. The method combines a hardware and software method, utilizes the barrier-free function of the system, and realizes a non-invasive mobile terminal application control identification method with strong universality and low error rate. Firstly, opening a screen reader and corresponding software, and uploading a screenshot to a server after a mechanical arm operates; secondly, preprocessing the screenshot, and then extracting and expanding the color of the screenshot to obtain a single-channel image. Then, performing edge detection and straight line detection on the single-channel image, and filtering noise to obtain a central coordinate of the control; and finally, judging the functions of the control by using a computer vision method. And circulating the steps to construct the page control tree of the App. The method can be applied to complex scenes, can realize the understanding of functional meaning of each control, has strong universality, and can be applied to scenes such as automatic testing, page structure decomposition, man-machine interaction analysis and the like of mobile application.

Description

Mobile terminal application control identification method based on computer vision
The technical field is as follows: the invention relates to a non-invasive control identification algorithm applied to a mobile terminal based on computer vision, and belongs to the technical field of computer software.
Background art:
the number of mobile applications has been explosively increased with the development of mobile internet, and software design has become more and more complex. Therefore, the demands of automatic testing, page structure decomposition, human-computer interaction analysis and the like of the mobile application are increasing day by day, and the demands do not depart from a control identification method based on a Graphical User Interface (GUI), namely, interactive visual components in the GUI are automatically identified. For example, in order to ensure the product quality of mobile applications, GUI automation tests are often required, and the mainstream "record playback" method currently used needs to determine the number, positions and interactive operations available in the GUI interface in advance.
At present, most of common control identification methods identify controls according to control attributes, and can be mainly classified into three categories: coordinate-based control identification, source code-based control identification, and control tree-based control identification. The following disadvantages are mainly present: (1) an understanding of the functional meaning of each control cannot be achieved. The existing method just classifies the controls by using the attributes, but cannot really identify the specific purpose of each control. In addition, the method fails when the attribute value is null or repeated. (2) The method can not be applied to complex scenes such as pages with interactive logic such as pop windows and sub-pages. (3) It has no universality. Because the control identification logic, the control calling logic and the like of the Android and iOS platforms are different, the control identification scheme cannot be reused.
The invention content is as follows:
aiming at the problems and difficulties, the invention provides a mobile terminal application control identification method based on computer vision. Compared with a control identification method based on coordinates, the method can be adapted to devices with different platforms and different resolutions, and is higher in universality; compared with the identification method based on the source code, the method is non-intrusive, namely the source code of software does not need to be obtained, the method can be used in scenes such as black box test and the like, and has a wider application range; compared with a control identification method based on a control tree, the method is not influenced by the page hierarchical structure and the control position, and can flexibly deal with various complex scenes. In addition, the method can realize semantic understanding on each control, and can identify the specific purpose of each control besides judging the position and the attribute of the control.
A mobile terminal application control identification method based on computer vision specifically comprises the following steps: s101: the screen reader is turned on and its main role is to describe elements on the screen in voice and out of the focus box by examining the GUI of the mobile application and the extra information the mobile application provides for the barrier-free feature. S102: and opening corresponding software, performing one-time operation on the screen by the mechanical arm, capturing the screen, uploading the screen to the server, and preprocessing the image. S103: and determining an RGB matrix corresponding to the color of the focus frame for the screenshot obtained in S102, and superposing according to different backgrounds to obtain an RGB range. S104: and extracting the pixels in the RGB range obtained in the step S103 from the screenshot obtained in the step S102 to obtain a single-channel image. S105: and expanding the image of the S104, carrying out edge detection according to the single-channel image obtained in the S104, and detecting the edge of the focus frame. S106: and performing linear detection according to the edge obtained in the step S105, and obtaining a rectangular coordinate system equation of the straight line. S107: and screening the equation obtained in the S106 to obtain a straight line corresponding to the focus frame, calculating the center coordinate and the length and the width of the control, and converting the center coordinate and the length and the width into the screen ratio. And S108, determining the function of the rectangular screenshot framed by the focus frame in a computer vision mode according to the central coordinate of the control and the screen ratio of the length and the width obtained in the S107. S109: and (5) forming a corresponding relation between the coordinates of the control obtained in the step (107) and the characters identified in the step (108), and constructing a page tree with the key value pairs as nodes. And S110, if the control to be clicked is known, traversing the page tree obtained in the step S109 to find the node corresponding to the control, wherein the path from the father node to the target node is the path operated after opening the APP, the physical coordinates on the screen are obtained according to the percentage of the coordinates of the control in the image, and the mechanical arm can directly double-click the control until the target control is found.
In step S101, the screen reader used in the method is in the following specific form: s201, the Android screen reader is Talkback, and the iOS screen reader is Voiceover.
Specifically, in step S102, the specific requirements of the screenshot are: s301, the image must be in png format.
Specifically, in step S102, the operation of the robot arm is: and S401, sliding left, sliding right and double clicking.
Specifically, in the step S102, the image preprocessing specifically includes: and S501, intercepting a screen part occupied by the graphical user interface of the mobile terminal software.
Specifically, in step S103, the specific method for obtaining the RGB matrix range includes: s601, determining a plurality of RBG matrixes corresponding to the focus frame to obtain an initial range; and S602, superposing different gray scales on the background according to the initial range obtained in the S501 to obtain a final range.
Specifically, in step S104, the single-channel image acquisition scheme is as follows: s701, traversing in the image matrix obtained in S102 according to the RGB matrix range obtained in S103; s702, setting the corresponding value of the pixel in the RGB matrix range as 1; and S703, setting the corresponding value of the pixel out of the RGB matrix range to 0.
Specifically, in step S105, the image is expanded as follows: s801, respectively splicing matrixes with pixel values of 0 and width of 50 at the left side and the right side of the image; s802: according to the image spliced in S801, matrices having pixel values of 0 and 50 are respectively spliced at the upper and lower sides thereof.
Specifically, in step S105, the scheme of edge detection is as follows: s901: carrying out Gaussian denoising on the image; s902, calculating the gradient of the denoised image obtained in the S901, and calculating the edge amplitude and angle of the image according to the gradient; s903, according to the image edge amplitude and the angle obtained in the S902, carrying out non-maximum value suppression along the gradient direction; and S904, performing double-threshold edge connection processing to obtain an edge.
Specifically, in step S106, the specific scheme of the line detection is as follows: s1001, converting the coordinates of each point of the image obtained in the S904 into polar coordinates; s1002: calculating a linear equation corresponding to each coordinate, wherein the coordinates with the common linear equation are on the same straight line; s1003, counting the pixel value of each straight line; s1004, if the pixel value on the straight line obtained in S1003 exceeds a certain threshold value, the straight line is reserved; s1005: if the pixel value on the straight line obtained in S1003 does not exceed a certain threshold, the straight line is not retained.
Specifically, in step S106, the specific method for obtaining the rectangular coordinate system equation of the straight line is as follows: and S1101, converting the polar coordinate equation into a rectangular coordinate equation.
In step S107, the specific method for screening the line corresponding to the focus frame includes: s1201: if the difference between the pixel values of two adjacent straight lines meets a certain fixed value, the straight line is considered to be the straight line corresponding to the focus frame; s1202: if the difference between the pixel values of two adjacent straight lines does not satisfy a certain fixed value, the straight line is considered as an interference straight line.
Specifically, in step S107, the specific method for calculating the center coordinate of the control includes: s1301: for a vertical straight line, taking an average value to obtain the abscissa of the control; s1302, averaging the horizontal straight lines to obtain the vertical coordinate of the control.
In step S107, the specific method for calculating the length and width of the control is as follows: s1401: for a vertical straight line, calculating the difference value between the maximum value and the minimum value to obtain the width of the control; and S1402, calculating the difference value between the maximum value and the minimum value to obtain the length of the control for the horizontal straight line.
Specifically, in step S107, the method for calculating the percentage of the control center in the screen includes: s1501: dividing the abscissa by the image width to obtain the percentage of the x-axis; s1502: dividing the ordinate by the image length to obtain the percentage of the y-axis;
specifically, in step S108, the method for specifically determining the control function is: s1601, performing character recognition by using an OCR (optical character recognition) to obtain characters corresponding to the control; and S1602, if no characters are detected in S1601, performing image matching, and determining functions of the database according to the constructed database.
Specifically, in step S109, the specific method for constructing the page tree is as follows: s1701, taking the central coordinate of the control and the control function as a combined key value pair as a node of the tree; s1702, setting a null node as a root node, and taking all controls of the mobile application home page as a parent node; s1703, jumping to all the controls of the page by clicking a certain control as a child node of the clicked control, and establishing a page tree by analogy.
In conclusion, the invention creates a non-invasive control identification algorithm method for the mobile terminal application based on computer vision, and has the following beneficial effects: (1) the understanding of the functional meaning of each control is realized, and besides the positioning of the control, the page level and the function of the control can be known. (2) The method can be applied to complex scenes such as pages with interactive logic of pop windows, sub-pages and the like. (3) Has universality. The method is suitable for the conditions of different platforms and different models.
Description of the drawings:
in order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a diagram of a hardware-software interaction of a non-intrusive control recognition algorithm for a mobile terminal based on computer vision provided by the present invention;
FIG. 2 is a general flow chart of a non-intrusive control identification algorithm applied to a mobile terminal based on computer vision provided by the present invention;
FIG. 3 is an example of an image augmentation method in a general flowchart of a non-invasive control recognition algorithm applied to a mobile terminal based on computer vision provided by the present invention;
FIG. 4 is a flowchart of edge detection in a general flowchart of a non-intrusive control identification algorithm applied to a mobile terminal based on computer vision provided by the present invention;
FIG. 5 is a flow chart of line detection in a general flow chart of a non-invasive control identification algorithm applied to a mobile terminal based on computer vision provided by the invention;
the specific implementation method comprises the following steps:
exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In the embodiment, taking a certain APP as an example, the method comprises the following specific steps:
s101: the screen reader is turned on.
S102: and opening the APP, and performing screen capture uploading after one-time 'right-sliding' operation on the screen by the mechanical arm, and capturing the image.
S103: and determining the RGB matrix range corresponding to the color of the focus frame for the screenshot obtained in the S102.
S104: and extracting the pixels in the RGB range obtained in the step S103 from the screenshot obtained in the step S102 to obtain a single-channel image.
S105: and expanding the single-channel image of the S104, and carrying out edge detection according to the single-channel image obtained in the S104.
S106: and performing linear detection according to the edge obtained in the step S105, and obtaining a rectangular coordinate system equation of the straight line.
S107: and screening the equation obtained in the S106 to obtain a straight line corresponding to the focus frame, calculating the center coordinate and the length and the width of the control, and converting the center coordinate and the length and the width into the screen ratio.
And S108, determining the function of the rectangular screenshot framed by the focus frame in a computer vision mode according to the central coordinate of the control and the screen ratio of the length and the width obtained in the S107.
S109: and (5) forming a corresponding relation between the coordinates of the control obtained in the step (107) and the characters identified in the step (108), and constructing a page tree with the key value pairs as nodes.
And S110, traversing the page tree obtained in the step S109 to find the node corresponding to the control according to the control to be clicked, obtaining the physical coordinates on the screen according to the percentage of the coordinates of the control in the image, and directly clicking the control by double clicking the mechanical arm until the target control is found.
FIG. 1 is a diagram of a hardware-software interaction of a non-intrusive control recognition algorithm for a mobile terminal based on computer vision provided by the present invention;
FIG. 2 is a general flow chart of a non-intrusive control identification algorithm applied to a mobile terminal based on computer vision provided by the present invention;
fig. 3 is an example of an image augmentation method in a general flowchart of a non-invasive control recognition algorithm applied to a mobile terminal based on computer vision provided by the present invention: s801, respectively splicing matrixes with pixel values of 0 and 50 widths at the left side and the right side of the image; s802: according to the image spliced in the step S801, matrixes with pixel values of 0 and 50 are respectively spliced at the upper side and the lower side of the image;
fig. 4 is a flowchart of edge detection in a general flowchart of a non-intrusive control identification algorithm applied to a mobile terminal based on computer vision provided by the present invention: s901: carrying out Gaussian denoising on the image; s902, calculating the gradient of the denoised image obtained in the S901, and calculating the edge amplitude and angle of the image according to the gradient; s903, according to the image edge amplitude and the angle obtained in the S902, carrying out non-maximum value suppression along the gradient direction; and S904, performing double-threshold edge connection processing to obtain an edge.
Fig. 5 is a flowchart of line detection in a general flowchart of a non-invasive control recognition algorithm applied to a mobile terminal based on computer vision provided by the present invention: s1001, converting the coordinates of each point of the image obtained in the S904 into polar coordinates; s1002: calculating a linear equation corresponding to each coordinate, wherein the coordinates with the common linear equation are on a straight line; s1003, counting the pixel value of each straight line; s1004, if the pixel value on the straight line obtained in S1003 exceeds a certain threshold value, the straight line is reserved; s1005: if the pixel value on the straight line obtained in S1003 does not exceed a certain threshold, the straight line is not retained.

Claims (17)

1. A mobile terminal application control identification method based on computer vision is characterized by comprising the following steps:
s101: opening a screen reader, which is mainly used for describing elements on a screen by voice and framing the elements by using a focus frame by checking a GUI (graphical user interface) of a mobile application program and additional information provided by the mobile application for barrier-free characteristics;
s102: corresponding software is opened, the mechanical arm performs one-time operation on the screen, then the screen is captured and uploaded to the server, and image preprocessing is performed;
s103: determining an RGB matrix corresponding to the color of the focus frame for the screenshot obtained in S102, and superposing according to different backgrounds to obtain an RGB range;
s104: extracting pixels in the RGB range obtained in the step S103 from the screenshot obtained in the step S102 to obtain a single-channel image;
s105: expanding the image of S104, carrying out edge detection according to the single-channel image obtained in S104, and detecting the edge of the focus frame;
s106: performing linear detection according to the edge obtained in the step S105, and obtaining a rectangular coordinate system equation of a straight line;
s107: screening the equation obtained in the S106 to obtain a straight line corresponding to the focus frame, calculating the center coordinate and the length and the width of the control, and converting the center coordinate and the length and the width into a screen ratio;
s108, according to the central coordinate of the control and the screen proportion of the length and the width obtained in the S107, determining the function of the rectangular screenshot framed by the focus frame in a computer vision mode;
s109: forming a corresponding relation between the coordinates of the control obtained in the step S107 and the characters identified in the step S108, and constructing a page tree with key value pairs as nodes;
and S110, if the control to be clicked is known, traversing the page tree obtained in the step S109 to find the node corresponding to the control, wherein the path from the father node to the target node is the path operated after opening the APP, the physical coordinates on the screen are obtained according to the percentage of the coordinates of the control in the image, and the mechanical arm can directly double-click the control until the target control is found.
2. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in the step S101, the specific form of the screen reader used is as follows:
s201, the Android screen reader is Talkback, and the iOS screen reader is Voiceover.
3. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S102, the specific requirements of the screenshot are:
s301, the image must be in png format.
4. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in the step S102, the mechanical arm is specifically operated to:
and S401, sliding left, sliding right and double clicking.
5. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in the step S102, a specific scheme of the image preprocessing is as follows:
and S501, intercepting a screen part occupied by the graphical user interface of the mobile terminal software.
6. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S103, the specific method for obtaining the RGB matrix range includes:
s601, determining a plurality of RBG matrixes corresponding to the focus frame to obtain an initial range; and S602, superposing different gray scales on the background according to the initial range obtained in the S501 to obtain a final range.
7. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in the step S104, the single-channel image acquisition scheme is as follows:
s701, traversing in the image matrix obtained in S102 according to the RGB matrix range obtained in S103; s702, setting the corresponding value of the pixel in the RGB matrix range as 1; and S703, setting the corresponding value of the pixel out of the RGB matrix range to 0.
8. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S105, the image is expanded as follows:
s801, respectively splicing matrixes with pixel values of 0 and 50 widths at the left side and the right side of the image; s802: according to the image spliced in S801, matrices having pixel values of 0 and 50 are respectively spliced at the upper and lower sides thereof.
9. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S105, the scheme of edge detection is as follows:
s901: carrying out Gaussian denoising on the image; s902, calculating the gradient of the denoised image obtained in the S901, and calculating the edge amplitude and angle of the image according to the gradient; s903, according to the image edge amplitude and the angle obtained in the S902, carrying out non-maximum value suppression along the gradient direction; and S904, performing double-threshold edge connection processing to obtain an edge.
10. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S106, the specific scheme of the line detection is:
s1001, converting the coordinates of each point of the image obtained in the S904 into polar coordinates; s1002: calculating a linear equation corresponding to each coordinate, wherein the coordinates with the common linear equation are on the same straight line; s1003, counting the pixel value of each straight line; s1004, if the pixel value on the straight line obtained in S1003 exceeds a certain threshold value, the straight line is reserved; s1005: if the pixel value on the straight line obtained in S1003 does not exceed a certain threshold, the straight line is not retained.
11. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S106, a specific method for obtaining the rectangular coordinate system equation of the straight line is as follows:
and S1101, converting the polar coordinate equation into a rectangular coordinate equation.
12. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S107, the specific method for screening the line corresponding to the focus frame includes:
s1201: if the difference between the pixel values of two adjacent straight lines meets a certain fixed value, the straight line is considered to be the straight line corresponding to the focus frame; s1202: if the difference between the pixel values of two adjacent straight lines does not satisfy a certain fixed value, the straight line is considered as an interference straight line.
13. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S107, the specific method for calculating the center coordinate of the control includes:
s1301: for a vertical straight line, taking an average value to obtain the abscissa of the control; s1302, averaging the horizontal straight lines to obtain the vertical coordinate of the control.
14. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S107, the specific method for calculating the length and width of the control is as follows:
s1401: for a vertical straight line, calculating the difference value between the maximum value and the minimum value to obtain the width of the control; and S1402, calculating the difference value between the maximum value and the minimum value to obtain the length of the control for the horizontal straight line.
15. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S107, the method for calculating the percentage of the control center in the screen includes:
s1501: dividing the abscissa by the image width to obtain the percentage of the x-axis; s1502: the ordinate is divided by the image length to give the percentage on the y-axis.
16. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S108, the specific method for determining the control function is:
s1601, performing character recognition by using an OCR (optical character recognition) to obtain characters corresponding to the control; and S1602, if no characters are detected in S1601, performing image matching, and determining functions of the database according to the constructed database.
17. The method for identifying the mobile-end application control based on the computer vision of claim 1, wherein: in step S109, the specific method for constructing the page tree is as follows:
s1701, taking the central coordinate of the control and the control function as a combined key value pair as a node of the tree; s1702, setting a null node as a root node, and taking the following nodes as father nodes of all controls of the mobile application home page; s1703, jumping to all the controls of the page by clicking a certain control as a child node of the clicked control, and establishing a page tree by analogy.
CN202110597673.XA 2021-05-31 2021-05-31 Mobile terminal application control identification method based on computer vision Active CN113434072B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110597673.XA CN113434072B (en) 2021-05-31 2021-05-31 Mobile terminal application control identification method based on computer vision
PCT/CN2021/098490 WO2022252239A1 (en) 2021-05-31 2021-06-05 Computer vision-based mobile terminal application control identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110597673.XA CN113434072B (en) 2021-05-31 2021-05-31 Mobile terminal application control identification method based on computer vision

Publications (2)

Publication Number Publication Date
CN113434072A CN113434072A (en) 2021-09-24
CN113434072B true CN113434072B (en) 2022-06-07

Family

ID=77803292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110597673.XA Active CN113434072B (en) 2021-05-31 2021-05-31 Mobile terminal application control identification method based on computer vision

Country Status (2)

Country Link
CN (1) CN113434072B (en)
WO (1) WO2022252239A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007148040A1 (en) * 2006-06-19 2007-12-27 British Telecommunications Public Limited Company Apparatus & method for selecting menu items
CN105045489A (en) * 2015-08-27 2015-11-11 广东欧珀移动通信有限公司 Button control method and apparatus
CN109922363A (en) * 2019-03-15 2019-06-21 青岛海信电器股份有限公司 A kind of graphical user interface method and display equipment of display screen shot

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080195958A1 (en) * 2007-02-09 2008-08-14 Detiege Patrick J Visual recognition of user interface objects on computer
US8918739B2 (en) * 2009-08-24 2014-12-23 Kryon Systems Ltd. Display-independent recognition of graphical user interface control
CN108509342A (en) * 2018-04-04 2018-09-07 成都中云天下科技有限公司 A kind of precisely quick App automated testing methods
CN110990238B (en) * 2019-11-13 2021-09-21 南京航空航天大学 Non-invasive visual test script automatic recording method based on video shooting
CN112181255A (en) * 2020-10-12 2021-01-05 深圳市欢太科技有限公司 Control identification method and device, terminal equipment and storage medium
CN112657176A (en) * 2020-12-31 2021-04-16 华南理工大学 Binocular projection man-machine interaction method combined with portrait behavior information
CN112597065B (en) * 2021-03-03 2021-05-18 浙江口碑网络技术有限公司 Page testing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007148040A1 (en) * 2006-06-19 2007-12-27 British Telecommunications Public Limited Company Apparatus & method for selecting menu items
CN105045489A (en) * 2015-08-27 2015-11-11 广东欧珀移动通信有限公司 Button control method and apparatus
CN109922363A (en) * 2019-03-15 2019-06-21 青岛海信电器股份有限公司 A kind of graphical user interface method and display equipment of display screen shot

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TinyLink 2.0: integrating device, cloud, and client development for IoT applications;Gaoyang Guan等;《Proceedings of the 26th Annual International Conference on Mobile Computing and Networking》;20200413;全文 *
机器人双目立体视觉测距技术研究与实现;张蓬等;《计算机测量与控制》;20131231;全文 *

Also Published As

Publication number Publication date
WO2022252239A1 (en) 2022-12-08
CN113434072A (en) 2021-09-24

Similar Documents

Publication Publication Date Title
US11886799B2 (en) Determining functional and descriptive elements of application images for intelligent screen automation
CN110942074B (en) Character segmentation recognition method and device, electronic equipment and storage medium
CN109684803B (en) Man-machine verification method based on gesture sliding
KR20220013298A (en) Method and device for recognizing characters
CN110598686B (en) Invoice identification method, system, electronic equipment and medium
CN110379020B (en) Laser point cloud coloring method and device based on generation countermeasure network
CN114549993B (en) Method, system and device for grading line segment image in experiment and readable storage medium
CN111274957A (en) Webpage verification code identification method, device, terminal and computer storage medium
CN106991303B (en) Gesture verification code identification method and device
CN111460355A (en) Page parsing method and device
KR20210113620A (en) Object recognition method and device, electronic device, storage medium
CN115115740A (en) Thinking guide graph recognition method, device, equipment, medium and program product
CN113434072B (en) Mobile terminal application control identification method based on computer vision
CN111241897A (en) Industrial checklist digitization by inferring visual relationships
CN112052730A (en) 3D dynamic portrait recognition monitoring device and method
CN116052193B (en) RPA interface dynamic form picking and matching method and system
CN111444834A (en) Image text line detection method, device, equipment and storage medium
WO2024021081A1 (en) Method and apparatus for detecting defect on surface of product
CN115082941A (en) Form information acquisition method and device for form document image
CN114821596A (en) Text recognition method and device, electronic equipment and medium
CN115147752A (en) Video analysis method and device and computer equipment
CN114067145A (en) Passive optical splitter detection method, device, equipment and medium
Tonge et al. Automatic Number Plate Recognition
CN111817916A (en) Test method, device, equipment and storage medium based on mobile terminal cluster
CN110826564A (en) Small target semantic segmentation method and system in complex scene image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant