CN115981995A - Test case generation method, system and storage medium based on application screen recording record - Google Patents

Test case generation method, system and storage medium based on application screen recording record Download PDF

Info

Publication number
CN115981995A
CN115981995A CN202211398616.XA CN202211398616A CN115981995A CN 115981995 A CN115981995 A CN 115981995A CN 202211398616 A CN202211398616 A CN 202211398616A CN 115981995 A CN115981995 A CN 115981995A
Authority
CN
China
Prior art keywords
screen recording
action
frame
touch
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211398616.XA
Other languages
Chinese (zh)
Inventor
师江帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Longce Technology Co ltd
Original Assignee
Hangzhou Longce Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Longce Technology Co ltd filed Critical Hangzhou Longce Technology Co ltd
Priority to CN202211398616.XA priority Critical patent/CN115981995A/en
Publication of CN115981995A publication Critical patent/CN115981995A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a test case generation method, a system and a storage medium based on application screen recording, wherein the method comprises the following steps: acquiring screen recording records of an operation application program; respectively carrying out touch point detection, keyboard detection and cursor detection on the screen recording records to obtain screen recording detection results; identifying the screen recording record according to the screen recording detection result to obtain a screen recording identification result; and generating a test case according to the screen recording recognition result. The invention realizes the automatic generation of the test case in the UI test technology, and realizes the reproduction of the screen recording record based on the test case; the method has the advantages of effectively saving labor cost, being simple in deployment, high in safety, simple in flow and simple and convenient to operate, effectively helping testers to reproduce bug (i.e. bug) of the application program, improving test efficiency and being high in reproduction success rate of test cases.

Description

Test case generation method and system based on application screen recording record and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence and machine vision, in particular to a test case generation method and system based on application screen recording and a storage medium.
Background
The UI test (User interface test) is one of the indispensable work contents of the tester. UI tests based on the use of applications (APP for short) consist of a series of UI operational events that mimic the user's real behavior when using a particular function of a given application on a smart mobile terminal such as a cell phone.
For example, to test the function of "search for a certain computer and join a shopping cart using the Taobao APP", the tester needs to perform the following operations: 1. sliding to search the Taobao APP on the main interface of the mobile phone, and then clicking an icon corresponding to the Taobao APP; 2. clicking a login button and inputting a user name and a password for login; 3. inputting characters 'computer' in a search box for searching; 4. clicking a link corresponding to the computer and entering a detail interface, and then clicking a 'join shopping cart' button. The operation event records the real behavior of the user in the process of searching a certain computer by using the Taobao APP and adding the computer into the shopping cart.
The tester needs to test, and can record these behavior operations when the user uses the APP through recording the screen (can be screenshot, also can screen video), can bring following advantage:
(1) Help the tester understand how the user interacts with the APP;
(2) Assisting the tester in handling error reports and functional requirements from the end user;
(3) And the tester is helped to reproduce a software crash scene or find a software bug.
Therefore, it is of great significance to record and analyze the screen. In addition, the test case is generated according to the screen recording record to realize the reproduction of the screen recording record, and the method can also greatly help a tester to eliminate errors and perform program time-consuming statistics under the condition that repeated sequence operation is needed, so that the test case is generated according to the screen recording record and played back, and the method is also necessary and has practical value.
However, in the prior art, most of testers manually analyze screen recording records to collect relevant information, the method is time-consuming and labor-consuming, the analysis result is inaccurate, the effect is not good, and a test case is not generated based on the screen recording records to realize the reproduction of the screen recording records. In part of the technologies capable of automatically analyzing screen recording records, the following disadvantages still exist:
(1) The safety is poor. Third-party software needs to be installed, or an unsafe third-party instrument device needs to be connected, or an operating system (such as an android system) where the application program is located needs to be modified or ROOT (which means to obtain super user permission of the operating system of the application program).
(2) The cost is high. Other software and equipment need to be deployed to operate normally.
(3) The precision is low. The test case can not be generated according to the screen recording record so as to reproduce the screen recording record, or the reproduction success rate is low.
(4) Long time consumption and inconvenient operation. The APP is required to interact with the equipment, a bottom layer framework is required to be installed or contained, and the APP is not friendly and convenient to use.
Disclosure of Invention
In view of this, the invention provides a test case generation method, a test case generation system and a storage medium based on application screen recording records, so as to solve the problems that the screen recording records cannot be automatically analyzed and the test cases cannot be automatically generated according to the screen recording records so as to realize the reproduction of the screen recording records in the existing UI test technology.
The invention provides a test case generation method based on application screen recording records, which comprises the following steps:
acquiring screen recording records of an operation application program;
respectively carrying out touch point detection, keyboard detection and cursor detection on the screen recording records to obtain screen recording detection results;
identifying the screen recording record according to the screen recording detection result to obtain a screen recording identification result;
and generating a test case according to the screen recording recognition result.
Optionally, the screen recording record includes n frames of continuous screen recording; the screen recording detection result comprises a first detection result and a second detection result;
touch point detection, keyboard detection and cursor detection are respectively carried out on the screen recording records to obtain screen recording detection results, and the method comprises the following steps:
performing touch point detection on the screen recording record based on a Yolov7 target detection method to obtain a first detection result;
performing keyboard detection and cursor detection on the screen recording record based on the Yolov7 target detection method to obtain a second detection result;
the first detection result comprises whether at least one touch action exists in the screen recording record, and when at least one touch action exists in the screen recording record, a first action starting time, a first action ending time, a touch action type and touch action positioning which correspond to each touch action;
the second detection result comprises whether at least one input action exists in the screen recording record, and when at least one input action exists in the screen recording record, a second action starting time, a second action ending time, an input action positioning and input text content corresponding to each input action;
wherein the touch action types comprise single click, double click, long press and slide.
Optionally, the performing, by the Yolov 7-based target detection method, touch point detection on the screen recording record to obtain the first detection result includes:
step 211: sequentially extracting each frame of screen recording in the screen recording record according to the sequence of the frames; when the ith frame is extracted for screen recording, detecting the ith frame for screen recording by adopting the Yolov7 target detection method, and judging whether a contact point exists in the ith frame for screen recording; if yes, go to step 212, otherwise go to step 216; wherein i is more than or equal to 1 and less than or equal to n;
step 212: when i =1, determining that a touch action exists in the screen recording of the ith frame, determining the screen recording time of the screen recording of the ith frame as the first action starting time of the corresponding touch action, returning to step 211, and continuously extracting the screen recording of the (i + 1) th frame in the screen recording record;
when 1 is more than i and less than n, judging that the touch action exists in the ith frame screen recording, and executing the step 213;
when i = n, determining that there is a touch action in the screen recording of the ith frame, and determining the screen recording time of the screen recording of the ith frame as the first action ending time of the corresponding touch action, and executing step 2110;
step 213: judging whether a touch point exists in the screen recording of the (i-1) th frame; if yes, go to step 214, otherwise go to step 215;
step 214: returning to step 211, continuing to extract the screen recording of the (i + 1) th frame in the screen recording record, and executing step 2110 when i +1 is greater than n;
step 215: determining the screen recording time of the ith frame of screen recording as the first action starting time of the corresponding touch action, returning to the step 211, continuously extracting the (i + 1) th frame of screen recording in the screen recording record, and executing the step 2110 when i +1 is greater than n;
step 216: when i =1, judging that no touch action exists in the screen recording of the ith frame, returning to step 211, and continuously extracting the screen recording of the (i + 1) th frame in the screen recording record;
when 1 is more than i and less than n, judging that no touch action exists in the ith frame screen recording, and executing step 217;
when i = n, judging that no touch action exists in the screen recording of the ith frame, and executing step 2110;
step 217: judging whether a contact exists in the screen recording of the (i-1) th frame; if yes, go to step 218, otherwise go to step 219;
step 218: judging that touch action exists in the screen recording of the (i-1) th frame, determining the screen recording time of the (i-1) th frame as the corresponding first action ending time of the touch action, returning to the step 211, continuously extracting the screen recording of the (i + 1) th frame in the screen recording record, and executing the step 2110 when i +1 is greater than n;
step 219: returning to step 211, continuing to extract the screen recording of the (i + 1) th frame in the screen recording record, and executing step 2110 when i +1 is greater than n;
step 2110: when the screen recording record contains at least one touch action, the touch action type and the touch action location corresponding to each touch action are sequentially obtained according to the sequence of the first action starting time of each touch action, and touch point detection of the screen recording record is completed.
Optionally, step 2110 includes:
step 2110.1: when the screen recording record comprises at least one touch action, for the jth touch action in the screen recording record, acquiring the jth touch action according to the first action starting time and the first action ending time corresponding to the jth touch actionTouch screen recording frame number of actions; judging whether the number of the screen recording frames is greater than or equal to a first preset frame value, if so, executing a step 2110.2, otherwise, judging that the jth touch action is misjudgment, and executing a step 2110.6; wherein j is more than or equal to 1 and less than or equal to m 1 ,m 1 Recording the total number of touch actions in the screen recording record;
step 2110.2: judging whether the screen recording frame number is larger than or equal to a second preset frame value, if so, executing a step 2110.3, otherwise, executing a step 2110.4; wherein the second preset frame value is greater than the first preset frame value;
step 2110.3: acquiring a starting contact point position and an ending contact point position corresponding to the jth touch action, and obtaining a starting contact point position difference and a ending contact point position difference corresponding to the jth touch action according to the starting contact point position and the ending contact point position; judging whether the position difference of the initial contact and the final contact is greater than or equal to a first position difference threshold value, if so, judging that the touch action type of the jth touch action is sliding, and executing step 2110.6; otherwise, judging that the touch action type of the jth touch action is long press, and executing step 2110.6;
step 2110.4: acquiring the initial contact point position and the end contact point position corresponding to the jth touch action, and obtaining the initial and final contact point position difference corresponding to the jth touch action according to the initial contact point position and the end contact point position; judging whether the position difference of the start and end contact points is smaller than the first position difference threshold value, if so, judging that the touch action type of the jth touch action is a single click, and executing step 2110.6; otherwise, executing step 2110.5;
step 2110.5: judging whether the position difference of the initial contact and the final contact is smaller than or equal to a second position difference threshold value, and if two overlapped contacts exist between the first action initial time and the first action end time of the jth touch action, judging that the touch action type of the jth touch action is double-click, and executing step 2110.6; if at least one item is not satisfied, judging that the touch action type of the jth touch action is sliding, and executing step 2110.6; wherein the second position difference threshold is greater than the first position difference threshold;
step 2110.6: obtaining the touch action location of the jth touch action based on the touch action type of the jth touch action and the corresponding start and end contact point position difference, and executing step 2110.7;
step 2110.7: and j = j +1, obtaining the touch action type and the touch action location of the j +1 th touch action in the screen recording record according to the method from the step 2110.1 to the step 2110.6 until the touch action types and the touch action locations of all the touch actions in the screen recording record are obtained in sequence, and completing the touch point detection of the screen recording record.
Optionally, the performing, by the Yolov 7-based target detection method, keyboard detection and cursor detection on the screen recording record to obtain the second detection result includes:
step 221: sequentially extracting each frame of screen recording in the screen recording record according to the sequence of the frames; when a kth frame screen recording is extracted, detecting the kth frame screen recording by adopting the Yolov7 target detection method, and judging whether a keyboard exists in the kth frame screen recording; if yes, go to step 222, otherwise go to step 228; wherein k is more than or equal to 1 and less than or equal to n;
step 222: when k =1, determining that there is an input action in the screen recording of the k-th frame, and after determining the screen recording time of the screen recording of the k-th frame as the second action starting time of the corresponding input action, executing step 224;
when k is more than 1 and less than n, judging that the screen recording of the kth frame has input action, and executing step 223;
when k = n, it is determined that there is an input action in the screen recording of the k-th frame, and the screen recording time of the screen recording of the k-th frame is determined as the second action ending time of the corresponding input action, and step 2212 is executed;
step 223: judging whether an input keyboard exists in the screen recording of the (k-1) th frame, if so, executing a step 224, otherwise, executing a step 227;
step 224: judging whether a cursor exists in the screen recording of the kth frame; if yes, go to step 225, otherwise go to step 226;
step 225: updating the corresponding preset cursor start and end positions of the input action according to the detected cursor positioning, and executing step 226;
step 226: returning to the step 221, continuing to extract the screen recording of the (k + 1) th frame in the screen recording record, and executing the step 2212 when k +1 is greater than n;
step 227: after determining the screen recording time of the screen recording of the kth frame as the second action starting time of the corresponding input action, executing step 224;
step 228: when k =1, judging that no input action exists in the screen recording of the k frame, returning to the step 221, and continuously extracting the screen recording of the (k + 1) th frame in the screen recording record;
when k is more than 1 and less than n, judging that no input action exists in the screen recording of the kth frame, and executing step 229;
when k = n, it is determined that there is no input action for the screen recording of the k-th frame, and step 2212 is executed;
step 229: judging whether a keyboard exists in the screen recording of the (k-1) th frame, if so, executing a step 2210, otherwise, executing a step 2211;
step 2210: after the screen recording time of the (k-1) th frame screen recording is determined as the second action ending time of the corresponding input action, returning to the step 221, continuously extracting the (k + 1) th frame screen recording in the screen recording record, and executing the step 2212 when k +1 is larger than n;
step 2211: returning to the step 221, continuing to extract the screen recording of the (k + 1) th frame in the screen recording record, and executing the step 2212 when k +1 is greater than n;
step 2212: when the screen recording record contains at least one input action, sequentially determining the input action positioning and the input text content corresponding to each input action according to the sequence of the second action starting moment of each input action, and completing keyboard detection and cursor detection of the screen recording record.
Optionally, step 2212 includes:
step 2212.1: when at least one input action is included in the screen recording recordFor the first input action in the screen recording record, extracting a first screen recording corresponding to the first input action at the second action starting time and a second screen recording corresponding to the second action ending time from the screen recording record, extracting an input frame area according to the updated preset cursor starting and ending position of the first input action and the first screen recording and the second screen recording, and obtaining the input action positioning and the input character content corresponding to the first input action according to the input frame area; wherein k is more than or equal to 1 and less than or equal to m 2 ,m 2 Recording the total number of input actions in the screen recording record;
step 2212.2: and l = l +1, obtaining the input action positioning and the input text content corresponding to the l +1 th input action in the screen recording record according to the method in step 2212.1, until obtaining the input action positioning and the input text content of all the input actions in the screen recording record in sequence.
Optionally, the recognizing the screen recording record according to the screen recording detection result to obtain a screen recording recognition result includes:
when at least one touch action exists in the screen recording record, respectively performing character recognition and image recognition on each touch action according to the touch action type and the touch action positioning of each touch action to obtain a first recognition result of each touch action;
when at least one input action exists in the screen recording record, respectively carrying out character recognition on each input action according to the input action positioning corresponding to each input action and the input character content to obtain a second recognition result of each input action;
and obtaining the screen recording identification result according to all the first identification results and all the second identification results.
In addition, the invention also provides a test case generation system based on the application screen recording record, which is applied to the test case generation method based on the application screen recording record and comprises the following steps:
the screen recording module is used for acquiring screen recording records of the operation application program;
the detection module is used for respectively carrying out touch point detection, keyboard detection and cursor detection on the screen recording records to obtain screen recording detection results;
the identification module is used for identifying the screen recording record according to the screen recording detection result to obtain a screen recording identification result;
and the generating module is used for generating a test case according to the screen recording recognition result.
In addition, the invention also provides a test case generation system based on application screen recording, which comprises a processor, a memory and a computer program which is stored in the memory and can be run on the processor, wherein when the computer program runs, the method steps in the test case generation method based on application screen recording are realized.
In addition, the present invention also provides a computer storage medium, comprising: at least one instruction which, when executed, implements the method steps of the test case generation system method based on application screen recording.
The invention has the beneficial effects that: the method comprises the steps of recording a screen by operating an operation behavior of an application program operated by a user to obtain a corresponding screen recording record, and then performing touch point detection, keyboard detection and cursor detection on the screen recording record to realize automatic analysis of the screen recording record, detecting a series of operation actions (including touch actions and input actions of inputting characters in a keyboard) of the user in the operation behavior, and detecting the operation behavior of the user based on the screen recording record to obtain a screen recording detection result; after the operation behaviors of the user are detected, the screen recording record is identified based on the screen recording detection result, the operation behaviors of the user recorded in the screen recording record can be automatically converted into a reproducible screen recording identification result, and finally, a corresponding test case can be generated according to the screen recording identification result;
the test case generation method, the test case generation system and the storage medium based on the application screen recording record realize the automatic generation of the test case in the UI test technology, and realize the reproduction of the screen recording record based on the automatically generated test case; the labor cost can be effectively saved, and other software and equipment do not need to be deployed; third-party software does not need to be installed, unsafe instruments and equipment do not need to be connected, an operating system where an application program is located does not need to be modified or ROOT, and safety is high; the method has the advantages of simple flow, simple and convenient operation, effective help of testers to reproduce bug (bug) of the application program, improvement of test efficiency and high reproduction success rate of test cases.
Drawings
The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and not to be construed as limiting the invention in any way, and in which:
fig. 1 is a schematic flowchart illustrating a test case generation method based on application screen recording in an embodiment of the present invention;
fig. 2 is a schematic flow chart illustrating obtaining a screen recording detection result in a first embodiment of the present invention;
fig. 3 shows a detailed flowchart of touch point detection for screen recording records according to a first embodiment of the present invention;
fig. 4 shows a detailed flowchart of keyboard detection and cursor detection for screen recording records in a first embodiment of the present invention;
fig. 5 is a schematic structural diagram illustrating a test case generation system based on application screen recording in a second embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
The first embodiment,
As shown in fig. 1, a test case generation method based on application screen recording includes:
s1: acquiring screen recording records of an operation application program;
s2: respectively carrying out touch point detection, keyboard detection and cursor detection on the screen recording records to obtain screen recording detection results;
s3: identifying the screen recording record according to the screen recording detection result to obtain a screen recording identification result;
s4: and generating a test case according to the screen recording recognition result.
In the embodiment, a screen recording record is obtained by recording the operation behavior of a user operating an application program, then the screen recording record is subjected to touch point detection, keyboard detection and cursor detection, so that the screen recording record can be automatically analyzed, a series of operation actions (including touch actions and input actions of inputting characters in a keyboard) of the user in the operation behavior are detected, and the operation behavior of the user is detected based on the screen recording record, so that a screen recording detection result is obtained; after the operation behavior of the user is detected, the screen recording record is identified based on the screen recording detection result, the operation behavior of the user recorded in the screen recording record can be automatically converted into a reproducible screen recording identification result, and finally, a corresponding test case can be generated according to the screen recording identification result.
The test case generation method based on the application screen recording record of the embodiment realizes the automatic generation of the test case in the UI test technology, and realizes the reproduction of the screen recording record based on the automatically generated test case; the labor cost can be effectively saved, and other software and equipment do not need to be deployed; third-party software does not need to be installed, unsafe instruments and equipment do not need to be connected, an operating system where an application program is located does not need to be modified or ROOT, and safety is high; the method has the advantages of simple flow, simple and convenient operation, effective help of testers to reproduce bug (bug) of the application program, improvement of test efficiency and high reproduction success rate of test cases.
Specifically, the operating system of the application program in this embodiment is an android system on an intelligent terminal such as a mobile phone, and the test case generation method in this embodiment is used for implementing a UI test of an application program on the mobile phone with the android system. In S1, screen recording during application program operation can be achieved by opening a display touch operation function in an android system developer mode, and screen recording records are obtained for subsequent automatic analysis.
Preferably, the screen recording record comprises n frames of continuous screen recording; the screen recording detection result comprises a first detection result and a second detection result;
as shown in fig. 2, S2 includes:
s21: performing touch point detection on the screen recording record based on a Yolov7 target detection method to obtain a first detection result;
s22: and performing keyboard detection and cursor detection on the screen recording record based on the Yolov7 target detection method to obtain a second detection result.
The screen recording record is a record of a series of sequential operation actions generated when the application program is operated, so that the sequential operation actions can be conveniently detected and identified subsequently through n frames of continuous screen recording, and further orderly automatic analysis and test can be realized. The operation actions mainly comprise two categories, namely input actions and touch actions, wherein the input actions refer to inputting characters on a user interface, the positions of the input characters need to be determined through the movement of a cursor in the process of inputting the characters, and the specific contents of the input characters need to be determined through a keyboard, so that the automatic analysis of the input actions in the screen recording record can be realized through keyboard detection and cursor detection based on a Yolov7 target detection method; the touch action refers to the actions of clicking, double clicking, sliding, long pressing and the like on the user interface, and when a finger touches the user interface, a contact point (specifically a white touch point) is left on the finger, so that the touch action in the screen recording can be automatically analyzed through contact point detection based on the Yolov7 target detection method.
The specific operation method of the Yolov7 target detection method is the prior art, and specific details are not described herein again.
Specifically, the first detection result includes whether at least one touch action exists in the screen recording record, and when at least one touch action exists in the screen recording record, a first action start time, a first action end time, a touch action type and a touch action location corresponding to each touch action;
the second detection result comprises whether at least one input action exists in the screen recording record, and when at least one input action exists in the screen recording record, the second action starting time, the second action ending time, the input action positioning and the input text content corresponding to each input action.
Through the first detection result, specific information contained in the touch action can be efficiently and accurately identified when the touch action is contained in the screen recording record in the subsequent identification process; similarly, through the second detection result, when the screen recording record contains the input action, the specific information contained in the input action can be efficiently and accurately identified; and then the test case corresponding to the screen recording record can be conveniently and automatically generated subsequently.
Specifically, the touch action types in the present embodiment include a single click, a double click, a long press, and a slide.
The touch action type covers the related types involved in the touch action, and the touch action can be detected more comprehensively through the touch action type, so that the test case which is consistent with the real operation can be generated more accurately in the follow-up process.
It should be noted that, in this embodiment, the order of S21 and S22 may be changed, that is, after keyboard detection and cursor detection are performed first, touch point detection is performed; s21 and S22 may also be performed simultaneously; the two detection processes are not interfered with each other and can be independently carried out, and corresponding detection results are output after detection; and after the keyboard detection and the cursor detection, when no input action exists in the screen recording record, the corresponding result of the 'no touch action' can be output.
Preferably, S21 includes:
s211: sequentially extracting each frame of screen recording in the screen recording record according to the sequence of the frames; when the ith frame screen recording is extracted, detecting the ith frame screen recording by adopting the Yolov7 target detection method, and judging whether a contact exists in the ith frame screen recording; if yes, go to S212, otherwise go to S216; wherein i is more than or equal to 1 and less than or equal to n;
s212: when i =1, determining that a touch action exists in the screen recording of the ith frame, determining the screen recording time of the screen recording of the ith frame as the first action starting time of the corresponding touch action, returning to S211, and continuously extracting the screen recording of the (i + 1) th frame in the screen recording record;
when 1 is more than i and less than n, judging that the touch action exists in the ith frame screen recording, and executing S213;
when i = n, determining that there is a touch action in the screen recording of the ith frame, determining the screen recording time of the screen recording of the ith frame as the first action ending time of the corresponding touch action, and executing S2110;
s213: judging whether a touch point exists in the screen recording of the (i-1) th frame; if yes, go to S214, otherwise go to S215;
s214: returning to S211, continuously extracting the screen recording of the (i + 1) th frame in the screen recording record, and executing S2110 when i +1 is greater than n;
s215: determining the screen recording time of the ith frame of screen recording as the first action starting time of the corresponding touch action, returning to S211, continuously extracting the (i + 1) th frame of screen recording in the screen recording record, and executing S2110 when i +1 is greater than n;
s216: when i =1, judging that no touch action exists in the screen recording of the ith frame, returning to S211, and continuously extracting the screen recording of the (i + 1) th frame in the screen recording record;
when 1 is more than i and less than n, judging that no touch action exists in the ith frame screen recording, and executing S217;
when i = n, judging that no touch action exists in the screen recording of the ith frame, and executing S2110;
s217: judging whether a touch point exists in the screen recording of the (i-1) th frame; if yes, go to S218, otherwise go to S219;
s218: judging that touch action exists in the screen recording of the (i-1) th frame, determining the screen recording time of the (i-1) th frame as the corresponding first action ending time of the touch action, returning to S211, continuously extracting the screen recording of the (i + 1) th frame in the screen recording record, and executing S2110 when i +1 is larger than n;
s219: returning to S211, continuously extracting the screen recording of the (i + 1) th frame in the screen recording record, and executing S2110 when i +1 is greater than n;
s2110: when the screen recording record contains at least one touch action, the touch action type and the touch action location corresponding to each touch action are sequentially obtained according to the sequence of the first action starting time of each touch action, and touch point detection of the screen recording record is completed.
The screen recording record is a screen recording record with n continuous frames, so that whether each frame of screen recording has a touch action or not is judged, and when the touch action exists, the start and end time (including the first action start time and the first action end time), the touch action type and the touch action positioning corresponding to the touch action are determined, so that the operation behavior recorded by the screen recording record is converted into a series of operation events with a sequence, and the frames are sequentially extracted according to the sequence of the frames for judgment; for several different types of touch actions, each touch action can generate a touch point in the screen recording of the continuous T frames, so that whether the touch point exists in the screen recording (such as the screen recording of the ith frame) after the frame is extracted is judged, when the touch point exists in the frame, the touch action exists in the frame, and when the touch point does not exist in the frame, the touch action does not exist in the frame; the judgment of whether touch action exists in each frame of screen recording is realized.
When the touch action exists in the ith frame screen recording, whether the screen recording time corresponding to the ith frame screen recording is the starting time (namely the first action starting time) or the ending time (namely the first action ending time) or the middle time of the touch action is further judged according to the contact condition of the screen recording of the previous frame (namely the ith-1 frame screen recording); if a touch point appears on the i-1 th frame screen recording, it is indicated that the i-th frame screen recording and the i-1 th frame screen recording belong to the same continuous touch action, and at this time, the screen recording time corresponding to the i-th frame screen recording can be determined as the end time of the touch action (when the frame is drawn to the i +1 th frame screen recording, if a touch point also exists on the i +1 th frame screen recording, the i-1 th frame screen recording and the i-1 th frame screen recording all belong to the same touch action, and in the judgment process of the i +1 th frame screen recording, the end time of the touch action is updated by combining the touch point condition of the i-th frame screen recording, and after the update, the screen recording time of the i-th frame screen recording is actually the middle time of the touch action); if no touch point exists in the i-1 th frame screen recording, the fact that no touch action exists in the i-1 th frame screen recording is shown, and the i-th frame screen recording is the starting moment of the touch action. The judgment of the moment of screen recording of the frame is realized when touch action exists in the screen recording of the ith frame.
When the touch action does not exist in the ith frame screen recording, the corresponding first action ending time of the ith-1 frame screen recording when the touch action exists can be further judged according to the contact condition of the previous frame screen recording (namely the ith-1 frame screen recording); if the touch point appears on the screen recording of the (i-1) th frame and the touch point does not appear on the screen recording of the (i) th frame, the screen recording time corresponding to the screen recording of the (i) th frame is the corresponding ending time of the touch action existing on the screen recording of the (i-1) th frame.
According to the same principle, each frame of screen recording can be judged according to the sequence of frames, and the first action starting time and the first action ending time of each touch action are determined.
When the first action starting time and the first action ending time of each complete touch action in the screen recording record are detected according to the method, and further detailed judgment is carried out according to the characteristics of each type of touch action, the type (namely the type of the touch action) of the complete touch action and the position (namely the touch action positioning) corresponding to the touch action can be obtained.
By the method, automatic touch point detection of screen recording records can be quickly realized by a simple algorithm framework, unsafe software or instruments do not need to be purchased or installed, and the method is high in safety, low in cost, convenient to deploy, simple and convenient to operate and high in efficiency.
Preferably, S2110 comprises:
s2110.1: when the screen recording record comprises at least one touch action, regarding the jth touch action in the screen recording record, and according to the jth touch actionAcquiring the screen recording frame number of the jth touch action at the corresponding first action starting time and the first action finishing time; judging whether the number of the screen recording frames is greater than or equal to a first preset frame value, if so, executing S2110.2, otherwise, judging that the jth touch action is misjudgment, and executing S2110.6; wherein j is more than or equal to 1 and less than or equal to m 1 ,m 1 Recording the total number of touch actions in the screen recording;
s2110.2: judging whether the screen recording frame number is larger than or equal to a second preset frame value, if so, executing S2110.3, otherwise, executing S2110.4; wherein the second preset frame value is greater than the first preset frame value;
s2110.3: acquiring a starting contact point position and an ending contact point position corresponding to the jth touch action, and obtaining a starting contact point position difference and a ending contact point position difference corresponding to the jth touch action according to the starting contact point position and the ending contact point position; judging whether the position difference of the start and the last touch points is larger than or equal to a first position difference threshold value, if so, judging that the touch action type of the jth touch action is sliding, and executing S2110.6; otherwise, judging that the touch action type of the jth touch action is long press, and executing S2110.6;
s2110.4: acquiring a starting contact point position and an ending contact point position corresponding to the jth touch action, and obtaining the starting contact point position difference and the ending contact point position difference corresponding to the jth touch action according to the starting contact point position and the ending contact point position; judging whether the position difference of the initial contact and the final contact is smaller than the first position difference threshold value, if so, judging that the touch action type of the jth touch action is single click, and executing S2110.6; otherwise, executing S2110.5;
s2110.5: judging whether the position difference of the initial contact and the final contact is smaller than or equal to a second position difference threshold value or not, and if two overlapped contacts exist between the first action initial time and the first action end time of the jth touch action, judging that the touch action type of the jth touch action is double-click, and executing S2110.6; if at least one item is not satisfied, judging that the touch action type of the jth touch action is sliding, and executing S2110.6; wherein the second position difference threshold is greater than the first position difference threshold;
s2110.6: obtaining the touch action location of the jth touch action based on the touch action type of the jth touch action and the corresponding start and end contact point position difference, and executing S2110.7;
s2110.7: let j = j +1, according to the methods of S2110.1 to S2110.6, the touch action type and the touch action location of the j +1 th touch action in the screen recording record are obtained until the touch action types and the touch action locations of all the touch actions in the screen recording record are obtained in sequence, and touch point detection of the screen recording record is completed.
For the single-click touch action, in addition to the condition that the contact point appears in the continuous T frames, the method further comprises the following steps: (a1) The frame number T of the continuous contacts (i.e., the number of screen recording frames included between the first action start time and the second action end time) satisfies: t is t 1 ≤T≤t 2 ,t 1 Refers to a first predetermined frame value, t 2 A second preset frame value is referred, wherein 3 is taken in the embodiment, and FPS multiplied by 2/3 (the frame rate recorded by a recording screen) is taken; (a2) When the contact position between frames is substantially unchanged, the difference between the start contact position (the position of the contact point at which the finger touch operation starts to appear at the beginning, i.e., the contact position corresponding to the first start time) and the end contact position (the position of the contact point at which the finger touch operation disappears, i.e., the contact position corresponding to the first end time) is smaller than the first position difference threshold (set as D) 1 ) Wherein the embodiment D 1 Take 5 pixels.
For the double-click type touch action, in addition to the condition that the contact point appears in the continuous T frames, the method further comprises the following steps: (b1) The number of frames T in which the contact continuously appears (i.e., the number of screen recording frames included between the first action start time and the second action end time) satisfies: t is t 1 ≤T≤t 2 ,(t 1 And t 2 The meaning of (a) is the same as that in the click type); (b2) The difference in the contact location between frames is substantially constant, then the location of the starting contact (meaning as in the click type) and the location of the ending contact (includingMeaning in the single click type) is less than a second positional difference threshold (set to D) 2 ) Wherein the embodiment D 2 Taking the diameter of a circular contact point multiplied by 2/3; (b3) There are 2 high confidence contacts in a frame, i.e., there are 2 overlapping contacts in a frame (where the 2 overlapping contacts can be judged by the transparency and chromaticity of the contacts).
For the long press type touch action, in addition to the condition that the contact point appears in the continuous T frames, the method further comprises the following steps: (c1) The number of frames T in which the contact continuously appears (i.e., the number of screen recording frames included between the first action start time and the second action end time) satisfies: t > T 2 ,(t 2 The meaning of (a) is the same as that in the click type); (c2) The position of the contact point between the frames is basically unchanged, and the difference of the position of the starting contact point (meaning the same as the meaning in the clicking type) and the position of the ending contact point (meaning the same as the meaning in the clicking type) is smaller than D 1 Wherein the embodiment D 1 Take 5 pixels.
For the sliding type touch action, in addition to the condition that the contact point appears in the consecutive T frames, the method further includes: (d1) The number of frames T in which the contact continuously appears (i.e., the number of screen recording frames included between the first action start time and the second action end time) satisfies: t is more than or equal to T 1 ,(t 1 The meaning of (a) is the same as that in the click type); (d2) The difference in the position of the start contact point between the position of the start contact point (meaning the same as that in the click type) and the position of the end contact point (meaning the same as that in the click type) is greater than or equal to D 1 Wherein the embodiment D 1 Take 5 pixels.
Based on the characteristics of the touch actions of the types, when the touch action is detected in the screen recording record, the number of frames for executing the touch action (which is equal to the screen recording frame number experienced in the screen recording record) can be determined according to the first action starting time and the first action ending time of the touch action, and the touch action type to which the touch action belongs can be efficiently, accurately and automatically detected by comparing the screen recording frame number with a first preset frame value and a second preset frame value respectively and combining the comparison between the position difference of the initial contact and the final contact with a first position difference threshold and a second position difference threshold respectively. Meanwhile, the first action starting time corresponds to the starting of the touch action, and the first action ending time corresponds to the ending of the touch action, so that the position corresponding to the touch action (and the touch action positioning) can be accurately determined based on the starting contact point position and the ending contact point position.
Specifically, in S2110.3 and S2110.4, for determining the initial touch point position corresponding to the jth touch action, the screen recording corresponding to the first action initial time may be extracted from the screen recording record, and the initial touch point position corresponding to the first action initial time is detected by the Yolov7 target detection method; similarly, for determining the ending touch point position corresponding to the jth touch action, the screen recording corresponding to the first action ending time can be extracted from the screen recording record, and the corresponding ending touch point position can be detected by a Yolov7 target detection method.
It should be noted that, because multiple touch actions may exist in one screen recording record, and each frame of screen recording in the screen recording record is sequentially extracted in S211 according to the sequence of frames, the first action start time and the first action end time of each touch action are sequentially detected, and when the first action end time of a certain touch action is detected, the corresponding touch action type and touch action location of the certain touch action can be determined according to the screen recording at the corresponding first action start time and the screen recording at the corresponding first action end time by the methods of S2110.1 to S2110.6; and then continuing to detect the first action starting time and the first action ending time of the next touch action and determining the corresponding touch action type and touch action location. The detection of the first action starting time and the first action ending time involved in the touch point detection process of each touch action, and the determination of the touch action type and the touch action positioning are independent and do not interfere with each other. Therefore, in a specific embodiment, for S2110, as long as the screen recording record includes the touch action, and as long as the first action ending time of the touch action is detected, the corresponding determination process of the touch action type and the touch action location can be performed, without waiting for the detection of the remaining frames in the screen recording record, in the whole touch point detection process, S2110 can be inserted after the first action ending time of each touch action is determined, and then the detection of each touch action is completed in sequence.
Furthermore, it should be noted that, when operating an application program, two or more touch actions may occur in the same frame, that is, the previous touch action is not yet executed, and the next touch action is executed, that is, there are multiple touch points in a certain frame; in the process of detecting a touch point in each frame of screen recording, when it is detected that a plurality of touch points exist in the current frame, the touch points may be classified according to the relative positions of all the touch points (that is, the touch points belonging to the same touch action are in the same class), for example, when 2 touch points exist in the current frame, if the position difference between the 2 touch points is substantially 0, the 2 touch points may be considered as the same class (the touch points of the same touch action), and if the position difference between the 2 touch points in the current frame is large (can be compared with a predetermined threshold), the 2 touch points may be considered as being different classes (the touch points of the 2 touch actions, respectively). In addition, for the current frame with i > 1, the touch points can be classified according to the relative positions of the touch points detected by the current frame and the touch points detected by the previous frame (if the touch points exist in the previous frame), for example, when 2 touch points exist in the current frame, the position difference between the 2 touch points and each touch point in the previous frame is calculated, and several touch points with the smallest position difference are in the same type (namely belong to the same touch action) in the previous and subsequent frames of screen recording. After classifying the touch points in the screen recording where a plurality of touch points exist according to the above method, the subsequent determination may be performed for each type of touch points according to the specific method for detecting touch points of the present embodiment (i.e., S212 to S215 and S2110).
Specifically, a flowchart of touch point detection for screen recording in S21 of this embodiment is shown in fig. 3. Before frame extraction, state initialization is performed on the whole process, including setting each threshold and setting initial default values of a first action starting time and a first action ending time, and when subsequent frame extraction is detected, judgment is performed based on each threshold, and updating of the first action starting time and the first action ending time of each touch action is performed based on the initial default values; after each frame is detected, state reset is needed, namely, the threshold values and the initial default values in state initialization are continuously used in the detection of the next frame. And when the touch point detection is finished in the whole screen recording record, ending the touch point detection process.
Preferably, S22 includes:
s221: sequentially extracting each frame of screen recording in the screen recording records according to the sequence of the frames; when the k frame screen recording is extracted, detecting the k frame screen recording by adopting the Yolov7 target detection method, and judging whether a keyboard exists in the k frame screen recording; if yes, go to step S222, otherwise go to step S228; wherein k is more than or equal to 1 and less than or equal to n;
s222: when k =1, determining that there is an input action in the screen recording of the k-th frame, and executing S224 after determining the screen recording time of the screen recording of the k-th frame as the second action starting time of the corresponding input action;
when k is more than 1 and less than n, judging that the screen recording of the kth frame has input action, and executing S223;
when k = n, determining that there is an input action in the screen recording of the k-th frame, determining the screen recording time of the k-th frame as the second action ending time of the corresponding input action, and executing S2212;
s223: judging whether an input keyboard exists in the screen recording of the (k-1) th frame, if so, executing S224, otherwise, executing S227;
s224: judging whether a cursor exists in the screen recording of the kth frame; if yes, go to S225, otherwise go to S226;
s225: updating the corresponding preset cursor start and end positions of the input action according to the detected cursor positioning, and executing S226;
s226: returning to S221, continuing to extract the screen recording of the (k + 1) th frame in the screen recording record, and executing S2212 when k +1 is larger than n;
s227: after determining the screen recording time of the screen recording of the kth frame as the second action starting time of the corresponding input action, executing S224;
s228: when k =1, judging that no input action exists in the screen recording of the k frame, returning to S221, and continuously extracting the screen recording of the (k + 1) th frame in the screen recording record;
when 1 < k < n, judging that no input action exists in the screen recording of the kth frame, and executing S229;
when k = n, judging that no input action exists in the screen recording of the k-th frame, and executing S2212;
s229: judging whether a keyboard exists in the screen recording of the (k-1) th frame, if so, executing S2210, otherwise, executing S2211;
s2210: after the screen recording time of the (k-1) th frame screen recording is determined as the second action ending time of the corresponding input action, returning to S221, continuously extracting the (k + 1) th frame screen recording in the screen recording record, and executing S2212 when k +1 is larger than n;
s2211: returning to S221, continuing to extract the screen recording of the (k + 1) th frame in the screen recording record, and executing S2212 when k +1 is larger than n;
s2212: and when the screen recording record contains at least one input action, sequentially determining the input action positioning and the input text content corresponding to each input action according to the sequence of the second action starting moment of each input action, and completing the keyboard detection and cursor detection of the screen recording record.
In the screen recording record, if an input action exists, the input action is also formed by combining a series of operation actions with a sequence, including selecting characters to be input in a keyboard, so that the input characters are input to corresponding positions in an input frame along with the indication of a cursor, therefore, whether the input action exists in each frame of screen recording or not mainly depends on the keyboard and the cursor, and when the input action exists, the specific information of the input action mainly depends on the position (namely input action positioning) of the input frame and the content of the input characters, therefore, similar to the touch point detection, frames need to be sequentially extracted according to the sequence of the frames for judgment, firstly, whether the keyboard exists in the frame-extracted screen recording (for example, the k frame screen recording) or not is judged, when the keyboard exists in the frame, the frame exists the input action, and when the keyboard does not exist in the frame, the frame does not exist the input action; the judgment of whether the input action exists in each frame of screen recording is realized.
When there is an input action in the screen recording of the kth frame, it is further required to further judge whether the screen recording time corresponding to the screen recording of the kth frame is the starting time (i.e. the starting time of the second action), the ending time (i.e. the ending time of the second action), or the middle time of the input action according to the keyboard condition of the screen recording of the previous frame (i.e. the screen recording of the kth-1 frame); if a keyboard appears in the screen recording of the (k-1) th frame, it is indicated that the screen recording of the (k) th frame and the screen recording of the (k-1) th frame belong to the same continuous input action, and at this time, the screen recording time corresponding to the screen recording of the (k) th frame can be determined as the ending time of the input action (when the frame is drawn to the screen recording of the (k + 1) th frame, if a keyboard also exists in the screen recording of the (k + 1) th frame, the screen recording of the (k) th frame and the screen recording of the (k-1) th frame all belong to the same input action, and in the judgment process of the screen recording of the (k + 1) th frame, the ending time of the input action is updated by combining the keyboard detection condition of the screen recording of the (k) th frame, and after the updating, the screen recording time of the screen recording of the (k) th frame is actually the middle time of the input action); if the keyboard does not appear in the screen recording of the (k-1) th frame, the screen recording of the (k-1) th frame does not have the input action, and the screen recording of the (k) th frame is the starting moment of the input action. When the input action exists in the screen recording of the kth frame, the judgment of the screen recording time of the frame is realized.
When the input action does not exist in the screen recording of the kth frame, the corresponding second action ending moment of the screen recording of the kth-1 frame when the input action exists can be further judged according to the contact condition of the screen recording of the previous frame (namely the screen recording of the kth-1 frame); if the touch point appears on the screen recording of the (i-1) th frame and the touch point does not appear on the screen recording of the (i) th frame, the screen recording time corresponding to the screen recording of the (i) th frame is the corresponding ending time of the touch action existing on the screen recording of the (i-1) th frame.
When the input action exists in the k-th frame screen recording and the judgment of the frame screen recording time is finished, whether the cursor exists in the frame screen recording or not is continuously judged, and when the cursor is detected, the initial and final positions of the preset cursor (namely the leftmost position or the rightmost position of the input frame where the cursor is located) are updated according to cursor positioning, so that the area of the input frame can be conveniently extracted subsequently, and the input action positioning and the detection of the input text content are further realized.
According to the same principle, each frame can be judged according to the sequence of the frames, the second action starting time and the second action ending time of each input action are determined, and the corresponding input action positioning and the corresponding input text content are sequentially determined.
Preferably, S2212 includes:
s2212.1: when the screen recording record comprises at least one input action, extracting a first screen recording corresponding to the first input action at the second action starting moment and a second screen recording corresponding to the second action ending moment from the screen recording record for the first input action in the screen recording record, extracting an input frame area according to the updated preset cursor starting and ending position of the first input action and the first screen recording and the second screen recording, and obtaining the input action positioning and the input text content corresponding to the first input action according to the input frame area; wherein k is more than or equal to 1 and less than or equal to m 2 ,m 2 Recording the total number of input actions in the screen recording record;
s2212.2: and making l = l +1, obtaining the input action positioning and the input text content corresponding to the l +1 th input action in the screen recording record according to the method of S2212.1, until obtaining the input action positioning and the input text content of all the input actions in the screen recording record in sequence.
When an input action is detected in the screen recording record, the corresponding first screen recording and second screen recording can be respectively extracted according to the second action starting time and the second action ending time of the touch action, the input frame area corresponding to the input action can be accurately extracted by combining the updated preset cursor starting and ending positions, and accurate input action positioning and detection of input character content can be realized according to the input frame area.
Specifically, a flowchart of keyboard detection and cursor detection for screen recording in S22 of this embodiment is shown in fig. 4. Performing state initialization on the whole process before frame extraction, including setting initial default values of a second action starting time, a second action ending time and a preset cursor starting and ending position, and updating the second action starting time, the second action ending time and the preset cursor starting and ending position of each input action based on each initial default value during subsequent frame extraction detection; after each frame has completed detection, a state reset is required. And when the keyboard detection and the cursor detection are finished by the whole screen recording record, the flow of the keyboard detection and the cursor detection is finished.
Preferably, S3 comprises:
s31: when at least one touch action exists in the screen recording record, respectively performing character recognition and image recognition on each touch action according to the touch action type and the touch action positioning of each touch action to obtain a first recognition result of each touch action;
when at least one input action exists in the screen recording record, respectively carrying out character recognition on each input action according to the input action positioning corresponding to each input action and the input character content to obtain a second recognition result of each input action;
s32: and obtaining the screen recording identification result according to all the first identification results and all the second identification results.
Based on the touch action positioning, the position determination of various types of touch actions can be realized; since the action object of the touch action is usually an icon and/or a character, character recognition and image recognition are carried out on each touch action in the screen recording record through the determined position of the touch action, so that the specific icon and/or character of the touch action can be accurately, efficiently and automatically recognized, and the specific type of touch action carried out on the specific icon and/or character can be obtained, and an accurate first recognition result can be obtained; similarly, due to the common position of the action object of the input action, the character recognition is carried out based on the input action positioning and the input character content, the specific content of the input action can be accurately, efficiently and automatically recognized, and the accurate second recognition result is obtained; and then, accurate test cases can be generated subsequently, accurate reproduction of screen recording records is realized, and the reproduction success rate is improved.
Specifically, in the present embodiment, the Character Recognition employs a conventional OCR Character Recognition algorithm (Optical Character Recognition algorithm), and the image Recognition employs a deep learning method of a conventional CNN convolutional neural network. The specific operation method is the prior art, and is not described herein again.
It should be noted that, when there is no touch action in the screen recording record, the recognition of the touch action is not required to be performed according to the method of character recognition and image recognition in S31, and the system does not output the first recognition result or the specific content of the first recognition result is empty; similarly, when there is no input action in the screen recording record, the system does not need to recognize the input action according to the method for recognizing characters in S31, and does not output the second recognition result or the specific content of the second recognition result is empty; the identification processes of the two operation actions are independent and do not interfere with each other.
Specifically, in this embodiment S4, the first recognition result and the second recognition result recorded in each frame are recorded in the screen recording manner, so that the corresponding test case can be generated according to the actual operation condition of each operation action, and the reproduction of the screen recording record is realized.
The advantages of this embodiment are as follows:
1. the safety is higher. Other three-party software does not need to be installed, other possibly unsafe instrument equipment does not need to be connected, the android version does not need to be modified, and ROOT android equipment is also not needed. And only the display touch operation function in the android system developer mode needs to be opened.
2. The cost is lower, and the deployment is convenient. The software can be deployed in the form of an off-line SDK or can be deployed in a containerized manner in the form of a service without purchasing other software and equipment.
3. The process is simpler, more convenient and easier to use. Through the end-to-end algorithm architecture, different types of user behaviors executed in the recorded video can be accurately obtained only by inputting the video file recorded from the android device.
4. The precision is higher. The test is carried out on 60 types of software commonly used in China, and the success rate of action reproduction reaches more than 95%.
5. Less time consuming. The model compression technology is applied to improve the model reasoning speed, so that the recorded video can be converted into a reproducible test case more quickly.
Example II,
As shown in fig. 5, a test case generation system based on application screen recording is applied to the test case generation method based on application screen recording in the first embodiment, and includes:
the screen recording module is used for acquiring screen recording records of the operation application program;
the detection module is used for respectively carrying out touch point detection, keyboard detection and cursor detection on the screen recording records to obtain screen recording detection results;
the identification module is used for identifying the screen recording record according to the screen recording detection result to obtain a screen recording identification result;
and the generating module is used for generating a test case according to the screen recording identification result.
The method comprises the steps that a screen recording module is used for recording the operation behavior of a user operating an application program to obtain corresponding screen recording records, then a detection module is used for detecting touch points, keyboards and cursors of the screen recording records, automatic analysis of the screen recording records can be achieved, a series of operation actions (including touch actions and input actions of inputting characters in the keyboards) of the user in the operation behaviors are detected, and the operation behaviors of the user are detected based on the screen recording records, so that a screen recording detection result is obtained; after the operation behaviors of the user are detected, the screen recording record is identified based on the screen recording detection result of the identification module, the operation behaviors of the user recorded in the screen recording record can be automatically converted into a reproducible screen recording identification result, and finally, a corresponding test case can be generated through the generation module according to the screen recording identification result.
The test case generation system based on the application screen recording record in the embodiment realizes the automatic generation of the test case in the UI test technology, and realizes the reproduction of the screen recording record based on the automatically generated test case; the labor cost can be effectively saved, and other software and equipment do not need to be deployed; third-party software does not need to be installed, unsafe instruments and equipment do not need to be connected, an operating system where an application program is located does not need to be modified or ROOT, and safety is high; the method has the advantages of simple flow, simple and convenient operation, effective help of testers to reproduce bug (bug) of the application program, improvement of test efficiency and high reproduction success rate of test cases.
The functions of the test case generation system based on application screen recording record described in this embodiment correspond to the steps of the test case generation method based on application screen recording record in the first embodiment, details in this embodiment are not described in detail, and detailed descriptions are given in the first embodiment and fig. 1 to 4, and are not repeated here.
Example III,
A test case generation system based on application screen recording comprises a processor, a memory and a computer program which is stored in the memory and can run on the processor, wherein the computer program realizes the method steps in the test case generation method based on application screen recording in the embodiment one when running.
The computer program stored in the memory is run on the processor, so that the automatic generation of the test case in the UI test technology is realized, and the reproduction of the screen recording record is realized based on the automatically generated test case; the labor cost can be effectively saved, and other software and equipment do not need to be deployed; third-party software does not need to be installed, unsafe instruments and equipment do not need to be connected, an operating system where an application program is located does not need to be modified or ROOT, and safety is high; the method has the advantages of simple flow, simple and convenient operation, effective help of testers to reproduce bug (bug) of the application program, improvement of test efficiency and high reproduction success rate of test cases.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center of the computer device and the various interfaces and lines connecting the various parts of the overall computer device.
The memory may be used to store computer programs and/or models, and the processor may implement various functions of the computer device by executing or otherwise executing the computer programs and/or models stored in the memory, as well as by invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (e.g., a sound playing function, an image playing function, etc.); the storage data area may store data (e.g., audio data, video data, etc.) created according to the use of the cellular phone. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer programs. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer programs may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer programs may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The present embodiment also provides a computer storage medium, including: at least one instruction which when executed implements the method steps in the deep learning-based text recognition method of embodiment one.
The method comprises the steps that a computer storage medium containing at least one instruction is executed, so that the automatic generation of a test case in the UI test technology is realized, and the reproduction of a screen recording record is realized based on the automatically generated test case; the labor cost can be effectively saved, and other software and equipment do not need to be deployed; third-party software does not need to be installed, unsafe instruments and equipment do not need to be connected, an operating system where an application program is located does not need to be modified or ROOT, and safety is high; the method has the advantages of simple flow, simple and convenient operation, effective help of testers to reproduce bug (i.e. bug) of the application program, improvement of test efficiency and high reproduction success rate of test cases.
Similarly, details of the third embodiment are not described in detail in the first embodiment, the second embodiment and the specific descriptions in fig. 1 to fig. 5, which are not repeated herein.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. A test case generation method based on application screen recording is characterized by comprising the following steps:
acquiring screen recording records of an operation application program;
respectively carrying out touch point detection, keyboard detection and cursor detection on the screen recording records to obtain screen recording detection results;
identifying the screen recording record according to the screen recording detection result to obtain a screen recording identification result;
and generating a test case according to the screen recording recognition result.
2. The method for generating test cases based on application screen recording according to claim 1, wherein the screen recording comprises n frames of continuous screen recording; the screen recording detection result comprises a first detection result and a second detection result;
touch point detection, keyboard detection and cursor detection are respectively carried out on the screen recording records to obtain screen recording detection results, and the method comprises the following steps:
performing touch point detection on the screen recording record based on a Yolov7 target detection method to obtain a first detection result;
performing keyboard detection and cursor detection on the screen recording record based on the Yolov7 target detection method to obtain a second detection result;
the first detection result comprises whether at least one touch action exists in the screen recording record, and when at least one touch action exists in the screen recording record, a first action starting time, a first action ending time, a touch action type and touch action positioning which correspond to each touch action;
the second detection result comprises whether at least one input action exists in the screen recording record, and when at least one input action exists in the screen recording record, a second action starting time, a second action ending time, an input action positioning and input text content corresponding to each input action;
wherein the touch action types comprise single click, double click, long press and slide.
3. The method for generating the test case based on the application screen recording record according to claim 2, wherein the performing touch point detection on the screen recording record based on the Yolov7 target detection method to obtain the first detection result comprises:
step 211: sequentially extracting each frame of screen recording in the screen recording record according to the sequence of the frames; when the ith frame is extracted for screen recording, detecting the ith frame for screen recording by adopting the Yolov7 target detection method, and judging whether a contact point exists in the ith frame for screen recording; if yes, go to step 212, otherwise go to step 216; wherein i is more than or equal to 1 and less than or equal to n;
step 212: when i =1, determining that a touch action exists in the screen recording of the ith frame, determining the screen recording time of the screen recording of the ith frame as the first action starting time of the corresponding touch action, returning to step 211, and continuously extracting the screen recording of the (i + 1) th frame in the screen recording record;
when 1 is more than i and less than n, judging that the touch action exists in the ith frame screen recording, and executing the step 213;
when i = n, determining that there is a touch action in the screen recording of the ith frame, and determining the screen recording time of the screen recording of the ith frame as the first action ending time of the corresponding touch action, and executing step 2110;
step 213: judging whether a touch point exists in the screen recording of the (i-1) th frame; if yes, go to step 214, otherwise go to step 215;
step 214: returning to step 211, continuing to extract the screen recording of the (i + 1) th frame in the screen recording record, and executing step 2110 when i +1 is greater than n;
step 215: determining the screen recording time of the ith frame of screen recording as the first action starting time of the corresponding touch action, returning to the step 211, continuously extracting the (i + 1) th frame of screen recording in the screen recording record, and executing the step 2110 when i +1 is greater than n;
step 216: when i =1, judging that no touch action exists in the screen recording of the ith frame, returning to step 211, and continuously extracting the screen recording of the (i + 1) th frame in the screen recording record;
when 1 is more than i and less than n, judging that no touch action exists in the ith frame screen recording, and executing step 217;
when i = n, judging that no touch action exists in the screen recording of the ith frame, and executing step 2110;
step 217: judging whether a touch point exists in the screen recording of the (i-1) th frame; if yes, go to step 218, otherwise go to step 219;
step 218: judging that touch action exists in the screen recording of the (i-1) th frame, determining the screen recording time of the (i-1) th frame as the corresponding first action ending time of the touch action, returning to the step 211, continuously extracting the screen recording of the (i + 1) th frame in the screen recording record, and executing the step 2110 when i +1 is greater than n;
step 219: returning to step 211, continuing to extract the screen recording of the (i + 1) th frame in the screen recording record, and executing step 2110 when i +1 is greater than n;
step 2110: when the screen recording record contains at least one touch action, sequentially obtaining the touch action type and the touch action location corresponding to each touch action according to the sequence of the first action starting time of each touch action, and completing the touch point detection of the screen recording record.
4. The method for generating test cases based on application screen recording according to claim 3, wherein step 2110 includes:
step 2110.1: when the screen recording record comprises at least one touch action, acquiring a screen recording frame number of a jth touch action according to a first action starting time and a first action ending time corresponding to the jth touch action for the jth touch action in the screen recording record; judging whether the number of the screen recording frames is greater than or equal to a first preset frame value, if so, executing a step 2110.2, otherwise, judging that the jth touch action is misjudgment, and executing a step 2110.6; wherein j is more than or equal to 1 and less than or equal to m 1 ,m 1 Recording the total number of touch actions in the screen recording record;
step 2110.2: judging whether the screen recording frame number is larger than or equal to a second preset frame value, if so, executing a step 2110.3, otherwise, executing a step 2110.4; wherein the second preset frame value is greater than the first preset frame value;
step 2110.3: acquiring a starting contact point position and an ending contact point position corresponding to the jth touch action, and obtaining a starting contact point position difference and a ending contact point position difference corresponding to the jth touch action according to the starting contact point position and the ending contact point position; judging whether the position difference of the initial contact and the final contact is greater than or equal to a first position difference threshold value, if so, judging that the touch action type of the jth touch action is sliding, and executing step 2110.6; otherwise, judging that the touch action type of the jth touch action is long press, and executing step 2110.6;
step 2110.4: acquiring the initial contact point position and the end contact point position corresponding to the jth touch action, and obtaining the initial and final contact point position difference according to the initial contact point position and the end contact point position; judging whether the position difference of the initial contact and the final contact is smaller than the first position difference threshold value, if so, judging that the touch action type of the jth touch action is single click, and executing step 2110.6; otherwise, executing step 2110.5;
step 2110.5: judging whether the position difference of the initial contact and the final contact is smaller than or equal to a second position difference threshold value, and if two overlapped contacts exist between the first action initial time and the first action end time of the jth touch action, judging that the touch action type of the jth touch action is double-click, and executing step 2110.6; if at least one item is not satisfied, judging that the touch action type of the jth touch action is sliding, and executing step 2110.6; wherein the second position difference threshold is greater than the first position difference threshold;
step 2110.6: obtaining the touch action location of the jth touch action based on the touch action type of the jth touch action and the corresponding start and end touch point position difference, and executing step 2110.7;
step 2110.7: let j = j +1, obtain the touch action type and the touch action location of the j +1 th touch action in the screen recording record according to the method from step 2110.1 to step 2110.6 until the touch action types and the touch action locations of all the touch actions in the screen recording record are obtained in sequence, and touch point detection of the screen recording record is completed.
5. The method for generating the test case based on the application screen recording record according to claim 2, wherein the performing keyboard detection and cursor detection on the screen recording record based on the Yolov7 target detection method to obtain the second detection result comprises:
step 221: sequentially extracting each frame of screen recording in the screen recording record according to the sequence of the frames; when the k frame screen recording is extracted, detecting the k frame screen recording by adopting the Yolov7 target detection method, and judging whether a keyboard exists in the k frame screen recording; if yes, go to step 222, otherwise go to step 228; wherein k is more than or equal to 1 and less than or equal to n;
step 222: when k =1, determining that there is an input action in the screen recording of the k-th frame, and after determining the screen recording time of the screen recording of the k-th frame as the second action starting time of the corresponding input action, executing step 224;
when k is more than 1 and less than n, judging that the input action exists in the screen recording of the kth frame, and executing step 223;
when k = n, it is determined that there is an input action in the screen recording of the k-th frame, and the screen recording time of the screen recording of the k-th frame is determined as the second action ending time of the corresponding input action, and step 2212 is executed;
step 223: judging whether an input keyboard exists in the screen recording of the (k-1) th frame, if so, executing a step 224, otherwise, executing a step 227;
step 224: judging whether a cursor exists in the screen recording of the kth frame; if yes, go to step 225, otherwise go to step 226;
step 225: updating the corresponding preset cursor start and end positions of the input action according to the detected cursor positioning, and executing step 226;
step 226: returning to the step 221, continuing to extract the screen recording of the (k + 1) th frame in the screen recording record, and executing the step 2212 when k +1 is greater than n;
step 227: after determining the screen recording time of the screen recording of the kth frame as the second action starting time of the corresponding input action, executing step 224;
step 228: when k =1, judging that no input action exists in the screen recording of the k frame, returning to the step 221, and continuously extracting the screen recording of the (k + 1) th frame in the screen recording record;
when k is more than 1 and less than n, judging that no input action exists in the screen recording of the kth frame, and executing step 229;
when k = n, it is determined that there is no input action for the screen recording of the k-th frame, and step 2212 is executed;
step 229: judging whether a keyboard exists in the screen recording of the (k-1) th frame, if so, executing a step 2210, otherwise, executing a step 2211;
step 2210: after the screen recording time of the (k-1) th frame screen recording is determined as the second action ending time of the corresponding input action, returning to the step 221, continuously extracting the (k + 1) th frame screen recording in the screen recording record, and executing the step 2212 when k +1 is larger than n;
step 2211: returning to the step 221, continuing to extract the screen recording of the (k + 1) th frame in the screen recording record, and executing the step 2212 when k +1 is greater than n;
step 2212: and when the screen recording record contains at least one input action, sequentially determining the input action positioning and the input text content corresponding to each input action according to the sequence of the second action starting moment of each input action, and completing the keyboard detection and cursor detection of the screen recording record.
6. The method for generating the test case based on the application screen recording record according to claim 5, wherein step 2212 comprises:
step 2212.1: when the screen recording record comprises at least one input action, extracting a first screen recording corresponding to the first input action at the second action starting moment and a second screen recording corresponding to the second action ending moment from the screen recording record for the first input action in the screen recording record, extracting an input frame area according to the updated preset cursor starting and ending position of the first input action and the first screen recording and the second screen recording, and obtaining the input action positioning and the input text content corresponding to the first input action according to the input frame area; wherein k is more than or equal to 1 and less than or equal to m 2 ,m 2 Recording the total number of input actions in the screen recording record;
step 2212.2: and l = l +1, obtaining the input action positioning and the input text content corresponding to the l +1 th input action in the screen recording record according to the method in step 2212.1, until obtaining the input action positioning and the input text content of all the input actions in the screen recording record in sequence.
7. The method for generating the test case based on the application screen recording record according to any one of claims 2 to 6, wherein the identifying the screen recording record according to the screen recording detection result to obtain a screen recording identification result includes:
when at least one touch action exists in the screen recording record, respectively performing character recognition and image recognition on each touch action according to the touch action type and the touch action positioning of each touch action to obtain a first recognition result of each touch action;
when at least one input action exists in the screen recording record, respectively carrying out character recognition on each input action according to the input action positioning corresponding to each input action and the input character content to obtain a second recognition result of each input action;
and obtaining the screen recording recognition result according to all the first recognition results and all the second recognition results.
8. A test case generation system based on application screen recording is applied to the test case generation method based on application screen recording according to any one of claims 1 to 7, and comprises the following steps:
the screen recording module is used for acquiring screen recording records of the operation application program;
the detection module is used for respectively carrying out touch point detection, keyboard detection and cursor detection on the screen recording records to obtain screen recording detection results;
the identification module is used for identifying the screen recording record according to the screen recording detection result to obtain a screen recording identification result;
and the generating module is used for generating a test case according to the screen recording recognition result.
9. A test case generation system based on application screen recording, comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the computer program implementing the method steps of any one of claims 1 to 7 when executed.
10. A computer storage medium, the computer storage medium comprising: at least one instruction which, when executed, implements the method steps of any one of claims 1 to 7.
CN202211398616.XA 2022-11-09 2022-11-09 Test case generation method, system and storage medium based on application screen recording record Pending CN115981995A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211398616.XA CN115981995A (en) 2022-11-09 2022-11-09 Test case generation method, system and storage medium based on application screen recording record

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211398616.XA CN115981995A (en) 2022-11-09 2022-11-09 Test case generation method, system and storage medium based on application screen recording record

Publications (1)

Publication Number Publication Date
CN115981995A true CN115981995A (en) 2023-04-18

Family

ID=85968819

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211398616.XA Pending CN115981995A (en) 2022-11-09 2022-11-09 Test case generation method, system and storage medium based on application screen recording record

Country Status (1)

Country Link
CN (1) CN115981995A (en)

Similar Documents

Publication Publication Date Title
CN106844217B (en) Method and device for embedding point of applied control and readable storage medium
CN107666987B (en) Robot process automation
CN108763068B (en) Automatic testing method and terminal based on machine learning
US9703462B2 (en) Display-independent recognition of graphical user interface control
CN105955889B (en) A kind of graphical interfaces automated testing method
US9098313B2 (en) Recording display-independent computerized guidance
US10176079B2 (en) Identification of elements of currently-executing component script
US9448908B2 (en) System and method for model based session management
CN112819052B (en) Multi-modal fine-grained mixing method, system, device and storage medium
US10073766B2 (en) Building signatures of application flows
CN109241709B (en) User behavior identification method and device based on slider verification code verification
US9904517B2 (en) System and method for automatic modeling of an application
CN109189519B (en) Universal user desktop behavior simulation system and method
CN112749081B (en) User interface testing method and related device
CN103502936A (en) Image-based automation systems and methods
CN104268006A (en) Keyboard and mouse script playback method and device
CN105718370A (en) Android equipment test method and test device
US20110047462A1 (en) Display-independent computerized guidance
Sun et al. Ui components recognition system based on image understanding
CN115981995A (en) Test case generation method, system and storage medium based on application screen recording record
CN107515821B (en) Control testing method and device
CN113926201A (en) Plug-in identification method and device, electronic equipment and storage medium
US20240184692A1 (en) Software testing
CN117234905A (en) Test case generation method, device, equipment and storage medium
CN112099696A (en) Password control positioning method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination