CN110580170A - software performance risk identification method and device - Google Patents

software performance risk identification method and device Download PDF

Info

Publication number
CN110580170A
CN110580170A CN201910863103.3A CN201910863103A CN110580170A CN 110580170 A CN110580170 A CN 110580170A CN 201910863103 A CN201910863103 A CN 201910863103A CN 110580170 A CN110580170 A CN 110580170A
Authority
CN
China
Prior art keywords
software
risk
characteristic value
execution operation
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910863103.3A
Other languages
Chinese (zh)
Other versions
CN110580170B (en
Inventor
陈肇权
黄裕建
马泽政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN201910863103.3A priority Critical patent/CN110580170B/en
Publication of CN110580170A publication Critical patent/CN110580170A/en
Application granted granted Critical
Publication of CN110580170B publication Critical patent/CN110580170B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

the invention provides a method and a device for identifying software performance risks, wherein the method for identifying the software performance risks comprises the following steps: acquiring the execution operation of the software on data and objects; deconstructing the execution operation to generate a behavior characteristic value which can describe the execution operation; and identifying the performance risk of the software according to the behavior characteristic value and the pre-established software risk characteristic value mapping. The method and the device for identifying the software performance risk can greatly reduce the cost investment of software performance risk identification and improve the efficiency of identification and optimization in the software research and development and operation and maintenance processes.

Description

software performance risk identification method and device
Technical Field
the invention relates to the technical field of software performance testing, in particular to a method and a device for identifying software performance risks.
Background
Software performance is a characteristic of software, and directly reflects the compliance degree of software systems or components therein with requirements of business processing response, resource consumption and the like. Meanwhile, the software performance is also an important index in software quality evaluation, and reflects whether the software can support the expected use requirement. Therefore, the performance risk identification in the software development and operation and maintenance stages is always a very important subject, and directly influences the cost of finding and modifying the performance problem.
in the traditional software development and operation and maintenance process, the identification of performance risks is highly dependent on the experience of people, or the performance risks need to wait until the risks occur and then take the reverse action. However, the method has higher risk cost, and due to the increasingly complex business architecture and the introduction of various technical frameworks, the performance risk identification and optimization are more and more difficult, and the requirements on the capability of research, development and maintenance personnel and the labor input are higher and higher.
Disclosure of Invention
aiming at the problems in the prior art, the software performance risk identification method and device provided by the invention greatly reduce the cost investment of software performance risk identification and improve the identification and optimization efficiency in all stages needing to identify the software performance risk, particularly in the software research and development and operation and maintenance processes.
In order to solve the technical problems, the invention provides the following technical scheme:
In a first aspect, the present invention provides a method for identifying software performance risk, including:
Acquiring the execution operation of the software on data and objects;
Deconstructing the execution operation to generate a behavior characteristic value which can describe the execution operation;
And identifying the performance risk of the software according to the behavior characteristic value and the pre-established software risk characteristic value mapping.
preferably, the acquiring the execution operation of the software on the data and the object includes:
analyzing program codes in the software to generate a first abstract syntax tree;
searching the function of the abstract syntax tree and the calling relation of the abstract syntax tree by using a path covering algorithm and taking the entry function of the abstract syntax tree as an initial end point so as to generate static execution operation of the program code;
Analyzing the log files according to the time sequence of the log files in the software to generate a log file analysis result;
And aggregating the analysis results of the log file according to the serial number and the transaction code to generate static execution operation of the log file.
preferably, the acquiring the execution operation of the software on the data and the object further includes:
respectively setting a probe in the process memory, the network communication layer of the program code and the process memory and the network communication layer in the log file;
monitoring the logic, network interaction request and network interaction response of the software operation in real time through a probe;
And generating dynamic execution operation of the code program and the log file according to the logic of the software operation, the network interaction request and the network interaction response by utilizing a bypass method.
Preferably, the deconstructing the execution operation and generating a behavior characteristic value that can describe the execution operation includes:
analyzing the static execution operation and the dynamic execution operation of the program code to generate a second abstract syntax tree;
Traversing the second abstract syntax tree to extract a first behavior characteristic value;
Analyzing the static execution operation of the log file, and extracting a second behavior characteristic value according to the keyword;
The behavior feature value includes the first behavior feature value and the second behavior feature value.
Preferably, the method for identifying the software performance risk further comprises: the step of pre-establishing the software risk characteristic value mapping comprises the following steps: and establishing the software risk characteristic value mapping according to the behavior characteristic value threshold range and the corresponding software risk category and level.
Preferably, the establishing of the software risk characteristic value mapping according to the preset threshold range of the behavior characteristic value and the corresponding software performance risk category and level thereof includes:
and establishing the software risk characteristic value mapping according to the respective preset threshold ranges of the query results of the returned data volume, the operation time consumption, the operation concurrency, the throughput, the thread pool size, the connection pool size, the CPU memory resource consumption of the operating system, the statement concurrency and the predicate matching index and the corresponding software performance risk categories and levels thereof.
Preferably, the identifying the performance risk of the software according to the behavior characteristic value and a pre-established software risk characteristic value mapping includes:
and inquiring a performance risk identification result of the software corresponding to the behavior characteristic value in the software risk characteristic value mapping.
In a second aspect, the present invention provides an apparatus for identifying a software performance risk, the apparatus comprising:
the execution operation acquisition unit is used for acquiring the execution operation of the software on the data and the object;
A characteristic value extraction unit, configured to deconstruct the execution operation, and generate a behavior characteristic value that may describe the execution operation;
and the risk identification unit is used for identifying the performance risk of the software according to the behavior characteristic value and the pre-established software risk characteristic value mapping.
preferably, the execution operation acquisition unit includes:
the syntax tree generating module is used for analyzing the program codes in the software to generate a first abstract syntax tree;
a static generation first module, configured to retrieve, by using a path overlay algorithm, a function of the abstract syntax tree and a call relationship of the abstract syntax tree with an entry function of the abstract syntax tree as an initial end point, so as to generate a static execution operation of the program code;
The analysis result generation module is used for analyzing the log files according to the time sequence of the log files in the software so as to generate a log file analysis result;
And the static generation second module is used for aggregating the analysis results of the log file according to the serial number and the transaction code so as to generate static execution operation of the log file.
preferably, the execution operation acquisition unit further includes:
The probe setting module is used for respectively setting a probe in the process memory of the program code, the network communication layer and the process memory and the network communication layer in the log file;
the monitoring module is used for monitoring the logic of the software operation, the network interaction request and the network interaction response in real time through the probe;
And the dynamic generation module is used for generating dynamic execution operation of the code program and the log file according to the logic of the software operation, the network interaction request and the network interaction response by utilizing a bypass method.
preferably, the feature value extraction unit includes:
the abstract generating module is used for analyzing the static execution operation and the dynamic execution operation of the program code to generate a second abstract syntax tree;
and the first characteristic value extraction module is used for performing traversal operation on the second abstract syntax tree to extract a first behavior characteristic value.
the second characteristic value extraction module is used for analyzing the static execution operation of the log file and extracting a second behavior characteristic value according to the keyword;
the behavior feature value includes the first behavior feature value and the second behavior feature value.
Preferably, the software performance risk identification means further comprises: the mapping generation unit is used for establishing the software risk characteristic value mapping in advance;
The mapping generation unit is specifically configured to establish the software risk characteristic value mapping according to the behavior characteristic value threshold range and the corresponding software risk category and level.
Preferably, the mapping generating unit is further specifically configured to establish the software risk characteristic value mapping according to a preset threshold range of each query result of the returned data amount, the operation time consumption, the operation concurrency, the throughput, the thread pool size, the connection pool size, the operating system CPU memory resource consumption, the statement concurrency, and the predicate matching index, and a software performance risk category and a software performance risk level corresponding to the preset threshold range.
preferably, the risk identification unit is specifically configured to query a performance risk identification result of the software corresponding to the behavior characteristic value in the software risk characteristic value map.
In a third aspect, the present invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for identifying a software performance risk when executing the program.
In a fourth aspect, the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method for identifying a risk of software performance.
From the above description, it can be seen that the method and apparatus for identifying software performance risk provided by the present invention obtain the software behavior of the process of executing operation on data and objects by software from the program code and the log file in the software, extract the feature value capable of describing the software behavior from the software behavior as the basis, and finally identify the performance risk of the software according to the feature value. The software performance risk identification method is suitable for software research and development and operation and maintenance processes, can greatly reduce the cost investment of performance risk identification, and improves the identification and optimization efficiency. The method overcomes the difficulty that in the prior art, in the software research and development and operation and maintenance processes, the identification of the performance risk highly depends on the experience of people, or the performance risk needs to wait until the risk occurs and then the performance risk is reversed.
drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a first flowchart illustrating a method for identifying a software performance risk according to an embodiment of the present invention;
FIG. 2 is a first flowchart illustrating steps 100 of a method for identifying software performance risk according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a second step 100 of a method for identifying software performance risk according to an embodiment of the present invention;
FIG. 4 is a first flowchart illustrating steps 200 of a method for identifying software performance risk according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a second step 200 of the method for identifying software performance risk according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating a second method for identifying a software performance risk according to an embodiment of the present invention;
FIG. 7 is a flowchart illustrating steps 300 of a method for identifying software performance risk according to an embodiment of the invention;
FIG. 8 is a flowchart illustrating a method for identifying software performance risk in an exemplary embodiment of the present invention;
FIG. 9 is a diagram illustrating the recognition of risk of software performance in an exemplary embodiment of the present invention;
FIG. 10 is a first schematic structural diagram of an apparatus for identifying risk of software performance in an embodiment of the present invention;
FIG. 11 is a first schematic diagram illustrating a structure of an execute operation acquiring unit according to an embodiment of the present invention;
FIG. 12 is a diagram illustrating a second exemplary structure of an execution operation obtaining unit according to an embodiment of the present invention;
FIG. 13 is a first diagram illustrating a structure of a feature value extraction unit in an embodiment of the present invention;
FIG. 14 is a diagram illustrating a second exemplary structure of a feature value extraction unit according to an embodiment of the present invention;
FIG. 15 is a first schematic structural diagram of an apparatus for identifying risk of software performance in an embodiment of the present invention;
Fig. 16 is a schematic structural diagram of an electronic device in an embodiment of the invention.
Detailed Description
in order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the present invention provides a specific implementation manner of a software performance risk identification method, and referring to fig. 1, the method specifically includes the following steps:
Step 100: and acquiring the execution operation of the software on the data and the object.
It is understood that the execution of operations on data and objects by software is also referred to as software behavior, i.e., the process by which software performs operations on data and objects. When writing software behavior, performance risks may be introduced due to design and coding reasons, which may affect the software performance.
Step 200: deconstructing the execution operation to generate a behavior characteristic value that can describe the execution operation.
Specifically, a specific parser is called according to a software behavior acquisition channel, and software behaviors are deconstructed into behavior features described by a series of feature values. It will be appreciated that characteristic values may be considered descriptive of actions and attributes in software behavior.
Step 300: and identifying the performance risk of the software according to the behavior characteristic value and the pre-established software risk characteristic value mapping.
Specifically, the performance risk identification result of the software corresponding to the behavior characteristic value in the software risk characteristic value mapping is queried, so that the software performance risk corresponding to the specific behavior characteristic value is determined.
from the above description, the method for identifying the software performance risk provided by the invention obtains the software behavior of the process of executing the operation on the data and the object by the software from the program code and the log file in the software, extracts the characteristic value capable of describing the software behavior from the software behavior as the basis, and finally identifies the software performance risk according to the characteristic value. The software performance risk identification method is suitable for software research and development and operation and maintenance processes, can greatly reduce the cost investment of performance risk identification, and improves the identification and optimization efficiency. The method overcomes the difficulty that in the prior art, in the software research and development and operation and maintenance processes, the identification of the performance risk highly depends on the experience of people, or the performance risk needs to wait until the risk occurs and then the performance risk is reversed.
in one embodiment, referring to fig. 2, step 100 comprises:
step 101 a: program code in the software is parsed to generate a first abstract syntax tree.
step 102 a: and utilizing a path covering algorithm, taking the entry function of the abstract syntax tree as a starting end point, and retrieving the function of the abstract syntax tree and the calling relation of the abstract syntax tree so as to generate the static execution operation of the program code.
Step 101 b: and analyzing the log files according to the time sequence of the log files in the software to generate a log file analysis result.
Step 102 b: and aggregating the analysis results of the log file according to the serial number and the transaction code to generate static execution operation of the log file.
it is understood that steps 101a and 102a are methods for generating corresponding software behaviors for a program code, and are static scanning methods, that is, software behaviors are collected from a program code by way of file parsing without actually performing operations. In contrast, step 101b and step 102b are methods for generating corresponding software behaviors for log files, and are also static scanning methods.
In one embodiment, referring to fig. 3, step 100 further comprises:
step 101 c: and respectively arranging a probe in the process memory, the network communication layer and the process memory and the network communication layer in the log file of the program code.
it can be understood that the probe (NET probe) in step 101c is to obtain key information required by the target from the data of the marine lake, provide an information early warning indication, and flexibly generate various messages and information icons, so as to help the user to timely, comprehensively and accurately grasp various network movements, thereby improving the self time tracking and fast response capabilities.
102 c: and monitoring the logic of the software operation, the network interaction request and the network interaction response in real time through the probe.
103 c: and generating dynamic execution operation of the code program and the log file according to the logic of the software operation, the network interaction request and the network interaction response by utilizing a bypass method.
Compared with the steps 101a to 102b, the steps 102c and 103c are actually dynamic interception of software behaviors, that is, probes are added in a process memory or a network communication layer, logic of software operation or requests and responses of network interaction are monitored in a bypass mode in real time, and the software behaviors are collected in a data packet copy output mode in the bypass mode.
In one embodiment, referring to fig. 4, step 200 comprises:
Step 201 a: and analyzing the execution operation of the program code to generate a second abstract syntax tree.
it is understood that the execution operation in step 201a includes a static execution operation and a dynamic execution operation of the program code.
it is understood that an Abstract Syntax Tree (AST) is an Abstract representation of the source code Syntax structure. It represents the syntactic structure of the programming language in the form of a tree, each node on the tree representing a structure in the source code. The syntax is said to be "abstract" in that the syntax does not represent every detail that appears in the true syntax. For example, nesting brackets are implicit in the structure of the tree and are not present in the form of nodes; whereas a conditional jump statement like the if-condition-then may be represented using a node with two branches.
Step 202 a: and traversing the second abstract syntax tree to extract the first behavior characteristic value.
Traversal (Traversal) in step 202a refers to making one visit to each node in the tree (or graph) in turn along a certain search route. The operation performed by the access node depends on the specific application problem, and the specific access operation may be to check the value of the node, update the value of the node, and the like. Different traversal methods have different access node orders.
in one embodiment, referring to fig. 5, step 200 further comprises:
Step 201 b: and analyzing the executing operation of the log file, and extracting a second behavior characteristic value according to the key word.
the execution operation in step 201b is specifically a static execution operation of a log file, and in step 201b and step 202b, for the software behavior collected in the log, the parser includes parsing of text grammar and lexical, and aggregating multiple rows of logs of the same serial number. And after the analysis is finished, generating a characteristic value through extracting the key words. For example: one embodiment is as follows: one program behavior collected by static log auditing is:
2017/06/26 16:27:37,005 cashcoderecv-logType=->core4ctp:Operation message=operation begin execute,free memory:|1224779368|;
2017/06/26 16:27:37,006 cashcoderecv-connect to 127.0.0.1;
2017/06/26 16:27:37,073 cashcoderecv-select mothod:parallel=10;
2017/06/26 16:27:37,074 cashcoderecv-SQL=>SELECT ID,NAME FROM USER WHERE STATUS=1;
2017/06/26 16:27:37,271 cashcoderecv-RECORDNUM=20000;
2017/06/26 16:27:37,292 cashcoderecv-logType=->core4ctp:Operation message=operation end execute.it cost|287|ms;
after grammar and lexical analysis and multi-line log aggregation, software behaviors are disassembled as follows:
1. USER table with access to 127.0.0.1 database
2. query STATUS field equal to 1 record set
3. The result returns 20000 records
4. Consuming 5 seconds "
After extracting the keywords, generating behavior characteristics described in a JSON format as follows: { "TYPE": QUERYDB "," ADDRESS ": 127.0.0.1", "TABLENAME": USER ", CONDICTIONS {" STATUS ": 1" }, { "RECORDNUM": 20000 "," TIME ": 5 s", "PARALLEL": 10 "}.
in addition, the behavior feature value in step 200 includes the first behavior feature value in step 202a and the second behavior feature value in step 201 b.
particularly, when the target object is a database, such as the relational database common in the industry, such as ORACLE, MYSQL, DB2, and the like, and the data condition of distributed cache, such as SSDB, REDIS, etc., the multidimensional data feature of the database data needs to be collected. When risk identification is carried out on data access behaviors, the scale and the efficiency of data access need to be judged in combination with the characteristics of data analysis. At this time, in the implementation of step 200, it is necessary to perform query scanning on the database by using the collection rule through a dynamic search statistical means. The collection rules are defined in a configuration mode, and each rule corresponds to one evaluation dimension. For example: after four types of acquisition dimensions, such as TABLENAME, ROWNUM, COLUMNS, INDEX and the like, are configured in advance, the USER table is analyzed, the following characteristics are acquired and stored in a JSON format: { "TABLENAME": USER "," ROWNUM ": 24718", "COLUMNS": { "ID": NUMBER "," NAME ": VARCHAR (10)", "BRANCHID": VARCHAR (20) "," STATUS ": NUMBER" }, "INDEX": { "PK _ USER": ID "}.
in an embodiment, referring to fig. 6, the method for identifying a software performance risk further includes:
step 400: and generating the software risk characteristic value mapping.
In the implementation of step 400, the software risk feature value mapping may be established according to the behavior feature value threshold range and the corresponding software risk category and level, and further, the software risk feature value mapping may be established according to the respective preset threshold ranges of the query results of the returned data volume, the operation time consumption, the operation concurrency, the throughput, the thread pool size, the connection pool size, the operating system CPU memory resource consumption, the statement concurrency, and the predicate matching index, and the corresponding software performance risk categories and levels. For example: in one embodiment, four risk features are defined: A. the return data volume is more than 1000; B. takes more than 500 milliseconds; C. the sentence concurrency degree exceeds 100; D. the query predicate does not have a matching index. For the embodiments of the behavior dismantling unit 2 and the data analysis unit 3 described above: { "TYPE": QUERYDB "," ADDRESS ": 127.0.0.1", "TABLENAME": USER ", CONDICTIONS {" STATUS ": 1" }, { "RECORDNUM": 20000 "," TIME ": 5 s", "PARALLEL": 10 "}. This action hits three risk features simultaneously: the returned data volume is greater than 1000(record num 20000), takes more than 500 milliseconds (TIME 5s), the query predicate has no matching INDEX (INDEX does not include STATUS field), and is therefore identified as risk behavior.
it should be noted that step 400 may also be implemented by using other relevant parameters, and is not limited to establishing the software risk characteristic value mapping according to the respective preset threshold ranges of the query results of the returned data amount, the operation time consumption, the operation concurrency, the throughput, the thread pool size, the connection pool size, the operating system CPU memory resource consumption, the statement concurrency, and the predicate matching index, and the corresponding software performance risk categories and levels thereof.
in one embodiment, referring to fig. 7, step 300 includes:
Step 301: and inquiring a performance risk identification result of the software corresponding to the behavior characteristic value in the software risk characteristic value mapping.
It can be understood that different feature value threshold ranges correspond to different software performance risk identification results, and the software performance risk identification result corresponding to a feature value can be determined by determining that a specific feature value is in the corresponding threshold range.
from the above description, the method for identifying the software performance risk provided by the invention obtains the software behavior of the process of executing the operation on the data and the object by the software from the program code and the log file in the software, extracts the characteristic value capable of describing the software behavior from the software behavior as the basis, and finally identifies the software performance risk according to the characteristic value. The software performance risk identification method is suitable for software research and development and operation and maintenance processes, can greatly reduce the cost investment of performance risk identification, and improves the identification and optimization efficiency. The method overcomes the difficulty that in the prior art, in the software research and development and operation and maintenance processes, the identification of the performance risk highly depends on the experience of people, or the performance risk needs to wait until the risk occurs and then the performance risk is reversed.
To further illustrate the present solution, the present invention provides a specific application example of the software performance risk identification method, and the specific application example specifically includes the following contents, see fig. 8 and fig. 9.
S0: software behavior is collected.
it is understood that the software behavior in step S0 refers to the process by which software performs operations on data and objects. Specifically, software behaviors such as "application-side loop calculation", "program call", "database table access and data aggregation processing" are acquired through static auditing means such as code scanning and log mining and dynamic intercepting means such as communication layer packet capturing.
S1: and carrying out structured disassembly on the collected software behaviors.
Deconstructing is a behavior feature value that is stored in a formatted format and described in a series of feature values.
S2: and (5) analyzing the data characteristics.
And (5) carrying out retrieval statistics on the data of the software database, and collecting data characteristics by using predefined rules.
S3: and identifying the software performance risk according to the deconstructed behavior.
It can be understood that the risk characteristic is a type of value or value range of the behavior characteristic value, and if the value hits the range, the behavior is defined to have a potential performance hazard. For example: for data access behaviors, when the eigenvalue of the "SQL statement execution cost" is greater than 1000, it is defined as a risk characteristic. In addition, risk features are maintained in a structured and configurable manner, which facilitates querying and expansion.
it should be noted that, if the software behavior relates to a data access operation, the data characteristics generated in the identification process need to be queried at S2, and the performance risk of the data access operation is comprehensively evaluated.
and circularly comparing each characteristic value with the preset full-amount risk characteristic. And if the value of the characteristic value meets the risk characteristic value definition or is in the value range, the behavior characteristic is considered to have potential performance hazard on the dimension of the characteristic value.
s4: risk prompting and automatic optimization.
And informing the corresponding research personnel of the program of the identified software performance risk to perform manual confirmation. And simultaneously, an optimization suggestion is made for the risk characteristics of the hit.
Specifically, a contact way of research, development, operation and maintenance personnel corresponding to the software risk behavior is obtained, a person in charge of the joint is reminded in the modes of mails, short messages and the like, and the risk is audited and confirmed in a manual mode. And (5) making optimization suggestions for the behaviors with risks. And maintaining an optimization algorithm aiming at different risk characteristics, analyzing the risk characteristics contained in the risk behaviors, and calculating and proposing an optimization suggestion. For example, to identify the risk behavior of an HTTP interactive request, there are two risk features, namely, synchronous blocking waiting for return, and logging operations at the DEBUG level. When the risk behavior is analyzed, the log level is prompted to be adjusted to ERROR, and then the program interaction is recommended to be modified into asynchronous submission and background processing.
From the above description, the method for identifying the software performance risk provided by the invention obtains the software behavior of the process of executing the operation on the data and the object by the software from the program code and the log file in the software, extracts the characteristic value capable of describing the software behavior from the software behavior as the basis, and finally identifies the software performance risk according to the characteristic value. The software performance risk identification method is suitable for software research and development and operation and maintenance processes, can greatly reduce the cost investment of performance risk identification, and improves the identification and optimization efficiency. The method overcomes the difficulty that in the prior art, in the software research and development and operation and maintenance processes, the identification of the performance risk highly depends on the experience of people, or the performance risk needs to wait until the risk occurs and then the performance risk is reversed.
Based on the same inventive concept, the embodiment of the present application further provides a device for identifying software performance risks, which can be used to implement the method described in the above embodiment, such as the following embodiments. Because the principle of solving the problem of the software performance risk identification device is similar to the software performance risk identification method, the implementation of the software performance risk identification device can refer to the implementation of the software performance risk identification method, and repeated parts are not described again. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
The embodiment of the present invention provides a specific implementation manner of a software performance risk identification device capable of implementing the software performance risk identification method, and referring to fig. 10, the software performance risk identification device specifically includes the following contents:
An execution operation acquisition unit 10, configured to acquire execution operations of the software on data and objects;
a feature value extraction unit 20, configured to deconstruct the execution operation, and generate a behavior feature value that may describe the execution operation;
and the risk identification unit 30 is used for identifying the performance risk of the software according to the behavior characteristic value and a pre-established software risk characteristic value mapping.
Preferably, referring to fig. 11, the execution operation acquiring unit 10 includes:
A syntax tree generating module 101a, configured to parse a program code in the software to generate a first abstract syntax tree;
a static generation first module 102a, configured to retrieve, by using a path overlay algorithm, a function of the abstract syntax tree and a call relationship of the abstract syntax tree with an entry function of the abstract syntax tree as a start endpoint, so as to generate a static execution operation of the program code;
The analysis result generation module 103a is configured to analyze the log file according to the time sequence of the log file in the software to generate a log file analysis result;
and the static generation second module 104a aggregates the log file analysis results according to the serial number and the transaction code to generate static execution operation of the log file.
preferably, referring to fig. 12, the execution operation acquiring unit further includes:
the probe setting module 101b is configured to set a probe in the process memory of the program code, the network communication layer, and the process memory and the network communication layer in the log file, respectively;
the monitoring module 102b is used for monitoring the logic of the software operation, the network interaction request and the network interaction response in real time through a probe;
And the dynamic generation module 103b is configured to generate a dynamic execution operation of the code program and the log file according to the logic of the software operation, the network interaction request, and the network interaction response by using a bypass method.
Preferably, referring to fig. 13, the feature value extraction unit 20 includes:
an abstraction generating module 201a, configured to parse the static execution operation and the dynamic execution operation of the program code to generate a second abstract syntax tree;
The first feature value extraction module 202a is configured to perform a traversal operation on the second abstract syntax tree to extract a first behavior feature value.
Preferably, referring to fig. 14, the feature value extraction unit 20 further includes:
The second characteristic value extraction module 201b is configured to analyze static execution operations of the log file, and extract a second behavior characteristic value according to a keyword;
The behavior feature value includes the first behavior feature value and the second behavior feature value.
Preferably, referring to fig. 15, the software performance risk identification means further comprises: and the mapping generation unit 40 is configured to generate the software risk characteristic value mapping, and the mapping generation unit is specifically configured to establish the software risk characteristic value mapping according to the behavior characteristic value threshold range and the corresponding software risk category and level.
preferably, the mapping generating unit is further specifically configured to establish the software risk characteristic value mapping according to a preset threshold range of each query result of the returned data amount, the operation time consumption, the operation concurrency, the throughput, the thread pool size, the connection pool size, the operating system CPU memory resource consumption, the statement concurrency, and the predicate matching index, and a software performance risk category and a software performance risk level corresponding to the preset threshold range.
Preferably, the risk identification unit is specifically configured to query a performance risk identification result of the software corresponding to the behavior characteristic value in the software risk characteristic value map.
As can be seen from the above description, the device for identifying software performance risk provided by the present invention obtains the software behavior of the process in which the software performs operations on data and objects from the program code and the log file in the software, extracts the feature value that can describe the software behavior from the software behavior, and identifies the performance risk of the software according to the feature value. The software performance risk identification method is suitable for software research and development and operation and maintenance processes, can greatly reduce the cost investment of performance risk identification, and improves the identification and optimization efficiency. The method overcomes the difficulty that in the prior art, in the software research and development and operation and maintenance processes, the identification of the performance risk highly depends on the experience of people, or the performance risk needs to wait until the risk occurs and then the performance risk is reversed.
The embodiment of the present application further provides a specific implementation manner of an electronic device, which is capable of implementing all steps in the identification method of software performance risk in the foregoing embodiment, and referring to fig. 16, the electronic device specifically includes the following contents:
A processor (processor)1201, a memory (memory)1202, a communication Interface 1203, and a bus 1204;
The processor 1201, the memory 1202 and the communication interface 1203 complete communication with each other through the bus 1204; the communication interface 1203 is configured to implement information transmission between related devices, such as a server-side device, a storage device, and a client device.
the processor 1201 is configured to call the computer program in the memory 1202, and the processor executes the computer program to implement all the steps in the method for identifying a software performance risk in the above embodiments, for example, the processor executes the computer program to implement the following steps:
step 100: and acquiring the execution operation of the software on the data and the object.
Step 200: deconstructing the execution operation to generate a behavior characteristic value that can describe the execution operation.
Step 300: and identifying the performance risk of the software according to the behavior characteristic value and the pre-established software risk characteristic value mapping.
As can be seen from the above description, in the electronic device in the embodiment of the present application, the software behavior of the process of executing the operation on the data and the object by the software is obtained from the program code and the log file in the software, and based on the software behavior, the feature value that can describe the software behavior is extracted from the software behavior, and finally, the performance risk of the software is identified according to the feature value. The software performance risk identification method is suitable for software research and development and operation and maintenance processes, can greatly reduce the cost investment of performance risk identification, and improves the identification and optimization efficiency. The method overcomes the difficulty that in the prior art, in the software research and development and operation and maintenance processes, the identification of the performance risk highly depends on the experience of people, or the performance risk needs to wait until the risk occurs and then the performance risk is reversed.
embodiments of the present application further provide a computer-readable storage medium capable of implementing all steps in the method for identifying a software performance risk in the foregoing embodiments, where the computer-readable storage medium stores thereon a computer program, and when the computer program is executed by a processor, the computer program implements all steps of the method for identifying a software performance risk in the foregoing embodiments, for example, when the processor executes the computer program, the processor implements the following steps:
step 100: and acquiring the execution operation of the software on the data and the object.
step 200: deconstructing the execution operation to generate a behavior characteristic value that can describe the execution operation.
Step 300: and identifying the performance risk of the software according to the behavior characteristic value and the pre-established software risk characteristic value mapping.
As can be seen from the above description, in the computer-readable storage medium in the embodiment of the present application, the software behavior of the process of executing the operation on the data and the object by the software is obtained from the program code and the log file in the software, and based on the software behavior, the feature value that can describe the software behavior is extracted from the software behavior, and finally, the performance risk of the software is identified according to the feature value. The software performance risk identification method is suitable for software research and development and operation and maintenance processes, can greatly reduce the cost investment of performance risk identification, and improves the identification and optimization efficiency. The method overcomes the difficulty that in the prior art, in the software research and development and operation and maintenance processes, the identification of the performance risk highly depends on the experience of people, or the performance risk needs to wait until the risk occurs and then the performance risk is reversed.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the hardware + program class embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Although the present application provides method steps as in an embodiment or a flowchart, more or fewer steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an actual apparatus or client product executes, it may execute sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the embodiments or methods shown in the figures.
as will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
the present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
these computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
the principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (16)

1. a method for identifying software performance risk, comprising:
acquiring the execution operation of the software on data and objects;
deconstructing the execution operation to generate a behavior characteristic value which can describe the execution operation;
and identifying the performance risk of the software according to the behavior characteristic value and the pre-established software risk characteristic value mapping.
2. The method according to claim 1, wherein the obtaining of the execution operation of the software on the data and the object comprises:
analyzing program codes in the software to generate a first abstract syntax tree;
searching the function of the abstract syntax tree and the calling relation of the abstract syntax tree by using a path covering algorithm and taking the entry function of the abstract syntax tree as an initial end point so as to generate static execution operation of the program code;
Analyzing the log files according to the time sequence of the log files in the software to generate a log file analysis result;
and aggregating the analysis results of the log file according to the serial number and the transaction code to generate static execution operation of the log file.
3. The method according to claim 2, wherein the obtaining of the execution operation of the software on the data and the object further comprises:
respectively setting a probe in the process memory, the network communication layer of the program code and the process memory and the network communication layer in the log file;
Monitoring the logic, network interaction request and network interaction response of the software operation in real time through a probe;
And generating dynamic execution operation of the code program and the log file according to the logic of the software operation, the network interaction request and the network interaction response by utilizing a bypass method.
4. the identification method according to claim 2, wherein the deconstructing the execution operation and generating a behavior feature value that can describe the execution operation comprises:
analyzing the execution operation of the program code to generate a second abstract syntax tree;
Traversing the second abstract syntax tree to extract a first behavior characteristic value;
Analyzing the execution operation of the log file, and extracting a second behavior characteristic value according to the keyword;
The behavior feature value includes the first behavior feature value and the second behavior feature value.
5. The identification method according to claim 1, characterized in that the step of pre-establishing said software risk characteristic value mapping comprises: and establishing the software risk characteristic value mapping according to the behavior characteristic value threshold range and the corresponding software risk category and level.
6. the method according to claim 5, wherein the step of establishing the software risk feature value mapping according to the preset threshold range of the behavior feature value and the corresponding software performance risk category and level thereof comprises:
and establishing the software risk characteristic value mapping according to the respective preset threshold ranges of the query results of the returned data volume, the operation time consumption, the operation concurrency, the throughput, the thread pool size, the connection pool size, the CPU memory resource consumption of the operating system, the statement concurrency and the predicate matching index and the corresponding software performance risk categories and levels thereof.
7. The method according to claim 5, wherein the identifying the performance risk of the software according to the behavior characteristic value and a pre-established software risk characteristic value mapping comprises:
and inquiring a performance risk identification result of the software corresponding to the behavior characteristic value in the software risk characteristic value mapping.
8. An apparatus for identifying software performance risk, comprising:
The execution operation acquisition unit is used for acquiring the execution operation of the software on the data and the object;
a characteristic value extraction unit, configured to deconstruct the execution operation, and generate a behavior characteristic value that may describe the execution operation;
and the risk identification unit is used for identifying the performance risk of the software according to the behavior characteristic value and the pre-established software risk characteristic value mapping.
9. The apparatus according to claim 8, wherein the execution operation acquisition unit includes:
The syntax tree generating module is used for analyzing the program codes in the software to generate a first abstract syntax tree;
A static generation first module, configured to retrieve, by using a path overlay algorithm, a function of the abstract syntax tree and a call relationship of the abstract syntax tree with an entry function of the abstract syntax tree as an initial end point, so as to generate a static execution operation of the program code;
The analysis result generation module is used for analyzing the log files according to the time sequence of the log files in the software so as to generate a log file analysis result;
And the static generation second module is used for aggregating the analysis results of the log file according to the serial number and the transaction code so as to generate static execution operation of the log file.
10. The identification device according to claim 9, wherein the execution operation acquisition unit further includes:
The probe setting module is used for respectively setting a probe in the process memory of the program code, the network communication layer and the process memory and the network communication layer in the log file;
the monitoring module is used for monitoring the logic of the software operation, the network interaction request and the network interaction response in real time through the probe;
And the dynamic generation module is used for generating dynamic execution operation of the code program and the log file according to the logic of the software operation, the network interaction request and the network interaction response by utilizing a bypass method.
11. the recognition apparatus according to claim 9, wherein the feature value extraction unit includes:
the abstract generating module is used for analyzing the execution operation of the program code to generate a second abstract syntax tree;
The first characteristic value extraction module is used for performing traversal operation on the second abstract syntax tree to extract a first behavior characteristic value;
The second characteristic value extraction module is used for analyzing the execution operation of the log file and extracting a second behavior characteristic value according to the keyword;
The behavior feature value includes the first behavior feature value and the second behavior feature value.
12. the identification device of claim 8, further comprising: the mapping generation unit is used for establishing the software risk characteristic value mapping in advance;
The mapping generation unit is specifically configured to establish the software risk characteristic value mapping according to the behavior characteristic value threshold range and the corresponding software risk category and level.
13. The identification device according to claim 12, wherein the mapping generation unit is further specifically configured to establish the software risk feature value mapping according to a preset threshold range of each query result of the returned data volume, the operation time consumption, the operation concurrency, the throughput, the thread pool size, the connection pool size, the operating system CPU memory resource consumption, the statement concurrency, and the predicate matching index, and a software performance risk category and a software performance risk level corresponding to the threshold range.
14. The identification device according to claim 12, wherein the risk identification unit is specifically configured to query a performance risk identification result of the software corresponding to the behavior feature value in the software risk feature value map.
15. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method for identifying a software performance risk according to any one of claims 1 to 7 are implemented when the program is executed by the processor.
16. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for identifying a software performance risk according to any one of claims 1 to 7.
CN201910863103.3A 2019-09-12 2019-09-12 Method and device for identifying software performance risk Active CN110580170B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910863103.3A CN110580170B (en) 2019-09-12 2019-09-12 Method and device for identifying software performance risk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910863103.3A CN110580170B (en) 2019-09-12 2019-09-12 Method and device for identifying software performance risk

Publications (2)

Publication Number Publication Date
CN110580170A true CN110580170A (en) 2019-12-17
CN110580170B CN110580170B (en) 2023-07-21

Family

ID=68811784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910863103.3A Active CN110580170B (en) 2019-09-12 2019-09-12 Method and device for identifying software performance risk

Country Status (1)

Country Link
CN (1) CN110580170B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021228036A1 (en) * 2020-05-14 2021-11-18 International Business Machines Corporation Modification of codified infrastructure for orchestration in multi-cloud environment
CN113836907A (en) * 2021-09-06 2021-12-24 北京好欣晴移动医疗科技有限公司 Text clustering picture identification method, device and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503562A (en) * 2015-09-06 2017-03-15 阿里巴巴集团控股有限公司 A kind of Risk Identification Method and device
CN107679404A (en) * 2017-08-31 2018-02-09 百度在线网络技术(北京)有限公司 Method and apparatus for determining software systems potential risk
CN109214171A (en) * 2018-08-29 2019-01-15 深信服科技股份有限公司 A kind of detection method of software, device, equipment and medium
CN109544166A (en) * 2018-11-05 2019-03-29 阿里巴巴集团控股有限公司 A kind of Risk Identification Method and device
US20190207974A1 (en) * 2017-12-29 2019-07-04 Cyphort Inc. System for query injection detection using abstract syntax trees
CN110135693A (en) * 2019-04-12 2019-08-16 北京中科闻歌科技股份有限公司 A kind of Risk Identification Method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503562A (en) * 2015-09-06 2017-03-15 阿里巴巴集团控股有限公司 A kind of Risk Identification Method and device
CN107679404A (en) * 2017-08-31 2018-02-09 百度在线网络技术(北京)有限公司 Method and apparatus for determining software systems potential risk
US20190207974A1 (en) * 2017-12-29 2019-07-04 Cyphort Inc. System for query injection detection using abstract syntax trees
CN109214171A (en) * 2018-08-29 2019-01-15 深信服科技股份有限公司 A kind of detection method of software, device, equipment and medium
CN109544166A (en) * 2018-11-05 2019-03-29 阿里巴巴集团控股有限公司 A kind of Risk Identification Method and device
CN110135693A (en) * 2019-04-12 2019-08-16 北京中科闻歌科技股份有限公司 A kind of Risk Identification Method, device, equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021228036A1 (en) * 2020-05-14 2021-11-18 International Business Machines Corporation Modification of codified infrastructure for orchestration in multi-cloud environment
US11200048B2 (en) 2020-05-14 2021-12-14 International Business Machines Corporation Modification of codified infrastructure for orchestration in a multi-cloud environment
CN113836907A (en) * 2021-09-06 2021-12-24 北京好欣晴移动医疗科技有限公司 Text clustering picture identification method, device and system
CN113836907B (en) * 2021-09-06 2023-07-18 好心情健康产业集团有限公司 Text clustering picture identification method, device and system

Also Published As

Publication number Publication date
CN110580170B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
US6581052B1 (en) Test generator for database management systems
US9747335B2 (en) Generic operator framework
US20110219360A1 (en) Software debugging recommendations
US20050102613A1 (en) Generating a hierarchical plain-text execution plan from a database query
CN112988782B (en) Hive-supported interactive query method and device and storage medium
CN110543356A (en) abnormal task detection method, device and equipment and computer storage medium
US10915535B2 (en) Optimizations for a behavior analysis engine
Biswas et al. Boa meets python: A boa dataset of data science software in python language
US11269880B2 (en) Retroreflective clustered join graph generation for relational database queries
CN111597243A (en) Data warehouse-based abstract data loading method and system
CN110909016A (en) Database-based repeated association detection method, device, equipment and storage medium
CN110580170B (en) Method and device for identifying software performance risk
CN116483850A (en) Data processing method, device, equipment and medium
EP3816814A1 (en) Crux detection in search definitions
EP3293645B1 (en) Iterative evaluation of data through simd processor registers
Gombos et al. P-Spar (k) ql: SPARQL evaluation method on Spark GraphX with parallel query plan
CN116069808A (en) Method and device for determining dependency information of database storage process and electronic equipment
US11281671B2 (en) Retroreflective join graph generation for relational database queries
CN108763474B (en) Method, device and storage medium for acquiring transaction correlation and executing regression test
Lindvall et al. A comparison of latency for MongoDB and PostgreSQL with a focus on analysis of source code
KR101714985B1 (en) The method and device of inspection of nested query parallelism in a distributed parallel database
CN111190886B (en) Database access-oriented computation flow graph construction method, access method and device
CN117290355B (en) Metadata map construction system
Peltomaa Elasticsearch-based data management proof of concept for continuous integration
Xin et al. MEET DB2: automated database migration evaluation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant