CN116149626A - Program characteristic representation method and device, electronic equipment and storage medium - Google Patents

Program characteristic representation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116149626A
CN116149626A CN202310188066.7A CN202310188066A CN116149626A CN 116149626 A CN116149626 A CN 116149626A CN 202310188066 A CN202310188066 A CN 202310188066A CN 116149626 A CN116149626 A CN 116149626A
Authority
CN
China
Prior art keywords
logic
program
sample
characterization
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310188066.7A
Other languages
Chinese (zh)
Inventor
崔宝江
卜文杭
王子奇
杨俊�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202310188066.7A priority Critical patent/CN116149626A/en
Publication of CN116149626A publication Critical patent/CN116149626A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/313Logic programming, e.g. PROLOG programming language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/76Adapting program code to run in a different environment; Porting

Abstract

The application provides a program characteristic representation method, a program characteristic representation device, electronic equipment and a storage medium, and relates to the technical field of computers. The method comprises the following steps: extracting features of a program to be characterized through a program characterization model to obtain a logic feature characterization diagram; and analyzing the logic characteristic characterization graph through the program characterization model, and outputting a program characteristic representation result of the program to be characterized. The apparatus is for performing the above method. According to the method, the logic characteristics of the program to be characterized are analyzed, the unified characterization model is utilized to express the deep logic characteristics of programs written in different programming languages, the similar logic essence implied by different programming languages is revealed, and theoretical support is provided for downstream tasks such as logic vulnerability cause analysis and migration mining.

Description

Program characteristic representation method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technology, and in particular, to a method and apparatus for representing program features, an electronic device, and a storage medium.
Background
At present, as logic vulnerabilities become more important points of analysis in the security field, formal representation and analysis of logic vulnerabilities also become a focus problem. Based on the current situation that the loophole is found by simply manually finding and analyzing the trigger logic, the loophole is found by the automatic analysis method, which is possibly existed, is generally tried in the industry.
However, the currently adopted logic vulnerability characterization form and the programming language architecture used by the system where the logic vulnerability is located often have larger coupling, so that the problem that the deep logic feature characterization form of the program written by different programming languages is not uniform and intermediate representation cannot be directly migrated and compared exists.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method, an apparatus, an electronic device, and a storage medium for representing deep logic features of a program written in different programming languages by analyzing logic features of a program to be characterized, using a unified characterization model.
In a first aspect, an embodiment of the present application provides a method for representing a program feature, where the method includes: extracting features of a program to be characterized through a program characterization model to obtain a logic feature characterization diagram; the nodes of the logic characteristic representation graph represent the data logic structure of the program to be represented, and the edges of the logic characteristic representation graph represent the execution logic structure of the program to be represented; analyzing the logic characteristic characterization graph through the program characterization model, and outputting a program characteristic representation result of the program to be characterized; the program characterization model is used for carrying out feature extraction on sample characterization features in advance to generate a sample logic feature map, and generating based on the sample logic feature map.
According to the technical scheme, the logic characteristic characterization diagram of the program to be characterized is analyzed and represented by using the program characterization model, so that the logic characteristic characterization diagram can be generated by extracting the data logic structure and the execution logic structure of the program written in different programming languages. The logic characteristic characterization graph reveals similar logic essence implied by different programming languages, so that migration comparison can be directly carried out on the different programming languages, and theoretical support is provided for downstream tasks such as logic vulnerability cause analysis, migration mining and the like.
In some embodiments, feature extraction is performed on a program to be characterized through a program characterization model to obtain a logic feature characterization graph, including: extracting a data logic structure and an execution logic structure of a program to be characterized through a program characterization model; and generating a logic characteristic characterization graph according to the data logic structure and the execution logic structure.
According to the embodiment of the application, the logic characteristic characterization graph is generated according to the extracted data logic structure and execution logic structure of the program to be characterized, so that the logic characteristic characterization graph can reflect the logic structure characteristics of the program to be characterized in multiple aspects.
In some embodiments, the data logic structure includes a file hierarchy, a class hierarchy, a function hierarchy, and a statement hierarchy; generating a logic characteristic representation graph according to the data logic structure and the execution logic structure, wherein the logic characteristic representation graph comprises: generating a first logic feature map of a file hierarchy of a program to be characterized according to the file hierarchy and an execution logic structure of the file hierarchy; generating a second logic feature diagram of the class hierarchy of the program to be characterized according to the class hierarchy and the execution logic structure of the class hierarchy; generating a third logic feature diagram of the function level of the program to be characterized according to the function level and the execution logic structure of the function level; generating a fourth logic feature diagram of the sentence level of the program to be characterized according to the sentence level and the execution logic structure of the sentence level; and generating a logic characteristic characterization graph according to the first logic characteristic graph, the second logic characteristic graph, the third logic characteristic graph and the fourth logic characteristic graph.
According to the embodiment of the application, the data logic structure is subdivided into the multiple layers, the logic feature map of each layer is generated, and the logic feature characterization map is generated according to the logic feature map of each layer, so that the map can show all layers of the program to be characterized and logic structure features among all layers, and can more intuitively show the internal logic characteristics of the program to be characterized. And different programming languages are unified language independent abstract logic expression at the file level, class level, function level and statement level respectively.
In some embodiments, generating a logic feature characterization map from the first logic feature map, the second logic feature map, the third logic feature map, and the fourth logic feature map includes: rejecting contradictory logic features among the first logic feature map, the second logic feature map, the third logic feature map and the fourth logic feature map to obtain a logic feature characterization map of a program to be characterized; the contradictory logic features refer to different logic features expressed by the same logic features in the first logic feature diagram, the second logic feature diagram, the third logic feature diagram and the fourth logic feature diagram.
According to the embodiment of the application, the contradictory logic features in each level are removed, so that the splicing process of the logic feature diagrams of each level can be smoothly carried out, and finally, the spliced logic feature characterization diagrams represent correct internal logic in the program to be characterized.
In some embodiments, generating a first logical feature map of a file hierarchy of a program to be characterized from the file hierarchy and an execution logical structure of the file hierarchy includes: obtaining the file position and the file type of the program to be characterized according to a file characteristic analysis method; and generating a first logic characteristic diagram of a file hierarchy of the program to be characterized through a file tree analysis method according to the file position and the file type.
According to the embodiment of the application, the logic characteristics of the file hierarchy are analyzed by using the related method of file analysis, and the logic characteristic diagram of the file hierarchy is obtained. The method lays a foundation for constructing logic feature graphs of other levels while representing the logic features of the file level.
In some embodiments, the second logic feature map, the third logic feature map, and the fourth logic feature map are generated by: obtaining the running mode of the program to be characterized according to the file type; the operation modes comprise a compiling operation mode and an interpretation operation mode; and generating a second logic characteristic diagram, a third logic characteristic diagram and a fourth logic characteristic diagram according to the running mode of the program to be characterized.
According to the embodiment of the application, the running modes of the program to be characterized are obtained according to the file types, so that the logic feature diagrams are built for the programming languages of different running modes in a targeted mode, and the logic features of different levels of the program to be characterized are accurately expressed.
In some embodiments, the method further comprises: acquiring a sample characterization program; the sample characterization program comprises a sample data logic structure and a sample execution logic structure; the sample data logic structure comprises a sample file level, a sample class level, a sample function level and a sample statement level; generating a first sample logic feature map of a sample file hierarchy of a sample characterization program according to the sample file hierarchy and a sample execution logic structure of the sample file hierarchy; generating a second sample logic feature map of the sample class hierarchy of the sample characterization program according to the sample class hierarchy and the sample execution logic structure of the sample class hierarchy; generating a third sample logic feature map of the sample function level of the sample characterization program according to the sample function level and the sample execution logic structure of the sample function level; generating a fourth sample logic feature diagram of the sample sentence level of the sample characterization program according to the sample sentence level and the sample execution logic structure of the sample sentence level; the first sample logic feature diagram, the second sample logic feature diagram, the third sample logic feature diagram and the fourth sample logic feature diagram are spliced into a sample logic feature characterization diagram; and generating the program characterization model according to the sample logic characteristic characterization graph.
According to the embodiment of the application, the program characterization model is built by using the sample characterization program, so that the accuracy and the universality of the program characterization model are improved.
In a second aspect, embodiments of the present application provide a program feature representing apparatus, the apparatus including: the feature extraction module is used for extracting features of the program to be characterized through the program characterization model to obtain a logic feature characterization diagram; the nodes of the logic characteristic representation graph represent the data logic structure of the program to be represented, and the edges of the logic characteristic representation graph represent the execution logic structure of the program to be represented; the representation module is used for analyzing the logic characteristic representation graph through the program representation model and outputting a program characteristic representation result of the program to be represented; the program characterization model is used for carrying out feature extraction on sample characterization features in advance to generate a sample logic feature map, and generating based on the sample logic feature map.
According to the embodiment of the application, corresponding method steps are executed by utilizing different modules, so that the generated logic characteristic characterization graph reveals similar logic essence implied by different programming languages, and migration comparison can be directly carried out on the different programming languages, so that theoretical support is provided for downstream tasks such as logic vulnerability cause analysis, migration mining and the like.
In a third aspect, an embodiment of the present application provides an electronic device, including: the device comprises a processor, a memory, a storage medium and a bus, wherein the processor and the memory are communicated with each other through the bus; the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method steps of the first aspect.
In a fourth aspect, embodiments of the present application provide a non-transitory computer readable storage medium comprising: the computer-readable storage medium stores computer instructions that cause the computer to perform the method steps of the first aspect.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for representing program features according to an embodiment of the present application;
FIG. 2 is a schematic diagram of analysis and characterization of a program to be characterized according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a data logic structure according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of an execution logic structure according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a relationship of a four-level hierarchy according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of an exception handling method in the stitching process of the logic feature diagrams of various levels according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a process for generating a logic feature map of each level according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a method for generating a logic feature map of a compiled programming language according to an embodiment of the present application;
FIG. 9 is a schematic diagram of a method for generating a logic feature map of an illustrative programming language according to an embodiment of the present application;
FIG. 10 is a schematic diagram of an abnormal operation path identifying and extracting method for a programming language with an exception handling mechanism according to an embodiment of the present application;
FIG. 11 is a schematic diagram of a device for representing program features according to an embodiment of the present application;
Fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the drawings are designed solely for the purposes of illustration and description and are not intended to limit the scope of the application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this application, illustrates operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to the flow diagrams and one or more operations may be removed from the flow diagrams as directed by those skilled in the art.
In addition, the described embodiments are only some, but not all, of the embodiments of the present application. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
It is noted that all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the description of the figures above are intended to cover non-exclusive inclusions.
In the description of the embodiments of the present application, the technical terms "first," "second," etc. are used merely to distinguish between different objects and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated, a particular order or a primary or secondary relationship. In the description of the embodiments of the present application, the meaning of "plurality" is two or more unless explicitly defined otherwise.
In the description of the embodiments of the present application, the term "and/or" is merely an association relationship describing an association object, which means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
It can be understood that the method for representing program features provided in the embodiments of the present application may be applied to a terminal device (may also be referred to as an electronic device) and a server; the terminal equipment can be a smart phone, a tablet personal computer, a personal digital assistant (Personal Digital Assistant, PDA) and the like; the server may be an application server or a Web server.
In order to facilitate understanding of the technical solution provided in the embodiments of the present application, an application scenario of a method for representing program features provided in the embodiments of the present application is described below by taking a server as an execution body as an example.
Fig. 1 is a flowchart of a method for representing program features according to an embodiment of the present application, as shown in fig. 1, where the method includes:
and step 101, the server performs feature extraction on the program to be characterized through the program characterization model to obtain a logic feature characterization graph.
The nodes of the logic characteristic representation graph represent the data logic structure of the program to be represented, and the edges of the logic characteristic representation graph represent the execution logic structure of the program to be represented.
In the specific implementation process, the server acquires the program to be characterized before extracting the features. The program to be characterized refers to a program which needs to be analyzed for logical characteristics and is characterized by utilizing a program characterization model. The program to be characterized can be a program written in a common programming language such as Java, python, go, C, C ++. It should be noted that the program to be characterized may be a single executable program file, or may be a program group formed by a plurality of executable files executed in cooperation with each other, or the like. The program to be characterized can be input by a user or extracted from a storage medium. The storage medium may be a hard disk, a usb disk, a virtual storage medium, or the like. In practical application, the form, the acquisition mode and the language type of the program to be characterized can be selected according to practical conditions, and the application is not particularly limited.
The method comprises the steps that when a server performs feature extraction on a program to be characterized, a data logic structure and an execution logic structure of the program to be characterized are extracted, the data logic structure is regarded as a node, the execution logic structure is regarded as an edge, and the relation between the node and the edge is determined according to the execution relation among the data logic structures, so that a logic feature characterization diagram of the program to be characterized is generated. The data logic structure refers to a data type and an actual storage mode corresponding to the data type, which are included in the program to be characterized, wherein the actual storage mode of the data type refers to an actual storage allocation scheme corresponding to a certain data type name. The execution logic structure refers to each level in the program to be characterized and the execution mode between each level.
Step 102, the server analyzes the logic characteristic characterization graph through the program characterization model and outputs a program characteristic representation result of the program to be characterized.
The program characterization model is used for carrying out feature extraction on sample characterization features in advance to generate a sample logic feature map, and generating based on the sample logic feature map.
In the specific implementation process, after obtaining the logic characteristic characterization diagram of the program to be characterized, the server analyzes the logic characteristic characterization diagram through the program characterization model, so that the program characteristics of the program to be characterized are represented.
According to the technical scheme, the logic characteristic characterization diagram of the program to be characterized is analyzed and represented by using the program characterization model, so that the logic characteristic characterization diagram can be generated by extracting the data logic structure and the execution logic structure of the program written in different programming languages. The logic characteristic characterization graph reveals similar logic essence implied by different programming languages, so that migration comparison can be directly carried out on the different programming languages, and theoretical support is provided for downstream tasks such as logic vulnerability cause analysis, migration mining and the like.
In some embodiments, the server performs feature extraction on a program to be characterized through a program characterization model to obtain a logic feature characterization graph, including: the server extracts a data logic structure and an execution logic structure of the program to be characterized through the program characterization model, and generates a logic characteristic characterization graph according to the data logic structure and the execution logic structure.
In a specific implementation process, when the server extracts the data logic structure of the program to be characterized, the server extracts the minimum data logic unit of the program to be characterized. For example, the extracted data logic structure may be a file to which the program to be characterized belongs, a class, a function, a statement, etc. included in the program to be characterized. Similarly, when the execution logic structures are extracted, the execution relationship between the respective data logic structures is extracted. For example, the execution relationship may be call, include, inherit, etc. And finally, generating a logic characteristic characterization graph based on the extracted data logic units and the execution relationship. In the feature extraction, the extracted data logic structure or the execution logic structure may be defined according to the actual situation, which is not particularly limited in the present application.
According to the embodiment of the application, the logic characteristic characterization graph is generated according to the extracted data logic structure and execution logic structure of the program to be characterized, so that the logic characteristic characterization graph can reflect the logic structure characteristics of the program to be characterized in multiple aspects.
In some embodiments, the data logic structure includes a file hierarchy, a class hierarchy, a function hierarchy, and a statement hierarchy; the server generates a logic characteristic representation graph according to the data logic structure and the execution logic structure, and the method comprises the following steps: the server generates a first logic feature map of a file hierarchy of a program to be characterized according to the file hierarchy and an execution logic structure of the file hierarchy; the server generates a second logic feature diagram of the class hierarchy of the program to be characterized according to the class hierarchy and the execution logic structure of the class hierarchy; the server generates a third logic feature diagram of the function level of the program to be characterized according to the function level and the execution logic structure of the function level; the server generates a fourth logic feature diagram of the sentence level of the program to be characterized according to the sentence level and the execution logic structure of the sentence level; and the server generates a logic characteristic characterization graph according to the first logic characteristic graph, the second logic characteristic graph, the third logic characteristic graph and the fourth logic characteristic graph.
Fig. 2 is an analysis and characterization schematic diagram of a program to be characterized according to an embodiment of the present application, where, as shown in fig. 2, the horizontal axis represents an analysis level of the program to be characterized, and the vertical axis represents a characterization level of the program to be characterized. When the program to be characterized is analyzed, the program to be characterized is layered into a file level, a class level, a function level and a statement level. When the program to be characterized is characterized, the data logic structure, the execution logic structure and the arrangement combination mode of the program to be characterized are characterized.
In a specific implementation process, when the data logic structure of the program to be characterized is characterized, the server extracts the data organization of each level and characterizes the data logic structure by using the extracted data organization. The data organization of the file level comprises a code package, a code file and the like; the data organization of the class level comprises code classes, non-generic code blocks and the like; the data organization of the function level comprises functions, non-generic function code blocks and the like; the data organization of the statement level includes statement types, variable declaration manners, variable declaration types, and the like. Any data organization of any hierarchy can be represented by nodes in the logic characteristic representation graph. And, the data organization type of each level can be selected according to actual conditions, which is not particularly limited in this application.
It should be noted that "generic-free code block" refers to a free code block that does not belong to any class, and is used to process a code set that is not classified under a certain class structure. "generic function" refers to a function that does not belong to any class, but has complete function characteristics. "innumerable function code blocks" refers to a collection of code instructions that are not of any class nor of a complete function feature, but that can be logically considered to be a function in a dense arrangement in the physical organization of the code. Wherein the function may also be referred to as a method.
For ease of understanding, fig. 3 is a schematic diagram of a data logic structure provided in an embodiment of the present application, and as shown in fig. 3, a Class (for example, class_a) is exemplified as the data logic structure, where the logic structure includes three parts:
(1) Entity names, typically given by codes, are globally unique.
(2) Class labels, determined by the nature of the entity, as represented herein by a Class implementation, are presented as "Class" labels; there is one and only one category label.
(3) Attributes. All attributes that may be related to an entity, except the type attribute, are labeled as attributes. Two class_a attributes are shown here, namely external visibility, such as: the code is marked Public and thus True, and the internal text content Body, such as: the representation is simplified in #classf.
When the execution logic structure of the program to be characterized is characterized, the server extracts the execution logic characteristics of each level. The execution logic characteristics of each hierarchy are respectively as follows: the execution logic features of the file hierarchy include inclusion, application, etc.; the execution logic characteristics of the class hierarchy comprise inheritance, implementation, use and the like; the execution logic characteristics of the function level comprise calling, inheritance, overwriting and the like; the execution logic features of the statement level include execution, jump, etc. Wherein any execution logic feature of any hierarchy can be represented by an edge in the logic feature representation graph, i.e., a directed or undirected edge from a graph node representing a related data logic structure to a graph node representing a related data logic structure is used to represent the execution logic structure relationship. The relationship name of the execution data structure is represented by the edge name of the execution data structure.
For ease of understanding, fig. 4 is a schematic diagram of an execution logic structure provided in an embodiment of the present application, and as shown in fig. 4, a manner of temporarily declaring another class in a certain class and invoking an execution logic of a logic operation of the class.
In this manner, the operating body class_a and the operating recipient class_b have been identified as two graph nodes named class_a and class_b, respectively, according to the representation of the data logic structure described above, and wherein the provisional claim command is marked as a directed edge from class_a node to class_b node, with the relationship label named "< Include >" (meaning "contain").
It should be noted that, a directed edge or a non-directed edge pointing to itself from a graph node representing a data logic structure may be created, and more than or equal to two directed edges or non-directed edges having the same edge name, i.e., an "edge non-repeating rule", pointing to the same graph node representing a data logic structure from the same graph node representing a data logic structure may not be created. If an operation is to violate this constraint rule, one of the directed edges or undirected edges created later is instead reserved and the remaining directed edges or undirected edges are deleted in the graph. The characteristics to be characterized except the characteristic of 'type' can be selectively marked on the edge instance according to the characterization requirement, namely, the definition method of the attributes of the edge except the use of the edge name is limited, and the definition method of the attributes of the rest edges can be freely determined according to the actual acquisition and analysis requirement and is not limited by the representation method.
Through the obtained data logic structure and execution logic structure of each level, a logic feature map corresponding to each level can be generated, and the orchestration combination mode shown in fig. 2 includes a file level logic feature map, a class level logic feature map, a function level logic feature map and a statement level logic feature map.
By the method, any program can be divided into four-level hierarchical structures of 'files-classes-functions-sentences'. Fig. 5 is a schematic diagram of a relationship between four hierarchical structures provided in an embodiment of the present application, as shown in fig. 5, on each hierarchical structure, a logic feature diagram under the hierarchical level structure is constructed and formed by using a data logic structure and an execution logic structure element included under the hierarchical structure independently according to an organization structure relationship actually existing in a code. In addition, a multi-level logic characteristic representation graph is formed between adjacent hierarchical structures in a splicing mode: a "file-class" hierarchy, file nodes; class-function hierarchy, class nodes; "function-statement" hierarchy, function nodes. The specific process of splicing is as follows: for the logic feature graphs with adjacent hierarchical relationships, the node pairs with the same names are found out by using a graph searching method. And merging the nodes of the node pairs with the same name, and reserving all the relation edges related to the nodes in the node pairs. And finally, obtaining a logic characteristic characterization diagram of the program to be characterized. It should be noted that the newly generated merge node still needs to maintain the aforementioned "edge non-duplication rule".
The specific implementation process of the graph searching method is as follows:
step 1, respectively creating list arrays containing all node names in the graphs for each logic feature graph of adjacent layers, and dividing the arrays into two stacks according to the logic feature layers.
And 2, searching node name pairs containing the same name (referring to the same logic element) in array pairs of which each component belongs to different stacks by utilizing an array matching algorithm.
And 3, returning to the original logic feature graphs represented by the arrays aiming at each node name pair found in the step 2, finding the nodes pointed by the names through a graph finding method, and combining the nodes into node pairs two by two. These node pairs are what is required as "node pairs having the same name".
According to the embodiment of the application, the data logic structure is subdivided into the multiple layers, the logic feature map of each layer is generated, and the logic feature characterization map is generated according to the logic feature map of each layer, so that the map can show all layers of the program to be characterized and logic structure features among all layers, and can more intuitively show the internal logic characteristics of the program to be characterized. And different programming languages are unified language independent abstract logic expression at the file level, class level, function level and statement level respectively.
In some embodiments, the server generates a logical characterization graph from the first logical characterization graph, the second logical characterization graph, the third logical characterization graph, and the fourth logical characterization graph, including: the server eliminates contradictory logic features among the first logic feature map, the second logic feature map, the third logic feature map and the fourth logic feature map to obtain a logic feature characterization map of the program to be characterized; the contradictory logic features refer to different logic features expressed by the same logic features in the first logic feature diagram, the second logic feature diagram, the third logic feature diagram and the fourth logic feature diagram.
In the implementation process, after the server obtains the first logic feature diagram, the second logic feature diagram, the third logic feature diagram and the fourth logic feature diagram, the four levels of logic feature diagrams need to be spliced into the logic feature characterization diagram, and in the splicing process, a plurality of contradictory logic features exist between different levels, so that errors occur in the splicing process. That is, the split operation may cause the "edge non-repeating rule" to be violated, and thus an exception handling method needs to be performed to try to repair the rule, thereby rejecting these contradictory logic features. Fig. 6 is a schematic diagram of an exception handling method in a stitching process of a logic feature map of each level according to an embodiment of the present application, where, as shown in fig. 6, the method includes:
Step 601, taking a low-level logic feature diagram as a reference diagram, discarding the part of other diagrams, which contradicts the diagram feature, and trying to splice again; in the case of exception handling, four levels of the data logic structure are subdivided from high to low into a file level, a class level, a function level, and a statement level.
Step 602, taking the final logic feature map with later generation time as a reference map, discarding the part of other maps contradicting the feature of the map, and trying to splice again;
in step 603, a certain graph in the logic feature graph set is randomly selected as a reference graph, a portion of the other graph, which contradicts the feature of the graph, is removed, and then splicing is attempted again.
And finally, splicing the logic characteristic diagram after exception processing into a logic characteristic characterization diagram.
According to the embodiment of the application, the contradictory logic features in each level are removed, so that the splicing process of the logic feature diagrams of each level can be smoothly carried out, and finally, the spliced logic feature characterization diagrams represent correct internal logic in the program to be characterized.
In some embodiments, the server generates a first logical feature map of a file hierarchy of a program to be characterized according to the file hierarchy and an execution logical structure of the file hierarchy, including: the server obtains the file position and the file type of the program to be characterized according to the file characteristic analysis method, and generates a first logic characteristic diagram of the file hierarchy of the program to be characterized according to the file position and the file type through the file tree analysis method.
In the specific implementation process, the "file feature analysis method" refers to using related disclosure tools provided by an operating system program to obtain some features similar to a programming language file in a program file package to be analyzed, such as: external naming formats, internal text formats, etc., to infer the location of the inferable program files contained therein and the type of language code used. . The file feature analysis method can be a feature analysis method based on file statistical information, and specifically comprises an analysis positioning method based on file expansion names, a suspicious code file screening method based on file sizes and the like. The characteristic analysis method based on the file content information can be a characteristic analysis method based on the file content information, and specifically comprises a regular text matching method based on a specific code format, an NLP natural language processing and identifying method based on a code text random sampling segment and the like. The method for analyzing the file characteristics can be selected according to actual conditions, and the application is not particularly limited.
The "file tree analysis method" refers to the related operation of analyzing a program file package to be analyzed by using a related disclosure tool provided by an operating system program to obtain the organization structure information of each folder and file in the program file package, and finally generating the structure of the information based on the organization structure information. The organization structure of the file generated by the file tree analysis method is a tree structure which takes a root directory file as a root node and has a containing relation of directed edges between the directory and the file, so that the organization structure can be used as a basic skeleton structure of a first logic characteristic diagram of a file level of a program to be characterized. . The file tree analysis method may be a file hash analysis method, a chained file analysis method, or the like. The file tree analysis method may be selected according to actual situations, which is not particularly limited in this application.
The file position is the position where the program to be characterized is stored, and the file type is the programming language type used by the program to be characterized. Before generating a first logic feature map of a file level of a program to be characterized, the server needs to determine a file part to be analyzed in the program to be characterized to be analyzed, namely an analysis depth, based on a preset analysis range. The preset analysis range is a preset range, for example, one or several sub-program function packages in a program file package, one or several sub-program files in a program file package, one or several specific functions under a program file, etc. may be determined according to actual situations, which is not specifically limited in this application.
Fig. 7 is a schematic diagram of a generation process of a logic feature map of each level according to an embodiment of the present application, where, as shown in fig. 7, the generation process includes the following steps:
step 701, analyzing the file position and the file type of the program to be characterized through a file characteristic analysis method after obtaining the file analysis depth;
step 702, generating a first logic characteristic diagram of a file hierarchy through a file tree analysis method according to the file position and the file type.
Step 703, classifying the program to be characterized according to the file type to obtain the running mode of the program to be characterized. The operation modes comprise a compiling operation mode and an explanatory operation mode.
Step 704, based on different operation modes, executing different logic feature map generating methods to generate logic feature maps of class level, function level and statement level of the program to be characterized.
If the running mode of the program to be characterized is a compiled running mode, executing step 7041, and executing a logic feature map of class level, function level and statement level generated by a generation method aiming at the compiled programming language; if the running mode of the program to be characterized is an explanatory running mode, step 7042 is executed, and a class-level, function-level and statement-level logic feature diagram is generated by adopting a generating method aiming at the explanatory programming language.
It should be noted that, the logic feature map of each level constructed should conform to the specification of the logic feature map of each level in the program characterization model described in the above embodiment.
According to the embodiment of the application, the logic characteristics of the file hierarchy are analyzed by using the related method of file analysis, and the logic characteristic diagram of the file hierarchy is obtained. The method lays a foundation for constructing logic feature graphs of other levels while representing the logic features of the file level.
In some embodiments, the second logic feature map, the third logic feature map, and the fourth logic feature map are generated by: the server obtains the running mode of the program to be characterized according to the file type; the operation modes comprise a compiling operation mode and an interpretation operation mode; and the server generates a second logic characteristic diagram, a third logic characteristic diagram and a fourth logic characteristic diagram according to the running mode of the program to be characterized.
In a specific implementation process, a second logic feature map, a third logic feature map and a fourth logic feature map of the program to be characterized are generated according to step 703 and step 704 shown in fig. 7.
Fig. 8 is a schematic diagram of a method for generating a logic feature map of a compiled programming language according to an embodiment of the present application, and as shown in fig. 8, a specific implementation process of step 7041 includes the following steps:
in step 801, source code is compiled or an intermediate language representation (Intermediate Representation, IR) of the target program file is obtained using some intermediate representation language compilation tool, wherein the intermediate language representation may be LLVM IR or a form similar to LLVM IR. The intermediate representation compiling tool may be a boot tool in Java language, or may be an LLVM compiler in C/C++ language, or the like. It should be appreciated that for different languages, the corresponding compilation tools for that language should be used.
Step 802, constructing a logic feature map by utilizing related data in the intermediate language representation according to a top-down hierarchical order based on a preset analysis requirement range; it should be noted that the top-down hierarchical order is a class hierarchy, a function hierarchy, and a statement hierarchy.
Step 803, repeating the process of step 802 until a predetermined analysis granularity range is reached or the intermediate language representation obtained according to step 801 cannot be analyzed further.
Fig. 9 is a schematic diagram of a method for generating a logic feature diagram of an interpreted programming language according to an embodiment of the present application, and as shown in fig. 9, a specific implementation process of step 7041 includes the following steps:
step 901, obtaining an AST abstract syntax tree representation of a target program file based on an interpreter or lexical analyzer of a programming language program. It should be noted that the interpreter of the programming language program may be a Jython interpreter in Python language, or may be an official interpreter in Lua language. The lexical analyzer may be an AST lexical compiling module under Python language, an Opcodes related interpretation module of PHP, etc. It should be appreciated that for different languages, an interpreter or lexical analyzer corresponding to the language should be used.
And step 902, constructing a logic characteristic diagram by utilizing relevant data in the AST abstract syntax tree representation according to a top-down hierarchical order based on a preset analysis requirement range. It should be noted that, the top-down explanation refers to the above embodiments, and will not be repeated here.
Step 903, repeating step 902 until a preset analysis granularity range is reached or the AST abstract syntax tree representation obtained according to step 901 cannot be analyzed continuously.
It should be noted that, in the construction process of the edges and nodes in the logic feature diagrams described in step 802 and step 902, the described execution process of the programming language program should include both the representation of the execution logic for the code in the normal running state and the representation of the execution logic for the code in the abnormal processing running state. Fig. 10 is a schematic diagram of an abnormal running path identifying and extracting method of a programming language with an abnormal handling mechanism according to an embodiment of the present application, as shown in fig. 10, the method includes the following steps:
step 1001, constructing a running execution logic representation diagram in a normal running state based on the running execution logic of the code in the normal running state in the IR intermediate language representation or the AST abstract syntax tree representation;
step 1002, traversing from the operation starting point node defined by each function, and searching a node set containing a command representing abnormal throwing;
step 1003, a command node is thrown out of the exception in each exception node set, and an exception execution path edge from the exception command node to an exception execution outlet of the function execution block is added to the command node;
If the abnormal execution outlet is the defined outlet of the function, an abnormal execution path edge from the defined outlet node to all function nodes that have called the function is added 1004.
According to the embodiment of the application, the running modes of the program to be characterized are obtained according to the file types, so that the logic feature diagrams are built for the programming languages of different running modes in a targeted mode, and the logic features of different levels of the program to be characterized are accurately expressed.
In some embodiments, the method further comprises: acquiring a sample characterization program; the sample characterization program comprises a sample data logic structure and a sample execution logic structure; the sample data logic structure comprises a sample file level, a sample class level, a sample function level and a sample statement level; generating a first sample logic feature map of a sample file hierarchy of a sample characterization program according to the sample file hierarchy and a sample execution logic structure of the sample file hierarchy; generating a second sample logic feature map of the sample class hierarchy of the sample characterization program according to the sample class hierarchy and the sample execution logic structure of the sample class hierarchy; generating a third sample logic feature map of the sample function level of the sample characterization program according to the sample function level and the sample execution logic structure of the sample function level; generating a fourth sample logic feature diagram of the sample sentence level of the sample characterization program according to the sample sentence level and the sample execution logic structure of the sample sentence level; the first sample logic feature diagram, the second sample logic feature diagram, the third sample logic feature diagram and the fourth sample logic feature diagram are spliced into a sample logic feature characterization diagram; and generating the program characterization model according to the sample logic characteristic characterization graph.
In the implementation process, the sample characterization procedure is obtained, and the specific implementation process of generating the sample logic feature map of each level is referred to the above embodiment, which is not described herein again.
According to the embodiment of the application, the program characterization model is built by using the sample characterization program, so that the accuracy and the universality of the program characterization model are improved.
In summary, the method for representing program features provided in the embodiments of the present application solves the drawbacks inherent in the conventional program feature representation form based on the specific machine code instruction set, which are closely related to the specific code language features. Meanwhile, compared with the traditional graph structure code representation model, the static representation capability of the program representation model provided by the embodiment of the application for the exception handling situation is stronger; compared with the traditional static full-quantity code analysis method, the feature map generation method provided by the embodiment of the application has the advantages that the code structure which can be characterized under the same calculation force condition is more specific, and safety personnel can conduct research and mining on logic vulnerability deep logic through the static code logic analysis method.
Fig. 11 is a schematic structural diagram of a device for representing program features according to an embodiment of the present application, as shown in fig. 11, where the device includes a feature extraction module 1101 and a representation module 1102, where,
The feature extraction module 1101 is configured to perform feature extraction on a program to be characterized through a program characterization model, so as to obtain a logic feature characterization graph; the nodes of the logic characteristic representation graph represent the data logic structure of the program to be represented, and the edges of the logic characteristic representation graph represent the execution logic structure of the program to be represented; the representation module 1102 is configured to analyze the logic feature representation graph through a program representation model, and output a program feature representation result of a program to be represented; the program characterization model is used for carrying out feature extraction on sample characterization features in advance to generate a sample logic feature map, and generating based on the sample logic feature map.
On the basis of the above embodiment, the feature extraction module 1101 is specifically configured to: extracting a data logic structure and an execution logic structure of a program to be characterized through a program characterization model; and generating a logic characteristic characterization graph according to the data logic structure and the execution logic structure.
On the basis of the above embodiment, the feature extraction module 1101 is specifically configured to: the data logic structure comprises a file level, a class level, a function level and a statement level; generating a logic characteristic representation graph according to the data logic structure and the execution logic structure, wherein the logic characteristic representation graph comprises: generating a first logic feature map of a file hierarchy of a program to be characterized according to the file hierarchy and an execution logic structure of the file hierarchy; generating a second logic feature diagram of the class hierarchy of the program to be characterized according to the class hierarchy and the execution logic structure of the class hierarchy; generating a third logic feature diagram of the function level of the program to be characterized according to the function level and the execution logic structure of the function level; generating a fourth logic feature diagram of the sentence level of the program to be characterized according to the sentence level and the execution logic structure of the sentence level; and generating a logic characteristic characterization graph according to the first logic characteristic graph, the second logic characteristic graph, the third logic characteristic graph and the fourth logic characteristic graph.
On the basis of the above embodiment, the feature extraction module 1101 is specifically configured to: rejecting contradictory logic features among the first logic feature map, the second logic feature map, the third logic feature map and the fourth logic feature map to obtain a logic feature characterization map of a program to be characterized; the contradictory logic features refer to different logic features expressed by the same logic features in the first logic feature diagram, the second logic feature diagram, the third logic feature diagram and the fourth logic feature diagram.
On the basis of the above embodiment, the feature extraction module 1101 is specifically configured to: obtaining the file position and the file type of the program to be characterized according to a file characteristic analysis method; and generating a first logic characteristic diagram of a file hierarchy of the program to be characterized through a file tree analysis method according to the file position and the file type.
On the basis of the above embodiment, the feature extraction module 1101 is specifically configured to: obtaining the running mode of the program to be characterized according to the file type; the operation modes comprise a compiling operation mode and an interpretation operation mode; and generating a second logic characteristic diagram, a third logic characteristic diagram and a fourth logic characteristic diagram according to the running mode of the program to be characterized.
On the basis of the above embodiment, the apparatus further includes a model generating module, specifically configured to: acquiring a sample characterization program; the sample characterization program comprises a sample data logic structure and a sample execution logic structure; the sample data logic structure comprises a sample file level, a sample class level, a sample function level and a sample statement level; generating a first sample logic feature map of a sample file hierarchy of a sample characterization program according to the sample file hierarchy and a sample execution logic structure of the sample file hierarchy; generating a second sample logic feature map of the sample class hierarchy of the sample characterization program according to the sample class hierarchy and the sample execution logic structure of the sample class hierarchy; generating a third sample logic feature map of the sample function level of the sample characterization program according to the sample function level and the sample execution logic structure of the sample function level; generating a fourth sample logic feature diagram of the sample sentence level of the sample characterization program according to the sample sentence level and the sample execution logic structure of the sample sentence level; the first sample logic feature diagram, the second sample logic feature diagram, the third sample logic feature diagram and the fourth sample logic feature diagram are spliced into a sample logic feature characterization diagram; and generating the program characterization model according to the sample logic characteristic characterization graph.
According to the embodiment of the application, corresponding method steps are executed by utilizing different modules, so that the generated logic characteristic characterization graph reveals similar logic essence implied by different programming languages, and migration comparison can be directly carried out on the different programming languages, so that theoretical support is provided for downstream tasks such as logic vulnerability cause analysis, migration mining and the like.
Fig. 12 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, as shown in fig. 12, where the electronic device includes a processor 1201, a memory 1202, and a bus 1203; wherein the processor 1201 and the memory 1202 communicate with each other via the bus 1203. The processor 1201 is configured to invoke program instructions in the memory 1202 to perform the methods provided by the method embodiments described above.
The processor 1201 may be an integrated circuit chip having signal processing capabilities. The processor 1201 may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. Which may implement or perform the various methods, steps, and logical blocks disclosed in embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Memory 1202 may include, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), and the like.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the methods provided by the above-described method embodiments, for example comprising: extracting features of a program to be characterized through a program characterization model to obtain a logic feature characterization diagram; the nodes of the logic characteristic representation graph represent the data logic structure of the program to be represented, and the edges of the logic characteristic representation graph represent the execution logic structure of the program to be represented; analyzing the logic characteristic characterization graph through the program characterization model, and outputting a program characteristic representation result of the program to be characterized; the program characterization model is used for carrying out feature extraction on sample characterization features in advance to generate a sample logic feature map, and generating based on the sample logic feature map.
The present embodiment provides a non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the methods provided by the above-described method embodiments, for example, including: extracting features of a program to be characterized through a program characterization model to obtain a logic feature characterization diagram; the nodes of the logic characteristic representation graph represent the data logic structure of the program to be represented, and the edges of the logic characteristic representation graph represent the execution logic structure of the program to be represented; analyzing the logic characteristic characterization graph through the program characterization model, and outputting a program characteristic representation result of the program to be characterized; the program characterization model is used for carrying out feature extraction on sample characterization features in advance to generate a sample logic feature map, and generating based on the sample logic feature map.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
Further, the units described as separate units may or may not be physically separate, and units displayed as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Furthermore, functional modules in various embodiments of the present application may be integrated together to form a single portion, or each module may exist alone, or two or more modules may be integrated to form a single portion.
The above description is only of the embodiments shown in the present application and is not intended to limit the scope of the present application, and various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (10)

1. A method of representing a program feature, the method comprising:
extracting features of the program to be characterized through a program characterization model to obtain a logic feature characterization diagram; the nodes of the logic characteristic representation graph represent the data logic structure of the program to be represented, and the edges of the logic characteristic representation graph represent the execution logic structure of the program to be represented;
Analyzing the logic characteristic characterization graph through the program characterization model, and outputting a program characteristic representation result of the program to be characterized; the program characterization model is used for carrying out feature extraction on sample characterization features in advance to generate a sample logic feature map, and the sample logic feature map is based on the sample logic feature map.
2. The method according to claim 1, wherein the feature extraction of the program to be characterized by the program characterization model to obtain a logic feature characterization graph includes:
extracting the data logic structure and the execution logic structure of the program to be characterized through the program characterization model;
and generating the logic characteristic characterization graph according to the data logic structure and the execution logic structure.
3. The method of claim 2, wherein the data logic structure comprises a file hierarchy, a class hierarchy, a function hierarchy, and a statement hierarchy; the generating the logic characteristic characterization graph according to the data logic structure and the execution logic structure comprises the following steps:
generating a first logic feature map of the file hierarchy of the program to be characterized according to the file hierarchy and an execution logic structure of the file hierarchy;
Generating a second logic feature diagram of the class hierarchy of the program to be characterized according to the class hierarchy and an execution logic structure of the class hierarchy;
generating a third logic characteristic diagram of the function level of the program to be characterized according to the function level and an execution logic structure of the function level;
generating a fourth logic feature diagram of the statement level of the program to be characterized according to the statement level and an execution logic structure of the statement level;
and generating the logic characteristic characterization graph according to the first logic characteristic graph, the second logic characteristic graph, the third logic characteristic graph and the fourth logic characteristic graph.
4. The method of claim 3, wherein the generating the logic feature characterization graph from the first logic feature graph, the second logic feature graph, the third logic feature graph, and the fourth logic feature graph comprises:
rejecting contradictory logic features among the first logic feature map, the second logic feature map, the third logic feature map and the fourth logic feature map to obtain a logic feature characterization map of the program to be characterized; the contradictory logic features refer to different logic features expressed by the same logic features in the first logic feature diagram, the second logic feature diagram, the third logic feature diagram and the fourth logic feature diagram.
5. A method according to claim 3, wherein said generating a first logical feature map of the file hierarchy of the program to be characterized from the file hierarchy and from the execution logic structure of the file hierarchy comprises:
obtaining the file position and the file type of the program to be characterized according to a file characteristic analysis method;
and generating a first logic characteristic diagram of the file level of the program to be characterized according to the file position and the file type through a file tree analysis method.
6. The method of claim 5, wherein the second logic feature map, the third logic feature map, and the fourth logic feature map are generated by:
obtaining the running mode of the program to be characterized according to the file type; the operation modes comprise a compiling operation mode and an interpretation operation mode;
and generating the second logic characteristic diagram, the third logic characteristic diagram and the fourth logic characteristic diagram according to the running mode of the program to be characterized.
7. The method according to any one of claims 1-6, further comprising:
acquiring a sample characterization program; the sample characterization program comprises a sample data logic structure and a sample execution logic structure; the sample data logic structure comprises a sample file hierarchy, a sample class hierarchy, a sample function hierarchy and a sample statement hierarchy;
Generating a first sample logic feature map of the sample file hierarchy of the sample characterization program according to the sample file hierarchy and a sample execution logic structure of the sample file hierarchy;
generating a second sample logic feature map of the sample class hierarchy of the sample characterization program according to the sample class hierarchy and a sample execution logic structure of the sample class hierarchy;
generating a third sample logic feature map of the sample function hierarchy of the sample characterization program according to the sample function hierarchy and a sample execution logic structure of the sample function hierarchy;
generating a fourth sample logic feature map of the sample sentence level of the sample characterization program according to the sample sentence level and a sample execution logic structure of the sample sentence level;
combining the first sample logic feature map, the second sample logic feature map, the third sample logic feature map and the fourth sample logic feature map into a sample logic feature characterization map;
and generating the program characterization model according to the sample logic characteristic characterization graph.
8. A program feature representing apparatus, the apparatus comprising:
The feature extraction module is used for extracting features of the program to be characterized through a program characterization model to obtain a logic feature characterization diagram; the nodes of the logic characteristic representation graph represent the data logic structure of the program to be represented, and the edges of the logic characteristic representation graph represent the execution logic structure of the program to be represented;
the representation module is used for analyzing the logic characteristic representation graph through the program representation model and outputting a program characteristic representation result of the program to be represented; the program characterization model is used for carrying out feature extraction on sample characterization features in advance to generate a sample logic feature map, and the sample logic feature map is based on the sample logic feature map.
9. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of a method of representing a program feature according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that it has stored thereon a computer program which, when being executed by a processor, performs the steps of a method of representing program features as claimed in any one of claims 1 to 7.
CN202310188066.7A 2023-02-24 2023-02-24 Program characteristic representation method and device, electronic equipment and storage medium Pending CN116149626A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310188066.7A CN116149626A (en) 2023-02-24 2023-02-24 Program characteristic representation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310188066.7A CN116149626A (en) 2023-02-24 2023-02-24 Program characteristic representation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116149626A true CN116149626A (en) 2023-05-23

Family

ID=86359925

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310188066.7A Pending CN116149626A (en) 2023-02-24 2023-02-24 Program characteristic representation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116149626A (en)

Similar Documents

Publication Publication Date Title
CN109976761B (en) Software development kit generation method and device and terminal equipment
US11036614B1 (en) Data control-oriented smart contract static analysis method and system
US7493596B2 (en) Method, system and program product for determining java software code plagiarism and infringement
TWI577539B (en) Computer-implemented method, computer-readable storage memory, and system for runtime system
US8473899B2 (en) Automatic optimization of string allocations in a computer program
CN110688122B (en) Method and device for compiling and executing intelligent contract
CN110704064B (en) Method and device for compiling and executing intelligent contract
CN110704063A (en) Method and device for compiling and executing intelligent contract
US7761856B2 (en) Defining expressions in a meta-object model of an application
CN111913878B (en) Byte code instrumentation method, device and storage medium based on program analysis result
CN108920566B (en) Method, device and equipment for operating SQLite database
US11868465B2 (en) Binary image stack cookie protection
CN108197020A (en) Plug-in unit method of calibration, electronic equipment and computer storage media
US20220108023A1 (en) Docker image vulnerability inspection device and method for performing docker file analysis
CN111159301A (en) Data creating method, device, equipment and storage medium based on intelligent contract
CN110333872A (en) A kind of processing method of application, device, equipment and medium
Silva et al. Identifying classes in legacy JavaScript code
CN116149626A (en) Program characteristic representation method and device, electronic equipment and storage medium
CN112463596B (en) Test case data processing method, device and equipment and processing equipment
EP2535813A1 (en) Method and device for generating an alert during an analysis of performance of a computer application
CN115686467A (en) Type inference in dynamic languages
US7661092B1 (en) Intelligent reuse of local variables during bytecode compilation
CN111273913B (en) Method and device for outputting application program interface data represented by specifications
US10311392B2 (en) Just in time compilation (JIT) for business process execution
CN117235746B (en) Source code safety control platform based on multidimensional AST fusion detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination