CN114611114A - Code auditing method, device, equipment and storage medium - Google Patents

Code auditing method, device, equipment and storage medium Download PDF

Info

Publication number
CN114611114A
CN114611114A CN202210242202.1A CN202210242202A CN114611114A CN 114611114 A CN114611114 A CN 114611114A CN 202210242202 A CN202210242202 A CN 202210242202A CN 114611114 A CN114611114 A CN 114611114A
Authority
CN
China
Prior art keywords
code
audit
auditing
audited
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210242202.1A
Other languages
Chinese (zh)
Inventor
黎梦然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202210242202.1A priority Critical patent/CN114611114A/en
Publication of CN114611114A publication Critical patent/CN114611114A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Computer Security & Cryptography (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the field of artificial intelligence and discloses a code auditing method, device, equipment and storage medium. The method comprises the following steps: acquiring a code audit specification set, and generating a code audit specification library according to the code audit specification set; acquiring a code to be audited and scanning the code to be audited to obtain a code identifier; acquiring a corresponding code audit specification from the code audit specification library through the code identification; and code auditing is carried out on the code to be audited through a preset code auditing model based on the code auditing specification, so as to obtain a corresponding code auditing result. The invention also relates to a blockchain technique, in which a code identification can be stored.

Description

Code auditing method, device, equipment and storage medium
Technical Field
The invention relates to the field of artificial intelligence, in particular to a code auditing method, device, equipment and storage medium.
Background
In the prior art, code auditing belongs to advanced penetration test service, and is an analysis method for code to be audited, which is used for discovering program errors, security vulnerabilities and violating program coding specifications. At present, manual code auditing to be audited becomes the bottom best guarantee for really guaranteeing the design, development and application of software code to be audited.
In the related technology, one mode of code auditing is to manually audit codes to be audited of a program, which wastes human resources and requires higher code error correction capability for code auditors, and the code auditing work efficiency is lower, and the other mode is to perform code auditing based on regular matching.
Disclosure of Invention
The embodiment of the invention provides a code auditing method, a code auditing device, code auditing equipment and a storage medium, which are used for solving the technical problem of low accuracy rate during code auditing.
The invention provides a code auditing method in a first aspect, which comprises the following steps: acquiring a code audit specification set, and generating a code audit specification library according to the code audit specification set; acquiring a code to be audited and scanning the code to be audited to obtain a code identifier; acquiring a corresponding code audit specification from the code audit specification library through the code identification; and code auditing is carried out on the code to be audited through a preset code auditing model based on the code auditing specification, so as to obtain a corresponding code auditing result.
Optionally, in a first implementation manner of the first aspect of the present invention, the obtaining a code audit specification set, and generating a code audit specification library according to the code audit specification set includes: acquiring the audit specification set, wherein the audit specification set comprises at least two audit specifications; analyzing the at least two audit specifications to determine a specification description file corresponding to each audit specification; according to the specification description file corresponding to each audit specification, performing type division on each audit specification, and determining the specification type corresponding to each audit specification; analyzing the storage positions of the at least two audit specifications according to the specification type corresponding to each audit specification, and determining the storage position corresponding to each audit specification; and storing the at least two audit specifications according to the storage position corresponding to each audit specification to obtain a corresponding code audit specification library.
Optionally, in a second implementation manner of the first aspect of the present invention, the performing type division on each audit specification according to a specification description file corresponding to each audit specification, and determining a specification type corresponding to each audit specification includes: inputting the specification description file corresponding to each audit specification and a preset standard semantic into a preset semantic analysis model to perform semantic similarity calculation, so as to obtain a semantic analysis result; and analyzing each audit specification according to the semantic analysis result, and determining the specification type corresponding to each audit specification.
Optionally, in a third implementation manner of the first aspect of the present invention, the code auditing, based on the code auditing specification, is performed on the code to be audited through a preset code auditing model, and obtaining a corresponding code auditing result includes: inputting the code to be audited into a grammar analyzer in the code auditing model for lexical analysis, and generating a corresponding lexical unit array; constructing an abstract syntax tree according to the lexical unit array to obtain a corresponding abstract syntax tree; and code auditing is carried out according to the abstract syntax tree, and a corresponding code auditing result is obtained.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the inputting the pending code into a syntax analyzer in the code auditing model to perform lexical analysis, and the generating the corresponding lexical unit array includes: inputting the code to be audited into the syntactic analyzer, and performing lexical unit decomposition on the code to be audited through the syntactic analyzer to obtain a plurality of corresponding lexical units; and performing sequence analysis on the plurality of lexical units to obtain a plurality of corresponding lexical unit sequences, and combining the plurality of lexical units according to the plurality of lexical unit sequences to generate a corresponding lexical unit array.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the constructing an abstract syntax tree according to the lexical unit array to obtain a corresponding abstract syntax tree includes: according to the number of the lexical units, carrying out syntactic analysis on the code to be audited to obtain a corresponding syntactic analysis result; constructing a syntax analysis tree based on the syntax analysis result and a preset code language specification to obtain a syntax analysis tree corresponding to the code to be audited; and calling a node object creating function corresponding to each node in the syntax analysis tree to create the node, generating a plurality of created node objects and generating a corresponding abstract syntax tree according to the plurality of node objects.
Optionally, in a sixth implementation manner of the first aspect of the present invention, after performing code audit on the code to be audited through a preset code audit model based on the code audit specification to obtain a corresponding code audit result, the method further includes: acquiring the code auditing result, and extracting characteristic data of the code auditing result to obtain corresponding characteristic data; inputting the characteristic data into a preset false alarm discrimination model to carry out false alarm analysis, and obtaining a corresponding analysis result; and scanning the analysis result, if the analysis result shows that a false alarm exists, scanning the analysis result, determining corresponding false alarm information, transmitting the false alarm information and the code auditing result to an information prompt terminal, and if the analysis result shows that no false alarm exists, transmitting the code auditing result to the information prompt terminal.
The second aspect of the present invention provides a code auditing apparatus, including: the generating module is used for acquiring a code audit specification set and generating a code audit specification library according to the code audit specification set; the scanning module is used for acquiring a code to be audited and scanning the code to be audited to obtain a code identifier; the acquisition module is used for acquiring the corresponding code audit specification from the code audit specification library through the code identification; and the auditing module is used for performing code auditing on the code to be audited through a preset code auditing model based on the code auditing specification to obtain a corresponding code auditing result.
Optionally, in a first implementation manner of the second aspect of the present invention, the obtaining module specifically includes:
the obtaining unit is used for obtaining the audit specification set, wherein the audit specification set comprises at least two audit specifications;
the analysis unit is used for analyzing the at least two audit specifications and determining a specification description file corresponding to each audit specification;
the dividing unit is used for carrying out type division on each audit specification according to the specification description file corresponding to each audit specification and determining the specification type corresponding to each audit specification;
the analysis unit is used for analyzing the storage positions of the at least two audit specifications according to the specification type corresponding to each audit specification and determining the storage position corresponding to each audit specification;
and the storage unit is used for storing the at least two audit specifications according to the storage position corresponding to each audit specification to obtain a corresponding code audit specification library.
Optionally, in a second implementation manner of the second aspect of the present invention, the dividing unit is specifically configured to: inputting the specification description file corresponding to each audit specification and a preset standard semantic into a preset semantic analysis model to perform semantic similarity calculation, so as to obtain a semantic analysis result; and analyzing each audit specification according to the semantic analysis result, and determining the specification type corresponding to each audit specification.
Optionally, in a third implementation manner of the second aspect of the present invention, the audit module specifically includes:
the generating unit is used for inputting the codes to be audited into a grammar analyzer in the code auditing model for lexical analysis and generating corresponding lexical unit arrays;
the construction unit is used for constructing an abstract syntax tree according to the lexical unit array to obtain a corresponding abstract syntax tree;
and the auditing unit is used for performing code auditing according to the abstract syntax tree to obtain a corresponding code auditing result.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the generating unit is specifically configured to: inputting the code to be audited into the syntactic analyzer, and performing lexical unit decomposition on the code to be audited through the syntactic analyzer to obtain a plurality of corresponding lexical units; and performing sequence analysis on the plurality of lexical units to obtain a plurality of corresponding lexical unit sequences, and combining the plurality of lexical units according to the plurality of lexical unit sequences to generate a corresponding lexical unit array.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the building unit is specifically configured to: according to the number of the lexical units, carrying out syntactic analysis on the code to be audited to obtain a corresponding syntactic analysis result; constructing a syntactic analysis tree based on the syntactic analysis result and a preset code language specification to obtain the syntactic analysis tree corresponding to the code to be audited; calling a node object creating function corresponding to each node in the syntax analysis tree to create the node, generating a plurality of created node objects and generating a corresponding abstract syntax tree according to the plurality of node objects.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the code auditing apparatus further includes:
the extraction module is used for acquiring the code auditing result and extracting the characteristic data of the code auditing result to obtain corresponding characteristic data;
the analysis module is used for inputting the characteristic data into a preset false alarm discrimination model to carry out false alarm analysis so as to obtain a corresponding analysis result;
and the transmission module is used for scanning the analysis result, if the analysis result shows that a false alarm exists, scanning the analysis result, determining corresponding false alarm information, transmitting the false alarm information and the code auditing result to an information prompt terminal, and if the analysis result shows that the false alarm does not exist, transmitting the code auditing result to the information prompt terminal.
A third aspect of the present invention provides a computer apparatus comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the computer device to perform the code auditing method described above.
A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform the code auditing method described above.
In the technical scheme provided by the invention, the server determines the corresponding code type through the code identification, and downloads the corresponding code audit specification from the code audit specification library according to the code type, so that the problem of false alarm generated during code audit due to the adoption of different code specifications during development can be avoided, the efficiency and the accuracy of code audit are improved, in the embodiment of the invention, the audit of the audit result is executed by adopting a model based on a machine learning algorithm, a large amount of data is used for training, code audit is performed on the code to be audited through a preset code audit model, the corresponding code audit result is obtained, and the accuracy of code audit timing can be improved.
Drawings
FIG. 1 is a schematic diagram of an embodiment of a code auditing method in an embodiment of the present invention;
FIG. 2 is a schematic diagram of another embodiment of a code auditing method in an embodiment of the present invention;
FIG. 3 is a schematic diagram of an embodiment of a code auditing apparatus in the embodiment of the present invention;
FIG. 4 is a schematic diagram of another embodiment of a code auditing apparatus in an embodiment of the present invention;
FIG. 5 is a diagram of an embodiment of a computer device in an embodiment of the invention.
Detailed Description
The embodiment of the invention provides a code auditing method, a code auditing device, code auditing equipment and a storage medium, which are used for solving the technical problem of low accuracy rate during code auditing.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be implemented in other sequences than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. The artificial intelligence is a theory, a method, a technology and an application system which simulate, extend and expand human intelligence by using a digital computer or a machine controlled by the digital computer, sense the environment, acquire knowledge and obtain the best result by using the knowledge. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
For ease of understanding, a detailed flow of an embodiment of the present invention is described below, with reference to fig. 1, an embodiment of a code auditing method in an embodiment of the present invention includes:
101. acquiring a code audit specification set, and generating a code audit specification library according to the code audit specification set;
it is to be understood that the execution subject of the present invention may be a code auditing apparatus, and may also be a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject.
It should be noted that the audit specification set includes at least two audit specifications, the audit specification set is a set of predefined audit specifications, the audit specification set may be determined according to specific requirements of an enterprise, the audit specifications included in the audit specification set should all have corresponding specification description files to describe functions, meanings and the like of the corresponding audit specifications, and specifically, the server acquires the code audit specification set and generates a code audit specification library according to the code audit specification set.
102. Acquiring a code to be audited and scanning the code to be audited to obtain a code identifier;
it should be noted that the code identifier may be a serial number, where the identifier is used to indicate a corresponding relationship between the code and the code specification, and the code identifier is generated according to a preset identifier generation rule, where the preset identifier generation rule is a rule used to generate a new code identifier based on an original code identifier. For example, a unit value may be set in the preset identifier generation rule, and each time a new code identifier is generated, a unit value is incremented based on the previous code identifier, specifically, the server acquires the code to be audited and scans the code to be audited to determine the corresponding code identifier. It is emphasized that the code identity may also be stored in a node of a block chain in order to further ensure privacy and security of the code identity.
103. Acquiring a corresponding code audit specification from a code audit specification library through a code identifier;
specifically, the server determines the corresponding code type through the code identification, and downloads the corresponding code audit specification from the code audit specification library according to the code type, so that the problem of false alarm caused by the adoption of different code specifications during development can be avoided, and the efficiency and accuracy of code audit are improved.
104. And performing code audit on the code to be audited through a preset code audit model based on the code audit specification to obtain a corresponding code audit result.
Specifically, the code auditing model can detect defects and security holes existing in the codes to be audited by analyzing the syntax, semantics and data flow direction of the codes to be audited of the program, the auditing result of the code auditing model comprises the detail data of the detected problems, and the problems are the problems directly detected by the code auditing model.
In the embodiment of the invention, the server determines the corresponding code type through the code identification, and downloads the corresponding code audit specification from the code audit specification library according to the code type, so that the problem of false alarm caused by adopting different code specifications during development can be avoided, and the efficiency and the accuracy of code audit are improved.
Referring to fig. 2, another embodiment of the code auditing method according to the embodiment of the present invention includes:
201. acquiring a code audit specification set, and generating a code audit specification library according to the code audit specification set;
specifically, the server obtains an audit specification set, wherein the audit specification set comprises at least two audit specifications; analyzing at least two audit specifications, and determining a specification description file corresponding to each audit specification; according to the specification description file corresponding to each audit specification, performing type division on each audit specification to determine the specification type corresponding to each audit specification; analyzing the storage positions of at least two audit specifications according to the specification type corresponding to each audit specification, and determining the storage position corresponding to each audit specification; and storing at least two audit specifications according to the storage position corresponding to each audit specification to obtain a corresponding code audit specification library.
It should be noted that, in this step, a code audit specification set library with a wide range is established, and specification description files can express relevant information of rules such as functions and meanings of the audit specification set, so that the audit specification set needs to be classified, wherein classification items specifically include code language types and code plug-in types, that is, corresponding audit specification sets exist for different code language types and plug-in types, a user can search the audit specification set in a targeted manner according to the language type of a code to be audited and the plug-in type used, the code language types can be specifically divided into java, js, python and the like, and since different existing audit specification set sets may include audit specification sets with the same content, in order to avoid redundancy of the audit specification sets in the generated code audit specification set library, after each audit specification is subjected to type division, and recording the audit specification sets belonging to the same type to the corresponding database storage positions in a de-duplication mode, and further storing at least two audit specifications by the server according to the storage position corresponding to each audit specification to obtain a corresponding code audit specification library.
Optionally, according to the specification description file corresponding to each audit specification, performing type division on each audit specification, and determining the specification type corresponding to each audit specification may include: the server inputs a standard description file corresponding to each audit standard and a preset standard semantic into a preset semantic analysis model to carry out semantic similarity calculation, so as to obtain a semantic analysis result; and the server analyzes each audit specification according to the semantic analysis result and determines the specification type corresponding to each audit specification.
In the embodiment of the invention, a server analyzes a standard description file by using a semantic analysis model based on a deep learning network to confirm the specific semantics of each audit rule, specifically, the server calculates the semantic value of the standard description file by using the semantic analysis model of the deep learning network, further calculates the similarity between the semantic value and the semantic value of the standard semantics, and when the difference between the semantic value of the standard description file and the semantic value of the standard semantics is smaller than a preset threshold range, the standard description file is considered to be equal to the standard semantics, so as to obtain the specific semantic analysis result of the standard description file, and the server analyzes each audit specification according to the semantic analysis result to determine the specification type corresponding to each audit specification.
202. Acquiring a code to be audited and scanning the code to be audited to obtain a code identifier;
specifically, in this embodiment, the specific implementation of step 202 is similar to that of step 102, and is not described here again.
203. Acquiring a corresponding code audit specification from a code audit specification library through a code identifier;
specifically, in this embodiment, the specific implementation of step 203 is similar to that of step 103, and is not described herein again.
204. Inputting the codes to be checked into a grammar analyzer in the code auditing model for lexical analysis, and generating corresponding lexical unit arrays;
specifically, the server inputs the code to be audited into a grammar analyzer, and the grammar analyzer decomposes the code to be audited into a plurality of corresponding lexical units; the server analyzes the sequences of the plurality of lexical units to obtain a plurality of corresponding lexical unit sequences, and combines the plurality of lexical units according to the plurality of lexical unit sequences to generate a corresponding lexical unit array.
The method comprises the steps of inputting codes to be audited into a grammar analyzer for lexical analysis to obtain a plurality of lexical units, inputting codes to be audited into the lexical analyzer through a server, generating the lexical units through the lexical analyzer, wherein the lexical unit sequences are a set of the lexical units, the lexical units comprise information such as lexical unit names, contents, unit types, line numbers of the lexical units and column numbers of the lexical units, the lexical unit names are abstract symbol representations of the lexical units, the content of the lexical units records the specific contents corresponding to the lexical units, the types of the lexical units represent the types of the lexical units, the line numbers of the lexical units and the column numbers of the lexical units respectively record the line numbers and the column numbers of the lexical units in the codes to be audited, and specifically, the server analyzes the sequences of the plurality of the lexical units, and obtaining a plurality of corresponding lexical unit sequences, and combining the plurality of lexical units according to the plurality of lexical unit sequences to generate a corresponding lexical unit array.
205. Constructing an abstract syntax tree according to the lexical unit array to obtain a corresponding abstract syntax tree;
specifically, the server performs syntactic analysis on the code to be audited according to the lexical unit array to obtain a corresponding syntactic analysis result; the server constructs a syntactic analysis tree based on the syntactic analysis result and a preset code language specification to obtain a syntactic analysis tree corresponding to the code to be examined; and the server calls a node object creating function corresponding to each node in the syntax analysis tree to create the node, generates a plurality of created node objects and generates a corresponding abstract syntax tree according to the plurality of node objects.
It should be noted that the abstract syntax tree includes nodes for representing syntax structures of codes to be audited, where each node represents a syntax structure in the codes to be audited, where a node including a sub-node represents syntax supported by a programming language, and a node without a sub-node represents a lexical unit, in this embodiment, the abstract syntax tree is constructed mainly through two processes of lexical analysis and syntactic analysis, where the server constructs the syntax analysis tree based on a syntax analysis result and a preset code language specification to obtain a syntax analysis tree corresponding to the codes to be audited; and the server calls a node object creating function corresponding to each node in the syntax analysis tree to create the node, generates a plurality of created node objects and generates a corresponding abstract syntax tree according to the plurality of node objects.
206. And code auditing is carried out according to the abstract syntax tree, and a corresponding code auditing result is obtained.
Specifically, in this embodiment, the specific implementation of step 206 is similar to step 104 described above, and is not described herein again.
Optionally, after step 206, the method may include: the server acquires a code audit result and extracts characteristic data of the code audit result to obtain corresponding characteristic data; inputting the characteristic data into a preset false alarm discrimination model to carry out false alarm analysis to obtain a corresponding analysis result; and scanning the analysis result, if the analysis result is that a false alarm exists, scanning the analysis result, determining corresponding false alarm information and transmitting the false alarm information and the code auditing result to the information prompt terminal, and if the analysis result is that the false alarm does not exist, transmitting the code auditing result to the information prompt terminal.
It should be noted that, firstly, a false alarm discrimination model needs to be trained, specifically, in the embodiment of the present invention, the false alarm discrimination model includes a first false alarm discrimination model and a second false alarm discrimination model based on different machine learning algorithms, in the embodiment of the present invention, a server can analyze syntax, semantics and data flow direction of a code to be audited according to an abstract syntax tree, and detect defects and security vulnerabilities existing in the code to be audited, in the embodiment of the present invention, a machine learning algorithm is adopted to analyze code audit results, probability of false alarm existing in the defects and security vulnerabilities appearing in each audit result is calculated, if the probability is greater than a threshold value, the problem is considered as false alarm, after the server obtains the code audit results, the server can firstly traverse detail data of each problem, and determine the problem that a detection standard identifier in the detail data does not belong to a preset rule identifier set as the false alarm problem, and the follow-up server transmits the false alarm information and the code auditing result to the information prompt terminal.
In the embodiment of the invention, the server extracts the feature data of each problem from the detail data of the code audit result and inputs the feature data into the false alarm judgment model, so that the false alarm problem in the problems is automatically identified.
Referring to fig. 3, an embodiment of a code auditing apparatus according to an embodiment of the present invention includes:
the generating module 301 is configured to obtain a code audit specification set, and generate a code audit specification library according to the code audit specification set;
the scanning module 302 is configured to obtain a code to be audited and scan the code to be audited to obtain a code identifier;
an obtaining module 303, configured to obtain, through the code identifier, a corresponding code audit specification from the code audit specification library;
and the auditing module 304 is used for performing code auditing on the code to be audited through a preset code auditing model based on the code auditing specification to obtain a corresponding code auditing result.
Referring to fig. 4, another embodiment of the code auditing apparatus according to the embodiment of the present invention includes:
the generating module 301 is configured to obtain a code audit specification set, and generate a code audit specification library according to the code audit specification set;
the scanning module 302 is configured to obtain a code to be audited and scan the code to be audited to obtain a code identifier;
an obtaining module 303, configured to obtain, through the code identifier, a corresponding code audit specification from the code audit specification library;
and the auditing module 304 is used for performing code auditing on the code to be audited through a preset code auditing model based on the code auditing specification to obtain a corresponding code auditing result.
Optionally, the obtaining module 303 specifically includes:
an obtaining unit 3031, configured to obtain the audit specification set, where the audit specification set includes at least two audit specifications;
an analyzing unit 3032, configured to analyze the at least two audit specifications, and determine a specification description file corresponding to each audit specification;
a dividing unit 3033, configured to perform type division on each audit specification according to the specification description file corresponding to each audit specification, and determine a specification type corresponding to each audit specification;
an analyzing unit 3034, configured to perform storage location analysis on the at least two audit specifications according to the specification type corresponding to each audit specification, and determine a storage location corresponding to each audit specification;
and the storage unit 3035 is configured to store the at least two audit specifications according to the storage location corresponding to each audit specification to obtain a corresponding code audit specification library.
Optionally, the dividing unit 3033 is specifically configured to: inputting the specification description file corresponding to each audit specification and a preset standard semantic into a preset semantic analysis model to perform semantic similarity calculation, so as to obtain a semantic analysis result; and analyzing each audit specification according to the semantic analysis result, and determining the specification type corresponding to each audit specification.
Optionally, the auditing module 304 specifically includes:
the generating unit 3041 is configured to input the to-be-audited code into a syntax analyzer in the code auditing model to perform lexical analysis, and generate a corresponding lexical unit array;
a constructing unit 3042, configured to perform abstract syntax tree construction according to the lexical unit array, so as to obtain a corresponding abstract syntax tree;
and the auditing unit 3043 is configured to perform code auditing according to the abstract syntax tree to obtain a corresponding code auditing result.
Optionally, the generating unit 3041 is specifically configured to: inputting the code to be audited into the syntactic analyzer, and performing lexical unit decomposition on the code to be audited through the syntactic analyzer to obtain a plurality of corresponding lexical units; and performing sequence analysis on the plurality of lexical units to obtain a plurality of corresponding lexical unit sequences, and combining the plurality of lexical units according to the plurality of lexical unit sequences to generate a corresponding lexical unit array.
Optionally, the building unit 3042 is specifically configured to: according to the number of the lexical units, carrying out syntactic analysis on the code to be audited to obtain a corresponding syntactic analysis result; constructing a syntax analysis tree based on the syntax analysis result and a preset code language specification to obtain a syntax analysis tree corresponding to the code to be audited; calling a node object creating function corresponding to each node in the syntax analysis tree to create the node, generating a plurality of created node objects and generating a corresponding abstract syntax tree according to the plurality of node objects.
Optionally, the code auditing apparatus further includes:
an extraction module 305, configured to obtain the code audit result, and perform feature data extraction on the code audit result to obtain corresponding feature data;
the analysis module 306 is used for inputting the characteristic data into a preset false alarm discrimination model to carry out false alarm analysis so as to obtain a corresponding analysis result;
and a transmission module 307, configured to scan the analysis result, scan the analysis result if the analysis result indicates that a false alarm exists, determine corresponding false alarm information, transmit the false alarm information and the code audit result to an information prompt terminal, and transmit the code audit result to the information prompt terminal if the analysis result indicates that a false alarm does not exist.
Fig. 5 is a schematic structural diagram of a computer device 500 according to an embodiment of the present invention, where the computer device 500 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 510 (e.g., one or more processors) and a memory 520, and one or more storage media 530 (e.g., one or more mass storage devices) for storing applications 533 or data 532. Memory 520 and storage media 530 may be, among other things, transient or persistent storage. The program stored on the storage medium 530 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the computer device 500. Further, the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the computer device 500.
The computer device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input-output interfaces 560, and/or one or more operating systems 531, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and the like. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 5 does not constitute a limitation of the computer device, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components.
The present invention also provides a computer device, which includes a memory and a processor, wherein the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the processor executes the steps of the code auditing method in the above embodiments.
The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, and which may also be a volatile computer readable storage medium, having stored therein instructions, which, when executed on a computer, cause the computer to perform the steps of the code auditing method.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. The blockchain, which is essentially a decentralized database, is a string of data blocks associated by using a cryptographic method, each data block contains information of a batch of network transactions for verifying the validity (anti-counterfeiting) of the information and generating a next block, and the blockchain may include a blockchain bottom platform, a platform product service layer, an application service layer, and the like.

Claims (10)

1. A code auditing method, comprising:
acquiring a code audit specification set, and generating a code audit specification library according to the code audit specification set;
acquiring a code to be audited and scanning the code to be audited to obtain a code identifier;
acquiring a corresponding code audit specification from the code audit specification library through the code identifier;
and code auditing is carried out on the code to be audited through a preset code auditing model based on the code auditing specification, so as to obtain a corresponding code auditing result.
2. The code auditing method of claim 1, where said obtaining a set of code audit specifications and generating a code audit specification library from the set of code audit specifications comprises:
acquiring the audit specification set, wherein the audit specification set comprises at least two audit specifications;
analyzing the at least two audit specifications to determine a specification description file corresponding to each audit specification;
according to the specification description file corresponding to each audit specification, performing type division on each audit specification, and determining the specification type corresponding to each audit specification;
analyzing the storage positions of the at least two audit specifications according to the specification type corresponding to each audit specification, and determining the storage position corresponding to each audit specification;
and storing the at least two audit specifications according to the storage position corresponding to each audit specification to obtain a corresponding code audit specification library.
3. The code auditing method according to claim 2, wherein said classifying each audit specification according to its corresponding specification description file, and determining the specification type for each audit specification comprises:
inputting the specification description file corresponding to each audit specification and a preset standard semantic into a preset semantic analysis model to perform semantic similarity calculation, so as to obtain a semantic analysis result;
and analyzing each audit specification according to the semantic analysis result, and determining the specification type corresponding to each audit specification.
4. The code auditing method of claim 1, wherein code auditing of the code to be audited by a preset code auditing model based on the code auditing specification to obtain a corresponding code auditing result comprises:
inputting the code to be audited into a grammar analyzer in the code auditing model for lexical analysis, and generating a corresponding lexical unit array;
constructing an abstract syntax tree according to the lexical unit array to obtain a corresponding abstract syntax tree;
and code auditing is carried out according to the abstract syntax tree to obtain a corresponding code auditing result.
5. The code auditing method of claim 4, where said entering the code under review into a parser in the code auditing model for lexical analysis and generating a corresponding array of lexical units comprises:
inputting the code to be audited into the syntactic analyzer, and performing lexical unit decomposition on the code to be audited through the syntactic analyzer to obtain a plurality of corresponding lexical units;
and performing sequence analysis on the plurality of lexical units to obtain a plurality of corresponding lexical unit sequences, and combining the plurality of lexical units according to the plurality of lexical unit sequences to generate a corresponding lexical unit array.
6. The code auditing method of claim 4, wherein said constructing an abstract syntax tree according to said lexical unit array to obtain a corresponding abstract syntax tree comprises:
according to the number of the lexical units, carrying out syntactic analysis on the code to be audited to obtain a corresponding syntactic analysis result;
constructing a syntax analysis tree based on the syntax analysis result and a preset code language specification to obtain a syntax analysis tree corresponding to the code to be audited;
and calling a node object creating function corresponding to each node in the syntax analysis tree to create the node, generating a plurality of created node objects and generating a corresponding abstract syntax tree according to the plurality of node objects.
7. The code auditing method according to any one of claims 1-6, after said code auditing, based on said code auditing specification, using a preset code auditing model to perform code auditing on said code to be audited, obtaining a corresponding code auditing result, further comprising:
acquiring the code auditing result, and extracting characteristic data of the code auditing result to obtain corresponding characteristic data;
inputting the characteristic data into a preset false alarm discrimination model to carry out false alarm analysis, and obtaining a corresponding analysis result;
and scanning the analysis result, if the analysis result is false alarm, scanning the analysis result, determining corresponding false alarm information, transmitting the false alarm information and the code auditing result to an information prompt terminal, and if the analysis result is false alarm, transmitting the code auditing result to the information prompt terminal.
8. A code auditing apparatus, comprising:
the generating module is used for acquiring a code audit specification set and generating a code audit specification library according to the code audit specification set;
the scanning module is used for acquiring codes to be audited and scanning the codes to be audited to obtain code identifiers;
the acquisition module is used for acquiring the corresponding code audit specification from the code audit specification library through the code identifier;
and the auditing module is used for performing code auditing on the code to be audited through a preset code auditing model based on the code auditing specification to obtain a corresponding code auditing result.
9. A computer device, characterized in that the computer device comprises: a memory and at least one processor, the memory having instructions stored therein;
the at least one processor invoking the instructions in the memory to cause the computer device to perform the code auditing method of any one of claims 1-7.
10. A computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement the code auditing method of any one of claims 1-7.
CN202210242202.1A 2022-03-11 2022-03-11 Code auditing method, device, equipment and storage medium Pending CN114611114A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210242202.1A CN114611114A (en) 2022-03-11 2022-03-11 Code auditing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210242202.1A CN114611114A (en) 2022-03-11 2022-03-11 Code auditing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114611114A true CN114611114A (en) 2022-06-10

Family

ID=81862527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210242202.1A Pending CN114611114A (en) 2022-03-11 2022-03-11 Code auditing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114611114A (en)

Similar Documents

Publication Publication Date Title
CN111459799B (en) Software defect detection model establishing and detecting method and system based on Github
CN111400719A (en) Firmware vulnerability distinguishing method and system based on open source component version identification
US11106801B1 (en) Utilizing orchestration and augmented vulnerability triage for software security testing
CN112733146B (en) Penetration testing method, device and equipment based on machine learning and storage medium
CN108563951B (en) Virus detection method and device
CN112749284A (en) Knowledge graph construction method, device, equipment and storage medium
CN110830483B (en) Webpage log attack information detection method, system, equipment and readable storage medium
CN111447224A (en) Web vulnerability scanning method and vulnerability scanner
CN112733156A (en) Intelligent software vulnerability detection method, system and medium based on code attribute graph
CN112688966A (en) Webshell detection method, device, medium and equipment
CN115658080A (en) Method and system for identifying open source code components of software
CN116107834A (en) Log abnormality detection method, device, equipment and storage medium
WO2016093839A1 (en) Structuring of semi-structured log messages
CN113468524B (en) RASP-based machine learning model security detection method
US20240054210A1 (en) Cyber threat information processing apparatus, cyber threat information processing method, and storage medium storing cyber threat information processing program
CN117093556A (en) Log classification method, device, computer equipment and computer readable storage medium
CN113517998B (en) Processing method, device, equipment and storage medium of early warning configuration data
CN114611114A (en) Code auditing method, device, equipment and storage medium
KR102474042B1 (en) Method for analyzing association of diseases using data mining
CN114969761A (en) Log anomaly detection method based on LDA theme characteristics
CN114372290A (en) Enterprise metadata processing method, device, equipment and storage medium
CN112597498A (en) Webshell detection method, system and device and readable storage medium
CN113392016A (en) Protocol generation method, device, equipment and medium for processing program abnormal condition
CN113656183B (en) Task processing method, device, equipment and storage medium
Chrenousov et al. Deep learning based automatic software defects detection framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination