CN108549538A - A kind of code detection method, device, storage medium and test terminal - Google Patents
A kind of code detection method, device, storage medium and test terminal Download PDFInfo
- Publication number
- CN108549538A CN108549538A CN201810321498.XA CN201810321498A CN108549538A CN 108549538 A CN108549538 A CN 108549538A CN 201810321498 A CN201810321498 A CN 201810321498A CN 108549538 A CN108549538 A CN 108549538A
- Authority
- CN
- China
- Prior art keywords
- file
- code
- lexical unit
- unit sequence
- subcode
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3624—Software debugging by performing operations on the source code, e.g. via a compiler
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Stored Programmes (AREA)
Abstract
The embodiment of the invention discloses a kind of code detection method, device, storage medium and test terminal, the embodiment of the present invention to obtain code file to be detected;Build the corresponding global symbol table of the code file to be detected;Obtain the corresponding lexical unit sequence table of each subcode file and local symbol table in the code file to be detected;According to the local symbol table and the global symbol table, the lexical unit sequence table is updated, obtains updated lexical unit sequence table;According to the local symbol table, the global symbol table and the updated lexical unit sequence table, the testing result of the code file to be detected is determined.The program realizes the global detection to code file to be detected, and is not limited solely to individually carry out local detection to sub- code file, improves the accuracy of code detection.
Description
Technical field
The present invention relates to field of computer technology, and in particular to a kind of code detection method, device, storage medium and test
Terminal.
Background technology
In software project development, usually needed after code development comes out for availability and the accuracy etc. for ensuring code
Various detections are carried out to code, obtain code detection as a result, then code detection result can be utilized to help developer fixed
The problem hidden in the code of position, to be repaired accordingly to code.
In the prior art, multiple subcode files are generally comprised in the code file of a project, are examined to code
During survey, need individually to be detected each subcode file in code file, for example, first to a subcode text
Code in part is analyzed, and the local message of the subcode file is obtained, and is carried out according to the local message of the subcode file
Part detection, obtains the local testing result of the subcode file.Then, to the code in another subcode file according to this
Method carries out local detection, obtains the local testing result of another subcode file, and so on, until completing to code text
The detection of all subcode files in part.
In the research and practice process to the prior art, it was found by the inventors of the present invention that due to existing code detection side
It is independent to the detection of each subcode file in code file in case, the process of detection is fairly simple, the detection obtained
As a result it is only the corresponding local testing result of each subcode file in code file, therefore the part caused is detected
As a result accuracy is low.
Invention content
A kind of code detection method of offer of the embodiment of the present invention, device, storage medium and test terminal, it is intended to improve code
The accuracy of detection.
In order to solve the above technical problems, the embodiment of the present invention provides following technical scheme:
A kind of code detection method, including:
Obtain code file to be detected;
Build the corresponding global symbol table of the code file to be detected;
Obtain the corresponding lexical unit sequence table of each subcode file and part symbol in the code file to be detected
Number table;
According to the local symbol table and the global symbol table, the lexical unit sequence table is updated, is obtained
Updated lexical unit sequence table;
According to the local symbol table, the global symbol table and the updated lexical unit sequence table, institute is determined
State the testing result of code file to be detected.
A kind of code detecting apparatus, including:
First acquisition unit, for obtaining code file to be detected;
Construction unit, for building the corresponding global symbol table of the code file to be detected;
Second acquisition unit, for obtaining the corresponding morphology list of each subcode file in the code file to be detected
Metasequence table and local symbol table;
Updating unit is used for according to the local symbol table and the global symbol table, to the lexical unit sequence table
It is updated, obtains updated lexical unit sequence table;
Determination unit, for according to the local symbol table, the global symbol table and the updated lexical unit
Sequence table determines the testing result of the code file to be detected.
A kind of storage medium, the storage medium are stored with a plurality of instruction, and described instruction is loaded suitable for processor, with
Execute the step in any code detection method that the embodiment of the present invention is provided.
A kind of test terminal, the test terminal include:At least one processor and at least one processor;The storage
Device has program stored therein, and the processor calls described program, to execute any code detection that the embodiment of the present invention is provided
Step in method.
The embodiment of the present invention can build global symbol table corresponding with code file to be detected, which can
To include the global information of code file to be detected, and obtains each subcode file in code file to be detected and correspond to
Lexical unit sequence table and local symbol table, the local symbol table may include the local message of subcode file;Then root
According to local symbol table and global symbol table, lexical unit sequence table is updated, obtains updated lexical unit sequence table;
According to local symbol table, global symbol table and updated lexical unit sequence table, the detection of code file to be detected is determined
As a result.The program due to can be updated to lexical unit sequence table by global symbol table and local symbol table, and according to
Updated lexical unit sequence table and local symbol table and global symbol table obtain the complete of entire code file to be detected
Office's testing result, realizes the global detection to code file to be detected, and is not limited solely to individually to sub- code file
Local detection is carried out, this improves the accuracys of code detection.
Description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for
For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 is the schematic diagram of a scenario of code detection system provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of code detection method provided in an embodiment of the present invention;
Fig. 3 is the partial schematic diagram of lexical unit sequence table provided in an embodiment of the present invention;
Fig. 4 is the partial schematic diagram of abstract syntax tree provided in an embodiment of the present invention;
Fig. 5 is the partially schematic of the formed structure tree of local symbol table of subcode file provided in an embodiment of the present invention
Figure;
Fig. 6 is that the part of the formed structure tree of local symbol table of another subcode file provided in an embodiment of the present invention shows
It is intended to;
Fig. 7 is the partial schematic diagram of the corresponding structure tree of global symbol table provided in an embodiment of the present invention;
Fig. 8 is the schematic diagram of code detection provided in an embodiment of the present invention;
Fig. 9 is another flow diagram of code detection method provided in an embodiment of the present invention;
Figure 10 is the flow diagram of update lexical unit sequence table provided in an embodiment of the present invention;
Figure 11 is the structural schematic diagram of code detecting apparatus provided in an embodiment of the present invention;
Figure 12 is another structural schematic diagram of code detecting apparatus provided in an embodiment of the present invention;
Figure 13 is another structural schematic diagram of code detecting apparatus provided in an embodiment of the present invention;
Figure 14 is another structural schematic diagram of code detecting apparatus provided in an embodiment of the present invention;
Figure 15 is the structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, the every other implementation that those skilled in the art are obtained without creative efforts
Example, shall fall within the protection scope of the present invention.
A kind of code detection method of offer of the embodiment of the present invention, device, storage medium and test terminal.
Referring to Fig. 1, the schematic diagram of a scenario for the code detection system that Fig. 1 is provided by the embodiment of the present invention, code inspection
Examining system may include code detecting apparatus, the code detecting apparatus can specifically be integrated in tablet computer, laptop and
Desktop computer etc. has storage element and is equipped with microprocessor in the terminal with operational capability, is mainly used for obtaining to be checked
The code file of survey, which can be generated by the soft developing instrument part of code programming, for example, can be to server
It sends and is asked about the acquisition of code file, and receive the code file etc. that server is returned based on acquisition request, it can be with
Invalid code content in code file is filtered, code file to be detected is obtained, which can be
Redundant character or notes content etc..Then, the corresponding global symbol table of code file to be detected is built, in the global symbol table
May include the information such as class, function and variable of all subcode files in code file to be detected, for example, can first obtain
Lexical unit sequence table corresponding with each subcode file in code file to be detected, obtains lexical unit sequence table collection;
Each lexical unit sequence table, structure and each subcode in code file to be detected are concentrated further according to lexical unit sequence table
The corresponding local symbol table of file obtains local symbol table collection, and builds global symbol table according to local symbol table collection.Wherein,
Lexical unit sequence table may include string value and attribute information of lexical unit etc., local symbol table can be include filial generation
The information such as class, function and the variable of code file, a local symbol table correspond to a sub- code file.And it can obtain to be checked
The corresponding lexical unit sequence table of each subcode file and local symbol table in the code file of survey, for example, can be from morphology
Unit sequence table, which is concentrated, obtains the corresponding lexical unit sequence table of each subcode file, and concentrates and obtain from local symbol table
Each corresponding local symbol table of subcode file;Alternatively, carrying out word to each subcode file in code file to be detected
Method is analyzed, and obtains the corresponding lexical unit sequence table of each subcode file, and according to lexical unit sequence table structure and each
Corresponding local symbol table of subcode file etc..It, can after obtaining lexical unit sequence table, local symbol table and global symbol table
According to local symbol table and global symbol table, to be updated to lexical unit sequence table, updated lexical unit sequence is obtained
List.Finally it can determine generation to be detected according to local symbol table, global symbol table and updated lexical unit sequence table
The testing result of code file;Etc..
In addition, code detection system can also include server, which can be used for storing the code that terminal uploads
File, lexical unit sequence table, local symbol table and whole offices symbol table etc., which can also receive the pass of terminal transmission
It is asked in the acquisition of code file, and code file is sent to by terminal based on acquisition request;Either, terminal is received to send
The acquisition about local symbol table ask, and ask based on the acquisition local symbol table being sent to terminal etc..
It should be noted that the schematic diagram of a scenario of code detection system shown in FIG. 1 is only an example, the present invention is real
The code detection system and scene for applying example description are in order to more clearly illustrate the technical solution of the embodiment of the present invention, not
The restriction for technical solution provided in an embodiment of the present invention is constituted, those of ordinary skill in the art are it is found that with code detection
The appearance of the differentiation and new business scene of system, technical solution provided in an embodiment of the present invention is for similar technical problem, together
Sample is applicable in.
It is described in detail separately below.
In the present embodiment, it will be described from the angle of code detecting apparatus, which can specifically collect
At tablet computer, laptop and desktop computer etc. have storage element and microprocessor is installed and have operation energy
In the terminal of power.
A kind of code detection method, including:Obtain code file to be detected;It is corresponding to build code file to be detected
Global symbol table;Obtain the corresponding lexical unit sequence table of each subcode file in code file to be detected and local symbol
Table;According to local symbol table and global symbol table, lexical unit sequence table is updated, obtains updated lexical unit sequence
List;According to local symbol table, global symbol table and updated lexical unit sequence table, code file to be detected is determined
Testing result.
Referring to Fig. 2, Fig. 2 is the flow diagram for the code detection method that one embodiment of the invention provides.The code is examined
Survey method may include:
In step S101, code file to be detected is obtained.
Wherein, code can be the source file that language that developer's exploitation tool is supported is write out, can be with
It is one group of specific rule system for indicating information with discrete form by character, symbol or signal element etc..The code can be C
++ the source file that language, C language or Java language etc. are write out can also be the source file that other language are write out, tool
Hold in vivo and is not construed as limiting here.
Code detecting apparatus obtains code file to be detected first, which can be a software
The code file of project may include one or more subcode files.
In some embodiments, the step of obtaining code file to be detected may include:
(1) code file is obtained, and extracts invalid code content from code file according to code regulation.
(2) invalid code content is filtered, obtains code file to be detected.
Specifically, code detecting apparatus can obtain code file, for example, can be from local pre-stored code library
Code file is obtained, which can be that code detecting apparatus first passes through the generation of code programming developing instrument in advance;Either,
Code file can be sent to server and obtain request, and receive server and the code text that request returns is obtained based on code file
Part, the code file can be that code detecting apparatus or other-end are uploaded to server, by server storage code text
Part.It is understood that the acquisition modes of code file can be here not construed as limiting with other acquisition modes, particular content.
After obtaining code file, code detecting apparatus can pre-process code file, to filter out follow-up place
Unwanted content is managed, for example, the part unrelated with valid code can be filtered, the word of specification is provided for subsequent morphological analysis
Fu Liu.The pretreatment may include extracting invalid code content from code file according to code regulation, wherein when code text
When part is the source file write using C Plus Plus, which can be the redaction rule about C Plus Plus;When code text
When part is the source file that profit is shown a C language, which can be the redaction rule about C language;When code file is
When the source file write using Java language, which can be the redaction rule etc. about Java language.The invalid generation
Digital content may include notes content or pre-processing instruction etc., can also include other content, and particular content does not limit here
It is fixed.
After obtaining invalid code content, invalid code content can be filtered by code detecting apparatus, i.e., will be invalid
Code content is deleted from code file, obtains code file to be detected.When code file includes multiple subcode files
When, multiple subcode files can be traversed, invalid code content are extracted from each subcode file, and will be in invalid code
Appearance is filtered.
Optionally, the step of extracting invalid code content from code file according to code regulation may include:
(a) redundant character is extracted from code file according to code regulation;
(b) notes content is extracted from code file according to the annotation mark in code regulation;
(c) pre-processing instruction is extracted from code file according to the pretreatment mark in code regulation;
(d) redundant character, notes content and pre-processing instruction are set to invalid code content.
Specifically, code detecting apparatus can extract redundant character according to code regulation from code file, this is extra
Character may include excess space, excess space branch or extra bracket etc..For example, when code file is compiled using C Plus Plus
When the source file write, three row excess spaces can be extracted from code file according to the code regulation about C Plus Plus.Again
For example, when code file is the source file write using Java language, can according to the code regulation about Java language, from
Five excess spaces therein are extracted in continuous six spaces present in code file;Etc..
Code detecting apparatus can identify according to the annotation in code regulation and extract notes content from code file,
In, annotation mark can be double slashes " // ", either "/* " and " */" etc..For example, when code file is compiled using C Plus Plus
When the source file write, annotation mark " // ", and root can be searched from code file according to the code regulation about C Plus Plus
It is expert at according to annotation mark " // " from code file " // " and extracts notes content, the notes content is including " // " and behind
Content.
In another example when code file is the source file write using C Plus Plus, it can be according to the generation about C Plus Plus
Code rule searches the annotation mark "/* " of starting from code file, and the annotation mark of termination is searched from code file
" */", and "/* " is identified according to the annotation of starting and the annotation terminated identifies " */" and extracts "/* " (including/* between " */"
With * /) notes content.
Code detecting apparatus can extract pretreatment from code file according to the mark of the pretreatment in code regulation and refer to
It enables, pretreatment mark may include " # " etc., which may include #define, #if and #pragma etc..Example
It such as, can be according to the code regulation about C Plus Plus, from code when code file is the source file write using C Plus Plus
Pretreatment mark " # " is searched in file, and is expert at according to pretreatment mark " # " " # " from code file and is extracted pretreatment
Instruction, which includes " # " and its subsequent content.
After obtaining redundant character, notes content and pre-processing instruction, code detecting apparatus can be by redundant character, note
It releases content and pre-processing instruction is set as invalid code content, realize and extracted in vain from code file according to code regulation
Code content.
In step s 102, the corresponding global symbol table of code file to be detected is built.
After obtaining code file to be detected, it is corresponding complete that code detecting apparatus can build code file to be detected
Office's symbol table, wherein global symbol table may include the class that each subcode file includes in code file to be detected and its
The data structure of the symbols such as relevant information, function and its relevant information, variable and its relevant information.
In some embodiments, the step of building code file to be detected corresponding global symbol table may include:
(1) lexical unit sequence table corresponding with each subcode file in code file to be detected is obtained, word is obtained
Method unit sequence table collection.
(2) each lexical unit sequence table is concentrated according to lexical unit sequence table, structure in code file to be detected
Each corresponding local symbol table of subcode file, obtains local symbol table collection.
(3) global symbol table is built according to local symbol table collection.
Specifically, code detecting apparatus can first obtain corresponding with each subcode file in code file to be detected
Lexical unit sequence table obtains lexical unit sequence table collection, for example, each subcode file pair can be obtained by morphological analysis
The lexical unit sequence table answered.Wherein, may include in lexical unit sequence table multiple lexical units (be properly termed as morpheme,
It is properly termed as Token), for example, if or for etc. is a lexical unit, a sub- code file can after morphological analysis
It (is properly termed as with the lexical unit sequence table of the set of all lexical units of generation, as the subcode file
TokenList), one section of code can correspond to obtain a lexical unit section that (i.e. one Token sections, also may be used in subcode file
With referred to as TokenSection), include the sequence of one or more Token compositions.
It should be noted that in order to improve the acquisition efficiency of lexical unit sequence table, code detecting apparatus can call more
A thread, and the lexical unit sequence table of each subcode file is obtained by each thread parallel, so as to quick obtaining
To multiple lexical unit sequence tables.Certainly, code detecting apparatus can serially obtain lexical unit sequence table, and particular content is herein
Place is not construed as limiting.
In some embodiments, lexical unit corresponding with each subcode file in code file to be detected is obtained
Sequence table, the step of obtaining lexical unit sequence table collection may include:
(1) morphological analysis is carried out to each subcode file in code file to be detected, obtains each subcode file
Corresponding lexical unit sequence table.
(2) according to the corresponding lexical unit sequence table of each subcode file, to every height in code file to be detected
Code file is standardized, and obtains the standard code file set of each standard subcode file composition.
(3) each corresponding lexical unit sequence table of standard subcode file in standard code file set is obtained, word is obtained
Method unit sequence table collection.
Specifically, during obtaining lexical unit sequence table collection, first, code detecting apparatus can be to be detected
Each subcode file carries out morphological analysis in code file, obtains the corresponding lexical unit sequence table of each subcode file,
The corresponding lexical unit sequence table of each subcode file can form lexical unit sequence table collection, the lexical unit sequence table collection
In may include the corresponding lexical unit sequence table of one or more subcode files.
Wherein, morphological analysis can be that the character string in subcode file is converted to the process of lexical unit sequence,
The character stream for being mainly used for reading in pretreatment output of the morphological analysis, forms morpheme by the character stream, generates and export one
Lexical unit sequence, each lexical unit correspond to a morpheme, and entire lexical unit sequence is lexical unit sequence table, morphology
Unit sequence table is the Data Structures of subsequent processing and upper layer check item traversal code.
For example, to some code segment:For (int index=0;index<42;++ index) morphological analysis is carried out, it can be with
Obtain " for ", " (", " int ", " index ", "=", " 0 ", ";”、“index”、“<”、“42”、“;", " ++ ", " index " and
") " etc. 14 lexical units composition lexical unit sequence.
In some embodiments, morphological analysis is carried out to each subcode file in code file to be detected, obtained
The step of each subcode file corresponding lexical unit sequence table may include:
(a) string value of each lexical unit in each subcode file in code file to be detected is obtained.
(b) attribute information associated with each lexical unit is obtained.
(c) doubly linked list is generated according to the string value of each lexical unit and attribute information, obtains each subcode text
The lexical unit sequence table collection of the corresponding lexical unit sequence table composition of part.
During obtaining lexical unit sequence table, code detecting apparatus can obtain every in code file to be detected
The string value of each lexical unit in a sub- code file, for example, being by " f ", " o ", " r " etc. for lexical unit " for "
What three string values formed, lexical unit " int " is made of three string values such as " i ", " n ", " t ".
Lexical unit sequence table is to carry out the most basic unit of code detection, in addition to the string value including lexical unit with
Outside, it can also include attribute information associated with lexical unit, therefore, category associated with each lexical unit can be obtained
Property information, wherein the attribute information may include the pointer for being directed toward next lexical unit, the finger for being directed toward a upper lexical unit
The line number etc. of needle, the type of lexical unit and lexical unit.
It, can be according to each morphology in obtaining subcode file after the string value and attribute information of each lexical unit
The string value and attribute information of unit generate doubly linked list, obtain the corresponding lexical unit sequence table of each subcode file,
The corresponding lexical unit sequence table of each subcode file can form lexical unit sequence table collection.Lexical unit sequence table essence
On be a doubly linked list, safeguard lexical unit all in lexical unit sequence table.
For example, as shown in figure 3, certain section of code:“if(i>0) lexical unit sequence table " can be expressed as shown in Fig. 3
Doubly linked list, wherein arrow can indicate the pointer for the next lexical unit for being directed toward current lexical unit, for example, sensing word
Method unit " i " next lexical unit ">" pointer;Or arrow can indicate the upper word for being directed toward current lexical unit
The pointer of method unit, for example, be directed toward lexical unit " (" a upper lexical unit " if " pointer;Or arrow can indicate
Current lexical unit is directed toward the pointer of the lexical unit matched with it, for example, lexical unit " (" it is directed toward the morphology list matched with it
Member ") " pointer;Etc..
Optionally, the step of acquisition attribute information associated with each lexical unit may include:
Obtain each lexical unit is directed toward various information in code file to be detected finger in each subcode file
Needle obtains pointer information;Obtain the characteristic information of each lexical unit;By pointer information and the characteristic information of each lexical unit
It is set as attribute information associated with each lexical unit.
Specifically, code detecting apparatus can obtain each lexical unit each subcode in code file to be detected
The pointer of various information is directed toward in file, wherein various information may include current lexical unit next lexical unit,
The data flow architecture etc. of the lexical unit and lexical unit that are matched with current lexical unit.
For example, the pointer for the next lexical unit for being directed toward current lexical unit can be obtained, be directed toward current lexical unit
A upper lexical unit pointer, be directed toward with current lexical unit pairing lexical unit pointer (for example, for left bracket
For, that is, be directed toward the pointer of right parenthesis), be directed toward lexical unit symbol table pointer (for example, what variable was directed toward is global
Variable object in symbol table or local symbol table, what function was directed toward is function in global symbol table or local symbol table
Object etc.), the syntax tree structure pointer (the abstract syntax tree construction that can be used for safeguarding lexical unit) and morphology of lexical unit
Data flow architecture pointer of unit etc., these are pointer information.
At this time, it is also necessary to obtain the characteristic information of each lexical unit, wherein characteristic information may include lexical unit institute
The type etc. of line number and lexical unit in code file, the type of lexical unit may include number, character string, variable,
The types such as function and keyword, for example, " 1 " and " 2 " etc. can be numeric type, " main " can be type function,
" index " and " i " etc. can be types of variables, etc..The characteristic information of these pointer informations and lexical unit be and morphology
The associated attribute information of unit.
In some embodiments, according to the corresponding lexical unit sequence table of each subcode file, to generation to be detected
Each subcode file is standardized in code file, obtains the standard code file of each standard subcode file composition
The step of collection may include:
(a) according to the corresponding lexical unit sequence table of each subcode file and code standard logical format, standard is obtained
Code format.
(b) from code file to be detected in each subcode file search with the unmatched object code of code format
Format.
(c) it is modified to object code format according to code format, obtains the mark of each standard subcode file composition
Quasi- code file collection.
Since disparity items code may be to be write by different developers, for the code wind of different developers
Lattice may be different, and objectively form the multifarious present situation of code, therefore, in order to improve the effect of structure global symbol table
Rate, and the accuracy of raising code detection can be standardized code file after getting lexical unit sequence table
Processing, for example, can standardize and standardize to realize by some simplification steps come Unicode style.Wherein, the standard
During change is handled, code logic cannot all be changed for all simplified steps, and only carry out equivalencing in logic.
Specifically, after obtaining the corresponding lexical unit sequence table of each subcode file, code detecting apparatus can root
According to the corresponding lexical unit sequence table of each subcode file and code standard logical format, the code format of standard is obtained,
For example, it is starting that the logical format that a function is realized, which can be with opening brace, and terminated with right braces.Wherein, contemporary
When code file is the source file write using C Plus Plus, which can be the logic lattice about C Plus Plus
Formula;When code file is the source file write using Java language, which can be about Java languages
The logical format of speech.
Then, unmatched with the code format of standard from being searched in each subcode file in code file to be detected
Object code format modifies to object code format according to the code format of standard, obtains each standard subcode file,
For example, the code format of standard can be utilized to replace object code format, each standard subcode file can form standard generation
Code file set.Code file is replaced into row equivalent code logic to realize, with reputable code format.
For example, by taking normalization condition expression formula as an example, before being standardized, in some subcode file
Conditional expression the sentence of braces is omitted, specific subcode file can be as follows:
Code detecting apparatus can analyze conditional expression if according to the corresponding lexical unit sequence table of subcode file
(i>0) position where, and determine that the code format of standard is need behind conditional expression according to code standard logical format
Braces is set, at this point it is possible to be searched and the unmatched object code format of code format from subcode file:Condition table
Up to formula if (i>0) it is not provided with braces below, then can be modified, that is, is existed to object code format according to code format
Conditional expression if (i>0) braces is added below, and realization is standardized sub- code file, obtains standard subcode
File, specifically can be as follows:
After obtaining standard code file set, code detecting apparatus can obtain each standard in standard code file set
The corresponding lexical unit sequence table of code file, for example, can the character (such as braces) of addition be inserted by standardization
The corresponding lexical unit sequence table of subcode file before obtains the corresponding lexical unit sequence table of standard subcode file;Or
Person is can to delete the character of deletion from the corresponding lexical unit sequence table of subcode file before standardization, obtain
To the corresponding lexical unit sequence table of standard subcode file;The corresponding lexical unit sequence table of each standard subcode file can
To form lexical unit sequence table collection.
It is above-mentioned obtain lexical unit sequence table collection after, code detecting apparatus can be concentrated every according to lexical unit sequence table
A lexical unit sequence table builds local symbol table corresponding with each subcode file in code file to be detected, obtains
Local symbol table collection.For example, lexical unit sequence table can be traversed, extracting subcode file from lexical unit sequence table corresponds to
The information such as class, function and variable, and with the information such as the relevant key characteristic such as class, function and variable, according to these information
Class list, function list and variable list etc. can be built, it can be according to the corresponding class list of each subcode file, function row
Table and variable list build local symbol table corresponding with each subcode file, the corresponding local symbol of each subcode file
Table can form local symbol table collection.
In some embodiments, each lexical unit sequence table is concentrated according to lexical unit sequence table, structure with it is to be checked
The corresponding local symbol table of each subcode file in the code file of survey, the step of obtaining local symbol table collection may include:
(1) each lexical unit sequence table is concentrated according to lexical unit sequence table, structure in code file to be detected
Each corresponding abstract syntax tree of subcode file.
(2) local symbol corresponding with each subcode file in code file to be detected is built according to abstract syntax tree
Table obtains local symbol table collection.
Specifically, code detecting apparatus can concentrate each lexical unit sequence table, structure according to lexical unit sequence table
Abstract syntax tree (Abstract Syntax Tree, AST) corresponding with each subcode file in code file to be detected.
Wherein, abstract syntax tree can be the tree-shaped form of expression of the abstract syntax structure of code, and abstract syntax tree can be one two
Fork tree, each non-leaf nodes represent an operator, and two child nodes of non-leaf nodes respectively represent where operator
Two operation components of the operator.The priority for the logical construction and operator that abstract syntax tree construction contains expression formula is closed
System, this characteristic can improve the accuracy of code scene matching and realize the efficiency of the code scene.
It should be noted that the abstract syntax tree in the embodiment of the present invention, will not establish the logic between code expression
Relationship, for example, the logical relation in if-else statement interludes between if sentences and else sentences will not be established, and just for single
Abstract syntax structure is established in code expression, does not establish the structural relation between expression formula and expression formula.If due to building
Vertical structural relation between expression formula and expression formula is built then once there are syntax errors for the code file of input
Global abstract syntax tree construction will be mistake, and without reference to meaning, thus the present invention support it is incomplete or not
Input can be used as by the code file of compiling, and build the abstract syntax tree construction of single expression formula, if some expression formula
There is mistake, also only mistake occurs in the abstract syntax tree construction of part for that, the abstract syntax without influencing other expression formulas
Tree construction.
For example, such as next section of code:
String::Format("demo:%d%s%d ", Func (1,2), " AST ", 1+2*3);
It, can be with as shown in figure 4, including parameter for the structure for the abstract syntax tree that this section of code is finally built
1:"demo:%d%s%d ", parameter 2:Func (1,2), parameter 3:" AST " and parameter 4:1+2*3 etc., for example, non-leaf section
Point " * " represents an operator, and two child nodes " 2 " of non-leaf nodes and " 3 " respectively represent the operation where the operator
Two operation components of symbol, i.e. 2*3;Corresponding two child nodes of operator "+" are " 1 " and " * ", can obtain 1+2*3.
After obtaining abstract syntax tree, it can be built according to abstract syntax tree and each filial generation in code file to be detected
The corresponding local symbol table of code file, the corresponding local symbol table of each subcode file can form local symbol table collection.
In some embodiments, according to abstract syntax tree structure and each subcode file in code file to be detected
Corresponding local symbol table, the step of obtaining local symbol table collection may include:
(a) the corresponding class list of each subcode file in code file to be detected, letter are obtained according to abstract syntax tree
Ordered series of numbers table and variable list.
(b) according to the corresponding class list of each subcode file, function list and variable list structure and each subcode
The corresponding local symbol table of file, obtains local symbol table collection.
Specifically, code detecting apparatus can go out subcode file pair according to tree rapid extraction in abstract syntax tree
The information such as class, function and the variable answered, and with the information such as the relevant key characteristic such as class, function and variable, according to these letters
Breath can build the corresponding class list of each subcode file in code file to be detected, function list and variable list etc.,
It can be built according to the corresponding class list of each subcode file, function list and variable list corresponding with each subcode file
Local symbol table, the corresponding local symbol table of each subcode file can form local symbol table collection.Wherein, local symbol
Table is referred to as SymbolDatabase, which can be the corresponding symbolism result object of subcode file.
For example, in the source file that C Plus Plus is write, (include the .h of corresponding expansion getting each .cpp file
File) after corresponding lexical unit sequence table, a corresponding local symbol table can be built according to the lexical unit sequence table,
Wherein, each local symbol table can include following three types of data:(1) list of types (is referred to as Type
List), it can be used for type all in record code file, for example, the types such as class, struct or namespace, it should
May include each typonym and the corresponding key feature of each type etc. in list of types;(2) function list (also may be used
With referred to as Function List), it can be used for function all in record code file, may include each in the function list
A function name and the corresponding key feature of each function (for example, return value etc. of function) etc.;(3) variable list (also may be used
With referred to as Variable List), it can be used for variable all in record code file, may include each in the variable list
A name variable and the corresponding key feature of each variable etc..
To be illustrated below, for example, may include in demo.cpp code files demo1.cpp, demo.h and
The subcodes file such as demo2.cpp, the code content in demo1.cpp subcode files can be as follows:
May include demo.h subcode files, the generation in demo.h subcode files in demo1.cpp subcode files
Digital content can be as follows:
In Scanning Detction demo1.cpp subcode files, similar following local symbol table can be obtained:
By the local symbol table of demo1.cpp subcode files it is found that there are symbols to lack in demo1.cpp subcode files
It loses:CDemo2::The definition of Func is not found, and in fact, CDemo2::Func's is defined on demo2.cpp subcode texts
In part, the code content in the demo2.cpp subcode files can be as follows:
In Scanning Detction demo2.cpp subcode files, similar following local symbol table can be obtained:
By the local symbol table of demo2.cpp subcode files it is found that the local symbol table of demo2.cpp subcode files
Local symbol table compared to demo1.cpp subcode files is relatively simple, and most important one information is exactly CDemo2::Func
The definition of function, what this was missing from the corresponding local symbol table of demo1.cpp subcode files, this is also individually to obtain
The important problem for taking each subcode file to occur, at this time across file symbol search capacity be missing from.
It therefore, can be with after obtaining the local symbol table collection that the corresponding local symbol table of each subcode file is formed
Global symbol table is built according to local symbol table collection, is lacked so as to be found in global symbol table in some subcode file
The symbols such as class, function or the variable of mistake (i.e. type parameter) realize across file symbol search capacity.
In some embodiments, may include according to the step of local symbol table collection structure global symbol table:
(1) each local symbol table is concentrated to merge in local symbol table, the symbol table after being merged.
(2) identical symbolic parameter in the symbol table after merging is retained one of them, and in identical symbolic parameter
Other parameters deleted, obtain global symbol table.
Specifically, local symbol table can be concentrated each local symbol table to merge by code detecting apparatus, be closed
Symbol table after and, due to that may have identical symbolic parameter in the symbol table after merging, which may include
Class, function and variable etc., therefore identical symbolic parameter in the symbol table after merging can be retained one of them, and to identical
Symbolic parameter in other parameters deleted, obtain global symbol table.Either, there are two in symbol table after merging
When a identical symbolic parameter, one of symbolic parameter is undefined, and when another symbolic parameter defines, in combined process
In can retain defined symbolic parameter, and delete undefined symbolic parameter.
For example, constructing local symbol respectively for demo1.cpp subcodes file and demo2.cpp subcode files
Table, but the CDemo2 in demo1.cpp subcode files::What the definition of Func functions was missing from, the definition of the function be
In the local symbol table of demo2.cpp subcode files.It therefore, in order to well solve symbol missing the problem of, can basis
Each local symbol table builds global symbol table, build global symbol table it is crucial that solving the symbols such as class, function and variable
It searches and merging logic when symbol conflict.
For example, in order to visually illustrate the structure of global symbol table, illustrated in the form of symbolic construction tree,
The corresponding symbolic construction tree of local symbol table of demo1.cpp subcode files can with as shown in figure 5, as shown in Figure 5,
May include the classes such as CObject, CDemo1, CDemo2 in the local symbol table of demo1.cpp subcode files, and including
Func functions and global_var1 variables etc., wherein there is Func defined in CDemo1 classes, and Func definition lacks in CDemo2 classes
It loses.
The corresponding symbolic construction tree of local symbol table of demo2.cpp subcode files can be with as shown in fig. 6, can by Fig. 6
Know, may include the classes such as CObject and CDemo2 and global_var2 in the local symbol table of demo1.cpp subcode files
Variable etc., wherein have Func defined in CDemo2 classes.
According to the local symbol of the local symbol table of demo1.cpp subcode files and demo2.cpp subcode files
Table, the global symbol table built can with as shown in fig. 7, may include as shown in Figure 7, in global symbol table CObject,
The classes such as CDemo1 and CDemo2, and further include the variables such as global_var1 and global_var2 including Func functions,
In, definition has Func in CDemo1 and CDemo2 classes.I.e. by demo1.cpp subcodes file and demo2.cpp subcode texts
After the symbolic construction tree of part merges, the CDemo2 in demo2.cpp subcode files::The definition of Func functions, and
Global_var2 variables have been merged into the symbolic construction tree of demo1.cpp subcode files.
In step s 103, the corresponding lexical unit sequence table of each subcode file in code file to be detected is obtained
And local symbol table.
Code is detected for convenience, code detecting apparatus also needs to obtain in code file to be detected per height
The corresponding lexical unit sequence table of code file and local symbol table.
In some embodiments, the corresponding lexical unit sequence of each subcode file in code file to be detected is obtained
The step of list and local symbol table may include:
(1) it when being stored with lexical unit sequence table collection and local symbel table, concentrates and obtains from lexical unit sequence table
The corresponding lexical unit sequence table of each subcode file in code file to be detected.
(2) it is concentrated from local symbol table and obtains the corresponding local symbol of each subcode file in code file to be detected
Table.
Specifically, code detecting apparatus is during above-mentioned acquisition global symbol table, due to need to obtain with it is to be detected
Code file in the corresponding lexical unit sequence table of each subcode file, and in code file to be detected per height
Corresponding local symbol table of code file etc., therefore, code detecting apparatus get lexical unit sequence table collection drawn game above-mentioned
After portion's symbel table, lexical unit sequence table collection and local symbel table can be stored into local hard drive;Alternatively, by morphology
Unit sequence table collection and local symbel table are uploaded to server, by server to lexical unit sequence table collection and local symbol table
Collection is stored;Etc..
At this point, code detecting apparatus is obtaining the corresponding lexical unit sequence table of each subcode file and local symbol table
During, it can be determined that whether local hard drive or server etc. are stored with lexical unit sequence table collection, when being stored with morphology list
When metasequence table collection, it can directly be concentrated from lexical unit sequence table and obtain each subcode file in code file to be detected
Corresponding lexical unit sequence table.And judge local hard drive or server etc. and whether be stored with local symbol table collection, work as storage
When having local symbol table collection, it can directly be concentrated from local symbol table and obtain each subcode file in code file to be detected
Corresponding local symbol table.
In some embodiments, the corresponding lexical unit sequence of each subcode file in code file to be detected is obtained
The step of list and local symbol table may include:
(1) when not storing lexical unit sequence table collection and local symbel table, to each in code file to be detected
Subcode file carries out morphological analysis, obtains the corresponding lexical unit sequence table of each subcode file.
(2) according to lexical unit sequence table, abstract syntax tree corresponding with each subcode file is built.
(3) local symbol table corresponding with each subcode file is built according to abstract syntax tree.
Code detecting apparatus can not store the lexical unit sequence got during above-mentioned acquisition global symbol table
List collection and local symbel table etc., at this point, code detecting apparatus is obtaining the corresponding lexical unit sequence of each subcode file
During list and local symbol table, it can be determined that whether local hard drive or server etc. are stored with lexical unit sequence table
Collection, when not being stored with lexical unit sequence table collection, code detecting apparatus needs to reacquire each subcode file corresponding
Lexical unit sequence table and local symbol table.
Specifically, code detecting apparatus can carry out morphology point to each subcode file in code file to be detected
Analysis, obtains the corresponding lexical unit sequence table of each subcode file.For example, can obtain each in code file to be detected
The string value of each lexical unit in subcode file, and obtain attribute information associated with each lexical unit;Root
Doubly linked list is generated according to the string value and attribute information of each lexical unit, obtains the corresponding morphology list of each subcode file
Metasequence table.
Code detecting apparatus can also be according to the corresponding lexical unit sequence table of each subcode file, to generation to be detected
Each subcode file is standardized in code file, obtains each standard subcode file.For example, can be according to each
The corresponding lexical unit sequence table of subcode file and code standard logical format, obtain the code format of standard;From to be detected
Code file in search and the unmatched object code format of code format in each subcode file;According to code format pair
Object code format is modified, and the corresponding standard subcode file of each subcode file is obtained.
Then, the corresponding lexical unit sequence table of each standard subcode file is obtained, finally by each standard subcode
The corresponding lexical unit sequence table of file is set as the corresponding lexical unit sequence table of each subcode file.At this point it is possible to root
According to lexical unit sequence table, abstract syntax tree corresponding with each subcode file is built, and build according to abstract syntax tree
Local symbol table corresponding with each subcode file.For example, code file to be detected can be obtained according to abstract syntax tree
In the corresponding class list of each subcode file, function list and variable list, and according to the corresponding class of each subcode file
List, function list and variable list build local symbol table corresponding with each subcode file.
In step S104, according to local symbol table and global symbol table, lexical unit sequence table is updated, is obtained
Updated lexical unit sequence table.
In step S105, according to local symbol table, global symbol table and updated lexical unit sequence table, determination waits for
The testing result of the code file of detection.
The corresponding local symbol table of each subcode file in obtaining global symbol table and code file to be detected
With lexical unit sequence table, code detecting apparatus can be according to local symbol table and global symbol table, to lexical unit sequence table
It is updated, obtains updated lexical unit sequence table.For example, code detecting apparatus can traverse lexical unit sequence table,
Class, function and variable in lexical unit sequence table is searched and linked, based on the above-mentioned global symbol table built, is made
It with the symbolic look-up algorithm of context-sensitive, realizes across file or cross-module symbolic look-up ability, it is more accurate to reach acquisition
Code detection result.
Wherein, the effect of lookup and the link of the symbols such as class, function and variable, which is that, will traverse lexical unit sequence table
In class, function and variable etc. and local symbol table where it or global symbol table be associated, so as in traversal morphology list
When metasequence table, the relevant information of class, function or variable etc. can be known, so as to significantly improve the scanning of code check item
Efficiency.
In some embodiments, according to local symbol table and global symbol table, lexical unit sequence table is updated,
Updated lexical unit sequence table is obtained, according to local symbol table, global symbol table and updated lexical unit sequence table,
The step of testing result for determining code file to be detected may include:
(1) subcode file is obtained from code file to be detected, as current subcode file.
(2) the corresponding lexical unit sequence table of current subcode file and local symbol table are obtained.
(3) according to local symbol table and global symbol table, to the corresponding lexical unit sequence table of current subcode file into
Row update, obtains updated lexical unit sequence table.
(4) according to local symbol table, global symbol table and updated lexical unit sequence table, current subcode text is determined
The testing result of part.
(5) it returns to execute and obtains subcode file, the step as current subcode file from code file to be detected
Suddenly, until the subcode file detection in code file to be detected finishes, the testing result of code file to be detected is obtained.
Code detecting apparatus can be updated lexical unit sequence table by the symbolic look-up algorithm of context-sensitive,
Each subcode file morphology list in code file to be detected can be traversed by the symbolic look-up algorithm of the context-sensitive
Metasequence table obtains the type parameters such as class, function and the variable in lexical unit sequence table, by type parameter and local symbol table
Or the type parameter in global symbol table carries out pointer link, to update lexical unit sequence table.So that being obtained based on above-mentioned
The global symbol table got realizes the symbolic look-up algorithm of context-sensitive, is one to code detection result correctness
Important guarantee.
Specifically, code detecting apparatus can obtain subcode file from code file to be detected, as current son
Then code file is concentrated from lexical unit sequence table and obtains the corresponding lexical unit sequence table of current subcode file, with
And it is concentrated from local symbol table and obtains the corresponding local symbol table of current subcode file;Either, to current subcode file
Morphological analysis is carried out, obtains the corresponding lexical unit sequence table of current subcode file, and according to current subcode file
Lexical unit sequence table builds local symbol table corresponding with current subcode file;Etc..
After the lexical unit sequence table for obtaining current subcode file and local symbol table, code detecting apparatus can root
According to local symbol table and global symbol table, the corresponding lexical unit sequence table of current subcode file is updated, is obtained more
Lexical unit sequence table after new.
In some embodiments, according to local symbol table and global symbol table, lexical unit sequence table is updated,
The step of obtaining updated lexical unit sequence table may include:
(a) type parameter in lexical unit sequence table is obtained, type parameter includes the type name and type of type parameter
The qualified name of parameter.
(b) when the type name of type parameter is not system type name, and there is no type parameters in local symbol table
When type name, the type name of type parameter is searched from global symbol table.
(c) when the type name of present pattern parameter in global symbol table, by type parameter in lexical unit sequence table
Qualified name is matched with the qualified name of type parameter in global symbol table.
If (d) successful match, by the type parameter in the type parameter and global symbol table in lexical unit sequence table
Pointer link is carried out, updated lexical unit sequence table is obtained.
Specifically, code detecting apparatus can traverse the lexical unit sequence table of current subcode file, obtain current son
Type parameter in the lexical unit sequence table of code file, such shape parameter may include one or more, wherein the type
Parameter may include class, function and variable etc., such shape parameter may include the type name of type parameter and the limit of type parameter
It names, which may include one or more.For example, for type parameter A::B::C, the entitled C of type are limited entitled
A::B。
After obtaining type parameter, code detecting apparatus can extract the type name of type parameter from type parameter,
Then judge whether the type name of such shape parameter is system type name, the system type name may include code compilation system from
The type name stored in the data of band, for example, main.It, can not be in office when the type name of type parameter is system type name
The type name is continued to search in portion's symbol table or global symbol table, but terminates to search flow, is returned to such shape parameter and is directed toward symbol
The pointer of number table is sky.
When the type name of type parameter is not system type name, can further judge to whether there is in local symbol table
The type name of type parameter can be looked into when the type name of type parameter is not present in local symbol table from global symbol table
The type name for looking for type parameter judges the type name that whether there is type parameter in global symbol table.When being deposited in global symbol table
In the type name of type parameter, by type parameter in the qualified name of type parameter in lexical unit sequence table and global symbol table
Qualified name matched, judge in morphology unit sequence table type parameter in the qualified name of type parameter and global symbol table
Qualified name whether successful match.If type parameter in the qualified name of type parameter and global symbol table in lexical unit sequence table
Qualified name successful match, then by the type parameter in the type parameter and global symbol table in lexical unit sequence table into line pointer
Link is set to point to global symbol for example, the type parameter in lexical unit sequence table can be directed toward to the pointer of symbol table
Type parameter in table obtains updated lexical unit sequence table.
In some embodiments, after the step of obtaining the type parameter in lexical unit sequence table, the code detection
Method further includes:
(e) when the type name of type parameter is not system type name, and in local symbol table present pattern parameter class
When type name, by the qualified name progress of type parameter in the qualified name of type parameter in lexical unit sequence table and local symbol table
Match.
If (f) successful match, by type parameter in type parameter and the local symbol table in lexical unit sequence table into
Line pointer links, and obtains updated lexical unit sequence table.
When the type name of type parameter is not system type name, code detecting apparatus can further judge local symbol
The type name that whether there is type parameter in table, when the type name of present pattern parameter in local symbol table, by lexical unit
The qualified name of type parameter is matched with the qualified name of type parameter in local symbol table in sequence table, judges lexical unit sequence
In list in the qualified name of type parameter and local symbol table type parameter qualified name whether successful match.If lexical unit sequence
The qualified name successful match of the qualified name of type parameter and type parameter in local symbol table in list, then by lexical unit sequence
Type parameter in table carries out pointer with type parameter in local symbol table and links, for example, can will be in lexical unit sequence table
Type parameter be directed toward the pointer of symbol table, the type parameter being set to point in local symbol table obtains updated morphology
Unit sequence table.
After the lexical unit sequence table for updating current subcode file, code detecting apparatus can be according to global symbol
Table, the local symbol table of current subcode file and updated lexical unit sequence table, determine the inspection of current subcode file
Survey result.For example, code check item scanning can be carried out, based on the global symbol table and Symbolic Links built as a result, being directed to
Each error code scene carries out code scans, that is, traverses updated lexical unit sequence table, search updated lexical unit
Class, function and variable in sequence table etc., according to class, function or the variable etc. found, where calling class, function or variable
Local symbol table and global symbol table extract current son from the key feature stored in local symbol table or global symbol table
The testing result of code file.Code detecting apparatus can also specifically export lattice according to certain format output code error message
Formula can be flexibly arranged according to actual needs, and particular content is not construed as limiting here.
To be illustrated below, for example, as shown in figure 8, with above-mentioned demo1.cpp subcodes file and
For demo2.cpp subcode files, lookup and association based on global symbol table and class, function and variable, it can be found that
The problem of cannot being found in single local symbol table, in testing result, the demo.Func in demo1.cpp subcode files
Function call is correctly associated with the CDemo2 in demo2.cpp subcode files::Func functions are based on global symbol table, can be with
When knowing that type is equal to 1, return value is this key feature of null pointer NULL, therefore can export demo1.cpp subcode texts
The 19th row null pointer p dereferences report an error in part.
It completes to be updated the lexical unit sequence table of current subcode file, and is determining current subcode file
After testing result, it can continue to obtain another subcode file from code file to be detected, as current subcode text
Part, that is, return to execute and obtain subcode file from code file to be detected, the step of as current subcode file, until
Subcode file detection in code file to be detected finishes, and obtains the testing result of code file to be detected.
From the foregoing, it will be observed that the embodiment of the present invention can build global symbol table corresponding with code file to be detected, this is complete
Office's symbol table may include the global information of code file to be detected, and obtain each filial generation in code file to be detected
The corresponding lexical unit sequence table of code file and local symbol table, the local symbol table may include the part letter of subcode file
Breath;Then according to local symbol table and global symbol table, lexical unit sequence table is updated, obtains updated morphology list
Metasequence table;According to local symbol table, global symbol table and updated lexical unit sequence table, code text to be detected is determined
The testing result of part.The program is due to can carry out more lexical unit sequence table by global symbol table and local symbol table
Newly, and according to updated lexical unit sequence table and local symbol table and global symbol table entire generation to be detected is obtained
The global detection of code file as a result, realize the global detection to code file to be detected, and be not limited solely to it is individually right
Subcode file carries out local detection, and this improves the accuracys of code detection.
According to method described in above-described embodiment, citing is described in further detail below.
The present embodiment is by taking code detecting apparatus is terminal as an example, and by taking code file is the source files write of C++ as an example,
The code file of the present embodiment C Plus Plus project carries out first time scanning as input, to C++ codes:To in C++ code files
Each subcode file carries out morphological analysis, obtains lexical unit sequence table, and build per height according to lexical unit sequence table
Then the corresponding local symbol table of code file builds global symbol according to the corresponding local symbol table of each subcode file
Table, to for the first time scan after can in global symbol table the relevant key feature such as cached variable, function and class.Then,
C++ codes carry out second and scan:The lexical unit sequence table of each subcode file and part in C++ code files is obtained to accord with
Number table.And the lexical unit sequence table of each subcode file is updated according to global symbol table, local symbol table,
To establish the link that variable uses, function call and class are called, and specific code regulation inspection is carried out based on global symbol table,
So as to output code testing result, the result needed for checking is provided for upper layer static code check item.
Referring to Fig. 9, Fig. 9 is the flow diagram of code detection method provided in an embodiment of the present invention.This method flow
May include:
First, terminal obtains code file to be detected.
Wherein, for code file to be detected by taking the source file that C++ writes as an example, which can be one
The code file of a C++ projects may include one or more subcode files, will include below more with C++ code files
It is illustrated for a sub- code file.
It should be noted that the embodiment of the present invention during being detected to code, can be not based on compiling,
Static code detection can be carried out, so as to be detected in the case where compiling does not pass through, does not influence whole detection stream
Journey and result.In addition, C++ code files can support Windows Linux the systems such as Mac, cross-platform detection may be implemented.
Secondly, terminal-pair code file to be detected carries out first time scanning.
Wherein, may include during scanning for the first time code file is pre-processed, morphological analysis and standardization
Processing etc..
Pretreatment:
It can be that invalid generation is extracted from code file according to C++ code regulations that terminal-pair code file, which carries out pretreatment,
Invalid code content is filtered by digital content, obtains pretreated code file, which may include more
Remaining character, notes content and pre-processing instruction etc..For example, terminal can be extracted according to C++ code regulations from code file
Redundant character, the redundant character may include excess space, excess space branch or extra bracket etc.;For example, can basis
Three row excess spaces are extracted from code file about C++ code regulations.
And terminal extracts notes content according to the annotation mark in C++ code regulations from code file, wherein
Annotation mark can be double slashes " // ", either "/* " and " */" etc., for example, can be literary from code according to C++ code regulations
Annotation mark " // " is searched in part, and be expert at according to annotation mark " // " from code file " // " extract including " // " and
The notes content of content behind.
Terminal can also identify according to the pretreatment in C++ code regulations and extract pre-processing instruction from code file,
Pretreatment mark may include " # " etc., for example, can be searched from code file according to the code regulation about C Plus Plus
Pretreatment mark " # ", and be expert at and extracted including " # " and its subsequent according to pretreatment mark " # " " # " from code file
The pre-processing instruction of content.At this point, terminal can by the invalid codes such as redundant character, notes content and pre-processing instruction content from
It is deleted in code file, obtains pretreated code file.
It will be illustrated below, for example, some subcode file before pretreatment in C++ code files is:
At this point it is possible to extract from the subcode file redundant character, notes content and pre- according to C++ code regulations
The invalid codes content such as process instruction, and invalid code content is filtered, obtaining pretreated subcode file is:
It should be noted that by internal function main () " #ifdef TEST_PRE_CMD ", " #else ", " printf
("TEST_PRE_CMD is not define.");After the invalid codes information filtering such as //not define " and " #endif ",
Null can be reserved, invalid code content is expert at before being used to indicate pretreatment;It is of course also possible to not reserve null;It can be with
Flexibly it is arranged according to actual needs, particular content is not construed as limiting here.
Morphological analysis:
After obtaining pretreated code file, terminal can carry out morphological analysis to pretreated code file,
Obtain the corresponding lexical unit sequence table of each subcode file in code file.For example, terminal can obtain C++ code files
In in each subcode file each lexical unit string value, and obtain attribute associated with each lexical unit and believe
Breath generates doubly linked list according to the string value of each lexical unit and attribute information, it is corresponding to obtain each subcode file
Lexical unit sequence table.The attribute information may include the pointer information of each lexical unit and the feature letter of each lexical unit
Breath etc., the pointer information may include the pointer for the next lexical unit for being directed toward current lexical unit, be directed toward current morphology list
The pointer of a upper lexical unit for member is directed toward and the pointer of the lexical unit of current lexical unit pairing and direction morphology list
The pointer etc. of the symbol table of member, this feature information may include line number and lexical unit in code file where lexical unit
Type etc., the type of lexical unit may include the types such as number, character string, variable, function and keyword.
Wherein, the morphological analysis is similar with above-mentioned morphological analysis process, does not repeat here.For example, to some code
Section:For (int i=0;i<10;++ i) carry out morphological analysis, can obtain " for ", " (", " int ", " i ", "=", " 0 ",
“;”、“i”、“<”、“10”、“;", " ++ ", " i " and ") " etc. lexical units, then, lexical unit group in the form of doubly linked list
At lexical unit sequence.
Standardization:
After obtaining lexical unit sequence, terminal can be standardized each subcode file in C++ code files
Processing, obtains each standard subcode file.For example, terminal can be according to the corresponding lexical unit sequence of each subcode file
Table and C++ code standard logical formats, obtain the code format of standard, and each subcode from code file to be detected
Lookup and the unmatched object code format of code format, modify to object code format according to code format in file,
The corresponding standard subcode file of each subcode file is obtained, code file is replaced into row equivalent code logic to realize
It changes, with reputable code format.
It, can be as some local code before being standardized for example, for standardizing macro expansion
Under:
Code detecting apparatus can analyze the position where macro #define according to lexical unit sequence table, and according to
Code standard logical format determines that the code format of standard can be unfolded to macro, at this point it is possible to be looked into from the local code
It looks for and the unmatched object code format of code format:It is macro not to be unfolded, it then can be according to code format to target generation
Code format is modified, i.e., is unfolded to macro, realizes and be standardized to code, and the code for obtaining standard can be as
Under:
return(a>ba:b); |
After obtaining standard subcode file, terminal can obtain the corresponding lexical unit sequence of each standard subcode file
List, for example, the character of addition can be inserted into the corresponding lexical unit sequence table of subcode file before standardization, or
Person is to delete the character of deletion from the corresponding lexical unit sequence table of subcode file before standardization, marked
The corresponding lexical unit sequence table of quasi- subcode file.
Build local symbol table:
In obtaining C++ code files after the corresponding lexical unit sequence table of each standard subcode file, terminal can be with
Each standard subcode file corresponds to local symbol table in structure C++ code files.For example, terminal can be according to each standard
The corresponding lexical unit sequence table of code file builds abstract syntax tree, according to abstract syntax tree structure and each subcode file
Corresponding local symbol table.Wherein, the structure of abstract syntax tree is similar with the structure of above-mentioned abstract syntax tree, does not go to live in the household of one's in-laws on getting married here
It states.
Terminal can obtain the corresponding class list of each subcode file, function list and variable column according to abstract syntax tree
Table, may include in such list class and with the relevant key characteristic of class, may include function and and function in the function list
Relevant key characteristic may include variable and with the relevant key characteristic of variable etc. in the variable list, then according to each
The corresponding class list of subcode file, function list and variable list build local symbol corresponding with each subcode file
Table.Wherein, the structure of the local symbol table is similar with the above-mentioned structure of local symbol table, does not repeat here.
Build global symbol table:
After obtaining the corresponding local symbol table of each subcode file, terminal can be corresponded to according to each subcode file
Local symbol table build global symbol table.For example, local symbol table can be concentrated each local symbol table to close by terminal
And the symbol table after being merged;Then identical symbolic parameter in the symbol table after merging is retained one of them, and to phase
Other parameters in same symbolic parameter are deleted, and global symbol table is obtained.Wherein, the structure of the global symbol table with it is above-mentioned
The structure of global symbol table is similar, does not repeat here.
After obtaining global symbol table, terminal can carry out second to C++ code files and scan, to obtain each filial generation
The corresponding lexical unit sequence table of code file and local symbol table.For example, terminal can be from pre-stored lexical unit sequence
Obtain the corresponding lexical unit sequence table of each subcode file in the local hard drive or server of table collection, and from prestoring
Local symbol table collection local hard drive or server in obtain the corresponding local symbol table of each subcode file.
Either, terminal can reacquire the corresponding lexical unit sequence table of each subcode file according to the method described above
And local symbol table specifically can carry out morphological analysis to each subcode file in C++ code files, obtain every height
The corresponding lexical unit sequence table of code file.For example, the character of each lexical unit in each subcode file can be obtained
String value and associated attribute information;Doubly linked list is generated according to the string value of each lexical unit and attribute information, is obtained
Each corresponding lexical unit sequence table of subcode file.Terminal can also be according to the corresponding lexical unit of each subcode file
Sequence table is standardized each subcode file in code file to be detected, obtains each standard subcode text
Part.Then, the corresponding lexical unit sequence table of each standard subcode file is obtained, finally by each standard subcode file pair
The lexical unit sequence table answered is set as the corresponding lexical unit sequence table of each subcode file.
At this point, terminal can build abstract syntax tree corresponding with each subcode file according to lexical unit sequence table,
And local symbol table corresponding with each subcode file is built according to abstract syntax tree.For example, can be according to abstract syntax
Tree obtains the corresponding class list of each subcode file in code file to be detected, function list and variable list, and according to
The each corresponding class list of subcode file, function list and variable list structure part symbol corresponding with each subcode file
Number table.
Class, function, variable are searched and link:
Each subcode file corresponds in obtaining the corresponding global symbol table of C++ code files and C++ code files
Lexical unit sequence table and local symbol table after, terminal can be by the symbolic look-up algorithm of context-sensitive to lexical unit
Class, function and variable in sequence table etc. are searched and are linked, to update lexical unit sequence table.Using context-sensitive
Symbolic look-up algorithm, compare simple string matching, there is higher accuracy, in addition, in Data Structure Design, to the greatest extent
Data structure size is potentially reduced, caching is made full use of, improves search efficiency.
For example, the concrete type in code file representated by a lexical unit is influenced by current context, with such as
For lower code segment:
By the code segment it is found that the base class A of class B is directed to N1 actually::A is also directed to N2::A, in fact, looking into
When encountering such case during looking for, what can be taken is nearby principle, i.e. the nearest type of priority match, therefore, class
The base class A of B is directed toward N2::A.
Referring to Fig. 10, Figure 10 is the flow diagram of update lexical unit sequence table provided in an embodiment of the present invention.It should
Method flow may include:
S201, type parameter in the lexical unit sequence table of current subcode file is obtained, and type parameter is carried out
Level splits to obtain type qualified name and type name.
Terminal by the symbolic look-up algorithm of context-sensitive during updating lexical unit sequence table, first, from C
++ subcode file is obtained in code file obtains the lexical unit sequence of current subcode file as current subcode file
List traverses the lexical unit sequence table and obtains type parameters such as class, function and variable in lexical unit sequence table, and by class
Shape parameter carries out level and splits to obtain type qualified name and type name, for example, for type parameter A::B::C, type are entitled
C limits entitled A and B.
At this point, the type qualified name and type name of type parameter can be successively pressed into stack S by terminal, for example, for class
Shape parameter A::B::A, B, C can be successively pressed into stack S by C, and the type name C is in stack top.
S202, judge whether type name is system type name;If so, thening follow the steps S203;If it is not, thening follow the steps
S204。
Terminal can take out the stack top element of stack S, i.e., it (does not include type parameter to take out the type name of stack top in stack S
Qualified name), then judge whether the type name of such shape parameter is system type name, which may include that code is compiled
The type name stored in the included data of system is translated, for example, main.
The pointer that S203, return type parameter are directed toward symbol table is sky.
When the type name of type parameter is system type name, can not continue in local symbol table or global symbol table
The type name is searched, but terminates to search flow, it is sky to return to such shape parameter and be directed toward the pointer of symbol table.At this point, terminal can
Pointer to be directed toward symbol table according to such shape parameter is sky, is carried out more to the lexical unit sequence table of current subcode file
Newly, i.e., the pointer of the direction symbol table of such shape parameter in lexical unit sequence table is set as empty.
S204, the type name that type parameter is searched from the local symbol table of current subcode file.
S205, judge to whether there is type name in local symbol table;If so, thening follow the steps S206;If it is not, then executing step
Rapid S209.
When the type name of type parameter is not system type name, terminal can be from the local symbol of current subcode file
The type name that type parameter is searched in table can further judge the type name that whether there is type parameter in local symbol table.
It should be noted that terminal can be in the local symbol table where the type parameter preset range in search, when looking into
When can not find, then expanded scope is searched successively, until having searched all ranges in local symbol table.
S206, corresponding with type name in the local symbol table qualified name of qualified name in lexical unit sequence table is carried out
Match.
S207, judge whether to match;If so, thening follow the steps S208;If it is not, thening follow the steps S209.
When the type name of present pattern parameter in local symbol table, terminal can take out the qualified name of stack top in stack S,
Then the qualified name of type parameter in lexical unit sequence table is matched with the qualified name of type parameter in local symbol table,
Judge whether the qualified name of type parameter in morphology unit sequence table matches into the qualified name of type parameter in local symbol table
Work(.
It should be noted that when type parameter includes multiple qualified names, multiple qualified names can be matched one by one,
Until completing all qualified name successful match, then illustrate the qualified name of type parameter and local symbol table in morphology unit sequence table
The qualified name of middle type parameter whether successful match.
S208, the type parameter in lexical unit sequence table is linked with the type parameter in local symbol table.
If the qualified name of type parameter is matched with the qualified name of type parameter in local symbol table in lexical unit sequence table
Success, then terminal can link the type parameter in lexical unit sequence table with type parameter in local symbol table, example
If pointer links, for example, the type parameter in lexical unit sequence table can be directed toward to the pointer of symbol table, it is set to point to office
Type parameter in portion's symbol table obtains updated lexical unit sequence table.
S209, the type name that type parameter is searched from global symbol table.
S210, judge to whether there is type name in global symbol table;If so, thening follow the steps S211;If it is not, then executing step
Rapid S203.
When the type name of type parameter is not present in local symbol table, type parameter can be searched from global symbol table
Type name, judge in global symbol table whether there is type parameter type name.
It should be noted that terminal can be in the global symbol table where the type parameter preset range in search, when looking into
When can not find, then expanded scope is searched successively, until having searched all ranges in global symbol table.
S211, corresponding with type name in the global symbol table qualified name of qualified name in lexical unit sequence table is carried out
Match.
S212, judge whether to match;If so, thening follow the steps S213;If it is not, thening follow the steps S203.
When the type name of present pattern parameter in global symbol table, terminal can take out the qualified name of stack top in stack S,
And match the qualified name of type parameter in lexical unit sequence table with the qualified name of type parameter in global symbol table, sentence
In hyphenation method unit sequence table in the qualified name of type parameter and global symbol table type parameter qualified name whether successful match.
It should be noted that when type parameter includes multiple qualified names, multiple qualified names can be matched one by one,
Until completing all qualified name successful match, then illustrate the qualified name and global symbol table of type parameter in morphology unit sequence table
The qualified name of middle type parameter whether successful match.
S213, the type parameter in lexical unit sequence table is linked with the type parameter in global symbol table.
If the qualified name of type parameter is matched with the qualified name of type parameter in global symbol table in lexical unit sequence table
Success, then link the type parameter in lexical unit sequence table with the type parameter in global symbol table, such as pointer
Link is set to point to global symbol for example, the type parameter in lexical unit sequence table can be directed toward to the pointer of symbol table
Type parameter in table obtains updated lexical unit sequence table.
It should be noted that when lexical unit sequence table includes multiple type parameters, can be traversed one by one, directly
To the lookup and link for completing all types parameter.
Code check item scans:
After the lexical unit sequence table of update subcode file, terminal can be according to global symbol table, subcode file
Local symbol table and updated lexical unit sequence table, determine the testing result of subcode file.For example, terminal can be with
Based on the global symbol table and Symbolic Links built as a result, carrying out code scans for each error code scene, that is, traverse more
Lexical unit sequence table after new, searches class, function and the variable etc. in updated lexical unit sequence table, according to finding
Class, function or variable etc., the local symbol table and global symbol table where class, function or variable are called, from local symbol table
Or the testing result of subcode file is extracted in the key feature stored in global symbol table.
Export testing result:
After the testing result for obtaining each subcode file, terminal can be believed according to certain format output code mistake
Breath, specific output format can be flexibly arranged according to actual needs, and particular content is not construed as limiting here.
The embodiment of the present invention can according to global symbol table, local symbol table and lexical unit sequence table to code file into
Row static code detects (not needing compiled code), has fully considered that code file missing, type definition missing and grammer are wrong
Accidentally situations such as, can be detected for static code and provide the symbolism knots such as accurate and efficient global symbol table and local symbol table
Fruit so that code detection has syntactic level, across function scanning, semantic level and a degree of logic analysis ability;No
Improve only the accuracy and high efficiency of code detection result, and can be found that defect that may be present in code, performance and
The code detection result of the potential problems such as safety, final output can help that is hidden in the quick location code of developer to ask
Topic reduces the rehabilitation cost in later stage, and can promote generation so that developer is efficient and low cost repairs code
Code quality.
For ease of preferably implementing code detection method provided in an embodiment of the present invention, the embodiment of the present invention also provides one kind
Device based on above-mentioned code detection method.Wherein the meaning of noun is identical with above-mentioned code detection method, and specific implementation is thin
Section can be with the explanation in reference method embodiment.
Please refer to Fig.1 the structural schematic diagram that 1, Figure 11 is code detecting apparatus provided in an embodiment of the present invention, the wherein generation
Code detection device may include first acquisition unit 301, construction unit 302, second acquisition unit 303, updating unit 304 and really
Order member 305 etc..
Wherein, first acquisition unit 301, for obtaining code file to be detected.
The code can be the source file that the language that developer's exploitation tool is supported is write out, can be one
Group is indicated the specific rule system of information by character, symbol or signal element etc. with discrete form.The code can be C++ languages
The source file that speech, C language or Java language etc. are write out, can also be the source file that other language are write out, specific interior
Appearance is not construed as limiting here.
First acquisition unit 301 obtains code file to be detected first, which can be one
The code file of software project may include one or more subcode files.
In some embodiments, as shown in figure 14, first acquisition unit 301 may include extraction subelement 3011 and mistake
Filter unit 3012 etc., specifically can be as follows:
Subelement 3011 is extracted, is extracted in vain from code file for obtaining code file, and according to code regulation
Code content;
Filtering subelement 3012 obtains code file to be detected for invalid code content to be filtered.
Specifically, extraction subelement 3011 can obtain code file, for example, can be from local pre-stored code library
Middle acquisition code file, the code file can be that code detecting apparatus first passes through the generation of code programming developing instrument in advance;Or
It is that can send code file to server and obtain request, and receive server and the generation that request returns is obtained based on code file
Code file, which can be that code detecting apparatus or other-end are uploaded to server, by the server storage generation
Code file.It is understood that the acquisition modes of code file can not be limited here with other acquisition modes, particular content
It is fixed.
After obtaining code file, extraction subelement 3011 can pre-process code file, follow-up to filter out
Unwanted content is handled, for example, the part unrelated with valid code can be filtered, specification is provided for subsequent morphological analysis
Character stream.The pretreatment may include extracting invalid code content from code file according to code regulation, wherein work as code
When file is the source file write using C Plus Plus, which can be the redaction rule about C Plus Plus;Work as code
When file is the source file that profit is shown a C language, which can be the redaction rule about C language;Work as code file
When being the source file write using Java language, which can be the redaction rule etc. about Java language.This is invalid
Code content may include notes content or pre-processing instruction etc., can also include other content, and particular content is not made here
It limits.
After obtaining invalid code content, invalid code content can be filtered by filtering subelement 3012, i.e., by nothing
Effect code content is deleted from code file, obtains code file to be detected.When code file includes multiple subcode texts
When part, multiple subcode files can be traversed, extract invalid code content from each subcode file, and by invalid code
Content is filtered.
Optionally, extraction subelement 3011 specifically can be used for:
Redundant character is extracted from code file according to code regulation;
Notes content is extracted from code file according to the annotation mark in code regulation;
Pre-processing instruction is extracted from code file according to the pretreatment mark in code regulation;
Set redundant character, notes content and pre-processing instruction to invalid code content.
Specifically, extraction subelement 3011 can extract redundant character according to code regulation from code file, this is more
Remaining character may include excess space, excess space branch or extra bracket etc..For example, when code file is to utilize C Plus Plus
When the source file write, three row excess spaces can be extracted from code file according to the code regulation about C Plus Plus.
In another example when code file is the source file write using Java language, can according to the code regulation about Java language,
Five excess spaces therein are extracted from continuous six spaces present in code file;Etc..
Extraction subelement 3011 can identify according to the annotation in code regulation and extract notes content from code file,
Wherein, annotation mark can be double slashes " // ", either "/* " and " */" etc..For example, when code file is to utilize C Plus Plus
When the source file write, annotation mark " // " can be searched from code file according to the code regulation about C Plus Plus, and
It is expert at according to annotation mark " // " from code file " // " and extracts notes content, the notes content is including " // " and thereafter
The content in face.
In another example when code file is the source file write using C Plus Plus, it can be according to the generation about C Plus Plus
Code rule searches the annotation mark "/* " of starting from code file, and the annotation mark of termination is searched from code file
" */", and "/* " is identified according to the annotation of starting and the annotation terminated identifies " */" and extracts "/* " (including/* between " */"
With * /) notes content.
Extraction subelement 3011 can identify according to the pretreatment in code regulation and extract pretreatment from code file
Instruction, pretreatment mark may include " # " etc., which may include #define, #if and #pragma etc..Example
It such as, can be according to the code regulation about C Plus Plus, from code when code file is the source file write using C Plus Plus
Pretreatment mark " # " is searched in file, and is expert at according to pretreatment mark " # " " # " from code file and is extracted pretreatment
Instruction, which includes " # " and its subsequent content.
After obtaining redundant character, notes content and pre-processing instruction, extraction subelement 3011 can by redundant character,
Notes content and pre-processing instruction are set as invalid code content, realize and extract nothing from code file according to code regulation
Imitate code content.
Construction unit 302, for building the corresponding global symbol table of code file to be detected.
After obtaining code file to be detected, it is corresponding complete that construction unit 302 can build code file to be detected
Office's symbol table, wherein global symbol table may include the class that each subcode file includes in code file to be detected and its
The data structure of the symbols such as relevant information, function and its relevant information, variable and its relevant information.
In some embodiments, as shown in figure 12, construction unit 302 may include the first acquisition subelement 3021,
One structure subelement 3022 and second builds subelement 3023 etc., specifically can be as follows:
First obtains subelement 3021, for obtaining word corresponding with each subcode file in code file to be detected
Method unit sequence table obtains lexical unit sequence table collection;
First structure subelement 3022, for concentrating each lexical unit sequence table, structure according to lexical unit sequence table
Local symbol table corresponding with each subcode file in code file to be detected, obtains local symbol table collection;
Second structure subelement 3023, for building global symbol table according to local symbol table collection.
Specifically, the first acquisition subelement 3021 can be obtained first and each subcode file in code file to be detected
Corresponding lexical unit sequence table obtains lexical unit sequence table collection, for example, each subcode can be obtained by morphological analysis
The corresponding lexical unit sequence table of file.Wherein, may include that multiple lexical units (are properly termed as word in lexical unit sequence table
Element is referred to as Token), for example, if or for etc. is a lexical unit, a sub- code file passes through morphological analysis
The lexical unit sequence table of the set for all lexical units that can be generated afterwards, as the subcode file (is properly termed as
TokenList), one section of code can correspond to obtain a lexical unit section that (i.e. one Token sections, also may be used in subcode file
With referred to as TokenSection), include the sequence of one or more Token compositions.
It should be noted that in order to improve the acquisition efficiency of lexical unit sequence table, first obtains subelement 3021 can be with
Multiple threads are called, and obtain the lexical unit sequence table of each subcode file by each thread parallel, so as to fast
Speed gets multiple lexical unit sequence tables.Certainly, the first acquisition subelement 3021 can serially obtain lexical unit sequence table,
Particular content is not construed as limiting here.
In some embodiments, the first acquisition subelement 3021 may include analysis module, processing module and acquisition mould
Block etc., specifically can be as follows:
Analysis module obtains each for carrying out morphological analysis to each subcode file in code file to be detected
The corresponding lexical unit sequence table of subcode file;
Processing module is used for according to the corresponding lexical unit sequence table of each subcode file, to code text to be detected
Each subcode file is standardized in part, obtains the standard code file set of each standard subcode file composition;
Acquisition module, for obtaining the corresponding lexical unit sequence of each standard subcode file in standard code file set
Table obtains lexical unit sequence table collection.
Specifically, during obtaining lexical unit sequence table collection, first, analysis module can be to code to be detected
Each subcode file carries out morphological analysis in file, obtains the corresponding lexical unit sequence table of each subcode file, each
The corresponding lexical unit sequence table of subcode file can form lexical unit sequence table collection, and lexical unit sequence table concentration can
To include the corresponding lexical unit sequence table of one or more subcode files.
Wherein, morphological analysis can be that the character string in subcode file is converted to the process of lexical unit sequence,
The character stream for being mainly used for reading in pretreatment output of the morphological analysis, forms morpheme by the character stream, generates and export one
Lexical unit sequence, each lexical unit correspond to a morpheme, and entire lexical unit sequence is lexical unit sequence table, morphology
Unit sequence table is the Data Structures of subsequent processing and upper layer check item traversal code.
For example, to some code segment:For (int index=0;index<42;++ index) morphological analysis is carried out, it can be with
Obtain " for ", " (", " int ", " index ", "=", " 0 ", ";”、“index”、“<”、“42”、“;", " ++ ", " index " and
") " etc. 14 lexical units composition lexical unit sequence.
In some embodiments, analysis module may include the first acquisition submodule and the second acquisition submodule etc., tool
Body can be as follows:
First acquisition submodule, for obtaining in code file to be detected each lexical unit in each subcode file
String value;
Second acquisition submodule, for obtaining attribute information associated with each lexical unit;
Submodule is generated, for generating doubly linked list according to the string value and attribute information of each lexical unit, is obtained
Each lexical unit sequence table collection of the corresponding lexical unit sequence table composition of subcode file.
During obtaining lexical unit sequence table, the first acquisition submodule can obtain in code file to be detected
The string value of each lexical unit in each subcode file, for example, being by " f ", " o ", " r " for lexical unit " for "
It is formed Deng three string values, lexical unit " int " is made of three string values such as " i ", " n ", " t ".
Lexical unit sequence table is to carry out the most basic unit of code detection, in addition to the string value including lexical unit with
Outside, can also include attribute information associated with lexical unit, therefore, the second acquisition submodule can obtain and each morphology
The associated attribute information of unit, wherein the attribute information may include the pointer for being directed toward next lexical unit, be directed toward upper one
The line number etc. of the pointer of a lexical unit, the type of lexical unit and lexical unit.
In obtaining subcode file after the string value and attribute information of each lexical unit, generating submodule can root
Doubly linked list is generated according to the string value and attribute information of each lexical unit, obtains the corresponding morphology list of each subcode file
Metasequence table, the corresponding lexical unit sequence table of each subcode file can form lexical unit sequence table collection.Lexical unit
Sequence table is substantially a doubly linked list, safeguards lexical unit all in lexical unit sequence table.
For example, as shown in figure 3, certain section of code:“if(i>0) lexical unit sequence table " can be expressed as shown in Fig. 3
Doubly linked list, wherein arrow can indicate the pointer for the next lexical unit for being directed toward current lexical unit, for example, sensing word
Method unit " i " next lexical unit ">" pointer;Or arrow can indicate the upper word for being directed toward current lexical unit
The pointer of method unit, for example, be directed toward lexical unit " (" a upper lexical unit " if " pointer;Or arrow can indicate
Current lexical unit is directed toward the pointer of the lexical unit matched with it, for example, lexical unit " (" it is directed toward the morphology list matched with it
Member ") " pointer;Etc..
Optionally, the second acquisition submodule specifically can be used for:Each lexical unit is obtained in code file to be detected
In the pointer of various information is directed toward in each subcode file, obtain pointer information;Obtain the characteristic information of each lexical unit;
Set the characteristic information of pointer information and each lexical unit to attribute information associated with each lexical unit.
Specifically, the second acquisition submodule can obtain each filial generation in code file to be detected of each lexical unit
The pointer of various information is directed toward in code file, wherein various information may include next morphology list of current lexical unit
The data flow architecture etc. of member, the lexical unit matched with current lexical unit and lexical unit.
For example, the pointer for the next lexical unit for being directed toward current lexical unit can be obtained, be directed toward current lexical unit
A upper lexical unit pointer, be directed toward with current lexical unit pairing lexical unit pointer (for example, for left bracket
For, that is, be directed toward the pointer of right parenthesis), be directed toward lexical unit symbol table pointer (for example, what variable was directed toward is global
Variable object in symbol table or local symbol table, what function was directed toward is function in global symbol table or local symbol table
Object etc.), the syntax tree structure pointer (the abstract syntax tree construction that can be used for safeguarding lexical unit) and morphology of lexical unit
Data flow architecture pointer of unit etc., these are pointer information.
At this point, the second acquisition submodule also needs to obtain the characteristic information of each lexical unit, wherein characteristic information can be with
Include the type etc. of line number and lexical unit in code file where lexical unit, the type of lexical unit may include number
The types such as word, character string, variable, function and keyword, for example, " 1 " and " 2 " etc. can be numeric type, " main " can be
Type function, " index " and " i " etc. can be types of variables, etc..The characteristic information of these pointer informations and lexical unit is
For attribute information associated with lexical unit.
In some embodiments, processing module specifically can be used for:According to the corresponding morphology list of each subcode file
Metasequence table and code standard logical format, obtain the code format of standard;Each subcode from code file to be detected
It is searched and the unmatched object code format of code format in file;It is modified to object code format according to code format,
Obtain the standard code file set of each standard subcode file composition.
Since disparity items code may be to be write by different developers, for the code wind of different developers
Lattice may be different, and objectively form the multifarious present situation of code, therefore, in order to improve the effect of structure global symbol table
Rate, and the accuracy of raising code detection can be standardized code file after getting lexical unit sequence table
Processing, for example, can standardize and standardize to realize by some simplification steps come Unicode style.Wherein, the standard
During change is handled, code logic cannot all be changed for all simplified steps, and only carry out equivalencing in logic.
Specifically, after obtaining the corresponding lexical unit sequence table of each subcode file, processing module can be according to every
The corresponding lexical unit sequence table of a sub- code file and code standard logical format, obtain the code format of standard, example
Such as, it is starting that the logical format that a function is realized, which can be with opening brace, and is terminated with right braces.Wherein, work as code
When file is the source file write using C Plus Plus, which can be the logic lattice about C Plus Plus
Formula;When code file is the source file write using Java language, which can be about Java languages
The logical format of speech.
Then, processing module in code file to be detected in each subcode file from searching and the code format of standard
Unmatched object code format modifies to object code format according to the code format of standard, obtains each standard
Code file, for example, the code format of standard can be utilized to replace object code format, each standard subcode file can group
At standard code file set.Code file is replaced into row equivalent code logic to realize, with reputable code format.
For example, by taking normalization condition expression formula as an example, before being standardized, in some subcode file
Conditional expression the sentence of braces is omitted, specific subcode file can be as follows:
Processing module can analyze conditional expression if (i according to the corresponding lexical unit sequence table of subcode file>0)
The position at place, and determine that the code format of standard is to need to be arranged behind conditional expression according to code standard logical format
Braces, at this point it is possible to be searched and the unmatched object code format of code format from subcode file:Conditional expression if
(i>0) it is not provided with braces below, then can be modified to object code format according to code format, i.e., in condition table
Up to formula if (i>0) braces is added below, realization is standardized sub- code file, obtains standard subcode file,
It specifically can be as follows:
After obtaining standard code file set, acquisition module can obtain each standard subcode in standard code file set
The corresponding lexical unit sequence table of file, for example, the character (such as braces) of addition can be inserted into before standardization
The corresponding lexical unit sequence table of subcode file obtains the corresponding lexical unit sequence table of standard subcode file;Either,
The character of deletion can be deleted from the corresponding lexical unit sequence table of subcode file before standardization, obtains standard
The corresponding lexical unit sequence table of subcode file;The corresponding lexical unit sequence table of each standard subcode file can form
Lexical unit sequence table collection.
It is above-mentioned obtain lexical unit sequence table collection after, first structure subelement 3022 can be according to lexical unit sequence table
Each lexical unit sequence table is concentrated, local symbol corresponding with each subcode file in code file to be detected is built
Table obtains local symbol table collection.For example, lexical unit sequence table can be traversed, subcode is extracted from lexical unit sequence table
The information such as the corresponding class of file, function and variable, and with the information such as the relevant key characteristic such as class, function and variable, according to
These information can build class list, function list and variable list etc., can be arranged according to the corresponding class of each subcode file
Table, function list and variable list build local symbol table corresponding with each subcode file, and each subcode file corresponds to
Local symbol table can form local symbol table collection.
In some embodiments, the first structure subelement 3022 may include the first structure module and the second structure module
Deng specifically can be as follows:
First structure module, for concentrating each lexical unit sequence table according to lexical unit sequence table, structure with it is to be checked
The corresponding abstract syntax tree of each subcode file in the code file of survey;
Second structure module, for according to abstract syntax tree structure and each subcode file in code file to be detected
Corresponding local symbol table obtains local symbol table collection.
Specifically, the first structure module can concentrate each lexical unit sequence table, structure according to lexical unit sequence table
Abstract syntax tree (Abstract Syntax Tree, AST) corresponding with each subcode file in code file to be detected.
Wherein, abstract syntax tree can be the tree-shaped form of expression of the abstract syntax structure of code, and abstract syntax tree can be one two
Fork tree, each non-leaf nodes represent an operator, and two child nodes of non-leaf nodes respectively represent where operator
Two operation components of the operator.The priority for the logical construction and operator that abstract syntax tree construction contains expression formula is closed
System, this characteristic can improve the accuracy of code scene matching and realize the efficiency of the code scene.
It should be noted that the abstract syntax tree in the embodiment of the present invention, will not establish the logic between code expression
Relationship, for example, the logical relation in if-else statement interludes between if sentences and else sentences will not be established, and just for single
Abstract syntax structure is established in code expression, does not establish the structural relation between expression formula and expression formula.If due to building
Vertical structural relation between expression formula and expression formula is built then once there are syntax errors for the code file of input
Global abstract syntax tree construction will be mistake, and without reference to meaning, thus the present invention support it is incomplete or not
Input can be used as by the code file of compiling, and build the abstract syntax tree construction of single expression formula, if some expression formula
There is mistake, also only mistake occurs in the abstract syntax tree construction of part for that, the abstract syntax without influencing other expression formulas
Tree construction.
For example, such as next section of code:
String::Format("demo:%d%s%d ", Func (1,2), " AST ", 1+2*3);
It, can be with as shown in figure 4, including parameter for the structure for the abstract syntax tree that this section of code is finally built
1:"demo:%d%s%d ", parameter 2:Func (1,2), parameter 3:" AST " and parameter 4:1+2*3 etc., for example, non-leaf section
Point " * " represents an operator, and two child nodes " 2 " of non-leaf nodes and " 3 " respectively represent the operation where the operator
Two operation components of symbol, i.e. 2*3;Corresponding two child nodes of operator "+" are " 1 " and " * ", can obtain 1+2*3.
After obtaining abstract syntax tree, the second structure module can be built and code to be detected text according to abstract syntax tree
The corresponding local symbol table of each subcode file in part, the corresponding local symbol table of each subcode file can form part
Symbel table.
In some embodiments, the second structure module specifically can be used for:It is obtained according to abstract syntax tree to be detected
The corresponding class list of each subcode file, function list and variable list in code file;According to each subcode file pair
Class list, function list and the variable list answered build local symbol table corresponding with each subcode file, obtain local symbol
Number table collection.
Specifically, the second structure module can go out subcode file pair according to tree rapid extraction in abstract syntax tree
The information such as class, function and the variable answered, and with the information such as the relevant key characteristic such as class, function and variable, according to these letters
Breath can build the corresponding class list of each subcode file in code file to be detected, function list and variable list etc.,
It can be built according to the corresponding class list of each subcode file, function list and variable list corresponding with each subcode file
Local symbol table, the corresponding local symbol table of each subcode file can form local symbol table collection.Wherein, local symbol
Table is referred to as SymbolDatabase, which can be the corresponding symbolism result object of subcode file.
For example, in the source file that C Plus Plus is write, (include the .h of corresponding expansion getting each .cpp file
File) after corresponding lexical unit sequence table, a corresponding local symbol table can be built according to the lexical unit sequence table,
Wherein, each local symbol table can include following three types of data:(1) list of types (is referred to as Type
List), it can be used for type all in record code file, for example, the types such as class, struct or namespace, it should
May include each typonym and the corresponding key feature of each type etc. in list of types;(2) function list (also may be used
With referred to as Function List), it can be used for function all in record code file, may include each in the function list
A function name and the corresponding key feature of each function (for example, return value etc. of function) etc.;(3) variable list (also may be used
With referred to as Variable List), it can be used for variable all in record code file, may include each in the variable list
A name variable and the corresponding key feature of each variable etc..
To be illustrated below, for example, may include in demo.cpp code files demo1.cpp, demo.h and
The subcodes file such as demo2.cpp, the code content in demo1.cpp subcode files can be as follows:
May include demo.h subcode files, the generation in demo.h subcode files in demo1.cpp subcode files
Digital content can be as follows:
In Scanning Detction demo1.cpp subcode files, similar following local symbol table can be obtained:
By the local symbol table of demo1.cpp subcode files it is found that there are symbols to lack in demo1.cpp subcode files
It loses:CDemo2::The definition of Func is not found, and in fact, CDemo2::Func's is defined on demo2.cpp subcode texts
In part, the code content in the demo2.cpp subcode files can be as follows:
In Scanning Detction demo2.cpp subcode files, similar following local symbol table can be obtained:
By the local symbol table of demo2.cpp subcode files it is found that the local symbol table of demo2.cpp subcode files
Local symbol table compared to demo1.cpp subcode files is relatively simple, and most important one information is exactly CDemo2::Func
The definition of function, what this was missing from the corresponding local symbol table of demo1.cpp subcode files, this is also individually to obtain
The important problem for taking each subcode file to occur, at this time across file symbol search capacity be missing from.
Therefore, after obtaining the local symbol table collection that the corresponding local symbol table of each subcode file is formed, second
Global symbol table can be built according to local symbol table collection by building module, so as to find certain height in global symbol table
The symbols such as the class, function or the variable that are lacked in code file (i.e. type parameter) realize across file symbol search capacity.
In some embodiments, the second structure subelement 3023 is specifically used for:
Each local symbol table is concentrated to merge in local symbol table, the symbol table after being merged;
Identical symbolic parameter in symbol table after merging is retained one of them, and to its in identical symbolic parameter
He deletes at parameter, obtains global symbol table.
Specifically, local symbol table can be concentrated each local symbol table to merge by the second structure subelement 3023,
Symbol table after being merged, due to that may have identical symbolic parameter in the symbol table after merging, which can
To include class, function and variable etc., therefore identical symbolic parameter in the symbol table after merging can be retained one of them, and
Other parameters in identical symbolic parameter are deleted, global symbol table is obtained.Either, in symbol table after merging
There are when two identical symbolic parameters, one of symbolic parameter is undefined, and when another symbolic parameter defines, merging
During can retain defined symbolic parameter, and delete undefined symbolic parameter.
For example, constructing local symbol respectively for demo1.cpp subcodes file and demo2.cpp subcode files
Table, but the CDemo2 in demo1.cpp subcode files::What the definition of Func functions was missing from, the definition of the function be
In the local symbol table of demo2.cpp subcode files.It therefore, in order to well solve symbol missing the problem of, can basis
Each local symbol table builds global symbol table, build global symbol table it is crucial that solving the symbols such as class, function and variable
It searches and merging logic when symbol conflict.
For example, in order to visually illustrate the structure of global symbol table, illustrated in the form of symbolic construction tree,
The corresponding symbolic construction tree of local symbol table of demo1.cpp subcode files can with as shown in figure 5, as shown in Figure 5,
May include the classes such as CObject, CDemo1, CDemo2 in the local symbol table of demo1.cpp subcode files, and including
Func functions and global_var1 variables etc., wherein there is Func defined in CDemo1 classes, and Func definition lacks in CDemo2 classes
It loses.
The corresponding symbolic construction tree of local symbol table of demo2.cpp subcode files can be with as shown in fig. 6, can by Fig. 6
Know, may include the classes such as CObject and CDemo2 and global_var2 in the local symbol table of demo1.cpp subcode files
Variable etc., wherein have Func defined in CDemo2 classes.
According to the local symbol of the local symbol table of demo1.cpp subcode files and demo2.cpp subcode files
Table, the global symbol table built can with as shown in fig. 7, may include as shown in Figure 7, in global symbol table CObject,
The classes such as CDemo1 and CDemo2, and further include the variables such as global_var1 and global_var2 including Func functions,
In, definition has Func in CDemo1 and CDemo2 classes.I.e. by demo1.cpp subcodes file and demo2.cpp subcode texts
After the symbolic construction tree of part merges, the CDemo2 in demo2.cpp subcode files::The definition of Func functions, and
Global_var2 variables have been merged into the symbolic construction tree of demo1.cpp subcode files.
Second acquisition unit 303, for obtaining the corresponding morphology list of each subcode file in code file to be detected
Metasequence table and local symbol table.
Code is detected for convenience, second acquisition unit 303 needs to obtain each in code file to be detected
The corresponding lexical unit sequence table of subcode file and local symbol table.
In some embodiments, second acquisition unit 303 specifically can be used for:
When being stored with lexical unit sequence table collection and local symbel table, concentrate acquisition to be checked from lexical unit sequence table
The corresponding lexical unit sequence table of each subcode file in the code file of survey;
It is concentrated from local symbol table and obtains the corresponding local symbol table of each subcode file in code file to be detected.
Specifically, during above-mentioned acquisition global symbol table, due to needing in acquisition and code file to be detected
Each corresponding lexical unit sequence table of subcode file, and it is corresponding with each subcode file in code file to be detected
Local symbol table etc., therefore, code detecting apparatus it is above-mentioned get lexical unit sequence table collection and local symbel table after,
Lexical unit sequence table collection and local symbel table can be stored into local hard drive;Alternatively, by lexical unit sequence table collection
It is uploaded to server with local symbel table, lexical unit sequence table collection and local symbel table are stored by server;
Etc..
At this point, second acquisition unit 303 is obtaining the corresponding lexical unit sequence table of each subcode file and part symbol
During number table, it can be determined that whether local hard drive or server etc. are stored with lexical unit sequence table collection, when being stored with word
When method unit sequence table collection, it can directly be concentrated from lexical unit sequence table and obtain each subcode in code file to be detected
The corresponding lexical unit sequence table of file.And judge local hard drive or server etc. and whether be stored with local symbol table collection, when
When being stored with local symbol table collection, it can directly be concentrated from local symbol table and obtain each subcode in code file to be detected
The corresponding local symbol table of file.
In some embodiments, second acquisition unit 303 specifically can be used for:
When not storing lexical unit sequence table collection and local symbel table, to each filial generation in code file to be detected
Code file carries out morphological analysis, obtains the corresponding lexical unit sequence table of each subcode file;According to lexical unit sequence table,
Structure abstract syntax tree corresponding with each subcode file;It is built according to abstract syntax tree corresponding with each subcode file
Local symbol table.
Code detecting apparatus can not store the lexical unit sequence got during above-mentioned acquisition global symbol table
List collection and local symbel table etc., at this point, second acquisition unit 303 is obtaining the corresponding lexical unit of each subcode file
During sequence table and local symbol table, it can be determined that whether local hard drive or server etc. are stored with lexical unit sequence table
Collection, when not being stored with lexical unit sequence table collection, code detecting apparatus needs to reacquire each subcode file corresponding
Lexical unit sequence table and local symbol table.
Specifically, second acquisition unit 303 can carry out morphology to each subcode file in code file to be detected
Analysis, obtains the corresponding lexical unit sequence table of each subcode file.For example, can obtain every in code file to be detected
The string value of each lexical unit in a sub- code file, and obtain attribute information associated with each lexical unit;
Doubly linked list is generated according to the string value of each lexical unit and attribute information, obtains the corresponding morphology of each subcode file
Unit sequence table.
Second acquisition unit 303 can also be according to the corresponding lexical unit sequence table of each subcode file, to be detected
Code file in each subcode file be standardized, obtain each standard subcode file.For example, can basis
The each corresponding lexical unit sequence table of subcode file and code standard logical format, obtain the code format of standard;From waiting for
It is searched and the unmatched object code format of code format in each subcode file in the code file of detection;According to code lattice
Formula modifies to object code format, obtains the corresponding standard subcode file of each subcode file.
Then, second acquisition unit 303 obtains the corresponding lexical unit sequence table of each standard subcode file, finally will
Each corresponding lexical unit sequence table of standard subcode file is set as the corresponding lexical unit sequence of each subcode file
Table.At this point it is possible to according to lexical unit sequence table, structure abstract syntax tree corresponding with each subcode file, and according to
Abstract syntax tree builds local symbol table corresponding with each subcode file.It is waited for for example, can be obtained according to abstract syntax tree
The corresponding class list of each subcode file, function list and variable list in the code file of detection, and according to each filial generation
The corresponding class list of code file, function list and variable list build local symbol table corresponding with each subcode file.
Updating unit 304, for according to local symbol table and global symbol table, being updated to lexical unit sequence table,
Obtain updated lexical unit sequence table.
Determination unit 305 is used for according to local symbol table, global symbol table and updated lexical unit sequence table, really
The testing result of fixed code file to be detected.
The corresponding local symbol table of each subcode file in obtaining global symbol table and code file to be detected
With lexical unit sequence table, updating unit 304 can according to local symbol table and global symbol table, to lexical unit sequence table into
Row update, obtains updated lexical unit sequence table.For example, updating unit 304 can traverse lexical unit sequence table, to word
Class, function and variable in method unit sequence table are searched and are linked, based on the above-mentioned global symbol table built, in use
Hereafter sensitive symbolic look-up algorithm is realized across file or cross-module symbolic look-up ability, is reached and is obtained more accurate code
Testing result.
Wherein, the effect of lookup and the link of the symbols such as class, function and variable, which is that, will traverse lexical unit sequence table
In class, function and variable etc. and local symbol table where it or global symbol table be associated, so as in traversal morphology list
When metasequence table, the relevant information of class, function or variable etc. can be known, so as to significantly improve the scanning of code check item
Efficiency.
In some embodiments, as shown in figure 13, updating unit 304 may include the second acquisition subelement 3041,
Three obtain subelement 3042 and update subelement 3043 etc., specifically can be as follows:
Second obtains subelement 3041, for obtaining subcode file from code file to be detected, as current son
Code file;
Third obtains subelement 3042, for obtaining the corresponding lexical unit sequence table of current subcode file and part symbol
Number table;
Subelement 3043 is updated, is used for according to local symbol table and global symbol table, it is corresponding to current subcode file
Lexical unit sequence table is updated, and obtains updated lexical unit sequence table;
Determination unit 305 specifically can be used for:According to local symbol table, global symbol table and updated lexical unit sequence
List determines the testing result of current subcode file;Triggering second obtains subelement and executes from code file to be detected
Subcode file is obtained, as the operation of current subcode file, until the subcode file inspection in code file to be detected
Survey finishes, and obtains the testing result of code file to be detected.
Updating unit 304 can be updated lexical unit sequence table by the symbolic look-up algorithm of context-sensitive,
Each subcode file morphology list in code file to be detected can be traversed by the symbolic look-up algorithm of the context-sensitive
Metasequence table obtains the type parameters such as class, function and the variable in lexical unit sequence table, by type parameter and local symbol table
Or the type parameter in global symbol table carries out pointer link, to update lexical unit sequence table.So that being obtained based on above-mentioned
The global symbol table got realizes the symbolic look-up algorithm of context-sensitive, is one to code detection result correctness
Important guarantee.
Specifically, the second acquisition subelement 3041 can obtain subcode file from code file to be detected, as
Current subcode file, then, third obtain subelement 3042 and concentrate the current subcode file of acquisition from lexical unit sequence table
Corresponding lexical unit sequence table, and, it is concentrated from local symbol table and obtains the corresponding local symbol table of current subcode file;
Either, third obtains subelement 3042 and carries out morphological analysis to current subcode file, obtains current subcode file and corresponds to
Lexical unit sequence table, and according to the lexical unit sequence table of current subcode file structure with current subcode file pair
Local symbol table answered etc..
After the lexical unit sequence table that obtains current subcode file and local symbol table, update subelement 3043 can be with
According to local symbol table and global symbol table, the corresponding lexical unit sequence table of current subcode file is updated, is obtained
Updated lexical unit sequence table.
In some embodiments, update subelement 3043 specifically can be used for:
The type parameter in lexical unit sequence table is obtained, type parameter includes the type name and type parameter of type parameter
Qualified name;
The type of type parameter is not present when the type name of type parameter is not system type name, and in local symbol table
When name, the type name of type parameter is searched from global symbol table;
When the type name of present pattern parameter in global symbol table, by the restriction of type parameter in lexical unit sequence table
Name is matched with the qualified name of type parameter in global symbol table;
If successful match, the type parameter in the type parameter and global symbol table in lexical unit sequence table is carried out
Pointer links, and obtains updated lexical unit sequence table.
Specifically, update subelement 3043 can traverse the lexical unit sequence table of current subcode file, obtain current
Type parameter in the lexical unit sequence table of subcode file, such shape parameter may include one or more, wherein such
Shape parameter may include class, function and variable etc., such shape parameter may include the type name and type parameter of type parameter
Qualified name, the qualified name may include one or more.For example, for type parameter A::B::C, the entitled C of type, qualified name
For A::B.
After obtaining type parameter, update subelement 3043 can extract the type of type parameter from type parameter
Name, then judges whether the type name of such shape parameter is system type name, which may include code compilation system
The type name stored in the included data of system, for example, main.It, can not when the type name of type parameter is system type name
The type name is continued to search in local symbol table or global symbol table, but terminates to search flow, is returned to such shape parameter and is referred to
It is sky to the pointer of symbol table.
When the type name of type parameter is not system type name, update subelement 3043 can further judge local symbol
The type name that whether there is type parameter in number table, when the type name of type parameter is not present in local symbol table, Ke Yicong
The type name that type parameter is searched in global symbol table judges the type name that whether there is type parameter in global symbol table.When
In global symbol table when the type name of present pattern parameter, the qualified name of type parameter in lexical unit sequence table is accorded with global
The qualified name of type parameter is matched in number table, judges the qualified name and global symbol of type parameter in morphology unit sequence table
In table the qualified name of type parameter whether successful match.If the qualified name and global symbol of type parameter in lexical unit sequence table
The qualified name successful match of type parameter in table, then by the class in the type parameter and global symbol table in lexical unit sequence table
Shape parameter carries out pointer link, for example, the type parameter in lexical unit sequence table can be directed toward to the pointer of symbol table, setting
For the type parameter being directed toward in global symbol table, updated lexical unit sequence table is obtained.
In some embodiments, update subelement 3043 also specifically can be used for:
When the type name of type parameter is not system type name, and in local symbol table present pattern parameter type name
When, the qualified name of type parameter in lexical unit sequence table is matched with the qualified name of type parameter in local symbol table;
If successful match, the type parameter in lexical unit sequence table is referred to type parameter in local symbol table
Needle links, and obtains updated lexical unit sequence table.
When the type name of type parameter is not system type name, update subelement 3043 can further judge local symbol
The type name that whether there is type parameter in number table, when the type name of present pattern parameter in local symbol table, by morphology list
The qualified name of type parameter is matched with the qualified name of type parameter in local symbol table in metasequence table, judges lexical unit
In sequence table in the qualified name of type parameter and local symbol table type parameter qualified name whether successful match.If lexical unit
The qualified name successful match of the qualified name of type parameter and type parameter in local symbol table in sequence table, then by lexical unit sequence
Type parameter in list carries out pointer with type parameter in local symbol table and links, for example, can be by lexical unit sequence table
In type parameter be directed toward the pointer of symbol table, the type parameter being set to point in local symbol table obtains updated word
Method unit sequence table.
After the lexical unit sequence table for updating current subcode file, determination unit 305 can according to global symbol table,
The local symbol table of current subcode file and updated lexical unit sequence table, determine the detection of current subcode file
As a result.For example, code check item scanning can be carried out, based on the global symbol table and Symbolic Links built as a result, for each
Error code scene carries out code scans, that is, traverses updated lexical unit sequence table, search updated lexical unit sequence
Class, function and variable in list etc. call the office where class, function or variable according to class, function or the variable etc. found
Portion's symbol table and global symbol table extract current filial generation from the key feature stored in local symbol table or global symbol table
The testing result of code file.Code detecting apparatus can also be according to certain format output code error message, specific output format
Can flexibly it be arranged according to actual needs, particular content is not construed as limiting here.
To be illustrated below, for example, as shown in figure 8, with above-mentioned demo1.cpp subcodes file and
For demo2.cpp subcode files, lookup and association based on global symbol table and class, function and variable, it can be found that
The problem of cannot being found in single local symbol table, in testing result, the demo.Func in demo1.cpp subcode files
Function call is correctly associated with the CDemo2 in demo2.cpp subcode files::Func functions are based on global symbol table, can be with
When knowing that type is equal to 1, return value is this key feature of null pointer NULL, therefore can export demo1.cpp subcode texts
The 19th row null pointer p dereferences report an error in part.
It completes to be updated the lexical unit sequence table of current subcode file, and is determining current subcode file
After testing result, it can continue to obtain another subcode file from code file to be detected, as current subcode text
Part, that is, return to execute and obtain subcode file from code file to be detected, the step of as current subcode file, until
Subcode file detection in code file to be detected finishes, and obtains the testing result of code file to be detected.
From the foregoing, it will be observed that the embodiment of the present invention can obtain code file to be detected by first acquisition unit 301, and by
Construction unit 302 builds global symbol table corresponding with code file to be detected, which may include to be detected
Code file global information and second acquisition unit 303 obtain each subcode file in code file to be detected
Corresponding lexical unit sequence table and local symbol table, the local symbol table may include the local message of subcode file;So
Updating unit 304 is updated lexical unit sequence table according to local symbol table and global symbol table afterwards, obtains updated
Lexical unit sequence table;Determination unit 305 according to local symbol table, global symbol table and updated lexical unit sequence table,
Determine the testing result of code file to be detected.The program is due to can be by global symbol table and local symbol table to morphology
Unit sequence table is updated, and is obtained according to updated lexical unit sequence table and local symbol table and global symbol table
To entire code file to be detected global detection as a result, realize the global detection to code file to be detected, without
It is limited only to individually carry out local detection to sub- code file, this improves the accuracys of code detection.
Correspondingly, the embodiment of the present invention also provides a kind of terminal, which can be that test terminal as shown in figure 15 should
Terminal, which may include radio frequency (RF, Radio Frequency) circuit 601, to include one or more computer-readable deposits
The memory 602 of storage media, input unit 603, display unit 604, sensor 605, voicefrequency circuit 606, Wireless Fidelity
(WiFi, Wireless Fidelity) module 607, include there are one or more than one processing core processor 608, with
And the equal components of power supply 609.It will be understood by those skilled in the art that the limit of the not structure paired terminal of terminal structure shown in Figure 15
It is fixed, may include either combining certain components or different components arrangement than illustrating more or fewer components.Wherein:
RF circuits 601 can be used for receiving and sending messages or communication process in, signal sends and receivees, particularly, by base station
After downlink information receives, one or the processing of more than one processor 608 are transferred to;In addition, the data for being related to uplink are sent to
Base station.In general, RF circuits 601 include but not limited to antenna, at least one amplifier, tuner, one or more oscillators, use
Family identity module (SIM, Subscriber Identity Module) card, transceiver, coupler, low-noise amplifier
(LNA, Low Noise Amplifier), duplexer etc..In addition, RF circuits 601 can also by radio communication with network and its
He communicates equipment.The wireless communication can use any communication standard or agreement, including but not limited to global system for mobile telecommunications system
Unite (GSM, Global System of Mobile communication), general packet radio service (GPRS, General
Packet Radio Service), CDMA (CDMA, Code Division Multiple Access), wideband code division it is more
Location (WCDMA, Wideband Code Division Multiple Access), long term evolution (LTE, Long Term
Evolution), Email, short message service (SMS, Short Messaging Service) etc..
Memory 602 can be used for storing software program and module, and processor 608 is stored in memory 602 by operation
Software program and module, to perform various functions application and data processing.Memory 602 can include mainly storage journey
Sequence area and storage data field, wherein storing program area can storage program area, the application program (ratio needed at least one function
Such as sound-playing function, image player function) etc.;Storage data field can be stored uses created data according to terminal
(such as audio data, phone directory etc.) etc..In addition, memory 602 may include high-speed random access memory, can also include
Nonvolatile memory, for example, at least a disk memory, flush memory device or other volatile solid-state parts.Phase
Ying Di, memory 602 can also include Memory Controller, to provide processor 608 and input unit 603 to memory 602
Access.
Input unit 603 can be used for receiving the number or character information of input, and generate and user setting and function
Control related keyboard, mouse, operating lever, optics or the input of trace ball signal.Specifically, in a specific embodiment
In, input unit 603 may include touch sensitive surface and other input equipments.Touch sensitive surface, also referred to as touch display screen or tactile
Control plate, collect user on it or neighbouring touch operation (such as user using any suitable object such as finger, stylus or
Operation of the attachment on touch sensitive surface or near touch sensitive surface), and corresponding connection dress is driven according to preset formula
It sets.Optionally, touch sensitive surface may include both touch detecting apparatus and touch controller.Wherein, touch detecting apparatus is examined
The touch orientation of user is surveyed, and detects the signal that touch operation is brought, transmits a signal to touch controller;Touch controller from
Touch information is received on touch detecting apparatus, and is converted into contact coordinate, then gives processor 608, and can reception processing
Order that device 608 is sent simultaneously is executed.Furthermore, it is possible to a variety of using resistance-type, condenser type, infrared ray and surface acoustic wave etc.
Type realizes touch sensitive surface.In addition to touch sensitive surface, input unit 603 can also include other input equipments.Specifically, other are defeated
Enter equipment and can include but is not limited to physical keyboard, function key (such as volume control button, switch key etc.), trace ball, mouse
It is one or more in mark, operating lever etc..
Display unit 604 can be used for showing information input by user or be supplied to user information and terminal it is various
Graphical user interface, these graphical user interface can be made of figure, text, icon, video and its arbitrary combination.Display
Unit 604 may include display panel, optionally, may be used liquid crystal display (LCD, Liquid Crystal Display),
The forms such as Organic Light Emitting Diode (OLED, Organic Light-Emitting Diode) configure display panel.Further
, touch sensitive surface can cover display panel, when touch sensitive surface detects on it or after neighbouring touch operation, send processing to
Device 608 is followed by subsequent processing device 608 and is provided on a display panel accordingly according to the type of touch event to determine the type of touch event
Visual output.Although in fig.15, touch sensitive surface and display panel are to realize input and defeated as two independent components
Enter function, but in certain embodiments, touch sensitive surface and display panel can be integrated and realize and output and input function.
Terminal may also include at least one sensor 605, such as optical sensor, motion sensor and other sensors.
Specifically, optical sensor may include ambient light sensor and proximity sensor, wherein ambient light sensor can be according to ambient light
Light and shade adjust the brightness of display panel, proximity sensor can close display panel and/or the back of the body when terminal is moved in one's ear
Light.As a kind of motion sensor, gravity accelerometer can detect in all directions (generally three axis) acceleration
Size can detect that size and the direction of gravity when static, can be used to identify terminal posture application (such as horizontal/vertical screen switching,
Dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap) etc.;It can also configure as terminal
The other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor, details are not described herein.
Voicefrequency circuit 606, loud speaker, microphone can provide the audio interface between user and terminal.Voicefrequency circuit 606 can
By the transformed electric signal of the audio data received, it is transferred to loud speaker, voice signal output is converted to by loud speaker;It is another
The voice signal of collection is converted to electric signal by aspect, microphone, and audio data is converted to after being received by voicefrequency circuit 606, then
After the processing of audio data output processor 608, through RF circuits 601 to be sent to such as another terminal, or by audio data
Output is further processed to memory 602.Voicefrequency circuit 606 is also possible that earphone jack, with provide peripheral hardware earphone with
The communication of terminal.
WiFi belongs to short range wireless transmission technology, and terminal can help user's transceiver electronics postal by WiFi module 607
Part, browsing webpage and access streaming video etc., it has provided wireless broadband internet to the user and has accessed.Although Figure 15 is shown
WiFi module 607, but it is understood that, and it is not belonging to must be configured into for terminal, it can not change as needed completely
Become in the range of the essence of invention and omits.
Processor 608 is the control centre of terminal, using the various pieces of various interfaces and the entire terminal of connection, is led to
It crosses operation or executes the software program and/or module being stored in memory 602, and call and be stored in memory 602
Data execute the various functions and processing data of terminal, to carry out integral monitoring to terminal.Optionally, processor 608 can wrap
Include one or more processing cores;Preferably, processor 608 can integrate application processor and modem processor, wherein answer
With the main processing operation system of processor, user interface and application program etc., modem processor mainly handles wireless communication.
It is understood that above-mentioned modem processor can not also be integrated into processor 608.
Terminal further includes the power supply 609 (such as battery) powered to all parts, it is preferred that power supply can pass through power supply pipe
Reason system and processor 608 are logically contiguous, to realize management charging, electric discharge and power managed by power-supply management system
Etc. functions.Power supply 609 can also include one or more direct current or AC power, recharging system, power failure inspection
The random components such as slowdown monitoring circuit, power supply changeover device or inverter, power supply status indicator.
Although being not shown, terminal can also include camera, bluetooth module etc., and details are not described herein.Specifically in this implementation
In example, the processor 608 in terminal can be corresponding by the process of one or more application program according to following instruction
Executable file is loaded into memory 602, and runs the application program of storage in the memory 602 by processor 608, from
And realize various functions:
Obtain code file to be detected;Build the corresponding global symbol table of code file to be detected;It obtains to be detected
Code file in the corresponding lexical unit sequence table of each subcode file and local symbol table;According to local symbol table and entirely
Office's symbol table, is updated lexical unit sequence table, obtains updated lexical unit sequence table;According to local symbol table,
Global symbol table and updated lexical unit sequence table, determine the testing result of code file to be detected.
Optionally, the step of building code file to be detected corresponding global symbol table may include:Obtain with it is to be checked
The corresponding lexical unit sequence table of each subcode file, obtains lexical unit sequence table collection in the code file of survey;According to word
Method unit sequence table concentrates each lexical unit sequence table, structure corresponding with each subcode file in code file to be detected
Local symbol table, obtain local symbol table collection;Global symbol table is built according to local symbol table collection.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, the detailed description above with respect to code detection method is may refer to, details are not described herein again.
From the foregoing, it will be observed that the embodiment of the present invention can by global symbol table and local symbol table to lexical unit sequence table into
Row update, and obtained according to updated lexical unit sequence table and local symbol table and global symbol table entire to be detected
Code file global detection as a result, realizing the global detection to code file to be detected, and be not limited solely to list
Local detection solely is carried out to sub- code file, this improves the accuracys of code detection.
It will appreciated by the skilled person that all or part of step in the various methods of above-described embodiment can be with
It is completed by instructing, or controls relevant hardware by instructing and complete, which can be stored in one and computer-readable deposit
In storage media, and is loaded and executed by processor.
For this purpose, the embodiment of the present invention provides a kind of storage medium, wherein being stored with a plurality of instruction, which can be handled
Device is loaded, to execute the step in any code detection method that the embodiment of the present invention is provided.For example, the instruction can
To execute following steps:
Obtain code file to be detected;Build the corresponding global symbol table of code file to be detected;It obtains to be detected
Code file in the corresponding lexical unit sequence table of each subcode file and local symbol table;According to local symbol table and entirely
Office's symbol table, is updated lexical unit sequence table, obtains updated lexical unit sequence table;According to local symbol table,
Global symbol table and updated lexical unit sequence table, determine the testing result of code file to be detected.
Optionally, the step of building code file to be detected corresponding global symbol table may include:Obtain with it is to be checked
The corresponding lexical unit sequence table of each subcode file, obtains lexical unit sequence table collection in the code file of survey;According to word
Method unit sequence table concentrates each lexical unit sequence table, structure corresponding with each subcode file in code file to be detected
Local symbol table, obtain local symbol table collection;Global symbol table is built according to local symbol table collection.
The specific implementation of above each operation can be found in the embodiment of front, and details are not described herein.
Wherein, which may include:Read-only memory (ROM, Read Only Memory), random access memory
Body (RAM, Random Access Memory), disk or CD etc..
By the instruction stored in the storage medium, any code inspection that the embodiment of the present invention is provided can be executed
Step in survey method, it is thereby achieved that achieved by any code detection method that the embodiment of the present invention is provided
Advantageous effect refers to the embodiment of front, and details are not described herein.
Be provided for the embodiments of the invention above a kind of code detection method, device, storage medium and test terminal into
It has gone and has been discussed in detail, principle and implementation of the present invention are described for specific case used herein, the above implementation
The explanation of example is merely used to help understand the method and its core concept of the present invention;Meanwhile for those skilled in the art, according to
According to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification
It should not be construed as limiting the invention.
Claims (23)
1. a kind of code detection method, which is characterized in that including:
Obtain code file to be detected;
Build the corresponding global symbol table of the code file to be detected;
Obtain the corresponding lexical unit sequence table of each subcode file and local symbol table in the code file to be detected;
According to the local symbol table and the global symbol table, the lexical unit sequence table is updated, is updated
Lexical unit sequence table afterwards;
According to the local symbol table, the global symbol table and the updated lexical unit sequence table, waited for described in determination
The testing result of the code file of detection.
2. code detection method according to claim 1, which is characterized in that the structure code file to be detected
The step of corresponding global symbol table includes:
Lexical unit sequence table corresponding with each subcode file in the code file to be detected is obtained, morphology list is obtained
Metasequence table collection;
Concentrate each lexical unit sequence table according to the lexical unit sequence table, structure in the code file to be detected
Each corresponding local symbol table of subcode file, obtains local symbol table collection;
Global symbol table is built according to the local symbol table collection.
3. code detection method according to claim 2, which is characterized in that the acquisition and the code text to be detected
The corresponding lexical unit sequence table of each subcode file in part, the step of obtaining lexical unit sequence table collection include:
Morphological analysis is carried out to each subcode file in the code file to be detected, each subcode file is obtained and corresponds to
Lexical unit sequence table;
According to the corresponding lexical unit sequence table of each subcode file, to each subcode in the code file to be detected
File is standardized, and obtains the standard code file set of each standard subcode file composition;
The corresponding lexical unit sequence table of each standard subcode file in the standard code file set is obtained, morphology list is obtained
Metasequence table collection.
4. code detection method according to claim 3, which is characterized in that each subcode file of basis is corresponding
Lexical unit sequence table is standardized each subcode file in the code file to be detected, obtains each
Standard subcode file composition standard code file set the step of include:
According to the corresponding lexical unit sequence table of each subcode file and code standard logical format, the code lattice of standard are obtained
Formula;
From in the code file to be detected in each subcode file search with the code format unmatched target generation
Code format;
It is modified to the object code format according to the code format, obtains the mark of each standard subcode file composition
Quasi- code file collection.
5. code detection method according to claim 3, which is characterized in that described in the code file to be detected
Each subcode file carries out morphological analysis, and the step of obtaining the corresponding lexical unit sequence table of each subcode file includes:
Obtain the string value of each lexical unit in each subcode file in code file to be detected;
Obtain attribute information associated with each lexical unit;
Doubly linked list is generated according to the string value of each lexical unit and attribute information, it is corresponding to obtain each subcode file
The lexical unit sequence table collection of lexical unit sequence table composition.
6. code detection method according to claim 5, which is characterized in that the acquisition and each lexical unit phase
The step of associated attribute information includes:
It obtains each lexical unit and is directed toward various information in each subcode file in the code file to be detected
Pointer, obtain pointer information;
Obtain the characteristic information of each lexical unit;
Set the characteristic information of the pointer information and each lexical unit to category associated with each lexical unit
Property information.
7. code detection method according to claim 2, which is characterized in that described according to the lexical unit sequence table collection
In each lexical unit sequence table, build local symbol corresponding with each subcode file in the code file to be detected
Table, the step of obtaining local symbol table collection include:
Concentrate each lexical unit sequence table according to the lexical unit sequence table, structure in the code file to be detected
Each corresponding abstract syntax tree of subcode file;
According to abstract syntax tree structure part symbol corresponding with each subcode file in the code file to be detected
Number table, obtains local symbol table collection.
8. code detection method according to claim 7, which is characterized in that it is described according to the abstract syntax tree structure with
The corresponding local symbol table of each subcode file, obtains local symbol table Ji Buzhoubao in the code file to be detected
It includes:
The corresponding class list of each subcode file, letter in the code file to be detected are obtained according to the abstract syntax tree
Ordered series of numbers table and variable list;
According to the corresponding class list of each subcode file, function list and variable list structure and each subcode file
Corresponding local symbol table obtains local symbol table collection.
9. code detection method according to claim 2, which is characterized in that described to be built according to the local symbol table collection
The step of global symbol table includes:
Each local symbol table is concentrated to merge in the local symbol table, the symbol table after being merged;
Identical symbolic parameter in symbol table after the merging is retained one of them, and to its in identical symbolic parameter
He deletes at parameter, obtains global symbol table.
10. code detection method according to claim 2, which is characterized in that described to obtain the code text to be detected
The corresponding lexical unit sequence table of each subcode file and the step of local symbol table, include in part:
When being stored with the lexical unit sequence table collection and local symbel table, concentrates and obtain from the lexical unit sequence table
The corresponding lexical unit sequence table of each subcode file in the code file to be detected;
Each the corresponding part of subcode file accords with from the local symbol table concentration acquisition code file to be detected
Number table.
11. code detection method according to claim 2, which is characterized in that described to obtain the code text to be detected
The corresponding lexical unit sequence table of each subcode file and the step of local symbol table, include in part:
When not storing the lexical unit sequence table collection and local symbel table, to each in the code file to be detected
Subcode file carries out morphological analysis, obtains the corresponding lexical unit sequence table of each subcode file;
According to the lexical unit sequence table, structure abstract syntax tree corresponding with each subcode file;
According to abstract syntax tree structure local symbol table corresponding with each subcode file.
12. according to claim 1 to 11 any one of them code detection method, which is characterized in that described according to the part
Symbol table and the global symbol table are updated the lexical unit sequence table, obtain updated lexical unit sequence
Table determines described to be checked according to the local symbol table, the global symbol table and the updated lexical unit sequence table
The step of testing result of the code file of survey includes:
Subcode file is obtained from the code file to be detected, as current subcode file;
Obtain the corresponding lexical unit sequence table of the current subcode file and local symbol table;
According to the local symbol table and the global symbol table, lexical unit sequence corresponding to the current subcode file
Table is updated, and obtains updated lexical unit sequence table;
According to the local symbol table, the global symbol table and the updated lexical unit sequence table, work as described in determination
The testing result of preceding subcode file;
It returns to execute and obtains subcode file from the code file to be detected, the step of as current subcode file,
Until the subcode file detection in the code file to be detected finishes, the detection of the code file to be detected is obtained
As a result.
13. code detection method according to claim 12, which is characterized in that described according to the local symbol table and institute
The step of stating global symbol table, being updated to the lexical unit sequence table, obtain updated lexical unit sequence table is wrapped
It includes:
The type parameter in the lexical unit sequence table is obtained, the type parameter includes the type name and type of type parameter
The qualified name of parameter;
There is no the types to join when the type name of the type parameter is not system type name, and in the local symbol table
When several type names, the type name of the type parameter is searched from the global symbol table;
When, there are when the type name of the type parameter, type in the lexical unit sequence table being joined in the global symbol table
Several qualified names is matched with the qualified name of type parameter in the global symbol table;
If successful match, by the type parameter in the lexical unit sequence table and the type parameter in the global symbol table
Pointer link is carried out, updated lexical unit sequence table is obtained.
14. code detection method according to claim 13, which is characterized in that described to obtain the lexical unit sequence table
In type parameter the step of after, the method further includes:
When the type name of the type parameter is not system type name, and there are the type parameters in the local symbol table
Type name when, by type parameter in the qualified name of type parameter in the lexical unit sequence table and the local symbol table
Qualified name is matched;
If successful match, by type parameter in type parameter and the local symbol table in the lexical unit sequence table into
Line pointer links, and obtains updated lexical unit sequence table.
15. according to claim 1 to 11 any one of them code detection method, which is characterized in that the acquisition is to be detected
The step of code file includes:
Code file is obtained, and invalid code content is extracted from the code file according to code regulation;
The invalid code content is filtered, code file to be detected is obtained.
16. code detection method according to claim 15, which is characterized in that it is described according to code regulation from the code
The step of invalid code content is extracted in file include:
According to code regulation redundant character is extracted from the code file;
It is identified according to the annotation in code regulation and extracts notes content from the code file;
It is identified according to the pretreatment in code regulation and extracts pre-processing instruction from the code file;
Set the redundant character, notes content and pre-processing instruction to invalid code content.
17. a kind of code detecting apparatus, which is characterized in that including:
First acquisition unit, for obtaining code file to be detected;
Construction unit, for building the corresponding global symbol table of the code file to be detected;
Second acquisition unit, for obtaining the corresponding lexical unit sequence of each subcode file in the code file to be detected
List and local symbol table;
Updating unit, for according to the local symbol table and the global symbol table, being carried out to the lexical unit sequence table
Update, obtains updated lexical unit sequence table;
Determination unit, for according to the local symbol table, the global symbol table and the updated lexical unit sequence
Table determines the testing result of the code file to be detected.
18. code detecting apparatus according to claim 17, which is characterized in that the construction unit includes:
First obtains subelement, for obtaining morphology list corresponding with each subcode file in the code file to be detected
Metasequence table obtains lexical unit sequence table collection;
First structure subelement, for concentrating each lexical unit sequence table, structure and institute according to the lexical unit sequence table
The corresponding local symbol table of each subcode file in code file to be detected is stated, local symbol table collection is obtained;
Second structure subelement, for building global symbol table according to the local symbol table collection.
19. code detecting apparatus according to claim 18, which is characterized in that described first, which obtains subelement, includes:
Analysis module obtains each for carrying out morphological analysis to each subcode file in the code file to be detected
The corresponding lexical unit sequence table of subcode file;
Processing module is used for according to the corresponding lexical unit sequence table of each subcode file, to the code text to be detected
Each subcode file is standardized in part, obtains the standard code file set of each standard subcode file composition;
Acquisition module, for obtaining the corresponding lexical unit sequence of each standard subcode file in the standard code file set
Table obtains lexical unit sequence table collection.
20. according to claim 17 to 19 any one of them code detecting apparatus, which is characterized in that the updating unit packet
It includes:
Second obtains subelement, for obtaining subcode file from the code file to be detected, as current subcode
File;
Third obtains subelement, for obtaining the corresponding lexical unit sequence table of the current subcode file and local symbol
Table;
Subelement is updated, is used for according to the local symbol table and the global symbol table, to the current subcode file pair
The lexical unit sequence table answered is updated, and obtains updated lexical unit sequence table;
The determination unit is specifically used for:According to the local symbol table, the global symbol table and the updated morphology
Unit sequence table determines the testing result of the current subcode file;The second acquisition subelement is triggered to execute from described
Subcode file is obtained in code file to be detected, as the operation of current subcode file, until the generation to be detected
Subcode file detection in code file finishes, and obtains the testing result of the code file to be detected.
21. code detecting apparatus according to claim 20, which is characterized in that the update subelement is specifically used for:
The type parameter in the lexical unit sequence table is obtained, the type parameter includes the type name and type of type parameter
The qualified name of parameter;
When the type name of the type parameter is not system type name, and there are the type parameters in the local symbol table
Type name when, by type parameter in the qualified name of type parameter in the lexical unit sequence table and the local symbol table
Qualified name is matched;If successful match, by type parameter and the local symbol table in the lexical unit sequence table
Middle type parameter carries out pointer link, obtains updated lexical unit sequence table;
There is no the types to join when the type name of the type parameter is not system type name, and in the local symbol table
When several type names, the type name of the type parameter is searched from the global symbol table;
When, there are when the type name of the type parameter, type in the lexical unit sequence table being joined in the global symbol table
Several qualified names is matched with the qualified name of type parameter in the global symbol table;
If successful match, by the type parameter in the lexical unit sequence table and the type parameter in the global symbol table
Pointer link is carried out, updated lexical unit sequence table is obtained.
22. a kind of storage medium, which is characterized in that the storage medium is stored with a plurality of instruction, and described instruction is suitable for processor
It is loaded, the step in 1 to 16 any one of them code detection method is required with perform claim.
23. a kind of test terminal, which is characterized in that the test terminal includes:At least one processor and at least one processing
Device;The memory has program stored therein, and the processor calls described program, and 1-16 any one of them is required with perform claim
Step in code detection method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810321498.XA CN108549538B (en) | 2018-04-11 | 2018-04-11 | Code detection method and device, storage medium and test terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810321498.XA CN108549538B (en) | 2018-04-11 | 2018-04-11 | Code detection method and device, storage medium and test terminal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108549538A true CN108549538A (en) | 2018-09-18 |
CN108549538B CN108549538B (en) | 2021-03-02 |
Family
ID=63514479
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810321498.XA Active CN108549538B (en) | 2018-04-11 | 2018-04-11 | Code detection method and device, storage medium and test terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108549538B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109582575A (en) * | 2018-11-27 | 2019-04-05 | 网易(杭州)网络有限公司 | Game test method and device |
CN110297639A (en) * | 2019-07-01 | 2019-10-01 | 北京百度网讯科技有限公司 | Method and apparatus for detecting code |
CN110309050A (en) * | 2019-05-22 | 2019-10-08 | 深圳壹账通智能科技有限公司 | Detection method, device, server and the storage medium of code specification |
CN110489127A (en) * | 2019-08-12 | 2019-11-22 | 腾讯科技(深圳)有限公司 | Error code determines method, apparatus, computer readable storage medium and equipment |
CN110489973A (en) * | 2019-08-06 | 2019-11-22 | 广州大学 | A kind of intelligent contract leak detection method, device and storage medium based on Fuzz |
CN110879709A (en) * | 2019-11-29 | 2020-03-13 | 五八有限公司 | Detection method and device of useless codes, terminal equipment and storage medium |
CN111651198A (en) * | 2020-04-20 | 2020-09-11 | 北京大学 | Automatic code abstract generation method and device |
CN112276263A (en) * | 2020-10-14 | 2021-01-29 | 宁波市博虹机械制造开发有限公司 | G code-based special motion control method for electric spark forming machine |
CN112651213A (en) * | 2020-12-25 | 2021-04-13 | 军工保密资格审查认证中心 | Safety examination method and device for numerical control program |
CN113946347A (en) * | 2021-09-29 | 2022-01-18 | 北京五八信息技术有限公司 | Function call detection method and device, electronic equipment and readable medium |
CN117149663A (en) * | 2023-10-30 | 2023-12-01 | 合肥中科类脑智能技术有限公司 | Multi-target detection algorithm deployment method and device, electronic equipment and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03255533A (en) * | 1990-03-06 | 1991-11-14 | Fujitsu Ltd | Symbol managing system in programming language processing system |
CN103780263A (en) * | 2012-10-22 | 2014-05-07 | 株式会社特博睿 | Device and method of data compression and recording medium |
CN105930267A (en) * | 2016-04-15 | 2016-09-07 | 中国工商银行股份有限公司 | Database dictionary based storage process static detection method and system |
CN106227668A (en) * | 2016-07-29 | 2016-12-14 | 腾讯科技(深圳)有限公司 | Data processing method and device |
-
2018
- 2018-04-11 CN CN201810321498.XA patent/CN108549538B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03255533A (en) * | 1990-03-06 | 1991-11-14 | Fujitsu Ltd | Symbol managing system in programming language processing system |
CN103780263A (en) * | 2012-10-22 | 2014-05-07 | 株式会社特博睿 | Device and method of data compression and recording medium |
CN105930267A (en) * | 2016-04-15 | 2016-09-07 | 中国工商银行股份有限公司 | Database dictionary based storage process static detection method and system |
CN106227668A (en) * | 2016-07-29 | 2016-12-14 | 腾讯科技(深圳)有限公司 | Data processing method and device |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109582575B (en) * | 2018-11-27 | 2022-03-22 | 网易(杭州)网络有限公司 | Game testing method and device |
CN109582575A (en) * | 2018-11-27 | 2019-04-05 | 网易(杭州)网络有限公司 | Game test method and device |
CN110309050A (en) * | 2019-05-22 | 2019-10-08 | 深圳壹账通智能科技有限公司 | Detection method, device, server and the storage medium of code specification |
CN110297639A (en) * | 2019-07-01 | 2019-10-01 | 北京百度网讯科技有限公司 | Method and apparatus for detecting code |
CN110297639B (en) * | 2019-07-01 | 2023-03-21 | 北京百度网讯科技有限公司 | Method and apparatus for detecting code |
CN110489973A (en) * | 2019-08-06 | 2019-11-22 | 广州大学 | A kind of intelligent contract leak detection method, device and storage medium based on Fuzz |
CN110489127A (en) * | 2019-08-12 | 2019-11-22 | 腾讯科技(深圳)有限公司 | Error code determines method, apparatus, computer readable storage medium and equipment |
CN110489127B (en) * | 2019-08-12 | 2023-10-13 | 腾讯科技(深圳)有限公司 | Error code determination method, apparatus, computer-readable storage medium and device |
CN110879709A (en) * | 2019-11-29 | 2020-03-13 | 五八有限公司 | Detection method and device of useless codes, terminal equipment and storage medium |
CN111651198A (en) * | 2020-04-20 | 2020-09-11 | 北京大学 | Automatic code abstract generation method and device |
CN111651198B (en) * | 2020-04-20 | 2021-04-13 | 北京大学 | Automatic code abstract generation method and device |
CN112276263A (en) * | 2020-10-14 | 2021-01-29 | 宁波市博虹机械制造开发有限公司 | G code-based special motion control method for electric spark forming machine |
CN112651213A (en) * | 2020-12-25 | 2021-04-13 | 军工保密资格审查认证中心 | Safety examination method and device for numerical control program |
CN113946347A (en) * | 2021-09-29 | 2022-01-18 | 北京五八信息技术有限公司 | Function call detection method and device, electronic equipment and readable medium |
CN117149663A (en) * | 2023-10-30 | 2023-12-01 | 合肥中科类脑智能技术有限公司 | Multi-target detection algorithm deployment method and device, electronic equipment and medium |
CN117149663B (en) * | 2023-10-30 | 2024-02-02 | 合肥中科类脑智能技术有限公司 | Multi-target detection algorithm deployment method and device, electronic equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN108549538B (en) | 2021-03-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108549538A (en) | A kind of code detection method, device, storage medium and test terminal | |
US10324909B2 (en) | Omega names: name generation and derivation utilizing nested three or more attributes | |
CN106227774B (en) | Information search method and device | |
CN108763887A (en) | Database manipulation requests verification method, apparatus, server and storage medium | |
US20140359587A1 (en) | Deeply parallel source code compilation | |
US9483508B1 (en) | Omega names: name generation and derivation | |
US20160306736A1 (en) | Translation verification testing | |
US9311077B2 (en) | Identification of code changes using language syntax and changeset data | |
CN110058850A (en) | A kind of development approach of application, device and storage medium | |
US11243750B2 (en) | Code completion with machine learning | |
CN112860265A (en) | Method and device for detecting operation abnormity of source code database | |
CN108959454B (en) | Prompting clause specifying method, device, equipment and storage medium | |
CN110188366A (en) | A kind of information processing method, device and storage medium | |
CN111949328B (en) | Start acceleration method and device, computer equipment and storage medium | |
CN108763222A (en) | Detection, interpretation method and device, server and storage medium are translated in a kind of leakage | |
WO2017167118A1 (en) | Method and device for compiling computer language | |
CN107729015A (en) | A kind of method and apparatus for determining the useless function in engineering code | |
CN113821496B (en) | Database migration method, system, device and computer readable storage medium | |
CN107741901A (en) | A kind of method of testing and device of linked database sentence | |
CN109635175A (en) | Page data joining method, device, readable storage medium storing program for executing and electronic equipment | |
CN109446078A (en) | Code test method and device, storage medium, electronic equipment | |
EP4075320A1 (en) | A method and device for improving the efficiency of pattern recognition in natural language | |
CN112069198B (en) | SQL analysis optimization method and device | |
CN107220349B (en) | Method and system for predicting database release time | |
WO2022256573A1 (en) | System and method for detecting vulnerabilities in object-oriented program code using an object property graph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |