CN108549538A - A kind of code detection method, device, storage medium and test terminal - Google Patents

A kind of code detection method, device, storage medium and test terminal Download PDF

Info

Publication number
CN108549538A
CN108549538A CN201810321498.XA CN201810321498A CN108549538A CN 108549538 A CN108549538 A CN 108549538A CN 201810321498 A CN201810321498 A CN 201810321498A CN 108549538 A CN108549538 A CN 108549538A
Authority
CN
China
Prior art keywords
file
code
lexical unit
unit sequence
subcode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810321498.XA
Other languages
Chinese (zh)
Other versions
CN108549538B (en
Inventor
邹越
袁明凯
黄斌
张蓓
严明
魏学峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tencent Network Information Technology Co Ltd
Original Assignee
Shenzhen Tencent Network Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tencent Network Information Technology Co Ltd filed Critical Shenzhen Tencent Network Information Technology Co Ltd
Priority to CN201810321498.XA priority Critical patent/CN108549538B/en
Publication of CN108549538A publication Critical patent/CN108549538A/en
Application granted granted Critical
Publication of CN108549538B publication Critical patent/CN108549538B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3624Software debugging by performing operations on the source code, e.g. via a compiler

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the invention discloses a kind of code detection method, device, storage medium and test terminal, the embodiment of the present invention to obtain code file to be detected;Build the corresponding global symbol table of the code file to be detected;Obtain the corresponding lexical unit sequence table of each subcode file and local symbol table in the code file to be detected;According to the local symbol table and the global symbol table, the lexical unit sequence table is updated, obtains updated lexical unit sequence table;According to the local symbol table, the global symbol table and the updated lexical unit sequence table, the testing result of the code file to be detected is determined.The program realizes the global detection to code file to be detected, and is not limited solely to individually carry out local detection to sub- code file, improves the accuracy of code detection.

Description

A kind of code detection method, device, storage medium and test terminal
Technical field
The present invention relates to field of computer technology, and in particular to a kind of code detection method, device, storage medium and test Terminal.
Background technology
In software project development, usually needed after code development comes out for availability and the accuracy etc. for ensuring code Various detections are carried out to code, obtain code detection as a result, then code detection result can be utilized to help developer fixed The problem hidden in the code of position, to be repaired accordingly to code.
In the prior art, multiple subcode files are generally comprised in the code file of a project, are examined to code During survey, need individually to be detected each subcode file in code file, for example, first to a subcode text Code in part is analyzed, and the local message of the subcode file is obtained, and is carried out according to the local message of the subcode file Part detection, obtains the local testing result of the subcode file.Then, to the code in another subcode file according to this Method carries out local detection, obtains the local testing result of another subcode file, and so on, until completing to code text The detection of all subcode files in part.
In the research and practice process to the prior art, it was found by the inventors of the present invention that due to existing code detection side It is independent to the detection of each subcode file in code file in case, the process of detection is fairly simple, the detection obtained As a result it is only the corresponding local testing result of each subcode file in code file, therefore the part caused is detected As a result accuracy is low.
Invention content
A kind of code detection method of offer of the embodiment of the present invention, device, storage medium and test terminal, it is intended to improve code The accuracy of detection.
In order to solve the above technical problems, the embodiment of the present invention provides following technical scheme:
A kind of code detection method, including:
Obtain code file to be detected;
Build the corresponding global symbol table of the code file to be detected;
Obtain the corresponding lexical unit sequence table of each subcode file and part symbol in the code file to be detected Number table;
According to the local symbol table and the global symbol table, the lexical unit sequence table is updated, is obtained Updated lexical unit sequence table;
According to the local symbol table, the global symbol table and the updated lexical unit sequence table, institute is determined State the testing result of code file to be detected.
A kind of code detecting apparatus, including:
First acquisition unit, for obtaining code file to be detected;
Construction unit, for building the corresponding global symbol table of the code file to be detected;
Second acquisition unit, for obtaining the corresponding morphology list of each subcode file in the code file to be detected Metasequence table and local symbol table;
Updating unit is used for according to the local symbol table and the global symbol table, to the lexical unit sequence table It is updated, obtains updated lexical unit sequence table;
Determination unit, for according to the local symbol table, the global symbol table and the updated lexical unit Sequence table determines the testing result of the code file to be detected.
A kind of storage medium, the storage medium are stored with a plurality of instruction, and described instruction is loaded suitable for processor, with Execute the step in any code detection method that the embodiment of the present invention is provided.
A kind of test terminal, the test terminal include:At least one processor and at least one processor;The storage Device has program stored therein, and the processor calls described program, to execute any code detection that the embodiment of the present invention is provided Step in method.
The embodiment of the present invention can build global symbol table corresponding with code file to be detected, which can To include the global information of code file to be detected, and obtains each subcode file in code file to be detected and correspond to Lexical unit sequence table and local symbol table, the local symbol table may include the local message of subcode file;Then root According to local symbol table and global symbol table, lexical unit sequence table is updated, obtains updated lexical unit sequence table; According to local symbol table, global symbol table and updated lexical unit sequence table, the detection of code file to be detected is determined As a result.The program due to can be updated to lexical unit sequence table by global symbol table and local symbol table, and according to Updated lexical unit sequence table and local symbol table and global symbol table obtain the complete of entire code file to be detected Office's testing result, realizes the global detection to code file to be detected, and is not limited solely to individually to sub- code file Local detection is carried out, this improves the accuracys of code detection.
Description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.
Fig. 1 is the schematic diagram of a scenario of code detection system provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of code detection method provided in an embodiment of the present invention;
Fig. 3 is the partial schematic diagram of lexical unit sequence table provided in an embodiment of the present invention;
Fig. 4 is the partial schematic diagram of abstract syntax tree provided in an embodiment of the present invention;
Fig. 5 is the partially schematic of the formed structure tree of local symbol table of subcode file provided in an embodiment of the present invention Figure;
Fig. 6 is that the part of the formed structure tree of local symbol table of another subcode file provided in an embodiment of the present invention shows It is intended to;
Fig. 7 is the partial schematic diagram of the corresponding structure tree of global symbol table provided in an embodiment of the present invention;
Fig. 8 is the schematic diagram of code detection provided in an embodiment of the present invention;
Fig. 9 is another flow diagram of code detection method provided in an embodiment of the present invention;
Figure 10 is the flow diagram of update lexical unit sequence table provided in an embodiment of the present invention;
Figure 11 is the structural schematic diagram of code detecting apparatus provided in an embodiment of the present invention;
Figure 12 is another structural schematic diagram of code detecting apparatus provided in an embodiment of the present invention;
Figure 13 is another structural schematic diagram of code detecting apparatus provided in an embodiment of the present invention;
Figure 14 is another structural schematic diagram of code detecting apparatus provided in an embodiment of the present invention;
Figure 15 is the structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, the every other implementation that those skilled in the art are obtained without creative efforts Example, shall fall within the protection scope of the present invention.
A kind of code detection method of offer of the embodiment of the present invention, device, storage medium and test terminal.
Referring to Fig. 1, the schematic diagram of a scenario for the code detection system that Fig. 1 is provided by the embodiment of the present invention, code inspection Examining system may include code detecting apparatus, the code detecting apparatus can specifically be integrated in tablet computer, laptop and Desktop computer etc. has storage element and is equipped with microprocessor in the terminal with operational capability, is mainly used for obtaining to be checked The code file of survey, which can be generated by the soft developing instrument part of code programming, for example, can be to server It sends and is asked about the acquisition of code file, and receive the code file etc. that server is returned based on acquisition request, it can be with Invalid code content in code file is filtered, code file to be detected is obtained, which can be Redundant character or notes content etc..Then, the corresponding global symbol table of code file to be detected is built, in the global symbol table May include the information such as class, function and variable of all subcode files in code file to be detected, for example, can first obtain Lexical unit sequence table corresponding with each subcode file in code file to be detected, obtains lexical unit sequence table collection; Each lexical unit sequence table, structure and each subcode in code file to be detected are concentrated further according to lexical unit sequence table The corresponding local symbol table of file obtains local symbol table collection, and builds global symbol table according to local symbol table collection.Wherein, Lexical unit sequence table may include string value and attribute information of lexical unit etc., local symbol table can be include filial generation The information such as class, function and the variable of code file, a local symbol table correspond to a sub- code file.And it can obtain to be checked The corresponding lexical unit sequence table of each subcode file and local symbol table in the code file of survey, for example, can be from morphology Unit sequence table, which is concentrated, obtains the corresponding lexical unit sequence table of each subcode file, and concentrates and obtain from local symbol table Each corresponding local symbol table of subcode file;Alternatively, carrying out word to each subcode file in code file to be detected Method is analyzed, and obtains the corresponding lexical unit sequence table of each subcode file, and according to lexical unit sequence table structure and each Corresponding local symbol table of subcode file etc..It, can after obtaining lexical unit sequence table, local symbol table and global symbol table According to local symbol table and global symbol table, to be updated to lexical unit sequence table, updated lexical unit sequence is obtained List.Finally it can determine generation to be detected according to local symbol table, global symbol table and updated lexical unit sequence table The testing result of code file;Etc..
In addition, code detection system can also include server, which can be used for storing the code that terminal uploads File, lexical unit sequence table, local symbol table and whole offices symbol table etc., which can also receive the pass of terminal transmission It is asked in the acquisition of code file, and code file is sent to by terminal based on acquisition request;Either, terminal is received to send The acquisition about local symbol table ask, and ask based on the acquisition local symbol table being sent to terminal etc..
It should be noted that the schematic diagram of a scenario of code detection system shown in FIG. 1 is only an example, the present invention is real The code detection system and scene for applying example description are in order to more clearly illustrate the technical solution of the embodiment of the present invention, not The restriction for technical solution provided in an embodiment of the present invention is constituted, those of ordinary skill in the art are it is found that with code detection The appearance of the differentiation and new business scene of system, technical solution provided in an embodiment of the present invention is for similar technical problem, together Sample is applicable in.
It is described in detail separately below.
In the present embodiment, it will be described from the angle of code detecting apparatus, which can specifically collect At tablet computer, laptop and desktop computer etc. have storage element and microprocessor is installed and have operation energy In the terminal of power.
A kind of code detection method, including:Obtain code file to be detected;It is corresponding to build code file to be detected Global symbol table;Obtain the corresponding lexical unit sequence table of each subcode file in code file to be detected and local symbol Table;According to local symbol table and global symbol table, lexical unit sequence table is updated, obtains updated lexical unit sequence List;According to local symbol table, global symbol table and updated lexical unit sequence table, code file to be detected is determined Testing result.
Referring to Fig. 2, Fig. 2 is the flow diagram for the code detection method that one embodiment of the invention provides.The code is examined Survey method may include:
In step S101, code file to be detected is obtained.
Wherein, code can be the source file that language that developer's exploitation tool is supported is write out, can be with It is one group of specific rule system for indicating information with discrete form by character, symbol or signal element etc..The code can be C ++ the source file that language, C language or Java language etc. are write out can also be the source file that other language are write out, tool Hold in vivo and is not construed as limiting here.
Code detecting apparatus obtains code file to be detected first, which can be a software The code file of project may include one or more subcode files.
In some embodiments, the step of obtaining code file to be detected may include:
(1) code file is obtained, and extracts invalid code content from code file according to code regulation.
(2) invalid code content is filtered, obtains code file to be detected.
Specifically, code detecting apparatus can obtain code file, for example, can be from local pre-stored code library Code file is obtained, which can be that code detecting apparatus first passes through the generation of code programming developing instrument in advance;Either, Code file can be sent to server and obtain request, and receive server and the code text that request returns is obtained based on code file Part, the code file can be that code detecting apparatus or other-end are uploaded to server, by server storage code text Part.It is understood that the acquisition modes of code file can be here not construed as limiting with other acquisition modes, particular content.
After obtaining code file, code detecting apparatus can pre-process code file, to filter out follow-up place Unwanted content is managed, for example, the part unrelated with valid code can be filtered, the word of specification is provided for subsequent morphological analysis Fu Liu.The pretreatment may include extracting invalid code content from code file according to code regulation, wherein when code text When part is the source file write using C Plus Plus, which can be the redaction rule about C Plus Plus;When code text When part is the source file that profit is shown a C language, which can be the redaction rule about C language;When code file is When the source file write using Java language, which can be the redaction rule etc. about Java language.The invalid generation Digital content may include notes content or pre-processing instruction etc., can also include other content, and particular content does not limit here It is fixed.
After obtaining invalid code content, invalid code content can be filtered by code detecting apparatus, i.e., will be invalid Code content is deleted from code file, obtains code file to be detected.When code file includes multiple subcode files When, multiple subcode files can be traversed, invalid code content are extracted from each subcode file, and will be in invalid code Appearance is filtered.
Optionally, the step of extracting invalid code content from code file according to code regulation may include:
(a) redundant character is extracted from code file according to code regulation;
(b) notes content is extracted from code file according to the annotation mark in code regulation;
(c) pre-processing instruction is extracted from code file according to the pretreatment mark in code regulation;
(d) redundant character, notes content and pre-processing instruction are set to invalid code content.
Specifically, code detecting apparatus can extract redundant character according to code regulation from code file, this is extra Character may include excess space, excess space branch or extra bracket etc..For example, when code file is compiled using C Plus Plus When the source file write, three row excess spaces can be extracted from code file according to the code regulation about C Plus Plus.Again For example, when code file is the source file write using Java language, can according to the code regulation about Java language, from Five excess spaces therein are extracted in continuous six spaces present in code file;Etc..
Code detecting apparatus can identify according to the annotation in code regulation and extract notes content from code file, In, annotation mark can be double slashes " // ", either "/* " and " */" etc..For example, when code file is compiled using C Plus Plus When the source file write, annotation mark " // ", and root can be searched from code file according to the code regulation about C Plus Plus It is expert at according to annotation mark " // " from code file " // " and extracts notes content, the notes content is including " // " and behind Content.
In another example when code file is the source file write using C Plus Plus, it can be according to the generation about C Plus Plus Code rule searches the annotation mark "/* " of starting from code file, and the annotation mark of termination is searched from code file " */", and "/* " is identified according to the annotation of starting and the annotation terminated identifies " */" and extracts "/* " (including/* between " */" With * /) notes content.
Code detecting apparatus can extract pretreatment from code file according to the mark of the pretreatment in code regulation and refer to It enables, pretreatment mark may include " # " etc., which may include #define, #if and #pragma etc..Example It such as, can be according to the code regulation about C Plus Plus, from code when code file is the source file write using C Plus Plus Pretreatment mark " # " is searched in file, and is expert at according to pretreatment mark " # " " # " from code file and is extracted pretreatment Instruction, which includes " # " and its subsequent content.
After obtaining redundant character, notes content and pre-processing instruction, code detecting apparatus can be by redundant character, note It releases content and pre-processing instruction is set as invalid code content, realize and extracted in vain from code file according to code regulation Code content.
In step s 102, the corresponding global symbol table of code file to be detected is built.
After obtaining code file to be detected, it is corresponding complete that code detecting apparatus can build code file to be detected Office's symbol table, wherein global symbol table may include the class that each subcode file includes in code file to be detected and its The data structure of the symbols such as relevant information, function and its relevant information, variable and its relevant information.
In some embodiments, the step of building code file to be detected corresponding global symbol table may include:
(1) lexical unit sequence table corresponding with each subcode file in code file to be detected is obtained, word is obtained Method unit sequence table collection.
(2) each lexical unit sequence table is concentrated according to lexical unit sequence table, structure in code file to be detected Each corresponding local symbol table of subcode file, obtains local symbol table collection.
(3) global symbol table is built according to local symbol table collection.
Specifically, code detecting apparatus can first obtain corresponding with each subcode file in code file to be detected Lexical unit sequence table obtains lexical unit sequence table collection, for example, each subcode file pair can be obtained by morphological analysis The lexical unit sequence table answered.Wherein, may include in lexical unit sequence table multiple lexical units (be properly termed as morpheme, It is properly termed as Token), for example, if or for etc. is a lexical unit, a sub- code file can after morphological analysis It (is properly termed as with the lexical unit sequence table of the set of all lexical units of generation, as the subcode file TokenList), one section of code can correspond to obtain a lexical unit section that (i.e. one Token sections, also may be used in subcode file With referred to as TokenSection), include the sequence of one or more Token compositions.
It should be noted that in order to improve the acquisition efficiency of lexical unit sequence table, code detecting apparatus can call more A thread, and the lexical unit sequence table of each subcode file is obtained by each thread parallel, so as to quick obtaining To multiple lexical unit sequence tables.Certainly, code detecting apparatus can serially obtain lexical unit sequence table, and particular content is herein Place is not construed as limiting.
In some embodiments, lexical unit corresponding with each subcode file in code file to be detected is obtained Sequence table, the step of obtaining lexical unit sequence table collection may include:
(1) morphological analysis is carried out to each subcode file in code file to be detected, obtains each subcode file Corresponding lexical unit sequence table.
(2) according to the corresponding lexical unit sequence table of each subcode file, to every height in code file to be detected Code file is standardized, and obtains the standard code file set of each standard subcode file composition.
(3) each corresponding lexical unit sequence table of standard subcode file in standard code file set is obtained, word is obtained Method unit sequence table collection.
Specifically, during obtaining lexical unit sequence table collection, first, code detecting apparatus can be to be detected Each subcode file carries out morphological analysis in code file, obtains the corresponding lexical unit sequence table of each subcode file, The corresponding lexical unit sequence table of each subcode file can form lexical unit sequence table collection, the lexical unit sequence table collection In may include the corresponding lexical unit sequence table of one or more subcode files.
Wherein, morphological analysis can be that the character string in subcode file is converted to the process of lexical unit sequence, The character stream for being mainly used for reading in pretreatment output of the morphological analysis, forms morpheme by the character stream, generates and export one Lexical unit sequence, each lexical unit correspond to a morpheme, and entire lexical unit sequence is lexical unit sequence table, morphology Unit sequence table is the Data Structures of subsequent processing and upper layer check item traversal code.
For example, to some code segment:For (int index=0;index<42;++ index) morphological analysis is carried out, it can be with Obtain " for ", " (", " int ", " index ", "=", " 0 ", ";”、“index”、“<”、“42”、“;", " ++ ", " index " and ") " etc. 14 lexical units composition lexical unit sequence.
In some embodiments, morphological analysis is carried out to each subcode file in code file to be detected, obtained The step of each subcode file corresponding lexical unit sequence table may include:
(a) string value of each lexical unit in each subcode file in code file to be detected is obtained.
(b) attribute information associated with each lexical unit is obtained.
(c) doubly linked list is generated according to the string value of each lexical unit and attribute information, obtains each subcode text The lexical unit sequence table collection of the corresponding lexical unit sequence table composition of part.
During obtaining lexical unit sequence table, code detecting apparatus can obtain every in code file to be detected The string value of each lexical unit in a sub- code file, for example, being by " f ", " o ", " r " etc. for lexical unit " for " What three string values formed, lexical unit " int " is made of three string values such as " i ", " n ", " t ".
Lexical unit sequence table is to carry out the most basic unit of code detection, in addition to the string value including lexical unit with Outside, it can also include attribute information associated with lexical unit, therefore, category associated with each lexical unit can be obtained Property information, wherein the attribute information may include the pointer for being directed toward next lexical unit, the finger for being directed toward a upper lexical unit The line number etc. of needle, the type of lexical unit and lexical unit.
It, can be according to each morphology in obtaining subcode file after the string value and attribute information of each lexical unit The string value and attribute information of unit generate doubly linked list, obtain the corresponding lexical unit sequence table of each subcode file, The corresponding lexical unit sequence table of each subcode file can form lexical unit sequence table collection.Lexical unit sequence table essence On be a doubly linked list, safeguard lexical unit all in lexical unit sequence table.
For example, as shown in figure 3, certain section of code:“if(i>0) lexical unit sequence table " can be expressed as shown in Fig. 3 Doubly linked list, wherein arrow can indicate the pointer for the next lexical unit for being directed toward current lexical unit, for example, sensing word Method unit " i " next lexical unit ">" pointer;Or arrow can indicate the upper word for being directed toward current lexical unit The pointer of method unit, for example, be directed toward lexical unit " (" a upper lexical unit " if " pointer;Or arrow can indicate Current lexical unit is directed toward the pointer of the lexical unit matched with it, for example, lexical unit " (" it is directed toward the morphology list matched with it Member ") " pointer;Etc..
Optionally, the step of acquisition attribute information associated with each lexical unit may include:
Obtain each lexical unit is directed toward various information in code file to be detected finger in each subcode file Needle obtains pointer information;Obtain the characteristic information of each lexical unit;By pointer information and the characteristic information of each lexical unit It is set as attribute information associated with each lexical unit.
Specifically, code detecting apparatus can obtain each lexical unit each subcode in code file to be detected The pointer of various information is directed toward in file, wherein various information may include current lexical unit next lexical unit, The data flow architecture etc. of the lexical unit and lexical unit that are matched with current lexical unit.
For example, the pointer for the next lexical unit for being directed toward current lexical unit can be obtained, be directed toward current lexical unit A upper lexical unit pointer, be directed toward with current lexical unit pairing lexical unit pointer (for example, for left bracket For, that is, be directed toward the pointer of right parenthesis), be directed toward lexical unit symbol table pointer (for example, what variable was directed toward is global Variable object in symbol table or local symbol table, what function was directed toward is function in global symbol table or local symbol table Object etc.), the syntax tree structure pointer (the abstract syntax tree construction that can be used for safeguarding lexical unit) and morphology of lexical unit Data flow architecture pointer of unit etc., these are pointer information.
At this time, it is also necessary to obtain the characteristic information of each lexical unit, wherein characteristic information may include lexical unit institute The type etc. of line number and lexical unit in code file, the type of lexical unit may include number, character string, variable, The types such as function and keyword, for example, " 1 " and " 2 " etc. can be numeric type, " main " can be type function, " index " and " i " etc. can be types of variables, etc..The characteristic information of these pointer informations and lexical unit be and morphology The associated attribute information of unit.
In some embodiments, according to the corresponding lexical unit sequence table of each subcode file, to generation to be detected Each subcode file is standardized in code file, obtains the standard code file of each standard subcode file composition The step of collection may include:
(a) according to the corresponding lexical unit sequence table of each subcode file and code standard logical format, standard is obtained Code format.
(b) from code file to be detected in each subcode file search with the unmatched object code of code format Format.
(c) it is modified to object code format according to code format, obtains the mark of each standard subcode file composition Quasi- code file collection.
Since disparity items code may be to be write by different developers, for the code wind of different developers Lattice may be different, and objectively form the multifarious present situation of code, therefore, in order to improve the effect of structure global symbol table Rate, and the accuracy of raising code detection can be standardized code file after getting lexical unit sequence table Processing, for example, can standardize and standardize to realize by some simplification steps come Unicode style.Wherein, the standard During change is handled, code logic cannot all be changed for all simplified steps, and only carry out equivalencing in logic.
Specifically, after obtaining the corresponding lexical unit sequence table of each subcode file, code detecting apparatus can root According to the corresponding lexical unit sequence table of each subcode file and code standard logical format, the code format of standard is obtained, For example, it is starting that the logical format that a function is realized, which can be with opening brace, and terminated with right braces.Wherein, contemporary When code file is the source file write using C Plus Plus, which can be the logic lattice about C Plus Plus Formula;When code file is the source file write using Java language, which can be about Java languages The logical format of speech.
Then, unmatched with the code format of standard from being searched in each subcode file in code file to be detected Object code format modifies to object code format according to the code format of standard, obtains each standard subcode file, For example, the code format of standard can be utilized to replace object code format, each standard subcode file can form standard generation Code file set.Code file is replaced into row equivalent code logic to realize, with reputable code format.
For example, by taking normalization condition expression formula as an example, before being standardized, in some subcode file Conditional expression the sentence of braces is omitted, specific subcode file can be as follows:
Code detecting apparatus can analyze conditional expression if according to the corresponding lexical unit sequence table of subcode file (i>0) position where, and determine that the code format of standard is need behind conditional expression according to code standard logical format Braces is set, at this point it is possible to be searched and the unmatched object code format of code format from subcode file:Condition table Up to formula if (i>0) it is not provided with braces below, then can be modified, that is, is existed to object code format according to code format Conditional expression if (i>0) braces is added below, and realization is standardized sub- code file, obtains standard subcode File, specifically can be as follows:
After obtaining standard code file set, code detecting apparatus can obtain each standard in standard code file set The corresponding lexical unit sequence table of code file, for example, can the character (such as braces) of addition be inserted by standardization The corresponding lexical unit sequence table of subcode file before obtains the corresponding lexical unit sequence table of standard subcode file;Or Person is can to delete the character of deletion from the corresponding lexical unit sequence table of subcode file before standardization, obtain To the corresponding lexical unit sequence table of standard subcode file;The corresponding lexical unit sequence table of each standard subcode file can To form lexical unit sequence table collection.
It is above-mentioned obtain lexical unit sequence table collection after, code detecting apparatus can be concentrated every according to lexical unit sequence table A lexical unit sequence table builds local symbol table corresponding with each subcode file in code file to be detected, obtains Local symbol table collection.For example, lexical unit sequence table can be traversed, extracting subcode file from lexical unit sequence table corresponds to The information such as class, function and variable, and with the information such as the relevant key characteristic such as class, function and variable, according to these information Class list, function list and variable list etc. can be built, it can be according to the corresponding class list of each subcode file, function row Table and variable list build local symbol table corresponding with each subcode file, the corresponding local symbol of each subcode file Table can form local symbol table collection.
In some embodiments, each lexical unit sequence table is concentrated according to lexical unit sequence table, structure with it is to be checked The corresponding local symbol table of each subcode file in the code file of survey, the step of obtaining local symbol table collection may include:
(1) each lexical unit sequence table is concentrated according to lexical unit sequence table, structure in code file to be detected Each corresponding abstract syntax tree of subcode file.
(2) local symbol corresponding with each subcode file in code file to be detected is built according to abstract syntax tree Table obtains local symbol table collection.
Specifically, code detecting apparatus can concentrate each lexical unit sequence table, structure according to lexical unit sequence table Abstract syntax tree (Abstract Syntax Tree, AST) corresponding with each subcode file in code file to be detected. Wherein, abstract syntax tree can be the tree-shaped form of expression of the abstract syntax structure of code, and abstract syntax tree can be one two Fork tree, each non-leaf nodes represent an operator, and two child nodes of non-leaf nodes respectively represent where operator Two operation components of the operator.The priority for the logical construction and operator that abstract syntax tree construction contains expression formula is closed System, this characteristic can improve the accuracy of code scene matching and realize the efficiency of the code scene.
It should be noted that the abstract syntax tree in the embodiment of the present invention, will not establish the logic between code expression Relationship, for example, the logical relation in if-else statement interludes between if sentences and else sentences will not be established, and just for single Abstract syntax structure is established in code expression, does not establish the structural relation between expression formula and expression formula.If due to building Vertical structural relation between expression formula and expression formula is built then once there are syntax errors for the code file of input Global abstract syntax tree construction will be mistake, and without reference to meaning, thus the present invention support it is incomplete or not Input can be used as by the code file of compiling, and build the abstract syntax tree construction of single expression formula, if some expression formula There is mistake, also only mistake occurs in the abstract syntax tree construction of part for that, the abstract syntax without influencing other expression formulas Tree construction.
For example, such as next section of code:
String::Format("demo:%d%s%d ", Func (1,2), " AST ", 1+2*3);
It, can be with as shown in figure 4, including parameter for the structure for the abstract syntax tree that this section of code is finally built 1:"demo:%d%s%d ", parameter 2:Func (1,2), parameter 3:" AST " and parameter 4:1+2*3 etc., for example, non-leaf section Point " * " represents an operator, and two child nodes " 2 " of non-leaf nodes and " 3 " respectively represent the operation where the operator Two operation components of symbol, i.e. 2*3;Corresponding two child nodes of operator "+" are " 1 " and " * ", can obtain 1+2*3.
After obtaining abstract syntax tree, it can be built according to abstract syntax tree and each filial generation in code file to be detected The corresponding local symbol table of code file, the corresponding local symbol table of each subcode file can form local symbol table collection.
In some embodiments, according to abstract syntax tree structure and each subcode file in code file to be detected Corresponding local symbol table, the step of obtaining local symbol table collection may include:
(a) the corresponding class list of each subcode file in code file to be detected, letter are obtained according to abstract syntax tree Ordered series of numbers table and variable list.
(b) according to the corresponding class list of each subcode file, function list and variable list structure and each subcode The corresponding local symbol table of file, obtains local symbol table collection.
Specifically, code detecting apparatus can go out subcode file pair according to tree rapid extraction in abstract syntax tree The information such as class, function and the variable answered, and with the information such as the relevant key characteristic such as class, function and variable, according to these letters Breath can build the corresponding class list of each subcode file in code file to be detected, function list and variable list etc., It can be built according to the corresponding class list of each subcode file, function list and variable list corresponding with each subcode file Local symbol table, the corresponding local symbol table of each subcode file can form local symbol table collection.Wherein, local symbol Table is referred to as SymbolDatabase, which can be the corresponding symbolism result object of subcode file.
For example, in the source file that C Plus Plus is write, (include the .h of corresponding expansion getting each .cpp file File) after corresponding lexical unit sequence table, a corresponding local symbol table can be built according to the lexical unit sequence table, Wherein, each local symbol table can include following three types of data:(1) list of types (is referred to as Type List), it can be used for type all in record code file, for example, the types such as class, struct or namespace, it should May include each typonym and the corresponding key feature of each type etc. in list of types;(2) function list (also may be used With referred to as Function List), it can be used for function all in record code file, may include each in the function list A function name and the corresponding key feature of each function (for example, return value etc. of function) etc.;(3) variable list (also may be used With referred to as Variable List), it can be used for variable all in record code file, may include each in the variable list A name variable and the corresponding key feature of each variable etc..
To be illustrated below, for example, may include in demo.cpp code files demo1.cpp, demo.h and The subcodes file such as demo2.cpp, the code content in demo1.cpp subcode files can be as follows:
May include demo.h subcode files, the generation in demo.h subcode files in demo1.cpp subcode files Digital content can be as follows:
In Scanning Detction demo1.cpp subcode files, similar following local symbol table can be obtained:
By the local symbol table of demo1.cpp subcode files it is found that there are symbols to lack in demo1.cpp subcode files It loses:CDemo2::The definition of Func is not found, and in fact, CDemo2::Func's is defined on demo2.cpp subcode texts In part, the code content in the demo2.cpp subcode files can be as follows:
In Scanning Detction demo2.cpp subcode files, similar following local symbol table can be obtained:
By the local symbol table of demo2.cpp subcode files it is found that the local symbol table of demo2.cpp subcode files Local symbol table compared to demo1.cpp subcode files is relatively simple, and most important one information is exactly CDemo2::Func The definition of function, what this was missing from the corresponding local symbol table of demo1.cpp subcode files, this is also individually to obtain The important problem for taking each subcode file to occur, at this time across file symbol search capacity be missing from.
It therefore, can be with after obtaining the local symbol table collection that the corresponding local symbol table of each subcode file is formed Global symbol table is built according to local symbol table collection, is lacked so as to be found in global symbol table in some subcode file The symbols such as class, function or the variable of mistake (i.e. type parameter) realize across file symbol search capacity.
In some embodiments, may include according to the step of local symbol table collection structure global symbol table:
(1) each local symbol table is concentrated to merge in local symbol table, the symbol table after being merged.
(2) identical symbolic parameter in the symbol table after merging is retained one of them, and in identical symbolic parameter Other parameters deleted, obtain global symbol table.
Specifically, local symbol table can be concentrated each local symbol table to merge by code detecting apparatus, be closed Symbol table after and, due to that may have identical symbolic parameter in the symbol table after merging, which may include Class, function and variable etc., therefore identical symbolic parameter in the symbol table after merging can be retained one of them, and to identical Symbolic parameter in other parameters deleted, obtain global symbol table.Either, there are two in symbol table after merging When a identical symbolic parameter, one of symbolic parameter is undefined, and when another symbolic parameter defines, in combined process In can retain defined symbolic parameter, and delete undefined symbolic parameter.
For example, constructing local symbol respectively for demo1.cpp subcodes file and demo2.cpp subcode files Table, but the CDemo2 in demo1.cpp subcode files::What the definition of Func functions was missing from, the definition of the function be In the local symbol table of demo2.cpp subcode files.It therefore, in order to well solve symbol missing the problem of, can basis Each local symbol table builds global symbol table, build global symbol table it is crucial that solving the symbols such as class, function and variable It searches and merging logic when symbol conflict.
For example, in order to visually illustrate the structure of global symbol table, illustrated in the form of symbolic construction tree, The corresponding symbolic construction tree of local symbol table of demo1.cpp subcode files can with as shown in figure 5, as shown in Figure 5, May include the classes such as CObject, CDemo1, CDemo2 in the local symbol table of demo1.cpp subcode files, and including Func functions and global_var1 variables etc., wherein there is Func defined in CDemo1 classes, and Func definition lacks in CDemo2 classes It loses.
The corresponding symbolic construction tree of local symbol table of demo2.cpp subcode files can be with as shown in fig. 6, can by Fig. 6 Know, may include the classes such as CObject and CDemo2 and global_var2 in the local symbol table of demo1.cpp subcode files Variable etc., wherein have Func defined in CDemo2 classes.
According to the local symbol of the local symbol table of demo1.cpp subcode files and demo2.cpp subcode files Table, the global symbol table built can with as shown in fig. 7, may include as shown in Figure 7, in global symbol table CObject, The classes such as CDemo1 and CDemo2, and further include the variables such as global_var1 and global_var2 including Func functions, In, definition has Func in CDemo1 and CDemo2 classes.I.e. by demo1.cpp subcodes file and demo2.cpp subcode texts After the symbolic construction tree of part merges, the CDemo2 in demo2.cpp subcode files::The definition of Func functions, and Global_var2 variables have been merged into the symbolic construction tree of demo1.cpp subcode files.
In step s 103, the corresponding lexical unit sequence table of each subcode file in code file to be detected is obtained And local symbol table.
Code is detected for convenience, code detecting apparatus also needs to obtain in code file to be detected per height The corresponding lexical unit sequence table of code file and local symbol table.
In some embodiments, the corresponding lexical unit sequence of each subcode file in code file to be detected is obtained The step of list and local symbol table may include:
(1) it when being stored with lexical unit sequence table collection and local symbel table, concentrates and obtains from lexical unit sequence table The corresponding lexical unit sequence table of each subcode file in code file to be detected.
(2) it is concentrated from local symbol table and obtains the corresponding local symbol of each subcode file in code file to be detected Table.
Specifically, code detecting apparatus is during above-mentioned acquisition global symbol table, due to need to obtain with it is to be detected Code file in the corresponding lexical unit sequence table of each subcode file, and in code file to be detected per height Corresponding local symbol table of code file etc., therefore, code detecting apparatus get lexical unit sequence table collection drawn game above-mentioned After portion's symbel table, lexical unit sequence table collection and local symbel table can be stored into local hard drive;Alternatively, by morphology Unit sequence table collection and local symbel table are uploaded to server, by server to lexical unit sequence table collection and local symbol table Collection is stored;Etc..
At this point, code detecting apparatus is obtaining the corresponding lexical unit sequence table of each subcode file and local symbol table During, it can be determined that whether local hard drive or server etc. are stored with lexical unit sequence table collection, when being stored with morphology list When metasequence table collection, it can directly be concentrated from lexical unit sequence table and obtain each subcode file in code file to be detected Corresponding lexical unit sequence table.And judge local hard drive or server etc. and whether be stored with local symbol table collection, work as storage When having local symbol table collection, it can directly be concentrated from local symbol table and obtain each subcode file in code file to be detected Corresponding local symbol table.
In some embodiments, the corresponding lexical unit sequence of each subcode file in code file to be detected is obtained The step of list and local symbol table may include:
(1) when not storing lexical unit sequence table collection and local symbel table, to each in code file to be detected Subcode file carries out morphological analysis, obtains the corresponding lexical unit sequence table of each subcode file.
(2) according to lexical unit sequence table, abstract syntax tree corresponding with each subcode file is built.
(3) local symbol table corresponding with each subcode file is built according to abstract syntax tree.
Code detecting apparatus can not store the lexical unit sequence got during above-mentioned acquisition global symbol table List collection and local symbel table etc., at this point, code detecting apparatus is obtaining the corresponding lexical unit sequence of each subcode file During list and local symbol table, it can be determined that whether local hard drive or server etc. are stored with lexical unit sequence table Collection, when not being stored with lexical unit sequence table collection, code detecting apparatus needs to reacquire each subcode file corresponding Lexical unit sequence table and local symbol table.
Specifically, code detecting apparatus can carry out morphology point to each subcode file in code file to be detected Analysis, obtains the corresponding lexical unit sequence table of each subcode file.For example, can obtain each in code file to be detected The string value of each lexical unit in subcode file, and obtain attribute information associated with each lexical unit;Root Doubly linked list is generated according to the string value and attribute information of each lexical unit, obtains the corresponding morphology list of each subcode file Metasequence table.
Code detecting apparatus can also be according to the corresponding lexical unit sequence table of each subcode file, to generation to be detected Each subcode file is standardized in code file, obtains each standard subcode file.For example, can be according to each The corresponding lexical unit sequence table of subcode file and code standard logical format, obtain the code format of standard;From to be detected Code file in search and the unmatched object code format of code format in each subcode file;According to code format pair Object code format is modified, and the corresponding standard subcode file of each subcode file is obtained.
Then, the corresponding lexical unit sequence table of each standard subcode file is obtained, finally by each standard subcode The corresponding lexical unit sequence table of file is set as the corresponding lexical unit sequence table of each subcode file.At this point it is possible to root According to lexical unit sequence table, abstract syntax tree corresponding with each subcode file is built, and build according to abstract syntax tree Local symbol table corresponding with each subcode file.For example, code file to be detected can be obtained according to abstract syntax tree In the corresponding class list of each subcode file, function list and variable list, and according to the corresponding class of each subcode file List, function list and variable list build local symbol table corresponding with each subcode file.
In step S104, according to local symbol table and global symbol table, lexical unit sequence table is updated, is obtained Updated lexical unit sequence table.
In step S105, according to local symbol table, global symbol table and updated lexical unit sequence table, determination waits for The testing result of the code file of detection.
The corresponding local symbol table of each subcode file in obtaining global symbol table and code file to be detected With lexical unit sequence table, code detecting apparatus can be according to local symbol table and global symbol table, to lexical unit sequence table It is updated, obtains updated lexical unit sequence table.For example, code detecting apparatus can traverse lexical unit sequence table, Class, function and variable in lexical unit sequence table is searched and linked, based on the above-mentioned global symbol table built, is made It with the symbolic look-up algorithm of context-sensitive, realizes across file or cross-module symbolic look-up ability, it is more accurate to reach acquisition Code detection result.
Wherein, the effect of lookup and the link of the symbols such as class, function and variable, which is that, will traverse lexical unit sequence table In class, function and variable etc. and local symbol table where it or global symbol table be associated, so as in traversal morphology list When metasequence table, the relevant information of class, function or variable etc. can be known, so as to significantly improve the scanning of code check item Efficiency.
In some embodiments, according to local symbol table and global symbol table, lexical unit sequence table is updated, Updated lexical unit sequence table is obtained, according to local symbol table, global symbol table and updated lexical unit sequence table, The step of testing result for determining code file to be detected may include:
(1) subcode file is obtained from code file to be detected, as current subcode file.
(2) the corresponding lexical unit sequence table of current subcode file and local symbol table are obtained.
(3) according to local symbol table and global symbol table, to the corresponding lexical unit sequence table of current subcode file into Row update, obtains updated lexical unit sequence table.
(4) according to local symbol table, global symbol table and updated lexical unit sequence table, current subcode text is determined The testing result of part.
(5) it returns to execute and obtains subcode file, the step as current subcode file from code file to be detected Suddenly, until the subcode file detection in code file to be detected finishes, the testing result of code file to be detected is obtained.
Code detecting apparatus can be updated lexical unit sequence table by the symbolic look-up algorithm of context-sensitive, Each subcode file morphology list in code file to be detected can be traversed by the symbolic look-up algorithm of the context-sensitive Metasequence table obtains the type parameters such as class, function and the variable in lexical unit sequence table, by type parameter and local symbol table Or the type parameter in global symbol table carries out pointer link, to update lexical unit sequence table.So that being obtained based on above-mentioned The global symbol table got realizes the symbolic look-up algorithm of context-sensitive, is one to code detection result correctness Important guarantee.
Specifically, code detecting apparatus can obtain subcode file from code file to be detected, as current son Then code file is concentrated from lexical unit sequence table and obtains the corresponding lexical unit sequence table of current subcode file, with And it is concentrated from local symbol table and obtains the corresponding local symbol table of current subcode file;Either, to current subcode file Morphological analysis is carried out, obtains the corresponding lexical unit sequence table of current subcode file, and according to current subcode file Lexical unit sequence table builds local symbol table corresponding with current subcode file;Etc..
After the lexical unit sequence table for obtaining current subcode file and local symbol table, code detecting apparatus can root According to local symbol table and global symbol table, the corresponding lexical unit sequence table of current subcode file is updated, is obtained more Lexical unit sequence table after new.
In some embodiments, according to local symbol table and global symbol table, lexical unit sequence table is updated, The step of obtaining updated lexical unit sequence table may include:
(a) type parameter in lexical unit sequence table is obtained, type parameter includes the type name and type of type parameter The qualified name of parameter.
(b) when the type name of type parameter is not system type name, and there is no type parameters in local symbol table When type name, the type name of type parameter is searched from global symbol table.
(c) when the type name of present pattern parameter in global symbol table, by type parameter in lexical unit sequence table Qualified name is matched with the qualified name of type parameter in global symbol table.
If (d) successful match, by the type parameter in the type parameter and global symbol table in lexical unit sequence table Pointer link is carried out, updated lexical unit sequence table is obtained.
Specifically, code detecting apparatus can traverse the lexical unit sequence table of current subcode file, obtain current son Type parameter in the lexical unit sequence table of code file, such shape parameter may include one or more, wherein the type Parameter may include class, function and variable etc., such shape parameter may include the type name of type parameter and the limit of type parameter It names, which may include one or more.For example, for type parameter A::B::C, the entitled C of type are limited entitled A::B。
After obtaining type parameter, code detecting apparatus can extract the type name of type parameter from type parameter, Then judge whether the type name of such shape parameter is system type name, the system type name may include code compilation system from The type name stored in the data of band, for example, main.It, can not be in office when the type name of type parameter is system type name The type name is continued to search in portion's symbol table or global symbol table, but terminates to search flow, is returned to such shape parameter and is directed toward symbol The pointer of number table is sky.
When the type name of type parameter is not system type name, can further judge to whether there is in local symbol table The type name of type parameter can be looked into when the type name of type parameter is not present in local symbol table from global symbol table The type name for looking for type parameter judges the type name that whether there is type parameter in global symbol table.When being deposited in global symbol table In the type name of type parameter, by type parameter in the qualified name of type parameter in lexical unit sequence table and global symbol table Qualified name matched, judge in morphology unit sequence table type parameter in the qualified name of type parameter and global symbol table Qualified name whether successful match.If type parameter in the qualified name of type parameter and global symbol table in lexical unit sequence table Qualified name successful match, then by the type parameter in the type parameter and global symbol table in lexical unit sequence table into line pointer Link is set to point to global symbol for example, the type parameter in lexical unit sequence table can be directed toward to the pointer of symbol table Type parameter in table obtains updated lexical unit sequence table.
In some embodiments, after the step of obtaining the type parameter in lexical unit sequence table, the code detection Method further includes:
(e) when the type name of type parameter is not system type name, and in local symbol table present pattern parameter class When type name, by the qualified name progress of type parameter in the qualified name of type parameter in lexical unit sequence table and local symbol table Match.
If (f) successful match, by type parameter in type parameter and the local symbol table in lexical unit sequence table into Line pointer links, and obtains updated lexical unit sequence table.
When the type name of type parameter is not system type name, code detecting apparatus can further judge local symbol The type name that whether there is type parameter in table, when the type name of present pattern parameter in local symbol table, by lexical unit The qualified name of type parameter is matched with the qualified name of type parameter in local symbol table in sequence table, judges lexical unit sequence In list in the qualified name of type parameter and local symbol table type parameter qualified name whether successful match.If lexical unit sequence The qualified name successful match of the qualified name of type parameter and type parameter in local symbol table in list, then by lexical unit sequence Type parameter in table carries out pointer with type parameter in local symbol table and links, for example, can will be in lexical unit sequence table Type parameter be directed toward the pointer of symbol table, the type parameter being set to point in local symbol table obtains updated morphology Unit sequence table.
After the lexical unit sequence table for updating current subcode file, code detecting apparatus can be according to global symbol Table, the local symbol table of current subcode file and updated lexical unit sequence table, determine the inspection of current subcode file Survey result.For example, code check item scanning can be carried out, based on the global symbol table and Symbolic Links built as a result, being directed to Each error code scene carries out code scans, that is, traverses updated lexical unit sequence table, search updated lexical unit Class, function and variable in sequence table etc., according to class, function or the variable etc. found, where calling class, function or variable Local symbol table and global symbol table extract current son from the key feature stored in local symbol table or global symbol table The testing result of code file.Code detecting apparatus can also specifically export lattice according to certain format output code error message Formula can be flexibly arranged according to actual needs, and particular content is not construed as limiting here.
To be illustrated below, for example, as shown in figure 8, with above-mentioned demo1.cpp subcodes file and For demo2.cpp subcode files, lookup and association based on global symbol table and class, function and variable, it can be found that The problem of cannot being found in single local symbol table, in testing result, the demo.Func in demo1.cpp subcode files Function call is correctly associated with the CDemo2 in demo2.cpp subcode files::Func functions are based on global symbol table, can be with When knowing that type is equal to 1, return value is this key feature of null pointer NULL, therefore can export demo1.cpp subcode texts The 19th row null pointer p dereferences report an error in part.
It completes to be updated the lexical unit sequence table of current subcode file, and is determining current subcode file After testing result, it can continue to obtain another subcode file from code file to be detected, as current subcode text Part, that is, return to execute and obtain subcode file from code file to be detected, the step of as current subcode file, until Subcode file detection in code file to be detected finishes, and obtains the testing result of code file to be detected.
From the foregoing, it will be observed that the embodiment of the present invention can build global symbol table corresponding with code file to be detected, this is complete Office's symbol table may include the global information of code file to be detected, and obtain each filial generation in code file to be detected The corresponding lexical unit sequence table of code file and local symbol table, the local symbol table may include the part letter of subcode file Breath;Then according to local symbol table and global symbol table, lexical unit sequence table is updated, obtains updated morphology list Metasequence table;According to local symbol table, global symbol table and updated lexical unit sequence table, code text to be detected is determined The testing result of part.The program is due to can carry out more lexical unit sequence table by global symbol table and local symbol table Newly, and according to updated lexical unit sequence table and local symbol table and global symbol table entire generation to be detected is obtained The global detection of code file as a result, realize the global detection to code file to be detected, and be not limited solely to it is individually right Subcode file carries out local detection, and this improves the accuracys of code detection.
According to method described in above-described embodiment, citing is described in further detail below.
The present embodiment is by taking code detecting apparatus is terminal as an example, and by taking code file is the source files write of C++ as an example, The code file of the present embodiment C Plus Plus project carries out first time scanning as input, to C++ codes:To in C++ code files Each subcode file carries out morphological analysis, obtains lexical unit sequence table, and build per height according to lexical unit sequence table Then the corresponding local symbol table of code file builds global symbol according to the corresponding local symbol table of each subcode file Table, to for the first time scan after can in global symbol table the relevant key feature such as cached variable, function and class.Then, C++ codes carry out second and scan:The lexical unit sequence table of each subcode file and part in C++ code files is obtained to accord with Number table.And the lexical unit sequence table of each subcode file is updated according to global symbol table, local symbol table, To establish the link that variable uses, function call and class are called, and specific code regulation inspection is carried out based on global symbol table, So as to output code testing result, the result needed for checking is provided for upper layer static code check item.
Referring to Fig. 9, Fig. 9 is the flow diagram of code detection method provided in an embodiment of the present invention.This method flow May include:
First, terminal obtains code file to be detected.
Wherein, for code file to be detected by taking the source file that C++ writes as an example, which can be one The code file of a C++ projects may include one or more subcode files, will include below more with C++ code files It is illustrated for a sub- code file.
It should be noted that the embodiment of the present invention during being detected to code, can be not based on compiling, Static code detection can be carried out, so as to be detected in the case where compiling does not pass through, does not influence whole detection stream Journey and result.In addition, C++ code files can support Windows Linux the systems such as Mac, cross-platform detection may be implemented.
Secondly, terminal-pair code file to be detected carries out first time scanning.
Wherein, may include during scanning for the first time code file is pre-processed, morphological analysis and standardization Processing etc..
Pretreatment:
It can be that invalid generation is extracted from code file according to C++ code regulations that terminal-pair code file, which carries out pretreatment, Invalid code content is filtered by digital content, obtains pretreated code file, which may include more Remaining character, notes content and pre-processing instruction etc..For example, terminal can be extracted according to C++ code regulations from code file Redundant character, the redundant character may include excess space, excess space branch or extra bracket etc.;For example, can basis Three row excess spaces are extracted from code file about C++ code regulations.
And terminal extracts notes content according to the annotation mark in C++ code regulations from code file, wherein Annotation mark can be double slashes " // ", either "/* " and " */" etc., for example, can be literary from code according to C++ code regulations Annotation mark " // " is searched in part, and be expert at according to annotation mark " // " from code file " // " extract including " // " and The notes content of content behind.
Terminal can also identify according to the pretreatment in C++ code regulations and extract pre-processing instruction from code file, Pretreatment mark may include " # " etc., for example, can be searched from code file according to the code regulation about C Plus Plus Pretreatment mark " # ", and be expert at and extracted including " # " and its subsequent according to pretreatment mark " # " " # " from code file The pre-processing instruction of content.At this point, terminal can by the invalid codes such as redundant character, notes content and pre-processing instruction content from It is deleted in code file, obtains pretreated code file.
It will be illustrated below, for example, some subcode file before pretreatment in C++ code files is:
At this point it is possible to extract from the subcode file redundant character, notes content and pre- according to C++ code regulations The invalid codes content such as process instruction, and invalid code content is filtered, obtaining pretreated subcode file is:
It should be noted that by internal function main () " #ifdef TEST_PRE_CMD ", " #else ", " printf ("TEST_PRE_CMD is not define.");After the invalid codes information filtering such as //not define " and " #endif ", Null can be reserved, invalid code content is expert at before being used to indicate pretreatment;It is of course also possible to not reserve null;It can be with Flexibly it is arranged according to actual needs, particular content is not construed as limiting here.
Morphological analysis:
After obtaining pretreated code file, terminal can carry out morphological analysis to pretreated code file, Obtain the corresponding lexical unit sequence table of each subcode file in code file.For example, terminal can obtain C++ code files In in each subcode file each lexical unit string value, and obtain attribute associated with each lexical unit and believe Breath generates doubly linked list according to the string value of each lexical unit and attribute information, it is corresponding to obtain each subcode file Lexical unit sequence table.The attribute information may include the pointer information of each lexical unit and the feature letter of each lexical unit Breath etc., the pointer information may include the pointer for the next lexical unit for being directed toward current lexical unit, be directed toward current morphology list The pointer of a upper lexical unit for member is directed toward and the pointer of the lexical unit of current lexical unit pairing and direction morphology list The pointer etc. of the symbol table of member, this feature information may include line number and lexical unit in code file where lexical unit Type etc., the type of lexical unit may include the types such as number, character string, variable, function and keyword.
Wherein, the morphological analysis is similar with above-mentioned morphological analysis process, does not repeat here.For example, to some code Section:For (int i=0;i<10;++ i) carry out morphological analysis, can obtain " for ", " (", " int ", " i ", "=", " 0 ", “;”、“i”、“<”、“10”、“;", " ++ ", " i " and ") " etc. lexical units, then, lexical unit group in the form of doubly linked list At lexical unit sequence.
Standardization:
After obtaining lexical unit sequence, terminal can be standardized each subcode file in C++ code files Processing, obtains each standard subcode file.For example, terminal can be according to the corresponding lexical unit sequence of each subcode file Table and C++ code standard logical formats, obtain the code format of standard, and each subcode from code file to be detected Lookup and the unmatched object code format of code format, modify to object code format according to code format in file, The corresponding standard subcode file of each subcode file is obtained, code file is replaced into row equivalent code logic to realize It changes, with reputable code format.
It, can be as some local code before being standardized for example, for standardizing macro expansion Under:
Code detecting apparatus can analyze the position where macro #define according to lexical unit sequence table, and according to Code standard logical format determines that the code format of standard can be unfolded to macro, at this point it is possible to be looked into from the local code It looks for and the unmatched object code format of code format:It is macro not to be unfolded, it then can be according to code format to target generation Code format is modified, i.e., is unfolded to macro, realizes and be standardized to code, and the code for obtaining standard can be as Under:
return(a>ba:b);
After obtaining standard subcode file, terminal can obtain the corresponding lexical unit sequence of each standard subcode file List, for example, the character of addition can be inserted into the corresponding lexical unit sequence table of subcode file before standardization, or Person is to delete the character of deletion from the corresponding lexical unit sequence table of subcode file before standardization, marked The corresponding lexical unit sequence table of quasi- subcode file.
Build local symbol table:
In obtaining C++ code files after the corresponding lexical unit sequence table of each standard subcode file, terminal can be with Each standard subcode file corresponds to local symbol table in structure C++ code files.For example, terminal can be according to each standard The corresponding lexical unit sequence table of code file builds abstract syntax tree, according to abstract syntax tree structure and each subcode file Corresponding local symbol table.Wherein, the structure of abstract syntax tree is similar with the structure of above-mentioned abstract syntax tree, does not go to live in the household of one's in-laws on getting married here It states.
Terminal can obtain the corresponding class list of each subcode file, function list and variable column according to abstract syntax tree Table, may include in such list class and with the relevant key characteristic of class, may include function and and function in the function list Relevant key characteristic may include variable and with the relevant key characteristic of variable etc. in the variable list, then according to each The corresponding class list of subcode file, function list and variable list build local symbol corresponding with each subcode file Table.Wherein, the structure of the local symbol table is similar with the above-mentioned structure of local symbol table, does not repeat here.
Build global symbol table:
After obtaining the corresponding local symbol table of each subcode file, terminal can be corresponded to according to each subcode file Local symbol table build global symbol table.For example, local symbol table can be concentrated each local symbol table to close by terminal And the symbol table after being merged;Then identical symbolic parameter in the symbol table after merging is retained one of them, and to phase Other parameters in same symbolic parameter are deleted, and global symbol table is obtained.Wherein, the structure of the global symbol table with it is above-mentioned The structure of global symbol table is similar, does not repeat here.
After obtaining global symbol table, terminal can carry out second to C++ code files and scan, to obtain each filial generation The corresponding lexical unit sequence table of code file and local symbol table.For example, terminal can be from pre-stored lexical unit sequence Obtain the corresponding lexical unit sequence table of each subcode file in the local hard drive or server of table collection, and from prestoring Local symbol table collection local hard drive or server in obtain the corresponding local symbol table of each subcode file.
Either, terminal can reacquire the corresponding lexical unit sequence table of each subcode file according to the method described above And local symbol table specifically can carry out morphological analysis to each subcode file in C++ code files, obtain every height The corresponding lexical unit sequence table of code file.For example, the character of each lexical unit in each subcode file can be obtained String value and associated attribute information;Doubly linked list is generated according to the string value of each lexical unit and attribute information, is obtained Each corresponding lexical unit sequence table of subcode file.Terminal can also be according to the corresponding lexical unit of each subcode file Sequence table is standardized each subcode file in code file to be detected, obtains each standard subcode text Part.Then, the corresponding lexical unit sequence table of each standard subcode file is obtained, finally by each standard subcode file pair The lexical unit sequence table answered is set as the corresponding lexical unit sequence table of each subcode file.
At this point, terminal can build abstract syntax tree corresponding with each subcode file according to lexical unit sequence table, And local symbol table corresponding with each subcode file is built according to abstract syntax tree.For example, can be according to abstract syntax Tree obtains the corresponding class list of each subcode file in code file to be detected, function list and variable list, and according to The each corresponding class list of subcode file, function list and variable list structure part symbol corresponding with each subcode file Number table.
Class, function, variable are searched and link:
Each subcode file corresponds in obtaining the corresponding global symbol table of C++ code files and C++ code files Lexical unit sequence table and local symbol table after, terminal can be by the symbolic look-up algorithm of context-sensitive to lexical unit Class, function and variable in sequence table etc. are searched and are linked, to update lexical unit sequence table.Using context-sensitive Symbolic look-up algorithm, compare simple string matching, there is higher accuracy, in addition, in Data Structure Design, to the greatest extent Data structure size is potentially reduced, caching is made full use of, improves search efficiency.
For example, the concrete type in code file representated by a lexical unit is influenced by current context, with such as For lower code segment:
By the code segment it is found that the base class A of class B is directed to N1 actually::A is also directed to N2::A, in fact, looking into When encountering such case during looking for, what can be taken is nearby principle, i.e. the nearest type of priority match, therefore, class The base class A of B is directed toward N2::A.
Referring to Fig. 10, Figure 10 is the flow diagram of update lexical unit sequence table provided in an embodiment of the present invention.It should Method flow may include:
S201, type parameter in the lexical unit sequence table of current subcode file is obtained, and type parameter is carried out Level splits to obtain type qualified name and type name.
Terminal by the symbolic look-up algorithm of context-sensitive during updating lexical unit sequence table, first, from C ++ subcode file is obtained in code file obtains the lexical unit sequence of current subcode file as current subcode file List traverses the lexical unit sequence table and obtains type parameters such as class, function and variable in lexical unit sequence table, and by class Shape parameter carries out level and splits to obtain type qualified name and type name, for example, for type parameter A::B::C, type are entitled C limits entitled A and B.
At this point, the type qualified name and type name of type parameter can be successively pressed into stack S by terminal, for example, for class Shape parameter A::B::A, B, C can be successively pressed into stack S by C, and the type name C is in stack top.
S202, judge whether type name is system type name;If so, thening follow the steps S203;If it is not, thening follow the steps S204。
Terminal can take out the stack top element of stack S, i.e., it (does not include type parameter to take out the type name of stack top in stack S Qualified name), then judge whether the type name of such shape parameter is system type name, which may include that code is compiled The type name stored in the included data of system is translated, for example, main.
The pointer that S203, return type parameter are directed toward symbol table is sky.
When the type name of type parameter is system type name, can not continue in local symbol table or global symbol table The type name is searched, but terminates to search flow, it is sky to return to such shape parameter and be directed toward the pointer of symbol table.At this point, terminal can Pointer to be directed toward symbol table according to such shape parameter is sky, is carried out more to the lexical unit sequence table of current subcode file Newly, i.e., the pointer of the direction symbol table of such shape parameter in lexical unit sequence table is set as empty.
S204, the type name that type parameter is searched from the local symbol table of current subcode file.
S205, judge to whether there is type name in local symbol table;If so, thening follow the steps S206;If it is not, then executing step Rapid S209.
When the type name of type parameter is not system type name, terminal can be from the local symbol of current subcode file The type name that type parameter is searched in table can further judge the type name that whether there is type parameter in local symbol table.
It should be noted that terminal can be in the local symbol table where the type parameter preset range in search, when looking into When can not find, then expanded scope is searched successively, until having searched all ranges in local symbol table.
S206, corresponding with type name in the local symbol table qualified name of qualified name in lexical unit sequence table is carried out Match.
S207, judge whether to match;If so, thening follow the steps S208;If it is not, thening follow the steps S209.
When the type name of present pattern parameter in local symbol table, terminal can take out the qualified name of stack top in stack S, Then the qualified name of type parameter in lexical unit sequence table is matched with the qualified name of type parameter in local symbol table, Judge whether the qualified name of type parameter in morphology unit sequence table matches into the qualified name of type parameter in local symbol table Work(.
It should be noted that when type parameter includes multiple qualified names, multiple qualified names can be matched one by one, Until completing all qualified name successful match, then illustrate the qualified name of type parameter and local symbol table in morphology unit sequence table The qualified name of middle type parameter whether successful match.
S208, the type parameter in lexical unit sequence table is linked with the type parameter in local symbol table.
If the qualified name of type parameter is matched with the qualified name of type parameter in local symbol table in lexical unit sequence table Success, then terminal can link the type parameter in lexical unit sequence table with type parameter in local symbol table, example If pointer links, for example, the type parameter in lexical unit sequence table can be directed toward to the pointer of symbol table, it is set to point to office Type parameter in portion's symbol table obtains updated lexical unit sequence table.
S209, the type name that type parameter is searched from global symbol table.
S210, judge to whether there is type name in global symbol table;If so, thening follow the steps S211;If it is not, then executing step Rapid S203.
When the type name of type parameter is not present in local symbol table, type parameter can be searched from global symbol table Type name, judge in global symbol table whether there is type parameter type name.
It should be noted that terminal can be in the global symbol table where the type parameter preset range in search, when looking into When can not find, then expanded scope is searched successively, until having searched all ranges in global symbol table.
S211, corresponding with type name in the global symbol table qualified name of qualified name in lexical unit sequence table is carried out Match.
S212, judge whether to match;If so, thening follow the steps S213;If it is not, thening follow the steps S203.
When the type name of present pattern parameter in global symbol table, terminal can take out the qualified name of stack top in stack S, And match the qualified name of type parameter in lexical unit sequence table with the qualified name of type parameter in global symbol table, sentence In hyphenation method unit sequence table in the qualified name of type parameter and global symbol table type parameter qualified name whether successful match.
It should be noted that when type parameter includes multiple qualified names, multiple qualified names can be matched one by one, Until completing all qualified name successful match, then illustrate the qualified name and global symbol table of type parameter in morphology unit sequence table The qualified name of middle type parameter whether successful match.
S213, the type parameter in lexical unit sequence table is linked with the type parameter in global symbol table.
If the qualified name of type parameter is matched with the qualified name of type parameter in global symbol table in lexical unit sequence table Success, then link the type parameter in lexical unit sequence table with the type parameter in global symbol table, such as pointer Link is set to point to global symbol for example, the type parameter in lexical unit sequence table can be directed toward to the pointer of symbol table Type parameter in table obtains updated lexical unit sequence table.
It should be noted that when lexical unit sequence table includes multiple type parameters, can be traversed one by one, directly To the lookup and link for completing all types parameter.
Code check item scans:
After the lexical unit sequence table of update subcode file, terminal can be according to global symbol table, subcode file Local symbol table and updated lexical unit sequence table, determine the testing result of subcode file.For example, terminal can be with Based on the global symbol table and Symbolic Links built as a result, carrying out code scans for each error code scene, that is, traverse more Lexical unit sequence table after new, searches class, function and the variable etc. in updated lexical unit sequence table, according to finding Class, function or variable etc., the local symbol table and global symbol table where class, function or variable are called, from local symbol table Or the testing result of subcode file is extracted in the key feature stored in global symbol table.
Export testing result:
After the testing result for obtaining each subcode file, terminal can be believed according to certain format output code mistake Breath, specific output format can be flexibly arranged according to actual needs, and particular content is not construed as limiting here.
The embodiment of the present invention can according to global symbol table, local symbol table and lexical unit sequence table to code file into Row static code detects (not needing compiled code), has fully considered that code file missing, type definition missing and grammer are wrong Accidentally situations such as, can be detected for static code and provide the symbolism knots such as accurate and efficient global symbol table and local symbol table Fruit so that code detection has syntactic level, across function scanning, semantic level and a degree of logic analysis ability;No Improve only the accuracy and high efficiency of code detection result, and can be found that defect that may be present in code, performance and The code detection result of the potential problems such as safety, final output can help that is hidden in the quick location code of developer to ask Topic reduces the rehabilitation cost in later stage, and can promote generation so that developer is efficient and low cost repairs code Code quality.
For ease of preferably implementing code detection method provided in an embodiment of the present invention, the embodiment of the present invention also provides one kind Device based on above-mentioned code detection method.Wherein the meaning of noun is identical with above-mentioned code detection method, and specific implementation is thin Section can be with the explanation in reference method embodiment.
Please refer to Fig.1 the structural schematic diagram that 1, Figure 11 is code detecting apparatus provided in an embodiment of the present invention, the wherein generation Code detection device may include first acquisition unit 301, construction unit 302, second acquisition unit 303, updating unit 304 and really Order member 305 etc..
Wherein, first acquisition unit 301, for obtaining code file to be detected.
The code can be the source file that the language that developer's exploitation tool is supported is write out, can be one Group is indicated the specific rule system of information by character, symbol or signal element etc. with discrete form.The code can be C++ languages The source file that speech, C language or Java language etc. are write out, can also be the source file that other language are write out, specific interior Appearance is not construed as limiting here.
First acquisition unit 301 obtains code file to be detected first, which can be one The code file of software project may include one or more subcode files.
In some embodiments, as shown in figure 14, first acquisition unit 301 may include extraction subelement 3011 and mistake Filter unit 3012 etc., specifically can be as follows:
Subelement 3011 is extracted, is extracted in vain from code file for obtaining code file, and according to code regulation Code content;
Filtering subelement 3012 obtains code file to be detected for invalid code content to be filtered.
Specifically, extraction subelement 3011 can obtain code file, for example, can be from local pre-stored code library Middle acquisition code file, the code file can be that code detecting apparatus first passes through the generation of code programming developing instrument in advance;Or It is that can send code file to server and obtain request, and receive server and the generation that request returns is obtained based on code file Code file, which can be that code detecting apparatus or other-end are uploaded to server, by the server storage generation Code file.It is understood that the acquisition modes of code file can not be limited here with other acquisition modes, particular content It is fixed.
After obtaining code file, extraction subelement 3011 can pre-process code file, follow-up to filter out Unwanted content is handled, for example, the part unrelated with valid code can be filtered, specification is provided for subsequent morphological analysis Character stream.The pretreatment may include extracting invalid code content from code file according to code regulation, wherein work as code When file is the source file write using C Plus Plus, which can be the redaction rule about C Plus Plus;Work as code When file is the source file that profit is shown a C language, which can be the redaction rule about C language;Work as code file When being the source file write using Java language, which can be the redaction rule etc. about Java language.This is invalid Code content may include notes content or pre-processing instruction etc., can also include other content, and particular content is not made here It limits.
After obtaining invalid code content, invalid code content can be filtered by filtering subelement 3012, i.e., by nothing Effect code content is deleted from code file, obtains code file to be detected.When code file includes multiple subcode texts When part, multiple subcode files can be traversed, extract invalid code content from each subcode file, and by invalid code Content is filtered.
Optionally, extraction subelement 3011 specifically can be used for:
Redundant character is extracted from code file according to code regulation;
Notes content is extracted from code file according to the annotation mark in code regulation;
Pre-processing instruction is extracted from code file according to the pretreatment mark in code regulation;
Set redundant character, notes content and pre-processing instruction to invalid code content.
Specifically, extraction subelement 3011 can extract redundant character according to code regulation from code file, this is more Remaining character may include excess space, excess space branch or extra bracket etc..For example, when code file is to utilize C Plus Plus When the source file write, three row excess spaces can be extracted from code file according to the code regulation about C Plus Plus. In another example when code file is the source file write using Java language, can according to the code regulation about Java language, Five excess spaces therein are extracted from continuous six spaces present in code file;Etc..
Extraction subelement 3011 can identify according to the annotation in code regulation and extract notes content from code file, Wherein, annotation mark can be double slashes " // ", either "/* " and " */" etc..For example, when code file is to utilize C Plus Plus When the source file write, annotation mark " // " can be searched from code file according to the code regulation about C Plus Plus, and It is expert at according to annotation mark " // " from code file " // " and extracts notes content, the notes content is including " // " and thereafter The content in face.
In another example when code file is the source file write using C Plus Plus, it can be according to the generation about C Plus Plus Code rule searches the annotation mark "/* " of starting from code file, and the annotation mark of termination is searched from code file " */", and "/* " is identified according to the annotation of starting and the annotation terminated identifies " */" and extracts "/* " (including/* between " */" With * /) notes content.
Extraction subelement 3011 can identify according to the pretreatment in code regulation and extract pretreatment from code file Instruction, pretreatment mark may include " # " etc., which may include #define, #if and #pragma etc..Example It such as, can be according to the code regulation about C Plus Plus, from code when code file is the source file write using C Plus Plus Pretreatment mark " # " is searched in file, and is expert at according to pretreatment mark " # " " # " from code file and is extracted pretreatment Instruction, which includes " # " and its subsequent content.
After obtaining redundant character, notes content and pre-processing instruction, extraction subelement 3011 can by redundant character, Notes content and pre-processing instruction are set as invalid code content, realize and extract nothing from code file according to code regulation Imitate code content.
Construction unit 302, for building the corresponding global symbol table of code file to be detected.
After obtaining code file to be detected, it is corresponding complete that construction unit 302 can build code file to be detected Office's symbol table, wherein global symbol table may include the class that each subcode file includes in code file to be detected and its The data structure of the symbols such as relevant information, function and its relevant information, variable and its relevant information.
In some embodiments, as shown in figure 12, construction unit 302 may include the first acquisition subelement 3021, One structure subelement 3022 and second builds subelement 3023 etc., specifically can be as follows:
First obtains subelement 3021, for obtaining word corresponding with each subcode file in code file to be detected Method unit sequence table obtains lexical unit sequence table collection;
First structure subelement 3022, for concentrating each lexical unit sequence table, structure according to lexical unit sequence table Local symbol table corresponding with each subcode file in code file to be detected, obtains local symbol table collection;
Second structure subelement 3023, for building global symbol table according to local symbol table collection.
Specifically, the first acquisition subelement 3021 can be obtained first and each subcode file in code file to be detected Corresponding lexical unit sequence table obtains lexical unit sequence table collection, for example, each subcode can be obtained by morphological analysis The corresponding lexical unit sequence table of file.Wherein, may include that multiple lexical units (are properly termed as word in lexical unit sequence table Element is referred to as Token), for example, if or for etc. is a lexical unit, a sub- code file passes through morphological analysis The lexical unit sequence table of the set for all lexical units that can be generated afterwards, as the subcode file (is properly termed as TokenList), one section of code can correspond to obtain a lexical unit section that (i.e. one Token sections, also may be used in subcode file With referred to as TokenSection), include the sequence of one or more Token compositions.
It should be noted that in order to improve the acquisition efficiency of lexical unit sequence table, first obtains subelement 3021 can be with Multiple threads are called, and obtain the lexical unit sequence table of each subcode file by each thread parallel, so as to fast Speed gets multiple lexical unit sequence tables.Certainly, the first acquisition subelement 3021 can serially obtain lexical unit sequence table, Particular content is not construed as limiting here.
In some embodiments, the first acquisition subelement 3021 may include analysis module, processing module and acquisition mould Block etc., specifically can be as follows:
Analysis module obtains each for carrying out morphological analysis to each subcode file in code file to be detected The corresponding lexical unit sequence table of subcode file;
Processing module is used for according to the corresponding lexical unit sequence table of each subcode file, to code text to be detected Each subcode file is standardized in part, obtains the standard code file set of each standard subcode file composition;
Acquisition module, for obtaining the corresponding lexical unit sequence of each standard subcode file in standard code file set Table obtains lexical unit sequence table collection.
Specifically, during obtaining lexical unit sequence table collection, first, analysis module can be to code to be detected Each subcode file carries out morphological analysis in file, obtains the corresponding lexical unit sequence table of each subcode file, each The corresponding lexical unit sequence table of subcode file can form lexical unit sequence table collection, and lexical unit sequence table concentration can To include the corresponding lexical unit sequence table of one or more subcode files.
Wherein, morphological analysis can be that the character string in subcode file is converted to the process of lexical unit sequence, The character stream for being mainly used for reading in pretreatment output of the morphological analysis, forms morpheme by the character stream, generates and export one Lexical unit sequence, each lexical unit correspond to a morpheme, and entire lexical unit sequence is lexical unit sequence table, morphology Unit sequence table is the Data Structures of subsequent processing and upper layer check item traversal code.
For example, to some code segment:For (int index=0;index<42;++ index) morphological analysis is carried out, it can be with Obtain " for ", " (", " int ", " index ", "=", " 0 ", ";”、“index”、“<”、“42”、“;", " ++ ", " index " and ") " etc. 14 lexical units composition lexical unit sequence.
In some embodiments, analysis module may include the first acquisition submodule and the second acquisition submodule etc., tool Body can be as follows:
First acquisition submodule, for obtaining in code file to be detected each lexical unit in each subcode file String value;
Second acquisition submodule, for obtaining attribute information associated with each lexical unit;
Submodule is generated, for generating doubly linked list according to the string value and attribute information of each lexical unit, is obtained Each lexical unit sequence table collection of the corresponding lexical unit sequence table composition of subcode file.
During obtaining lexical unit sequence table, the first acquisition submodule can obtain in code file to be detected The string value of each lexical unit in each subcode file, for example, being by " f ", " o ", " r " for lexical unit " for " It is formed Deng three string values, lexical unit " int " is made of three string values such as " i ", " n ", " t ".
Lexical unit sequence table is to carry out the most basic unit of code detection, in addition to the string value including lexical unit with Outside, can also include attribute information associated with lexical unit, therefore, the second acquisition submodule can obtain and each morphology The associated attribute information of unit, wherein the attribute information may include the pointer for being directed toward next lexical unit, be directed toward upper one The line number etc. of the pointer of a lexical unit, the type of lexical unit and lexical unit.
In obtaining subcode file after the string value and attribute information of each lexical unit, generating submodule can root Doubly linked list is generated according to the string value and attribute information of each lexical unit, obtains the corresponding morphology list of each subcode file Metasequence table, the corresponding lexical unit sequence table of each subcode file can form lexical unit sequence table collection.Lexical unit Sequence table is substantially a doubly linked list, safeguards lexical unit all in lexical unit sequence table.
For example, as shown in figure 3, certain section of code:“if(i>0) lexical unit sequence table " can be expressed as shown in Fig. 3 Doubly linked list, wherein arrow can indicate the pointer for the next lexical unit for being directed toward current lexical unit, for example, sensing word Method unit " i " next lexical unit ">" pointer;Or arrow can indicate the upper word for being directed toward current lexical unit The pointer of method unit, for example, be directed toward lexical unit " (" a upper lexical unit " if " pointer;Or arrow can indicate Current lexical unit is directed toward the pointer of the lexical unit matched with it, for example, lexical unit " (" it is directed toward the morphology list matched with it Member ") " pointer;Etc..
Optionally, the second acquisition submodule specifically can be used for:Each lexical unit is obtained in code file to be detected In the pointer of various information is directed toward in each subcode file, obtain pointer information;Obtain the characteristic information of each lexical unit; Set the characteristic information of pointer information and each lexical unit to attribute information associated with each lexical unit.
Specifically, the second acquisition submodule can obtain each filial generation in code file to be detected of each lexical unit The pointer of various information is directed toward in code file, wherein various information may include next morphology list of current lexical unit The data flow architecture etc. of member, the lexical unit matched with current lexical unit and lexical unit.
For example, the pointer for the next lexical unit for being directed toward current lexical unit can be obtained, be directed toward current lexical unit A upper lexical unit pointer, be directed toward with current lexical unit pairing lexical unit pointer (for example, for left bracket For, that is, be directed toward the pointer of right parenthesis), be directed toward lexical unit symbol table pointer (for example, what variable was directed toward is global Variable object in symbol table or local symbol table, what function was directed toward is function in global symbol table or local symbol table Object etc.), the syntax tree structure pointer (the abstract syntax tree construction that can be used for safeguarding lexical unit) and morphology of lexical unit Data flow architecture pointer of unit etc., these are pointer information.
At this point, the second acquisition submodule also needs to obtain the characteristic information of each lexical unit, wherein characteristic information can be with Include the type etc. of line number and lexical unit in code file where lexical unit, the type of lexical unit may include number The types such as word, character string, variable, function and keyword, for example, " 1 " and " 2 " etc. can be numeric type, " main " can be Type function, " index " and " i " etc. can be types of variables, etc..The characteristic information of these pointer informations and lexical unit is For attribute information associated with lexical unit.
In some embodiments, processing module specifically can be used for:According to the corresponding morphology list of each subcode file Metasequence table and code standard logical format, obtain the code format of standard;Each subcode from code file to be detected It is searched and the unmatched object code format of code format in file;It is modified to object code format according to code format, Obtain the standard code file set of each standard subcode file composition.
Since disparity items code may be to be write by different developers, for the code wind of different developers Lattice may be different, and objectively form the multifarious present situation of code, therefore, in order to improve the effect of structure global symbol table Rate, and the accuracy of raising code detection can be standardized code file after getting lexical unit sequence table Processing, for example, can standardize and standardize to realize by some simplification steps come Unicode style.Wherein, the standard During change is handled, code logic cannot all be changed for all simplified steps, and only carry out equivalencing in logic.
Specifically, after obtaining the corresponding lexical unit sequence table of each subcode file, processing module can be according to every The corresponding lexical unit sequence table of a sub- code file and code standard logical format, obtain the code format of standard, example Such as, it is starting that the logical format that a function is realized, which can be with opening brace, and is terminated with right braces.Wherein, work as code When file is the source file write using C Plus Plus, which can be the logic lattice about C Plus Plus Formula;When code file is the source file write using Java language, which can be about Java languages The logical format of speech.
Then, processing module in code file to be detected in each subcode file from searching and the code format of standard Unmatched object code format modifies to object code format according to the code format of standard, obtains each standard Code file, for example, the code format of standard can be utilized to replace object code format, each standard subcode file can group At standard code file set.Code file is replaced into row equivalent code logic to realize, with reputable code format.
For example, by taking normalization condition expression formula as an example, before being standardized, in some subcode file Conditional expression the sentence of braces is omitted, specific subcode file can be as follows:
Processing module can analyze conditional expression if (i according to the corresponding lexical unit sequence table of subcode file>0) The position at place, and determine that the code format of standard is to need to be arranged behind conditional expression according to code standard logical format Braces, at this point it is possible to be searched and the unmatched object code format of code format from subcode file:Conditional expression if (i>0) it is not provided with braces below, then can be modified to object code format according to code format, i.e., in condition table Up to formula if (i>0) braces is added below, realization is standardized sub- code file, obtains standard subcode file, It specifically can be as follows:
After obtaining standard code file set, acquisition module can obtain each standard subcode in standard code file set The corresponding lexical unit sequence table of file, for example, the character (such as braces) of addition can be inserted into before standardization The corresponding lexical unit sequence table of subcode file obtains the corresponding lexical unit sequence table of standard subcode file;Either, The character of deletion can be deleted from the corresponding lexical unit sequence table of subcode file before standardization, obtains standard The corresponding lexical unit sequence table of subcode file;The corresponding lexical unit sequence table of each standard subcode file can form Lexical unit sequence table collection.
It is above-mentioned obtain lexical unit sequence table collection after, first structure subelement 3022 can be according to lexical unit sequence table Each lexical unit sequence table is concentrated, local symbol corresponding with each subcode file in code file to be detected is built Table obtains local symbol table collection.For example, lexical unit sequence table can be traversed, subcode is extracted from lexical unit sequence table The information such as the corresponding class of file, function and variable, and with the information such as the relevant key characteristic such as class, function and variable, according to These information can build class list, function list and variable list etc., can be arranged according to the corresponding class of each subcode file Table, function list and variable list build local symbol table corresponding with each subcode file, and each subcode file corresponds to Local symbol table can form local symbol table collection.
In some embodiments, the first structure subelement 3022 may include the first structure module and the second structure module Deng specifically can be as follows:
First structure module, for concentrating each lexical unit sequence table according to lexical unit sequence table, structure with it is to be checked The corresponding abstract syntax tree of each subcode file in the code file of survey;
Second structure module, for according to abstract syntax tree structure and each subcode file in code file to be detected Corresponding local symbol table obtains local symbol table collection.
Specifically, the first structure module can concentrate each lexical unit sequence table, structure according to lexical unit sequence table Abstract syntax tree (Abstract Syntax Tree, AST) corresponding with each subcode file in code file to be detected. Wherein, abstract syntax tree can be the tree-shaped form of expression of the abstract syntax structure of code, and abstract syntax tree can be one two Fork tree, each non-leaf nodes represent an operator, and two child nodes of non-leaf nodes respectively represent where operator Two operation components of the operator.The priority for the logical construction and operator that abstract syntax tree construction contains expression formula is closed System, this characteristic can improve the accuracy of code scene matching and realize the efficiency of the code scene.
It should be noted that the abstract syntax tree in the embodiment of the present invention, will not establish the logic between code expression Relationship, for example, the logical relation in if-else statement interludes between if sentences and else sentences will not be established, and just for single Abstract syntax structure is established in code expression, does not establish the structural relation between expression formula and expression formula.If due to building Vertical structural relation between expression formula and expression formula is built then once there are syntax errors for the code file of input Global abstract syntax tree construction will be mistake, and without reference to meaning, thus the present invention support it is incomplete or not Input can be used as by the code file of compiling, and build the abstract syntax tree construction of single expression formula, if some expression formula There is mistake, also only mistake occurs in the abstract syntax tree construction of part for that, the abstract syntax without influencing other expression formulas Tree construction.
For example, such as next section of code:
String::Format("demo:%d%s%d ", Func (1,2), " AST ", 1+2*3);
It, can be with as shown in figure 4, including parameter for the structure for the abstract syntax tree that this section of code is finally built 1:"demo:%d%s%d ", parameter 2:Func (1,2), parameter 3:" AST " and parameter 4:1+2*3 etc., for example, non-leaf section Point " * " represents an operator, and two child nodes " 2 " of non-leaf nodes and " 3 " respectively represent the operation where the operator Two operation components of symbol, i.e. 2*3;Corresponding two child nodes of operator "+" are " 1 " and " * ", can obtain 1+2*3.
After obtaining abstract syntax tree, the second structure module can be built and code to be detected text according to abstract syntax tree The corresponding local symbol table of each subcode file in part, the corresponding local symbol table of each subcode file can form part Symbel table.
In some embodiments, the second structure module specifically can be used for:It is obtained according to abstract syntax tree to be detected The corresponding class list of each subcode file, function list and variable list in code file;According to each subcode file pair Class list, function list and the variable list answered build local symbol table corresponding with each subcode file, obtain local symbol Number table collection.
Specifically, the second structure module can go out subcode file pair according to tree rapid extraction in abstract syntax tree The information such as class, function and the variable answered, and with the information such as the relevant key characteristic such as class, function and variable, according to these letters Breath can build the corresponding class list of each subcode file in code file to be detected, function list and variable list etc., It can be built according to the corresponding class list of each subcode file, function list and variable list corresponding with each subcode file Local symbol table, the corresponding local symbol table of each subcode file can form local symbol table collection.Wherein, local symbol Table is referred to as SymbolDatabase, which can be the corresponding symbolism result object of subcode file.
For example, in the source file that C Plus Plus is write, (include the .h of corresponding expansion getting each .cpp file File) after corresponding lexical unit sequence table, a corresponding local symbol table can be built according to the lexical unit sequence table, Wherein, each local symbol table can include following three types of data:(1) list of types (is referred to as Type List), it can be used for type all in record code file, for example, the types such as class, struct or namespace, it should May include each typonym and the corresponding key feature of each type etc. in list of types;(2) function list (also may be used With referred to as Function List), it can be used for function all in record code file, may include each in the function list A function name and the corresponding key feature of each function (for example, return value etc. of function) etc.;(3) variable list (also may be used With referred to as Variable List), it can be used for variable all in record code file, may include each in the variable list A name variable and the corresponding key feature of each variable etc..
To be illustrated below, for example, may include in demo.cpp code files demo1.cpp, demo.h and The subcodes file such as demo2.cpp, the code content in demo1.cpp subcode files can be as follows:
May include demo.h subcode files, the generation in demo.h subcode files in demo1.cpp subcode files Digital content can be as follows:
In Scanning Detction demo1.cpp subcode files, similar following local symbol table can be obtained:
By the local symbol table of demo1.cpp subcode files it is found that there are symbols to lack in demo1.cpp subcode files It loses:CDemo2::The definition of Func is not found, and in fact, CDemo2::Func's is defined on demo2.cpp subcode texts In part, the code content in the demo2.cpp subcode files can be as follows:
In Scanning Detction demo2.cpp subcode files, similar following local symbol table can be obtained:
By the local symbol table of demo2.cpp subcode files it is found that the local symbol table of demo2.cpp subcode files Local symbol table compared to demo1.cpp subcode files is relatively simple, and most important one information is exactly CDemo2::Func The definition of function, what this was missing from the corresponding local symbol table of demo1.cpp subcode files, this is also individually to obtain The important problem for taking each subcode file to occur, at this time across file symbol search capacity be missing from.
Therefore, after obtaining the local symbol table collection that the corresponding local symbol table of each subcode file is formed, second Global symbol table can be built according to local symbol table collection by building module, so as to find certain height in global symbol table The symbols such as the class, function or the variable that are lacked in code file (i.e. type parameter) realize across file symbol search capacity.
In some embodiments, the second structure subelement 3023 is specifically used for:
Each local symbol table is concentrated to merge in local symbol table, the symbol table after being merged;
Identical symbolic parameter in symbol table after merging is retained one of them, and to its in identical symbolic parameter He deletes at parameter, obtains global symbol table.
Specifically, local symbol table can be concentrated each local symbol table to merge by the second structure subelement 3023, Symbol table after being merged, due to that may have identical symbolic parameter in the symbol table after merging, which can To include class, function and variable etc., therefore identical symbolic parameter in the symbol table after merging can be retained one of them, and Other parameters in identical symbolic parameter are deleted, global symbol table is obtained.Either, in symbol table after merging There are when two identical symbolic parameters, one of symbolic parameter is undefined, and when another symbolic parameter defines, merging During can retain defined symbolic parameter, and delete undefined symbolic parameter.
For example, constructing local symbol respectively for demo1.cpp subcodes file and demo2.cpp subcode files Table, but the CDemo2 in demo1.cpp subcode files::What the definition of Func functions was missing from, the definition of the function be In the local symbol table of demo2.cpp subcode files.It therefore, in order to well solve symbol missing the problem of, can basis Each local symbol table builds global symbol table, build global symbol table it is crucial that solving the symbols such as class, function and variable It searches and merging logic when symbol conflict.
For example, in order to visually illustrate the structure of global symbol table, illustrated in the form of symbolic construction tree, The corresponding symbolic construction tree of local symbol table of demo1.cpp subcode files can with as shown in figure 5, as shown in Figure 5, May include the classes such as CObject, CDemo1, CDemo2 in the local symbol table of demo1.cpp subcode files, and including Func functions and global_var1 variables etc., wherein there is Func defined in CDemo1 classes, and Func definition lacks in CDemo2 classes It loses.
The corresponding symbolic construction tree of local symbol table of demo2.cpp subcode files can be with as shown in fig. 6, can by Fig. 6 Know, may include the classes such as CObject and CDemo2 and global_var2 in the local symbol table of demo1.cpp subcode files Variable etc., wherein have Func defined in CDemo2 classes.
According to the local symbol of the local symbol table of demo1.cpp subcode files and demo2.cpp subcode files Table, the global symbol table built can with as shown in fig. 7, may include as shown in Figure 7, in global symbol table CObject, The classes such as CDemo1 and CDemo2, and further include the variables such as global_var1 and global_var2 including Func functions, In, definition has Func in CDemo1 and CDemo2 classes.I.e. by demo1.cpp subcodes file and demo2.cpp subcode texts After the symbolic construction tree of part merges, the CDemo2 in demo2.cpp subcode files::The definition of Func functions, and Global_var2 variables have been merged into the symbolic construction tree of demo1.cpp subcode files.
Second acquisition unit 303, for obtaining the corresponding morphology list of each subcode file in code file to be detected Metasequence table and local symbol table.
Code is detected for convenience, second acquisition unit 303 needs to obtain each in code file to be detected The corresponding lexical unit sequence table of subcode file and local symbol table.
In some embodiments, second acquisition unit 303 specifically can be used for:
When being stored with lexical unit sequence table collection and local symbel table, concentrate acquisition to be checked from lexical unit sequence table The corresponding lexical unit sequence table of each subcode file in the code file of survey;
It is concentrated from local symbol table and obtains the corresponding local symbol table of each subcode file in code file to be detected.
Specifically, during above-mentioned acquisition global symbol table, due to needing in acquisition and code file to be detected Each corresponding lexical unit sequence table of subcode file, and it is corresponding with each subcode file in code file to be detected Local symbol table etc., therefore, code detecting apparatus it is above-mentioned get lexical unit sequence table collection and local symbel table after, Lexical unit sequence table collection and local symbel table can be stored into local hard drive;Alternatively, by lexical unit sequence table collection It is uploaded to server with local symbel table, lexical unit sequence table collection and local symbel table are stored by server; Etc..
At this point, second acquisition unit 303 is obtaining the corresponding lexical unit sequence table of each subcode file and part symbol During number table, it can be determined that whether local hard drive or server etc. are stored with lexical unit sequence table collection, when being stored with word When method unit sequence table collection, it can directly be concentrated from lexical unit sequence table and obtain each subcode in code file to be detected The corresponding lexical unit sequence table of file.And judge local hard drive or server etc. and whether be stored with local symbol table collection, when When being stored with local symbol table collection, it can directly be concentrated from local symbol table and obtain each subcode in code file to be detected The corresponding local symbol table of file.
In some embodiments, second acquisition unit 303 specifically can be used for:
When not storing lexical unit sequence table collection and local symbel table, to each filial generation in code file to be detected Code file carries out morphological analysis, obtains the corresponding lexical unit sequence table of each subcode file;According to lexical unit sequence table, Structure abstract syntax tree corresponding with each subcode file;It is built according to abstract syntax tree corresponding with each subcode file Local symbol table.
Code detecting apparatus can not store the lexical unit sequence got during above-mentioned acquisition global symbol table List collection and local symbel table etc., at this point, second acquisition unit 303 is obtaining the corresponding lexical unit of each subcode file During sequence table and local symbol table, it can be determined that whether local hard drive or server etc. are stored with lexical unit sequence table Collection, when not being stored with lexical unit sequence table collection, code detecting apparatus needs to reacquire each subcode file corresponding Lexical unit sequence table and local symbol table.
Specifically, second acquisition unit 303 can carry out morphology to each subcode file in code file to be detected Analysis, obtains the corresponding lexical unit sequence table of each subcode file.For example, can obtain every in code file to be detected The string value of each lexical unit in a sub- code file, and obtain attribute information associated with each lexical unit; Doubly linked list is generated according to the string value of each lexical unit and attribute information, obtains the corresponding morphology of each subcode file Unit sequence table.
Second acquisition unit 303 can also be according to the corresponding lexical unit sequence table of each subcode file, to be detected Code file in each subcode file be standardized, obtain each standard subcode file.For example, can basis The each corresponding lexical unit sequence table of subcode file and code standard logical format, obtain the code format of standard;From waiting for It is searched and the unmatched object code format of code format in each subcode file in the code file of detection;According to code lattice Formula modifies to object code format, obtains the corresponding standard subcode file of each subcode file.
Then, second acquisition unit 303 obtains the corresponding lexical unit sequence table of each standard subcode file, finally will Each corresponding lexical unit sequence table of standard subcode file is set as the corresponding lexical unit sequence of each subcode file Table.At this point it is possible to according to lexical unit sequence table, structure abstract syntax tree corresponding with each subcode file, and according to Abstract syntax tree builds local symbol table corresponding with each subcode file.It is waited for for example, can be obtained according to abstract syntax tree The corresponding class list of each subcode file, function list and variable list in the code file of detection, and according to each filial generation The corresponding class list of code file, function list and variable list build local symbol table corresponding with each subcode file.
Updating unit 304, for according to local symbol table and global symbol table, being updated to lexical unit sequence table, Obtain updated lexical unit sequence table.
Determination unit 305 is used for according to local symbol table, global symbol table and updated lexical unit sequence table, really The testing result of fixed code file to be detected.
The corresponding local symbol table of each subcode file in obtaining global symbol table and code file to be detected With lexical unit sequence table, updating unit 304 can according to local symbol table and global symbol table, to lexical unit sequence table into Row update, obtains updated lexical unit sequence table.For example, updating unit 304 can traverse lexical unit sequence table, to word Class, function and variable in method unit sequence table are searched and are linked, based on the above-mentioned global symbol table built, in use Hereafter sensitive symbolic look-up algorithm is realized across file or cross-module symbolic look-up ability, is reached and is obtained more accurate code Testing result.
Wherein, the effect of lookup and the link of the symbols such as class, function and variable, which is that, will traverse lexical unit sequence table In class, function and variable etc. and local symbol table where it or global symbol table be associated, so as in traversal morphology list When metasequence table, the relevant information of class, function or variable etc. can be known, so as to significantly improve the scanning of code check item Efficiency.
In some embodiments, as shown in figure 13, updating unit 304 may include the second acquisition subelement 3041, Three obtain subelement 3042 and update subelement 3043 etc., specifically can be as follows:
Second obtains subelement 3041, for obtaining subcode file from code file to be detected, as current son Code file;
Third obtains subelement 3042, for obtaining the corresponding lexical unit sequence table of current subcode file and part symbol Number table;
Subelement 3043 is updated, is used for according to local symbol table and global symbol table, it is corresponding to current subcode file Lexical unit sequence table is updated, and obtains updated lexical unit sequence table;
Determination unit 305 specifically can be used for:According to local symbol table, global symbol table and updated lexical unit sequence List determines the testing result of current subcode file;Triggering second obtains subelement and executes from code file to be detected Subcode file is obtained, as the operation of current subcode file, until the subcode file inspection in code file to be detected Survey finishes, and obtains the testing result of code file to be detected.
Updating unit 304 can be updated lexical unit sequence table by the symbolic look-up algorithm of context-sensitive, Each subcode file morphology list in code file to be detected can be traversed by the symbolic look-up algorithm of the context-sensitive Metasequence table obtains the type parameters such as class, function and the variable in lexical unit sequence table, by type parameter and local symbol table Or the type parameter in global symbol table carries out pointer link, to update lexical unit sequence table.So that being obtained based on above-mentioned The global symbol table got realizes the symbolic look-up algorithm of context-sensitive, is one to code detection result correctness Important guarantee.
Specifically, the second acquisition subelement 3041 can obtain subcode file from code file to be detected, as Current subcode file, then, third obtain subelement 3042 and concentrate the current subcode file of acquisition from lexical unit sequence table Corresponding lexical unit sequence table, and, it is concentrated from local symbol table and obtains the corresponding local symbol table of current subcode file; Either, third obtains subelement 3042 and carries out morphological analysis to current subcode file, obtains current subcode file and corresponds to Lexical unit sequence table, and according to the lexical unit sequence table of current subcode file structure with current subcode file pair Local symbol table answered etc..
After the lexical unit sequence table that obtains current subcode file and local symbol table, update subelement 3043 can be with According to local symbol table and global symbol table, the corresponding lexical unit sequence table of current subcode file is updated, is obtained Updated lexical unit sequence table.
In some embodiments, update subelement 3043 specifically can be used for:
The type parameter in lexical unit sequence table is obtained, type parameter includes the type name and type parameter of type parameter Qualified name;
The type of type parameter is not present when the type name of type parameter is not system type name, and in local symbol table When name, the type name of type parameter is searched from global symbol table;
When the type name of present pattern parameter in global symbol table, by the restriction of type parameter in lexical unit sequence table Name is matched with the qualified name of type parameter in global symbol table;
If successful match, the type parameter in the type parameter and global symbol table in lexical unit sequence table is carried out Pointer links, and obtains updated lexical unit sequence table.
Specifically, update subelement 3043 can traverse the lexical unit sequence table of current subcode file, obtain current Type parameter in the lexical unit sequence table of subcode file, such shape parameter may include one or more, wherein such Shape parameter may include class, function and variable etc., such shape parameter may include the type name and type parameter of type parameter Qualified name, the qualified name may include one or more.For example, for type parameter A::B::C, the entitled C of type, qualified name For A::B.
After obtaining type parameter, update subelement 3043 can extract the type of type parameter from type parameter Name, then judges whether the type name of such shape parameter is system type name, which may include code compilation system The type name stored in the included data of system, for example, main.It, can not when the type name of type parameter is system type name The type name is continued to search in local symbol table or global symbol table, but terminates to search flow, is returned to such shape parameter and is referred to It is sky to the pointer of symbol table.
When the type name of type parameter is not system type name, update subelement 3043 can further judge local symbol The type name that whether there is type parameter in number table, when the type name of type parameter is not present in local symbol table, Ke Yicong The type name that type parameter is searched in global symbol table judges the type name that whether there is type parameter in global symbol table.When In global symbol table when the type name of present pattern parameter, the qualified name of type parameter in lexical unit sequence table is accorded with global The qualified name of type parameter is matched in number table, judges the qualified name and global symbol of type parameter in morphology unit sequence table In table the qualified name of type parameter whether successful match.If the qualified name and global symbol of type parameter in lexical unit sequence table The qualified name successful match of type parameter in table, then by the class in the type parameter and global symbol table in lexical unit sequence table Shape parameter carries out pointer link, for example, the type parameter in lexical unit sequence table can be directed toward to the pointer of symbol table, setting For the type parameter being directed toward in global symbol table, updated lexical unit sequence table is obtained.
In some embodiments, update subelement 3043 also specifically can be used for:
When the type name of type parameter is not system type name, and in local symbol table present pattern parameter type name When, the qualified name of type parameter in lexical unit sequence table is matched with the qualified name of type parameter in local symbol table;
If successful match, the type parameter in lexical unit sequence table is referred to type parameter in local symbol table Needle links, and obtains updated lexical unit sequence table.
When the type name of type parameter is not system type name, update subelement 3043 can further judge local symbol The type name that whether there is type parameter in number table, when the type name of present pattern parameter in local symbol table, by morphology list The qualified name of type parameter is matched with the qualified name of type parameter in local symbol table in metasequence table, judges lexical unit In sequence table in the qualified name of type parameter and local symbol table type parameter qualified name whether successful match.If lexical unit The qualified name successful match of the qualified name of type parameter and type parameter in local symbol table in sequence table, then by lexical unit sequence Type parameter in list carries out pointer with type parameter in local symbol table and links, for example, can be by lexical unit sequence table In type parameter be directed toward the pointer of symbol table, the type parameter being set to point in local symbol table obtains updated word Method unit sequence table.
After the lexical unit sequence table for updating current subcode file, determination unit 305 can according to global symbol table, The local symbol table of current subcode file and updated lexical unit sequence table, determine the detection of current subcode file As a result.For example, code check item scanning can be carried out, based on the global symbol table and Symbolic Links built as a result, for each Error code scene carries out code scans, that is, traverses updated lexical unit sequence table, search updated lexical unit sequence Class, function and variable in list etc. call the office where class, function or variable according to class, function or the variable etc. found Portion's symbol table and global symbol table extract current filial generation from the key feature stored in local symbol table or global symbol table The testing result of code file.Code detecting apparatus can also be according to certain format output code error message, specific output format Can flexibly it be arranged according to actual needs, particular content is not construed as limiting here.
To be illustrated below, for example, as shown in figure 8, with above-mentioned demo1.cpp subcodes file and For demo2.cpp subcode files, lookup and association based on global symbol table and class, function and variable, it can be found that The problem of cannot being found in single local symbol table, in testing result, the demo.Func in demo1.cpp subcode files Function call is correctly associated with the CDemo2 in demo2.cpp subcode files::Func functions are based on global symbol table, can be with When knowing that type is equal to 1, return value is this key feature of null pointer NULL, therefore can export demo1.cpp subcode texts The 19th row null pointer p dereferences report an error in part.
It completes to be updated the lexical unit sequence table of current subcode file, and is determining current subcode file After testing result, it can continue to obtain another subcode file from code file to be detected, as current subcode text Part, that is, return to execute and obtain subcode file from code file to be detected, the step of as current subcode file, until Subcode file detection in code file to be detected finishes, and obtains the testing result of code file to be detected.
From the foregoing, it will be observed that the embodiment of the present invention can obtain code file to be detected by first acquisition unit 301, and by Construction unit 302 builds global symbol table corresponding with code file to be detected, which may include to be detected Code file global information and second acquisition unit 303 obtain each subcode file in code file to be detected Corresponding lexical unit sequence table and local symbol table, the local symbol table may include the local message of subcode file;So Updating unit 304 is updated lexical unit sequence table according to local symbol table and global symbol table afterwards, obtains updated Lexical unit sequence table;Determination unit 305 according to local symbol table, global symbol table and updated lexical unit sequence table, Determine the testing result of code file to be detected.The program is due to can be by global symbol table and local symbol table to morphology Unit sequence table is updated, and is obtained according to updated lexical unit sequence table and local symbol table and global symbol table To entire code file to be detected global detection as a result, realize the global detection to code file to be detected, without It is limited only to individually carry out local detection to sub- code file, this improves the accuracys of code detection.
Correspondingly, the embodiment of the present invention also provides a kind of terminal, which can be that test terminal as shown in figure 15 should Terminal, which may include radio frequency (RF, Radio Frequency) circuit 601, to include one or more computer-readable deposits The memory 602 of storage media, input unit 603, display unit 604, sensor 605, voicefrequency circuit 606, Wireless Fidelity (WiFi, Wireless Fidelity) module 607, include there are one or more than one processing core processor 608, with And the equal components of power supply 609.It will be understood by those skilled in the art that the limit of the not structure paired terminal of terminal structure shown in Figure 15 It is fixed, may include either combining certain components or different components arrangement than illustrating more or fewer components.Wherein:
RF circuits 601 can be used for receiving and sending messages or communication process in, signal sends and receivees, particularly, by base station After downlink information receives, one or the processing of more than one processor 608 are transferred to;In addition, the data for being related to uplink are sent to Base station.In general, RF circuits 601 include but not limited to antenna, at least one amplifier, tuner, one or more oscillators, use Family identity module (SIM, Subscriber Identity Module) card, transceiver, coupler, low-noise amplifier (LNA, Low Noise Amplifier), duplexer etc..In addition, RF circuits 601 can also by radio communication with network and its He communicates equipment.The wireless communication can use any communication standard or agreement, including but not limited to global system for mobile telecommunications system Unite (GSM, Global System of Mobile communication), general packet radio service (GPRS, General Packet Radio Service), CDMA (CDMA, Code Division Multiple Access), wideband code division it is more Location (WCDMA, Wideband Code Division Multiple Access), long term evolution (LTE, Long Term Evolution), Email, short message service (SMS, Short Messaging Service) etc..
Memory 602 can be used for storing software program and module, and processor 608 is stored in memory 602 by operation Software program and module, to perform various functions application and data processing.Memory 602 can include mainly storage journey Sequence area and storage data field, wherein storing program area can storage program area, the application program (ratio needed at least one function Such as sound-playing function, image player function) etc.;Storage data field can be stored uses created data according to terminal (such as audio data, phone directory etc.) etc..In addition, memory 602 may include high-speed random access memory, can also include Nonvolatile memory, for example, at least a disk memory, flush memory device or other volatile solid-state parts.Phase Ying Di, memory 602 can also include Memory Controller, to provide processor 608 and input unit 603 to memory 602 Access.
Input unit 603 can be used for receiving the number or character information of input, and generate and user setting and function Control related keyboard, mouse, operating lever, optics or the input of trace ball signal.Specifically, in a specific embodiment In, input unit 603 may include touch sensitive surface and other input equipments.Touch sensitive surface, also referred to as touch display screen or tactile Control plate, collect user on it or neighbouring touch operation (such as user using any suitable object such as finger, stylus or Operation of the attachment on touch sensitive surface or near touch sensitive surface), and corresponding connection dress is driven according to preset formula It sets.Optionally, touch sensitive surface may include both touch detecting apparatus and touch controller.Wherein, touch detecting apparatus is examined The touch orientation of user is surveyed, and detects the signal that touch operation is brought, transmits a signal to touch controller;Touch controller from Touch information is received on touch detecting apparatus, and is converted into contact coordinate, then gives processor 608, and can reception processing Order that device 608 is sent simultaneously is executed.Furthermore, it is possible to a variety of using resistance-type, condenser type, infrared ray and surface acoustic wave etc. Type realizes touch sensitive surface.In addition to touch sensitive surface, input unit 603 can also include other input equipments.Specifically, other are defeated Enter equipment and can include but is not limited to physical keyboard, function key (such as volume control button, switch key etc.), trace ball, mouse It is one or more in mark, operating lever etc..
Display unit 604 can be used for showing information input by user or be supplied to user information and terminal it is various Graphical user interface, these graphical user interface can be made of figure, text, icon, video and its arbitrary combination.Display Unit 604 may include display panel, optionally, may be used liquid crystal display (LCD, Liquid Crystal Display), The forms such as Organic Light Emitting Diode (OLED, Organic Light-Emitting Diode) configure display panel.Further , touch sensitive surface can cover display panel, when touch sensitive surface detects on it or after neighbouring touch operation, send processing to Device 608 is followed by subsequent processing device 608 and is provided on a display panel accordingly according to the type of touch event to determine the type of touch event Visual output.Although in fig.15, touch sensitive surface and display panel are to realize input and defeated as two independent components Enter function, but in certain embodiments, touch sensitive surface and display panel can be integrated and realize and output and input function.
Terminal may also include at least one sensor 605, such as optical sensor, motion sensor and other sensors. Specifically, optical sensor may include ambient light sensor and proximity sensor, wherein ambient light sensor can be according to ambient light Light and shade adjust the brightness of display panel, proximity sensor can close display panel and/or the back of the body when terminal is moved in one's ear Light.As a kind of motion sensor, gravity accelerometer can detect in all directions (generally three axis) acceleration Size can detect that size and the direction of gravity when static, can be used to identify terminal posture application (such as horizontal/vertical screen switching, Dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap) etc.;It can also configure as terminal The other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor, details are not described herein.
Voicefrequency circuit 606, loud speaker, microphone can provide the audio interface between user and terminal.Voicefrequency circuit 606 can By the transformed electric signal of the audio data received, it is transferred to loud speaker, voice signal output is converted to by loud speaker;It is another The voice signal of collection is converted to electric signal by aspect, microphone, and audio data is converted to after being received by voicefrequency circuit 606, then After the processing of audio data output processor 608, through RF circuits 601 to be sent to such as another terminal, or by audio data Output is further processed to memory 602.Voicefrequency circuit 606 is also possible that earphone jack, with provide peripheral hardware earphone with The communication of terminal.
WiFi belongs to short range wireless transmission technology, and terminal can help user's transceiver electronics postal by WiFi module 607 Part, browsing webpage and access streaming video etc., it has provided wireless broadband internet to the user and has accessed.Although Figure 15 is shown WiFi module 607, but it is understood that, and it is not belonging to must be configured into for terminal, it can not change as needed completely Become in the range of the essence of invention and omits.
Processor 608 is the control centre of terminal, using the various pieces of various interfaces and the entire terminal of connection, is led to It crosses operation or executes the software program and/or module being stored in memory 602, and call and be stored in memory 602 Data execute the various functions and processing data of terminal, to carry out integral monitoring to terminal.Optionally, processor 608 can wrap Include one or more processing cores;Preferably, processor 608 can integrate application processor and modem processor, wherein answer With the main processing operation system of processor, user interface and application program etc., modem processor mainly handles wireless communication. It is understood that above-mentioned modem processor can not also be integrated into processor 608.
Terminal further includes the power supply 609 (such as battery) powered to all parts, it is preferred that power supply can pass through power supply pipe Reason system and processor 608 are logically contiguous, to realize management charging, electric discharge and power managed by power-supply management system Etc. functions.Power supply 609 can also include one or more direct current or AC power, recharging system, power failure inspection The random components such as slowdown monitoring circuit, power supply changeover device or inverter, power supply status indicator.
Although being not shown, terminal can also include camera, bluetooth module etc., and details are not described herein.Specifically in this implementation In example, the processor 608 in terminal can be corresponding by the process of one or more application program according to following instruction Executable file is loaded into memory 602, and runs the application program of storage in the memory 602 by processor 608, from And realize various functions:
Obtain code file to be detected;Build the corresponding global symbol table of code file to be detected;It obtains to be detected Code file in the corresponding lexical unit sequence table of each subcode file and local symbol table;According to local symbol table and entirely Office's symbol table, is updated lexical unit sequence table, obtains updated lexical unit sequence table;According to local symbol table, Global symbol table and updated lexical unit sequence table, determine the testing result of code file to be detected.
Optionally, the step of building code file to be detected corresponding global symbol table may include:Obtain with it is to be checked The corresponding lexical unit sequence table of each subcode file, obtains lexical unit sequence table collection in the code file of survey;According to word Method unit sequence table concentrates each lexical unit sequence table, structure corresponding with each subcode file in code file to be detected Local symbol table, obtain local symbol table collection;Global symbol table is built according to local symbol table collection.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, the detailed description above with respect to code detection method is may refer to, details are not described herein again.
From the foregoing, it will be observed that the embodiment of the present invention can by global symbol table and local symbol table to lexical unit sequence table into Row update, and obtained according to updated lexical unit sequence table and local symbol table and global symbol table entire to be detected Code file global detection as a result, realizing the global detection to code file to be detected, and be not limited solely to list Local detection solely is carried out to sub- code file, this improves the accuracys of code detection.
It will appreciated by the skilled person that all or part of step in the various methods of above-described embodiment can be with It is completed by instructing, or controls relevant hardware by instructing and complete, which can be stored in one and computer-readable deposit In storage media, and is loaded and executed by processor.
For this purpose, the embodiment of the present invention provides a kind of storage medium, wherein being stored with a plurality of instruction, which can be handled Device is loaded, to execute the step in any code detection method that the embodiment of the present invention is provided.For example, the instruction can To execute following steps:
Obtain code file to be detected;Build the corresponding global symbol table of code file to be detected;It obtains to be detected Code file in the corresponding lexical unit sequence table of each subcode file and local symbol table;According to local symbol table and entirely Office's symbol table, is updated lexical unit sequence table, obtains updated lexical unit sequence table;According to local symbol table, Global symbol table and updated lexical unit sequence table, determine the testing result of code file to be detected.
Optionally, the step of building code file to be detected corresponding global symbol table may include:Obtain with it is to be checked The corresponding lexical unit sequence table of each subcode file, obtains lexical unit sequence table collection in the code file of survey;According to word Method unit sequence table concentrates each lexical unit sequence table, structure corresponding with each subcode file in code file to be detected Local symbol table, obtain local symbol table collection;Global symbol table is built according to local symbol table collection.
The specific implementation of above each operation can be found in the embodiment of front, and details are not described herein.
Wherein, which may include:Read-only memory (ROM, Read Only Memory), random access memory Body (RAM, Random Access Memory), disk or CD etc..
By the instruction stored in the storage medium, any code inspection that the embodiment of the present invention is provided can be executed Step in survey method, it is thereby achieved that achieved by any code detection method that the embodiment of the present invention is provided Advantageous effect refers to the embodiment of front, and details are not described herein.
Be provided for the embodiments of the invention above a kind of code detection method, device, storage medium and test terminal into It has gone and has been discussed in detail, principle and implementation of the present invention are described for specific case used herein, the above implementation The explanation of example is merely used to help understand the method and its core concept of the present invention;Meanwhile for those skilled in the art, according to According to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification It should not be construed as limiting the invention.

Claims (23)

1. a kind of code detection method, which is characterized in that including:
Obtain code file to be detected;
Build the corresponding global symbol table of the code file to be detected;
Obtain the corresponding lexical unit sequence table of each subcode file and local symbol table in the code file to be detected;
According to the local symbol table and the global symbol table, the lexical unit sequence table is updated, is updated Lexical unit sequence table afterwards;
According to the local symbol table, the global symbol table and the updated lexical unit sequence table, waited for described in determination The testing result of the code file of detection.
2. code detection method according to claim 1, which is characterized in that the structure code file to be detected The step of corresponding global symbol table includes:
Lexical unit sequence table corresponding with each subcode file in the code file to be detected is obtained, morphology list is obtained Metasequence table collection;
Concentrate each lexical unit sequence table according to the lexical unit sequence table, structure in the code file to be detected Each corresponding local symbol table of subcode file, obtains local symbol table collection;
Global symbol table is built according to the local symbol table collection.
3. code detection method according to claim 2, which is characterized in that the acquisition and the code text to be detected The corresponding lexical unit sequence table of each subcode file in part, the step of obtaining lexical unit sequence table collection include:
Morphological analysis is carried out to each subcode file in the code file to be detected, each subcode file is obtained and corresponds to Lexical unit sequence table;
According to the corresponding lexical unit sequence table of each subcode file, to each subcode in the code file to be detected File is standardized, and obtains the standard code file set of each standard subcode file composition;
The corresponding lexical unit sequence table of each standard subcode file in the standard code file set is obtained, morphology list is obtained Metasequence table collection.
4. code detection method according to claim 3, which is characterized in that each subcode file of basis is corresponding Lexical unit sequence table is standardized each subcode file in the code file to be detected, obtains each Standard subcode file composition standard code file set the step of include:
According to the corresponding lexical unit sequence table of each subcode file and code standard logical format, the code lattice of standard are obtained Formula;
From in the code file to be detected in each subcode file search with the code format unmatched target generation Code format;
It is modified to the object code format according to the code format, obtains the mark of each standard subcode file composition Quasi- code file collection.
5. code detection method according to claim 3, which is characterized in that described in the code file to be detected Each subcode file carries out morphological analysis, and the step of obtaining the corresponding lexical unit sequence table of each subcode file includes:
Obtain the string value of each lexical unit in each subcode file in code file to be detected;
Obtain attribute information associated with each lexical unit;
Doubly linked list is generated according to the string value of each lexical unit and attribute information, it is corresponding to obtain each subcode file The lexical unit sequence table collection of lexical unit sequence table composition.
6. code detection method according to claim 5, which is characterized in that the acquisition and each lexical unit phase The step of associated attribute information includes:
It obtains each lexical unit and is directed toward various information in each subcode file in the code file to be detected Pointer, obtain pointer information;
Obtain the characteristic information of each lexical unit;
Set the characteristic information of the pointer information and each lexical unit to category associated with each lexical unit Property information.
7. code detection method according to claim 2, which is characterized in that described according to the lexical unit sequence table collection In each lexical unit sequence table, build local symbol corresponding with each subcode file in the code file to be detected Table, the step of obtaining local symbol table collection include:
Concentrate each lexical unit sequence table according to the lexical unit sequence table, structure in the code file to be detected Each corresponding abstract syntax tree of subcode file;
According to abstract syntax tree structure part symbol corresponding with each subcode file in the code file to be detected Number table, obtains local symbol table collection.
8. code detection method according to claim 7, which is characterized in that it is described according to the abstract syntax tree structure with The corresponding local symbol table of each subcode file, obtains local symbol table Ji Buzhoubao in the code file to be detected It includes:
The corresponding class list of each subcode file, letter in the code file to be detected are obtained according to the abstract syntax tree Ordered series of numbers table and variable list;
According to the corresponding class list of each subcode file, function list and variable list structure and each subcode file Corresponding local symbol table obtains local symbol table collection.
9. code detection method according to claim 2, which is characterized in that described to be built according to the local symbol table collection The step of global symbol table includes:
Each local symbol table is concentrated to merge in the local symbol table, the symbol table after being merged;
Identical symbolic parameter in symbol table after the merging is retained one of them, and to its in identical symbolic parameter He deletes at parameter, obtains global symbol table.
10. code detection method according to claim 2, which is characterized in that described to obtain the code text to be detected The corresponding lexical unit sequence table of each subcode file and the step of local symbol table, include in part:
When being stored with the lexical unit sequence table collection and local symbel table, concentrates and obtain from the lexical unit sequence table The corresponding lexical unit sequence table of each subcode file in the code file to be detected;
Each the corresponding part of subcode file accords with from the local symbol table concentration acquisition code file to be detected Number table.
11. code detection method according to claim 2, which is characterized in that described to obtain the code text to be detected The corresponding lexical unit sequence table of each subcode file and the step of local symbol table, include in part:
When not storing the lexical unit sequence table collection and local symbel table, to each in the code file to be detected Subcode file carries out morphological analysis, obtains the corresponding lexical unit sequence table of each subcode file;
According to the lexical unit sequence table, structure abstract syntax tree corresponding with each subcode file;
According to abstract syntax tree structure local symbol table corresponding with each subcode file.
12. according to claim 1 to 11 any one of them code detection method, which is characterized in that described according to the part Symbol table and the global symbol table are updated the lexical unit sequence table, obtain updated lexical unit sequence Table determines described to be checked according to the local symbol table, the global symbol table and the updated lexical unit sequence table The step of testing result of the code file of survey includes:
Subcode file is obtained from the code file to be detected, as current subcode file;
Obtain the corresponding lexical unit sequence table of the current subcode file and local symbol table;
According to the local symbol table and the global symbol table, lexical unit sequence corresponding to the current subcode file Table is updated, and obtains updated lexical unit sequence table;
According to the local symbol table, the global symbol table and the updated lexical unit sequence table, work as described in determination The testing result of preceding subcode file;
It returns to execute and obtains subcode file from the code file to be detected, the step of as current subcode file, Until the subcode file detection in the code file to be detected finishes, the detection of the code file to be detected is obtained As a result.
13. code detection method according to claim 12, which is characterized in that described according to the local symbol table and institute The step of stating global symbol table, being updated to the lexical unit sequence table, obtain updated lexical unit sequence table is wrapped It includes:
The type parameter in the lexical unit sequence table is obtained, the type parameter includes the type name and type of type parameter The qualified name of parameter;
There is no the types to join when the type name of the type parameter is not system type name, and in the local symbol table When several type names, the type name of the type parameter is searched from the global symbol table;
When, there are when the type name of the type parameter, type in the lexical unit sequence table being joined in the global symbol table Several qualified names is matched with the qualified name of type parameter in the global symbol table;
If successful match, by the type parameter in the lexical unit sequence table and the type parameter in the global symbol table Pointer link is carried out, updated lexical unit sequence table is obtained.
14. code detection method according to claim 13, which is characterized in that described to obtain the lexical unit sequence table In type parameter the step of after, the method further includes:
When the type name of the type parameter is not system type name, and there are the type parameters in the local symbol table Type name when, by type parameter in the qualified name of type parameter in the lexical unit sequence table and the local symbol table Qualified name is matched;
If successful match, by type parameter in type parameter and the local symbol table in the lexical unit sequence table into Line pointer links, and obtains updated lexical unit sequence table.
15. according to claim 1 to 11 any one of them code detection method, which is characterized in that the acquisition is to be detected The step of code file includes:
Code file is obtained, and invalid code content is extracted from the code file according to code regulation;
The invalid code content is filtered, code file to be detected is obtained.
16. code detection method according to claim 15, which is characterized in that it is described according to code regulation from the code The step of invalid code content is extracted in file include:
According to code regulation redundant character is extracted from the code file;
It is identified according to the annotation in code regulation and extracts notes content from the code file;
It is identified according to the pretreatment in code regulation and extracts pre-processing instruction from the code file;
Set the redundant character, notes content and pre-processing instruction to invalid code content.
17. a kind of code detecting apparatus, which is characterized in that including:
First acquisition unit, for obtaining code file to be detected;
Construction unit, for building the corresponding global symbol table of the code file to be detected;
Second acquisition unit, for obtaining the corresponding lexical unit sequence of each subcode file in the code file to be detected List and local symbol table;
Updating unit, for according to the local symbol table and the global symbol table, being carried out to the lexical unit sequence table Update, obtains updated lexical unit sequence table;
Determination unit, for according to the local symbol table, the global symbol table and the updated lexical unit sequence Table determines the testing result of the code file to be detected.
18. code detecting apparatus according to claim 17, which is characterized in that the construction unit includes:
First obtains subelement, for obtaining morphology list corresponding with each subcode file in the code file to be detected Metasequence table obtains lexical unit sequence table collection;
First structure subelement, for concentrating each lexical unit sequence table, structure and institute according to the lexical unit sequence table The corresponding local symbol table of each subcode file in code file to be detected is stated, local symbol table collection is obtained;
Second structure subelement, for building global symbol table according to the local symbol table collection.
19. code detecting apparatus according to claim 18, which is characterized in that described first, which obtains subelement, includes:
Analysis module obtains each for carrying out morphological analysis to each subcode file in the code file to be detected The corresponding lexical unit sequence table of subcode file;
Processing module is used for according to the corresponding lexical unit sequence table of each subcode file, to the code text to be detected Each subcode file is standardized in part, obtains the standard code file set of each standard subcode file composition;
Acquisition module, for obtaining the corresponding lexical unit sequence of each standard subcode file in the standard code file set Table obtains lexical unit sequence table collection.
20. according to claim 17 to 19 any one of them code detecting apparatus, which is characterized in that the updating unit packet It includes:
Second obtains subelement, for obtaining subcode file from the code file to be detected, as current subcode File;
Third obtains subelement, for obtaining the corresponding lexical unit sequence table of the current subcode file and local symbol Table;
Subelement is updated, is used for according to the local symbol table and the global symbol table, to the current subcode file pair The lexical unit sequence table answered is updated, and obtains updated lexical unit sequence table;
The determination unit is specifically used for:According to the local symbol table, the global symbol table and the updated morphology Unit sequence table determines the testing result of the current subcode file;The second acquisition subelement is triggered to execute from described Subcode file is obtained in code file to be detected, as the operation of current subcode file, until the generation to be detected Subcode file detection in code file finishes, and obtains the testing result of the code file to be detected.
21. code detecting apparatus according to claim 20, which is characterized in that the update subelement is specifically used for:
The type parameter in the lexical unit sequence table is obtained, the type parameter includes the type name and type of type parameter The qualified name of parameter;
When the type name of the type parameter is not system type name, and there are the type parameters in the local symbol table Type name when, by type parameter in the qualified name of type parameter in the lexical unit sequence table and the local symbol table Qualified name is matched;If successful match, by type parameter and the local symbol table in the lexical unit sequence table Middle type parameter carries out pointer link, obtains updated lexical unit sequence table;
There is no the types to join when the type name of the type parameter is not system type name, and in the local symbol table When several type names, the type name of the type parameter is searched from the global symbol table;
When, there are when the type name of the type parameter, type in the lexical unit sequence table being joined in the global symbol table Several qualified names is matched with the qualified name of type parameter in the global symbol table;
If successful match, by the type parameter in the lexical unit sequence table and the type parameter in the global symbol table Pointer link is carried out, updated lexical unit sequence table is obtained.
22. a kind of storage medium, which is characterized in that the storage medium is stored with a plurality of instruction, and described instruction is suitable for processor It is loaded, the step in 1 to 16 any one of them code detection method is required with perform claim.
23. a kind of test terminal, which is characterized in that the test terminal includes:At least one processor and at least one processing Device;The memory has program stored therein, and the processor calls described program, and 1-16 any one of them is required with perform claim Step in code detection method.
CN201810321498.XA 2018-04-11 2018-04-11 Code detection method and device, storage medium and test terminal Active CN108549538B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810321498.XA CN108549538B (en) 2018-04-11 2018-04-11 Code detection method and device, storage medium and test terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810321498.XA CN108549538B (en) 2018-04-11 2018-04-11 Code detection method and device, storage medium and test terminal

Publications (2)

Publication Number Publication Date
CN108549538A true CN108549538A (en) 2018-09-18
CN108549538B CN108549538B (en) 2021-03-02

Family

ID=63514479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810321498.XA Active CN108549538B (en) 2018-04-11 2018-04-11 Code detection method and device, storage medium and test terminal

Country Status (1)

Country Link
CN (1) CN108549538B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109582575A (en) * 2018-11-27 2019-04-05 网易(杭州)网络有限公司 Game test method and device
CN110297639A (en) * 2019-07-01 2019-10-01 北京百度网讯科技有限公司 Method and apparatus for detecting code
CN110309050A (en) * 2019-05-22 2019-10-08 深圳壹账通智能科技有限公司 Detection method, device, server and the storage medium of code specification
CN110489127A (en) * 2019-08-12 2019-11-22 腾讯科技(深圳)有限公司 Error code determines method, apparatus, computer readable storage medium and equipment
CN110489973A (en) * 2019-08-06 2019-11-22 广州大学 A kind of intelligent contract leak detection method, device and storage medium based on Fuzz
CN110879709A (en) * 2019-11-29 2020-03-13 五八有限公司 Detection method and device of useless codes, terminal equipment and storage medium
CN111651198A (en) * 2020-04-20 2020-09-11 北京大学 Automatic code abstract generation method and device
CN112276263A (en) * 2020-10-14 2021-01-29 宁波市博虹机械制造开发有限公司 G code-based special motion control method for electric spark forming machine
CN112651213A (en) * 2020-12-25 2021-04-13 军工保密资格审查认证中心 Safety examination method and device for numerical control program
CN113946347A (en) * 2021-09-29 2022-01-18 北京五八信息技术有限公司 Function call detection method and device, electronic equipment and readable medium
CN117149663A (en) * 2023-10-30 2023-12-01 合肥中科类脑智能技术有限公司 Multi-target detection algorithm deployment method and device, electronic equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03255533A (en) * 1990-03-06 1991-11-14 Fujitsu Ltd Symbol managing system in programming language processing system
CN103780263A (en) * 2012-10-22 2014-05-07 株式会社特博睿 Device and method of data compression and recording medium
CN105930267A (en) * 2016-04-15 2016-09-07 中国工商银行股份有限公司 Database dictionary based storage process static detection method and system
CN106227668A (en) * 2016-07-29 2016-12-14 腾讯科技(深圳)有限公司 Data processing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03255533A (en) * 1990-03-06 1991-11-14 Fujitsu Ltd Symbol managing system in programming language processing system
CN103780263A (en) * 2012-10-22 2014-05-07 株式会社特博睿 Device and method of data compression and recording medium
CN105930267A (en) * 2016-04-15 2016-09-07 中国工商银行股份有限公司 Database dictionary based storage process static detection method and system
CN106227668A (en) * 2016-07-29 2016-12-14 腾讯科技(深圳)有限公司 Data processing method and device

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109582575B (en) * 2018-11-27 2022-03-22 网易(杭州)网络有限公司 Game testing method and device
CN109582575A (en) * 2018-11-27 2019-04-05 网易(杭州)网络有限公司 Game test method and device
CN110309050A (en) * 2019-05-22 2019-10-08 深圳壹账通智能科技有限公司 Detection method, device, server and the storage medium of code specification
CN110297639A (en) * 2019-07-01 2019-10-01 北京百度网讯科技有限公司 Method and apparatus for detecting code
CN110297639B (en) * 2019-07-01 2023-03-21 北京百度网讯科技有限公司 Method and apparatus for detecting code
CN110489973A (en) * 2019-08-06 2019-11-22 广州大学 A kind of intelligent contract leak detection method, device and storage medium based on Fuzz
CN110489127A (en) * 2019-08-12 2019-11-22 腾讯科技(深圳)有限公司 Error code determines method, apparatus, computer readable storage medium and equipment
CN110489127B (en) * 2019-08-12 2023-10-13 腾讯科技(深圳)有限公司 Error code determination method, apparatus, computer-readable storage medium and device
CN110879709A (en) * 2019-11-29 2020-03-13 五八有限公司 Detection method and device of useless codes, terminal equipment and storage medium
CN111651198A (en) * 2020-04-20 2020-09-11 北京大学 Automatic code abstract generation method and device
CN111651198B (en) * 2020-04-20 2021-04-13 北京大学 Automatic code abstract generation method and device
CN112276263A (en) * 2020-10-14 2021-01-29 宁波市博虹机械制造开发有限公司 G code-based special motion control method for electric spark forming machine
CN112651213A (en) * 2020-12-25 2021-04-13 军工保密资格审查认证中心 Safety examination method and device for numerical control program
CN113946347A (en) * 2021-09-29 2022-01-18 北京五八信息技术有限公司 Function call detection method and device, electronic equipment and readable medium
CN117149663A (en) * 2023-10-30 2023-12-01 合肥中科类脑智能技术有限公司 Multi-target detection algorithm deployment method and device, electronic equipment and medium
CN117149663B (en) * 2023-10-30 2024-02-02 合肥中科类脑智能技术有限公司 Multi-target detection algorithm deployment method and device, electronic equipment and medium

Also Published As

Publication number Publication date
CN108549538B (en) 2021-03-02

Similar Documents

Publication Publication Date Title
CN108549538A (en) A kind of code detection method, device, storage medium and test terminal
US10324909B2 (en) Omega names: name generation and derivation utilizing nested three or more attributes
CN106227774B (en) Information search method and device
CN108763887A (en) Database manipulation requests verification method, apparatus, server and storage medium
US20140359587A1 (en) Deeply parallel source code compilation
US9483508B1 (en) Omega names: name generation and derivation
US20160306736A1 (en) Translation verification testing
US9311077B2 (en) Identification of code changes using language syntax and changeset data
CN110058850A (en) A kind of development approach of application, device and storage medium
US11243750B2 (en) Code completion with machine learning
CN112860265A (en) Method and device for detecting operation abnormity of source code database
CN108959454B (en) Prompting clause specifying method, device, equipment and storage medium
CN110188366A (en) A kind of information processing method, device and storage medium
CN111949328B (en) Start acceleration method and device, computer equipment and storage medium
CN108763222A (en) Detection, interpretation method and device, server and storage medium are translated in a kind of leakage
WO2017167118A1 (en) Method and device for compiling computer language
CN107729015A (en) A kind of method and apparatus for determining the useless function in engineering code
CN113821496B (en) Database migration method, system, device and computer readable storage medium
CN107741901A (en) A kind of method of testing and device of linked database sentence
CN109635175A (en) Page data joining method, device, readable storage medium storing program for executing and electronic equipment
CN109446078A (en) Code test method and device, storage medium, electronic equipment
EP4075320A1 (en) A method and device for improving the efficiency of pattern recognition in natural language
CN112069198B (en) SQL analysis optimization method and device
CN107220349B (en) Method and system for predicting database release time
WO2022256573A1 (en) System and method for detecting vulnerabilities in object-oriented program code using an object property graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant