CN106227668A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN106227668A
CN106227668A CN201610613852.7A CN201610613852A CN106227668A CN 106227668 A CN106227668 A CN 106227668A CN 201610613852 A CN201610613852 A CN 201610613852A CN 106227668 A CN106227668 A CN 106227668A
Authority
CN
China
Prior art keywords
lexical unit
code
unit sequence
type
code file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610613852.7A
Other languages
Chinese (zh)
Other versions
CN106227668B (en
Inventor
邹越
严明
张蓓
黄斌
袁明凯
魏学峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610613852.7A priority Critical patent/CN106227668B/en
Publication of CN106227668A publication Critical patent/CN106227668A/en
Application granted granted Critical
Publication of CN106227668B publication Critical patent/CN106227668B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/146Coding or compression of tree-structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/315Object-oriented languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/73Program documentation

Abstract

The invention discloses a kind of data processing method and device.Wherein, this data processing method includes: obtaining code file, wherein, code file is to include the source text of character string;Character string in code file is carried out morphological analysis, obtains lexical unit sequence;Resolve code file, obtain presetting object;It is associated setting up global symbol table to lexical unit sequence and default object, wherein, the data message of the global symbol table all default object in record code file;And according to global symbol table, code file is performed static code scanning, and obtain scanning result, wherein, scanning result at least includes the lookup result of the type to lexical unit sequence.The present invention solves the technical problem that in correlation technique, the accuracy of code scans is low.

Description

Data processing method and device
Technical field
The present invention relates to data processing field, in particular to a kind of data processing method and device.
Background technology
The solution of global symbol is not had currently for the data processing method generating symbolization, static on upper strata During the scanning of code check item, the result of non-global symbol is not accurate enough.
In traditional compilation process, abstract syntax tree (AST) can set up the logical relation between code expression, than As, in if-else statement interlude, if statement and the logical relation of else statement, set up abstract language not for single code expression Method structure, needs to there is syntax error by the language codes of compiling as input, the language codes once inputted, then build Overall abstract syntax tree construction out would is that mistake and without reference to meaning.So right in the case of compiling is not passed through Code carries out detecting, analyzing, and can affect overall symbol flow process and result.Therefore, symbolization is carried out according to abstract syntax tree It is low that data process accuracy, and the accuracy being scanned the code of symbolization is low, and then makes programmer be difficult to find in code The defect existed, reduces serviceability and safety, reduces programmer and processes the efficiency of code, and improves and repair into This.
Current Data processing uses simple string matching, and accuracy is low, additionally in Data Structure Design, and number Bigger according to structure, it is impossible to utilize data buffer storage fully, reduce the efficiency of data storage.
For the problem that the accuracy of code scans in correlation technique is low, effective solution is the most not yet proposed.
Summary of the invention
Embodiments provide a kind of data processing method and device, at least to solve code scans in correlation technique The low technical problem of accuracy.
An aspect according to embodiments of the present invention, it is provided that a kind of data processing method.This data processing method includes: Obtaining code file, wherein, code file is to include the source text of character string;Character string in code file is entered Row morphological analysis, obtains lexical unit sequence;Resolve code file, obtain presetting object;Right with default to lexical unit sequence As being associated setting up global symbol table, wherein, all default object during global symbol table is used for record code file Data message;And according to global symbol table, code file is performed static code scanning, obtain scanning result, wherein, scanning Result at least includes the lookup result of the type to lexical unit sequence.
Another aspect according to embodiments of the present invention, additionally provides a kind of data processing equipment.This data processing equipment bag Including: the first acquiring unit, be used for obtaining code file, wherein, code file is to include the source text of character string;Analyze Unit, for the character string in code file is carried out morphological analysis, obtains lexical unit sequence;Resolution unit, is used for solving Analysis code file, obtains presetting object;Associative cell is complete for being associated lexical unit sequence and default object to set up Office's symbol table, wherein, the data message of the global symbol table all default object in record code file;And scanning is single Unit, for code file being performed static code scanning according to global symbol table, obtains scanning result, and wherein, scanning result is extremely Include the lookup result of type to lexical unit sequence less.
In embodiments of the present invention, obtaining code file, wherein, code file is to include the source program literary composition of character string This;Character string in code file is carried out morphological analysis, obtains lexical unit sequence;Resolve code file, preset Object;Being associated setting up global symbol table to lexical unit sequence and default object, wherein, global symbol table is used for record The data message of all default object in code file;And according to global symbol table, code file execution static code is swept Retouching, obtain scanning result, wherein, scanning result at least includes the lookup result of the type to lexical unit sequence, owing to passing through It is associated setting up global symbol table to lexical unit sequence and default object, according to global symbol table, code file is performed Static code scans, and obtains scanning result, has reached to carry out code file the purpose of symbolization process, it is achieved thereby that improve The technique effect of the accuracy of code scans, and then solve the technical problem that in correlation technique, the accuracy of code scans is low.
Accompanying drawing explanation
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this Bright schematic description and description is used for explaining the present invention, is not intended that inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the hardware block diagram of the terminal of a kind of data processing method according to embodiments of the present invention;
Fig. 2 is the flow chart of a kind of data processing method according to embodiments of the present invention;
Fig. 3 is the stream of a kind of method setting up tree-like grammatical structure according to lexical unit sequence according to embodiments of the present invention Cheng Tu;
Fig. 4 is that lexical unit sequence and default object are associated setting up the overall situation by one according to embodiments of the present invention The flow chart of the method for symbol table;
Fig. 5 be one according to embodiments of the present invention according to process code by lexical unit sequence and with lexical unit sequence Corresponding default object performs the flow chart of the method for type association or function association;
Fig. 6 be according to embodiments of the present invention a kind of according to global symbol table to code file perform static code scanning The flow chart of method;
Fig. 7 is that code file execution static code is scanned by another kind according to embodiments of the present invention according to global symbol table The flow chart of method;
Fig. 8 is the method that another kind according to embodiments of the present invention carries out morphological analysis to the character stream in code file Flow chart;
Fig. 9 is the flow process of a kind of method setting up data flow model according to global symbol table according to embodiments of the present invention Figure;
Figure 10 is the flow chart of another kind of data processing method according to embodiments of the present invention;
Figure 11 is a kind of code schematic diagram carried out code file before pretreatment according to embodiments of the present invention;
Figure 12 is that code file carries out code schematic diagram after pretreatment according to embodiments of the present invention;
Figure 13 is that one according to embodiments of the present invention carries out morphological analysis to pretreatment code, obtains lexical unit sequence Code schematic diagram;
Figure 14 is the schematic diagram of a kind of lexical unit sequence according to embodiments of the present invention;
Figure 15 is a kind of schematic diagram simplifying lexical unit sequence according to embodiments of the present invention;
Figure 16 is the schematic diagram that lexical unit sequence is simplified by another kind according to embodiments of the present invention;
Figure 17 is the schematic diagram of a kind of abstract syntax tree construction according to embodiments of the present invention;
Figure 18 is the flow chart of a kind of method setting up global symbol table according to embodiments of the present invention;
Figure 19 is the schematic diagram of the class of a kind of object according to embodiments of the present invention;
Figure 20 is the schematic diagram of a kind of overall situation type Hash table according to embodiments of the present invention;
Figure 21 is a kind of code schematic diagram processing another name instruction execution according to embodiments of the present invention;
Figure 22 is the schematic diagram of the type search of a kind of lexical unit according to embodiments of the present invention;
Figure 23 is the flow chart of a kind of method that code file performs static code scanning according to embodiments of the present invention;
Figure 24 is the schematic diagram of a kind of source code according to embodiments of the present invention;
Figure 25 is the schematic diagram of the debugging file of a kind of symbolization output according to embodiments of the present invention;
Figure 26 is a kind of code schematic diagram setting up data flow model according to embodiments of the present invention;
Figure 27 is the schematic diagram of a kind of data processing equipment according to embodiments of the present invention;
Figure 28 is the schematic diagram of another kind of data processing equipment according to embodiments of the present invention;And
Figure 29 is the structured flowchart of a kind of terminal according to embodiments of the present invention.
Detailed description of the invention
In order to make those skilled in the art be more fully understood that the present invention program, below in conjunction with in the embodiment of the present invention Accompanying drawing, is clearly and completely described the technical scheme in the embodiment of the present invention, it is clear that described embodiment is only The embodiment of a present invention part rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people The every other embodiment that member is obtained under not making creative work premise, all should belong to the model of present invention protection Enclose.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, " Two " it is etc. for distinguishing similar object, without being used for describing specific order or precedence.Should be appreciated that so use Data can exchange in the appropriate case, in order to embodiments of the invention described herein can with except here diagram or Order beyond those described is implemented.Additionally, term " includes " and " having " and their any deformation, it is intended that cover Cover non-exclusive comprising, such as, contain series of steps or the process of unit, method, system, product or equipment are not necessarily limited to Those steps clearly listed or unit, but can include the most clearly listing or for these processes, method, product Or intrinsic other step of equipment or unit.
Embodiment 1
According to embodiments of the present invention, it is provided that the embodiment of a kind of data processing method.
Alternatively, in the present embodiment, above-mentioned data processing method can apply to as shown in Figure 1 by server 102 In the hardware environment constituted with terminal 104.Fig. 1 is that the computer of a kind of data processing method according to embodiments of the present invention is whole The hardware block diagram of end.As it is shown in figure 1, server 102 is attached with terminal 104 by network, above-mentioned network include but Being not limited to: wide area network, Metropolitan Area Network (MAN) or LAN, terminal 104 is not limited to PC, mobile phone, panel computer etc..The embodiment of the present invention Data processing method can be performed by server 102, it is also possible to performed by terminal 104, it is also possible to be by server 102 Jointly perform with terminal 104.Wherein, terminal 104 performs the data processing method of the embodiment of the present invention can also be by being arranged on Client thereon performs.
Fig. 2 is the flow chart of a kind of data processing method according to embodiments of the present invention.As in figure 2 it is shown, these data process Method may comprise steps of:
Step S202, obtains code file.
In the technical scheme that the application above-mentioned steps S202 provides, obtaining code file, wherein, code file is for including The source text of character string, can be C#The code file of language program, this C#Language is the program design language of Microsoft's exploitation Speech, character string namely character stream.By input code file, and then obtain this code file.
Step S204, carries out morphological analysis to the character string in code file, obtains lexical unit sequence.
In the technical scheme that the application above-mentioned steps S204 provides, the character string in code file is carried out morphology and divides Analysis, obtains lexical unit sequence, that is, the character string of code file is converted into the process of word sequence.
Morphological analysis is the process that character string is converted to word (Token) sequence.Carry out morphological analysis program or Person's function is called lexical analyzer (Lexical analyzer is called for short Lexer), is also scanning device (Scanner).Morphological analysis Device, typically presented in function, calls for syntax analyzer.The program completing morphological analysis task is referred to as morphological analysis journey Sequence or lexical analyzer or scanning device.From left to right source program is scanned, all kinds of according to the morphological rule identification of language Word, and produce the attribute of respective word.
After obtaining code file, read the character string of code file, character string is formed morpheme, generate and defeated Going out a lexical unit sequence, wherein, lexical unit sequence is the set of all lexical units generated after morphological analysis, Being subsequent treatment and the master data result of upper strata check item traversal code, its essence is a doubly linked list, safeguards all of Lexical unit, such as, if, for are lexical unit.Wherein, doubly linked list namely double linked list, be the one of chained list, it each Data Node has two pointers, is respectively directed to immediate successor and direct precursor, therefore any one from doubly linked list Node starts, and can access its forerunner's node and successor node easily.
The corresponding morpheme of each lexical unit, including the string value of corresponding morpheme, also includes and this lexical unit Other attribute being associated, such as, points to the attribute of the pointer of next lexical unit, points to the pointer of a lexical unit Attribute, point to the attribute of pointer of pairing lexical unit, this pairing lexical unit is the morphology matched with former lexical unit Unit, such as, former lexical unit is " (", then pairing lexical unit is ") ".The attribute being associated with lexical unit also includes word The type of method unit, such as, numeral, character string, variable, function, keyword etc..The attribute being associated with lexical unit also includes Lexical unit points to the pointer of symbol table, that is, variable lexical unit points to the variable object in symbol table, function lexical unit Point to corresponding function object.The attribute being associated with lexical unit also includes the line number of lexical unit, lexical unit Syntax tree structure pointer, this syntax tree structure pointer is the abstract syntax tree construction for safeguarding lexical unit.With lexical unit The attribute being associated also includes the data flow architecture pointer of lexical unit.
Step S206, resolves code file, obtains presetting object.
In the technical scheme that the application above-mentioned steps S206 provides, resolve code file, obtain presetting object.
Obtaining after code file, code file is resolved, generates the class corresponding with code file, name empty Between, the default object such as method field, and set up the inclusion relation preset between object.Alternatively, code file includes multiple generation Code file, resolves all code files successively, generates the class corresponding with code file, NameSpace, method field etc. respectively Preset object, and set up the inclusion relation between all default objects.This default object can be the base class of all objects, corresponding There is the code of logical meaning, can be class, method, attribute etc., the base class that lexical unit section in differentiating method is corresponding Object, it is also possible to be NameSpace, enumerate, method, field, trust, event, attribute, enumerator etc..Default object has correspondence Type, the inheritance between type mainly considers from the structure of code aspect is the most similar.
Step S208, is associated setting up global symbol table to lexical unit sequence and default object.
In the technical scheme that the application above-mentioned steps S208 provides, lexical unit sequence and default object are associated To set up global symbol table, wherein, the data message of the global symbol table all default object in record code file, build Found the process of global symbol table namely code file is carried out the process of symbolization.
Symbolization may refer to for instruction, instruction address, constant, variable, depositor etc., all by table justice on screen The symbol the strongest with readability shows.Symbol table is the data structure for language translator.In symbol table, program source generation Each identifier in Ma binds together with its statement or use information, such as its data type, action scope and interior Deposit address.Symbol table needs constantly to collect, record and use some grammers symbol in source program during compiler works Number type and the relevant information such as feature.The a little information of this symbol table is stored in system the most in a tabular form.Such as constant table, change Amount famous-brand clock, array name table, process famous-brand clock, label table etc., be referred to as symbol table.For symbol table organization, structure and manager The quality of method can directly affect the operational efficiency of compiling system.
Character string in code file is being carried out morphological analysis, is obtaining lexical unit sequence and resolve code literary composition Part, after obtaining presetting object, needs the definition by the same type of different piece in code file to associate, that is, The different keyword fragments of the same type part in code file are chained up, thus set up global symbol table, this overall situation Symbol table includes key (key) and the key assignments (Value) corresponding with key, the number of all default object in record code file It is believed that breath, have recorded the formatted message of all object logics in code, may be used for all codes in scan code file, Thus improve the accuracy of code scans.
Step S210, performs static code scanning according to global symbol table to code file, obtains scanning result.
In the technical scheme that the application above-mentioned steps S210 provides, according to global symbol table, code file is performed static state Code scans, obtains scanning result, and wherein, scanning result at least includes the lookup result of the type to lexical unit sequence.
After being associated setting up global symbol table to lexical unit sequence and default object, according to global symbol table Code file is performed static code scanning, static code scanning in soft project, programmer after finishing writing source code, Need not move through the compiling of compiler, it is not necessary to build the running environment of source code, and directly use some scanning tools to source code It is scanned, can save substantial amounts of manpower and time cost, improve development efficiency, and find out present in source code a lot Some security breaches that only cannot find by manpower, thus improve the accuracy of code scans, greatly reduce in project Security risk, improves software quality.So provide the knot of symbolization accurately and efficiently for the scanning of upper strata static code check item Really so that the check item in code possesses syntactic level, across function scanning, semantic level and logical analysis ability, the most defeated The code scans result gone out can help the problem hidden in the quick location code of developer, tester, and then improves Code quality, reduces the later stage rehabilitation cost to code.Taking into full account that code file disappearance, type definition disappearance and grammer are wrong In the case of Wu, the semiosis of the code file of this embodiment need not the code file of compiling input, it is not required that generation Code file can compile and pass through.
Alternatively, according to global symbol table, code file is performed static code scanning and be applicable to all of C#The item of language Purpose static code checks.
By above-mentioned steps S202 to step S210, by obtaining code file, code file is to include character string Source text;Character string in code file is carried out morphological analysis, obtains lexical unit sequence;Resolve code file, Obtain presetting object;Being associated setting up global symbol table to lexical unit sequence and default object, global symbol table is used for The data message of all default object in record code file;According to global symbol table, code file is performed static code to sweep Retouching, obtain scanning result, scanning result at least includes the lookup result of the type to lexical unit sequence, can solve relevant The technical problem that in technology, the accuracy of code scans is low, and then reach to improve the technique effect of the accuracy of code scans.
As a kind of optional embodiment, it is being associated setting up global symbol to lexical unit sequence and default object Before table, in the case of the logic not changing code file, lexical unit sequence is simplified, is simplified lexical unit Sequence, is associated setting up global symbol table to lexical unit sequence and default object and includes: to simplifying lexical unit sequence It is associated setting up global symbol table with default object.
Different item destination code, the code spice of distinct program person are different, so that code file has multiformity, So it is unfavorable for the foundation of global symbol table.It is being associated setting up global symbol table to lexical unit sequence and default object Before, in the case of the logic not changing code file, lexical unit sequence is simplified, that is, to lexical unit sequence Row carry out equivalencing in logic, are simplified lexical unit sequence.Unified by simplifying the step of lexical unit sequence The style of code file, thus reduce the cost of global symbol, and improve the accuracy of upper strata static scanning.To word Method unit sequence simplifies, and after being simplified lexical unit sequence, carries out simplification lexical unit sequence and default object Association is to set up global symbol table.
As a kind of optional embodiment, lexical unit sequence is being simplified, be simplified lexical unit sequence it After, set up tree-like grammatical structure according to simplifying lexical unit sequence, wherein, tree-like grammatical structure is for have default language for storage The tree structure of the data object of method, is associated setting up global symbol table bag to simplification lexical unit sequence and default object Include: be associated setting up global symbol table to simplification lexical unit sequence and default object according to tree-like grammatical structure.
Tree-like grammatical structure is the tree-shaped form of expression of the abstract syntax structure of code file, that is, abstract syntax tree (Abstract Syntax Tree, referred to as AST), tree-like grammatical structure is a binary tree, each non-leaf nodes generation One operator of table, its two child nodes represent two computing components of this operator respectively.Tree-like grammatical structure contains The logical structure of expression formula and the priority relationship of operator, thus improve the accuracy of code scene matching and realize code The efficiency of the scene that file is corresponding.Lexical unit sequence is being simplified, after being simplified lexical unit sequence, according to letter Changing lexical unit sequence and set up tree-like grammatical structure, this tree-like grammatical structure is the data object for storage with default grammer Tree structure.After setting up tree-like grammatical structure according to simplification lexical unit sequence, according to tree-like grammatical structure to simplification Lexical unit sequence and default object are associated setting up global symbol table.
As a kind of optional embodiment, set up tree-like grammer by the single code expression in lexical unit sequence and tie Structure.
Fig. 3 is the stream of a kind of method setting up tree-like grammatical structure according to lexical unit sequence according to embodiments of the present invention Cheng Tu.As it is shown on figure 3, this method setting up tree-like grammatical structure according to lexical unit sequence comprises the following steps:
Step S301, obtains the single code expression in lexical unit sequence.
In the technical scheme that the application above-mentioned steps S301 provides, set up tree-like grammatical structure according to lexical unit sequence Including: obtain the single code expression in lexical unit sequence.Lexical unit sequence is made up of multiple code expression, right Lexical unit sequence simplifies, and after being simplified lexical unit sequence, obtains the single code table in lexical unit sequence Reach formula.
Step S302, sets up tree-like grammatical structure according to single code expression.
In the technical scheme that the application above-mentioned steps S302 provides, set up tree-like grammer knot according to single code expression Structure.Traditional tree-like grammatical structure includes the logical relation between code expression, such as, if-else language in compilation process Sentence and the logical relation of else statement, and the tree-like grammatical structure in this embodiment and the knot of the tree-like grammer during conventional encoder Structure is different, and the tree-like grammatical structure of this embodiment is set up tree-like grammatical structure just for single code expression, do not set up generation Structural relation between code expression formula and code expression.Owing to this embodiment is pointed out incomplete or can not pass through compiling Code file, the code file once inputted is by there is syntax error, then the tree-like grammatical structure built would is that mistake By mistake and be nonsensical, and the tree-like grammatical structure of the single expression formula built, if there is mistake, the most whole There is mistake in the local of code file, has no effect on the tree-like grammatical structure of other code expression in code file, thus Improve tree-like grammatical structure reliability in building process.
This embodiment is by obtaining the single code expression in lexical unit sequence;Set up according to single code expression Tree-like grammatical structure, thus reached the purpose of the foundation to tree-like grammatical structure, improve tree-like grammatical structure and building Reliability in journey.
As a kind of optional embodiment, default object includes multiple default object, in step S210, to lexical unit Sequence and default object are associated setting up global symbol table and preset global object's row by setting up according to multiple default objects Table, obtains the another name instruction in code file and the code using another name instruction in code file is performed process, being processed Code;In the list of default global object, according to processing code by lexical unit sequence and pre-with what lexical unit sequence pair was answered If object performs association to obtain global symbol table.
Fig. 4 is that lexical unit sequence and default object are associated setting up the overall situation by one according to embodiments of the present invention The flow chart of the method for symbol table.As shown in Figure 4, lexical unit sequence and default object are associated setting up overall situation symbol by this The method of number table comprises the following steps:
Step S401, sets up according to multiple default objects and presets global object's list.
In the technical scheme that the application above-mentioned steps S401 provides, set up according to multiple default objects and preset global object List, the plurality of default object can be the objects such as class corresponding to code file, NameSpace, method field, wherein, presets complete Office's list object is for being associated the same type of different keyword fragments in code file, that is, preset object column Table is for associating the definition of the same type of different piece in code file.
Code file includes multiple code file, resolves all code files, generates the class corresponding with code file, name Multiple default objects such as space, method field, and set up multiple corresponding inclusion relation of default object.Alternatively, each code The corresponding default list object of file, this default list object can include Scope object, CType object, CSI object, CNamespace object, CNum enumeration object, CFunction object, CField object, CDelegate object, CEvent pair As, CProperty object, CIndexer object, CSymbolFile object.Merge CSymbolFile pair of all code files As etc., wherein, Scope object is the base class of all objects, and correspondence has the code of logical meaning, and CType object is class, side Method, attribute etc., the base class sub-object that lexical unit section in differentiating method is corresponding, CSI object correspondence C#Class in language Class, structure Struct, interface Interface, CNamespace object is NameSpace, CNum enumeration object for enumerating, CFunction object is method, and CField object is field, CDelegate object for entrust, CEvent object be event, CProperty object be attribute, CIndexer object be enumerator, the corresponding code file physically of CSymbolFile object, One corresponding CSymbolFile object of code file.Merge all of default object, set up the overall situation by key and key assignments right As list, such as, merge rare CSymbolFile object, using the fully qualified name of CSI object as key.With corresponding CSI The key assignments of object, as key assignments, sets up overall situation type Hash table, and this Hash table is for directly to enter according to key value (Key Value) The data structure that row accesses.
Step S402, obtains the another name instruction in code file and performs the code using another name instruction in code file Process, obtain processing code.
In the technical scheme that the application above-mentioned steps S402 provides, obtain the another name instruction in code file and to code The code using another name instruction in file performs process, obtains processing code.Another name instruction in code file is a name The another name that space or type are specified, that is, identifier.The instruction of this another name directly comprises the compilation unit of this instruction, in name In space effectively.The type search of code file can be interfered by the instruction of this another name, can be other to using this in code file The place of name instruction carries out the expansion in lexical unit aspect, obtains processing code.
Step S403, in the list of default global object, according to processing code by lexical unit sequence and and lexical unit The default object that sequence pair is answered performs type association or function association, obtains global symbol table.
In the technical scheme that the application above-mentioned steps S403 provides, in the list of default global object, according to processing generation Lexical unit sequence and default object execution type association or the function answered with lexical unit sequence pair are associated by code, obtain complete Office's symbol table.It is the variable corresponding with lexical unit, type and method object to be bound that type association associates with function, So when traveling through lexical unit sequence it is known that types of variables corresponding to the lexical unit of current traversal, and then determine this change Amount is Value Types or reference type, and function association achieves calling of function, it is determined whether refer to paid close attention to variable etc., Thus significantly increase upper strata check item ability in terms of semantic level and tracking function.
This embodiment presets global object's list by setting up according to multiple default objects, presets global object's list and is used for Same type of different keyword fragments in code file are associated;Obtain the another name instruction in code file and to generation The code using another name instruction in code file performs process, obtains processing code;In the list of default global object, according to process Lexical unit sequence and default object execution type association or the function answered with lexical unit sequence pair are associated by code, obtain Global symbol table, it is achieved that lexical unit sequence and default object are associated setting up the purpose of global symbol table.
As a kind of optional embodiment, answer by lexical unit sequence with lexical unit sequence pair according to processing code Preset object execution type association and include at least one of: the lexical unit sequence of type declarations will be used for according to processing code With the default object answered with lexical unit sequence pair performs to associate;According to processing code by corresponding with the variable in code file Lexical unit sequence performs to associate with variable.
The lexical unit sequence of type declarations and default right with what lexical unit sequence pair was answered will be used for according to processing code As performing association, could be for the lexical unit in the lexical unit sequence of type declarations and preset corresponding with lexical unit Object performs association;Type association can also is that according to processing code by the lexical unit sequence corresponding with the variable in code file Arrange and perform to associate with variable, can be that the lexical unit that variable is corresponding associates with variable-definition, it is achieved thereby that global symbol table Type association during foundation.
As a kind of optional embodiment, answer by lexical unit sequence with lexical unit sequence pair according to processing code Preset object to perform functional relationships and be coupled to include less: according to process code by the lexical unit sequence corresponding with default call method and Preset call method and perform association.
According to processing code, the lexical unit sequence corresponding with default call method and default call method are performed association, Can be, according to process code, the lexical unit corresponding with default call method and this default call method are performed association, thus Achieve the function association during global symbol table is set up.
As a kind of optional embodiment, answer by lexical unit sequence with lexical unit sequence pair according to processing code Preset object execution type association or functional relationships UNICOM crosses the type name found in the list of default global object and meets pre- If lexical unit sequence being associated with default object execution type association or function according to process code in the case of condition.
Fig. 5 be one according to embodiments of the present invention according to process code by lexical unit sequence and with lexical unit sequence Corresponding default object performs the flow chart of the method for type association or function association.As it is shown in figure 5, the method includes following Step:
Step S501, determines the type name of the type of lexical unit sequence and searches type in the list of default global object Name.
In the technical scheme that the application above-mentioned steps S501 provides, determine the type name of type of lexical unit sequence also Type name is searched in the list of default global object.Obtaining the type of lexical unit sequence, the type of lexical unit has type Name, determines the type name of lexical unit.The type name is searched in the list of default global object.
Step S502, it is judged that it is pre-conditioned whether the type name found meets.
In the technical scheme that the application above-mentioned steps S502 provides, it is judged that whether the type name found meets default bar Part.The type name of the type in code file is by context code influences, after finding type name, sentences and finds It is pre-conditioned whether type name meets, optionally it is determined that meet default bar from the type name that the object that lexical unit is corresponding is nearest Part.
Step S503, performs type association or functional relationships according to processing code by lexical unit sequence and default object Connection.
In the technical scheme that the application above-mentioned steps S503 provides, determine the type name of type of lexical unit sequence also Type name is searched in the list of default global object.Judge the type name that finds meet pre-conditioned after, that is, according to Lexical unit sequence and the default object answered with lexical unit sequence pair are held by the code being treated another name instruction Row type association or function association.
This embodiment is determined by the type name of the type of lexical unit sequence and searches in the list of default global object Type name;The type name judging to find meet pre-conditioned in the case of according to process code by lexical unit sequence and Preset object and perform type association or function association, it is achieved thereby that according to processing code by lexical unit sequence and and morphology The default object that unit sequence is corresponding performs type association or the purpose of function association, significantly improves upper strata check item at semanteme Performance in terms of aspect and function call tracking.
As a kind of optional embodiment, in step S210, according to global symbol table, code file is performed static state generation Code scanning, obtains scanning result and is determined by type name to be found, in the case of type name to be found is not preset kind name, Search type name to be found, from the base class sub-object at lexical unit section place in the base class sub-object from lexical unit section place not When finding type name to be found, from the list of default global object, search type name to be found, arranging from default global object When table finds type name to be found, and when type qualified name matches with default qualified name, return type name to be found Using as scanning result.
Fig. 6 be according to embodiments of the present invention a kind of according to global symbol table to code file perform static code scanning The flow chart of method.As shown in Figure 6, this according to global symbol table to code file perform static code scanning method include with Lower step:
Step S601, splits the lexical unit section including one or more lexical unit, obtains split result.
In the technical scheme that the application above-mentioned steps S601 provides, split the word including one or more lexical unit Method elementary section, obtains split result.Lexical unit segment table shows the sequence that one or more lexical unit forms.Will be with to be found Lexical unit section corresponding to type splits according to level, obtains split result.
Step S602, is pressed into split result in stack.
In the technical scheme that the application above-mentioned steps S602 provides, if it is judged that type qualified name and default qualified name Match, return type name to be found using as scanning result.Splitting the morphology list including one or more lexical unit Unit's section, after obtaining split result, by this split result press-in stack S.
Step S603, determines the type name to be found of lexical unit sequence according to the stack top element in stack.
In the technical scheme that the application above-mentioned steps S603 provides, determine lexical unit sequence according to the stack top element in stack The type name to be found of row.After being pressed in stack by split result, obtaining the stack top element in stack, this stack top element is to be treated Search the type name of type, do not include the type qualified name of type to be found.
Step S604, it is judged that whether type name to be found is preset kind name.
In the technical scheme that the application above-mentioned steps S604 provides, it is judged that whether type name to be found is preset kind Name.Preset kind name can be the system type name in the system library of language place own.Determining according to the stack top element in stack After the type name to be found of lexical unit sequence, it is judged that whether type name to be found is system type name.
Step S605, searches type name to be found from the base class sub-object at lexical unit section place.
In the technical scheme that the application above-mentioned steps S605 provides, if it is judged that type name to be found is not for presetting class Type name, searches type name to be found from the base class sub-object at lexical unit section place.Alternatively, from lexical unit section The base class CScope of the object in the physical file at TokenSection place starts, and searches type name to be found from the bottom up.
Step S606, when not finding type name to be found in the base class sub-object from lexical unit section place, from the overall situation Symbol table is searched type name to be found.
In the technical scheme that the application above-mentioned steps S606 provides, in the base class sub-object from lexical unit section place not When finding type name to be found, overall situation type Hash list can be included from global symbol table, according to overall situation type Hash row Type name to be found searched by table.
Step S607, when finding type name to be found from global symbol table, verifies the type of type name to be found Qualified name.
In the technical scheme that the application above-mentioned steps S607 provides, to be checked finding from the list of default global object When looking for type name, verifying the type qualified name of type name to be found, the type limits entitled for limiting type name to be found Title.Alternatively, if when not finding type name to be found from the list of default global object, terminate code file is held The process of row static code scanning.
Step S608, it is judged that whether type qualified name matches with default qualified name.
In the technical scheme that the application above-mentioned steps S608 provides, preset and limit the entitled qualified name matched, it is judged that Whether the type qualified name in stack matches with default qualified name.
Step S609, returns type name to be found using as scanning result.
In the technical scheme that the application above-mentioned steps S609 provides, if it is judged that type qualified name and default qualified name Match, return type name to be found, the entitled type name finally matched of type to be found of this return.
This embodiment includes the lexical unit section of one or more lexical unit by fractionation, obtains split result;Will In split result press-in stack;The type name to be found of lexical unit sequence is determined according to the stack top element in stack;Judge to be found Whether type name is preset kind name;If it is judged that type name to be found is not preset kind name, from lexical unit section place Base class sub-object in search type name to be found;Type to be found is not found in the base class sub-object from lexical unit section place During name, from the list of default global object, search type name to be found, to be found finding from the list of default global object During type name, verifying the type qualified name of type name to be found, type limits the entitled title for limiting type name to be found; Judge whether type qualified name matches with default qualified name;If it is judged that type qualified name matches with default qualified name, Return type name to be found using as scanning result, it is achieved thereby that code file is performed static code according to global symbol table The purpose of scanning.
As a kind of optional embodiment, step S210, according to global symbol table, code file is performed static code and sweep Retouch, after searching type name to be found in the base class sub-object from lexical unit section place, at the base from lexical unit section place When class object finds type name to be found, verify the type qualified name of type name to be found;Whether judge type qualified name Match with default qualified name;And if it is judged that type qualified name matches with default qualified name, return type to be found Name is using as scanning result.
Fig. 7 is that code file execution static code is scanned by another kind according to embodiments of the present invention according to global symbol table The flow chart of method.As it is shown in fig. 7, the method for code file execution static code scanning is included by this according to global symbol table Following steps:
Step S701, when finding type name to be found in the base class sub-object from lexical unit section place, verifies to be checked Look for the type qualified name of type name.
In the technical scheme that the application above-mentioned steps S701 provides, when the base class sub-object from lexical unit section place is looked into When finding type name to be found, verify the type qualified name of type name to be found.Alternatively, from lexical unit section The base class CScope of the object in the physical file at TokenSection place starts, and finds type name to be found from the bottom up Time, verify the type qualified name of type name to be found.
Step S702, it is judged that whether type qualified name matches with default qualified name.
In the technical scheme that the application above-mentioned steps S702 provides, it is judged that whether type qualified name limits famous prime minister with default Coupling.When verifying the type qualified name of type name to be found, it is judged that whether the type qualified name in stack limits famous prime minister with default Coupling.
Step S703, returns type name to be found using as scanning result.
In the technical scheme that the application above-mentioned steps S703 provides, if it is judged that type qualified name and default qualified name Match, return type name to be found using as scanning result.If it is judged that property qualified name and default qualified name are mutually Join, continue executing with step S605, from the base class sub-object at lexical unit section place, search type name to be found.
After this embodiment searches type name to be found in the base class sub-object from lexical unit section place, from morphology list When the base class sub-object at unit section place finds type name to be found, verify the type qualified name of type name to be found;Judge class Whether type qualified name matches with default qualified name;If it is judged that type qualified name matches with default qualified name, return is treated Search type name using as scanning result, thus realize this, according to global symbol table, code file performed static code scanning Purpose.
As a kind of optional embodiment, the character string in code file is being carried out morphological analysis, is obtaining morphology list Before metasequence, code file being performed pretreatment, obtains pretreatment code, wherein, pretreatment code is for meeting preset rules Character stream, the character string in code file is carried out morphological analysis, obtains lexical unit sequence and include: to pretreatment code In character string carry out morphological analysis, obtain lexical unit sequence.
Pretreatment is that the first carrying out code file processes, and filters out the code unrelated with valid code, that is, mistake Filter the content unrelated with follow-up global symbol table, thus for providing the word of specification to code file carries out morphological analysis Symbol sequence.Code file is being performed pretreatment, after obtaining pretreatment code, the character string in pretreatment code is being carried out Morphological analysis, obtains lexical unit sequence.
As a kind of optional embodiment, code file is performed pretreatment and includes at least one of: filtering code The space of file;Delete the annotation of code file;The pre-processing instruction of code file is processed according to preset configuration;To code file Unified coding, exports code character stream.
Character string in code file is being carried out morphological analysis, before obtaining lexical unit sequence, to code file Perform pretreatment and include multiple method, the space of filtering code file, thus remove unnecessary space;Annotation is to code file Perform not have any impact, thus delete the annotation of code file;The pretreatment processing code file according to preset configuration refers to Order, can process pre-processing instruction according to project configuration, and this pre-processing instruction can be #define, #if, #error, #line Deng.Code file Unified coding can be exported code character stream, such as, ASCII code character stream, thus realizes code The pretreatment of file.
As a kind of optional embodiment, step S204, the character stream in code file is carried out morphological analysis, obtains word Method unit sequence is by reading the character stream of code file;Character stream is formed morpheme, and generates lexical unit sequence according to morpheme Row realize.
Fig. 8 is the method that another kind according to embodiments of the present invention carries out morphological analysis to the character stream in code file Flow chart.As shown in Figure 8, this method that character stream in code file is carried out morphological analysis comprises the following steps:
Step S801, reads the character stream of code file.
In the technical scheme that the application above-mentioned steps S801 provides, read the character stream of code file.Can read in pre- Process the code character stream of output.
Step S802, forms morpheme by character stream.
In the technical scheme that the application above-mentioned steps S802 provides, character stream is formed morpheme.Can be by code literary composition Part carries out the code character stream composition morpheme of output during pretreatment.Morpheme is that the angle from word or the immediate constituent of stem determines Pronunciation and meaning coalition, it is not necessarily the pronunciation and meaning binding constituents of minimum, and whether the morpheme in word only from being minimum pronunciation and meaning knot Synthesis point determines.
Step S803, generates lexical unit sequence according to morpheme.
In the technical scheme that the application above-mentioned steps S803 provides, after character stream is formed morpheme, according to morpheme Generate lexical unit sequence, each lexical unit in lexical unit sequence and morpheme one_to_one corresponding.
This embodiment is by reading the character stream of code file;Character stream is formed morpheme;And generate word according to morpheme Method unit sequence, each lexical unit in lexical unit sequence and morpheme one_to_one corresponding, it is achieved thereby that in code file Character stream carry out morphological analysis, obtain the purpose of lexical unit sequence.
As a kind of optional embodiment, it is being associated setting up global symbol to lexical unit sequence and default object After table, set up the data flow model of the execution flow process for simulation code file according to global symbol table.
The basic ideas building data flow model are to perform flow process according to global symbol table simulation code, can be with global symbol The lexical unit with numerical value in plain text in list processing code, records numerical value letter in the data character stream section that lexical unit is corresponding Breath;Can be for simple function, such as, it is through simple arithmetic operations for returning constant value or return value, according to entirely Office's symbol table simulation calculates function return value;Can accord with according to global symbol list processing bit arithmetic;Can be according to global symbol table Mode standard in circulating for for is simulated;When variable is assigned given value, can follow the tracks of according to global symbol table The data character stream section etc. of the service condition of follow-up variable, more new variables.
As a kind of optional embodiment, set up data flow model by according to global symbol table mould according to global symbol table Intend the execution flow process of code file, obtain analog result to set up data flow model.
Fig. 9 is the flow process of a kind of method setting up data flow model according to global symbol table according to embodiments of the present invention Figure.As it is shown in figure 9, this method setting up data flow model according to global symbol table comprises the following steps:
Step S901, according to the execution flow process of global symbol table simulation code file, obtains analog result.
In the technical scheme that the application above-mentioned steps S901 provides, perform logic according to global symbol table simulation code, Speculate as much as possible and record variable is in the possible span of current context, obtain analog result.
Step S902, sets up data flow model according to analog result.
In the technical scheme that the application above-mentioned steps S902 provides, holding according to global symbol table simulation code file Row flow process, after obtaining analog result, sets up data flow model according to analog result.
This embodiment, by the execution flow process according to global symbol table simulation code file, obtains analog result;According to mould Intend result and set up data flow model, it is achieved thereby that set up the purpose of data flow model according to global symbol table.
As a kind of optional embodiment, according to the execution flow process of global symbol table simulation code file, obtain simulation knot Fruit includes: in the case of comparing the variable of code file according to global symbol table, determine and record the value of variable Scope.
In the case of the variable of code file being compared according to global symbol table, determine and record the value of variable Scope, if if conditional statement has carried out size etc. and has judged variable, then can speculate that this variable can in current context The value condition of energy, then resolves the service condition of variable, update data stream field the most backward from If condition.
The embodiment of the present invention obtains code file, and code file is to include the source text of character string;To code literary composition Character string in part carries out morphological analysis, obtains lexical unit sequence;Resolve code file, obtain presetting object;To morphology Unit sequence and default object are associated setting up global symbol table, global symbol table owning in record code file Preset the data message of object;And according to global symbol table, code file is performed static code scanning, obtain scanning result, Scanning result at least includes the lookup result of the type to lexical unit sequence, may be used for the exploitation of upper strata check item, finds Defect that may be present, performance and safety problem in code, it is possible to allow programmer efficiently, low cost repair these problems, thus Promote code quality, make symbolization result accurately, efficiently, the type search algorithm of context-sensitive can be used, compare simple String matching from the point of view of, there is higher accuracy, additionally in Data Structure Design, reduce structure size as far as possible, fill Divide and utilize caching, internal memory and efficiency have good performance, it is possible to detect in the case of compiling is not passed through, point Analysis, does not affect overall symbol flow process and result, improves the accuracy of code scans, it is possible to realize based on C Plus Plus, permissible Support Windows Linux Mac system.
Embodiment 2
Below in conjunction with preferred embodiment, technical scheme is illustrated.
This embodiment is for C#Language global symbol based on decomplier scheme, sweeps for upper strata static code check item Retouch and provide symbolization result accurately and efficiently so that check item possess syntactic level, across function scanning, semantic level and A certain degree of logical analysis ability.The code scans result of final output can help exploitation, tester quickly to position generation In Ma hide problem, promote code quality, reduce the later stage rehabilitation cost.It is applicable to all use C#The project of language quiet State code check.
This embodiment has taken into full account code file disappearance, type definition disappearance and the situation of syntax error, therefore symbol Change process need not the C of compiling input#Code, it is not required that C#Code can compile and pass through.
This embodiment achieves for C#Language symbolization based on decomplier flow process;The data structure of global symbol table with And build the logic of global symbol table;And the lookup algorithm of context-sensitive when type search and function lookup.
The application scenarios of this embodiment is: input C#The code file of language program, to C#Language codes carries out morphology and divides Analysis, sets up lexical unit chained list and abstract syntax tree, then extracts variable, Function feature information, builds global symbol table, and builds Vertical variable uses rule, function call link, finally, follow the tracks of the service condition of variable in code, thus it is speculated that the value that variable is possible Scope, finally sets up variable data flow model, provides the symbolization result needed for checking and tune for upper strata static code check item Use interface.
Figure 10 is the flow chart of another kind of data processing method according to embodiments of the present invention.As shown in Figure 10, these data Processing method comprises the following steps:
Step S1001, performs pretreatment to code file, obtains pretreatment code.
It is that the first for input code processes that code file carries out pretreatment, filters out successive character and need not Content, obtain pretreatment code.
Step S1002, carries out morphological analysis to pretreatment code, obtains lexical unit sequence.
The character string of the code text after pretreatment is carried out morphological analysis, character string is converted into lexical unit Sequence.
Step S1003, simplifies lexical unit sequence, is simplified lexical unit sequence.
Lexical unit sequence is entered row equivalent code logic replace, be simplified lexical unit sequence, thus specification Code format.
Step S1004, sets up tree-like grammatical structure.
After being simplified lexical unit sequence, set up tree-like grammatical structure, the building process class of tree-like grammatical structure It is similar in compilation process build the process of abstract syntax tree AST.
Step S1005, sets up global symbol table according to tree-like grammatical structure.
After setting up tree-like grammatical structure, building global symbol table, this global symbol table have recorded all in code patrolling Collect the formatted message of object.
Step S1006, sets up data flow model according to global symbol table.
After setting up global symbol table, setting up data flow model according to global symbol table, simulation code performs logic, note The span that record variable is possible.
This embodiment, by code file is performed pretreatment, obtains pretreatment code, pretreatment code is carried out morphology Analyze, obtain lexical unit sequence, lexical unit sequence is simplified, be simplified lexical unit sequence, set up tree-like language Method structure, sets up global symbol table according to tree-like grammatical structure, sets up data flow model according to global symbol table, improve code The accuracy of file scan.
Below to step S1001, code file performs pretreatment, obtains pretreatment code and is introduced.
Pretreatment is that the first carrying out source text code processes, it is therefore an objective to provide rule for morphological analysis afterwards The character stream of model, filters the part unrelated with valid code.Mainly include following a few partial content: filter unnecessary space;Go Fall annotation;Pre-processing instruction (#define, #if, #error, #line etc.) is processed according to project configuration;Unicode file is compiled Code, exports ASCII code character stream.
Figure 11 is a kind of code schematic diagram carried out code file before pretreatment according to embodiments of the present invention.Such as figure Shown in 11, before code file is carried out pretreatment, code file includes the part unrelated with valid code, including: annotation "/* This is the entry point.*/", pre-processing instruction " #if TEST CMD ", pre-processing instruction " //defind ", Pre-processing instruction " #else ", invalid statement " Console.WriteLie (" TEST_CMD is not defind. ");//not Defind. ", pre-processing instruction " #endif ", and space etc..
Figure 12 is that code file carries out code schematic diagram after pretreatment according to embodiments of the present invention.Such as Figure 12 institute Show, filter the part unrelated with valid code.Through filtering unnecessary space, remove annotation, process pretreatment according to project configuration Instruction, Unicode document No., after waiting process, the code after code file is carried out pretreatment does not include shown in Figure 11 Annotation "/* This is the entry point.*/", pre-processing instruction " #if TEST CMD ", pre-processing instruction " // Defind ", pre-processing instruction " #else ", invalid statement " Console.WriteLie (" TEST_CMD is not defind.”);//not defind. ", pre-processing instruction " #endif ", and excess space etc., thus be follow-up morphological analysis The character stream of specification is provided.
Below to step S1002, pretreatment code is carried out morphological analysis, obtain lexical unit sequence and be introduced.
The main task of morphological analysis is to read in the character stream of pretreatment output, and character stream forms morpheme, and by morpheme Generate and export lexical unit (Token) sequence, the corresponding morpheme of each lexical unit.Whole lexical unit sequence It is Token list.Token list is subsequent treatment and the Data Structures of upper strata check item traversal code, is a generation The set of all Token that code file generates after morphological analysis.
Figure 13 is that one according to embodiments of the present invention carries out morphological analysis to pretreatment code, obtains lexical unit sequence Code schematic diagram.As shown in figure 13, pretreatment code " for (int index=0;index<42;++ index) ", to pre-place Reason code " for (int index=0;index<42;++ index) " carry out morphological analysis, obtain lexical unit sequence: " for ", " (", " int ", " index ", "=", " 0 ", ";”、“index”、“<”、“42”、“;”、“++”、“index”、“)”.
Lexical unit Token is by syntactic analysis and the most basic unit of rule scanning, and code scans includes Layer scanning and bottom scan, and bottom scanning includes technical scheme, and upper strata scanning is rule scanning, such as, except zero Operation, thus find aacode defect well.Lexical unit Token in addition to the string value comprising this morpheme, also have with Other attribute that lexical unit Token is associated: such as, points to the pointer of next lexical unit Token;Point to a word The pointer of method unit Token;The pointer pointing to " pairing " lexical unit Token (for left bracket, i.e. points to right parenthesis Pointer);The type (numeral, character string, variable, function, keyword etc.) of lexical unit Token;Lexical unit Token points to symbol The pointer (variable Token points to the variable object in symbol table, and function Token points to the function object of its correspondence) of number table;Word The line number of method unit Token;The syntax tree structure pointer of lexical unit Token (safeguards the abstract syntax tree of lexical unit Token Structure);The data flow architecture pointer of lexical unit Token.
Lexical unit sequence TokenList is substantially a doubly linked list, safeguards all of lexical unit Token.
Figure 14 is the schematic diagram of a kind of lexical unit sequence according to embodiments of the present invention.As shown in figure 14, code " if (i > 0) " lexical unit sequence TokenList be doubly linked list, wherein, " if " → " (" → " i " → " > " → " 0 " → ") " → " 0 " → " > " → " i " → " (" → " if ", wherein, " (" and ") " pairing.
Below lexical unit sequence is simplified by step S1003, be simplified lexical unit sequence and be introduced.
Disparity items code, the code spice of distinct program person are different, objectively define the various of code file Property present situation, this to global symbol table set up process cause some trouble.In order to reduce the cost realizing global symbol And the accuracy of raising upper strata static scanning, after establishing T lexical unit sequence Token list, need some to simplify Step makes code file Unicode style.
All simplification steps of this embodiment all can not change the logic of code, can only be to carry out code file in logic Equivalencing.
Figure 15 is a kind of schematic diagram simplifying lexical unit sequence according to embodiments of the present invention.Such as Figure 15 institute Show, simplify C#Type based on Platform Type, by code " System.Int32il=default (int);" simplify, letter After changing default keyword, it is simplified lexical unit sequence " int i1=0;”.
Figure 16 is the schematic diagram that lexical unit sequence is simplified by another kind according to embodiments of the present invention.Such as Figure 16 institute Show, lamada expression formula " i=> i+5 " is reduced to canonical form " (i)=> { return i+5;}”.
Step S1004 is set up tree-like grammatical structure below be introduced.
Tree-like grammatical structure, that is, abstract syntax tree, is C#The tree-shaped form of expression of the abstract syntax structure of code.Tree Shape grammatical structure is a binary tree, and each non-leaf nodes represents an operator, and its two child nodes represent respectively Two computing components of this operator.The priority of logical structure and operator that tree-like grammatical structure contains expression formula is closed System, this characteristic can improve the accuracy of code scene matching and realize the efficiency of this scene.
It should be noted that the abstract syntax tree construction during the abstract syntax tree construction of this embodiment and conventional encoder Different places is, the abstract syntax tree construction during conventional encoder can set up the logical relation between code expression, If statement and the logical relation of else statement in such as if-else statement interlude, and the abstract syntax tree construction in this embodiment is only For single code expression is set up abstract syntax structure, do not set up the structure between code expression and code expression Relation.This embodiment supports C that is incomplete or that can not pass through compiling#Code is as input, the C once inputted#Code exists Syntax error, then the overall abstract syntax tree construction built would is that mistake and without reference to meaning, and builds list The abstract syntax tree construction of expression formula, if there is mistake, that is also local, does not affect the abstract syntax of other expression formula Tree construction.
Figure 17 is the schematic diagram of a kind of abstract syntax tree construction according to embodiments of the present invention.As shown in figure 17, this is abstract Syntax tree structure is code " String.Format (" { 0}{1}{2}{3} ", Func (1,2), " tsc# ", 1+2*3) " tree-shaped Structure.
Below according to tree-like grammatical structure, step S1005 is set up global symbol table to be introduced.
The foundation of global symbol table is to realize during code scans the judgement to types of variables, and function call is followed the tracks of Basis, more powerful than other symbolization scheme mated based on text and canonical.
Figure 18 is the flow chart of a kind of method setting up global symbol table according to embodiments of the present invention.As shown in figure 18, This method setting up global symbol table comprises the following steps:
Step S1801, resolves all code files successively, generates the objects such as corresponding class, NameSpace, method field, And set up corresponding inclusion relation, the corresponding CSymbolFile object of each code file.
According to C#The taxeme of language, defines CSymbolFile list object as shown in table 1.
Table 1CSymbolFile list object
Step S1802, merges all of CSymbolFile object, using the fully qualified name of CSI object as key, correspondence CSI object, as value, sets up overall situation type Hash table.
This step mainly solves the related question of partial key word type.
Step S1803, processes the another name instruction in code.
Process another name instruction (Using alias directives) in code, prepare for follow-up type association
Step S1804, type declarations Token is with corresponding type object association;Variable Token associates with variable-definition.
Associate with corresponding type object, with the variable pair in code file for the lexical unit Token of type declarations The lexical unit Token answered performs to associate with variable.
Step S1805, method call Token is with corresponding method object association.
Method call Token is with corresponding method object association, that is, the lexical unit corresponding with default call method Token performs to associate with default call method, thus realizes the foundation to global symbol table.
This embodiment, by resolving all code files successively, generates corresponding class, NameSpace, and method field etc. is right As, and set up corresponding inclusion relation, and the corresponding CSymbolFile object of each code file, merge all of CSymbolFile object, using the fully qualified name of CSI object as key, corresponding CSI object, as value, sets up overall situation class Type Hash table, processes the another name instruction in code, and type declarations Token is with corresponding type object association;Variable Token is with becoming Amount definition association, method call Token is with corresponding method object association, it is achieved thereby that the foundation to global symbol table.
Figure 19 is the schematic diagram of the class of a kind of object according to embodiments of the present invention.As shown in figure 19, the class of this object is The class of the object shown in table 1, the inheritance between type mainly considers from the structure of code aspect is the most similar.Its Middle CSI represents class Class, structure Struct or interface Interface object, because these three type is in code aspect There is similar structure, so integrating together.CSymbolFile object represents a code file physically, and it continues Hold and be because the superiors of code file from NameSpace CNamespace and be considered and conceal " namespace Global{ ... } " an overall namespace, both have identical logic implication.
Figure 20 is the schematic diagram of a kind of overall situation type Hash table according to embodiments of the present invention.As shown in figure 20, C#Language Support the definition of partial keyword, i.e. same class or method, can be in multiple code files, so needing same The definition of the different piece of one type associates.During setting up overall situation type Hash table, will be same type of Different partial keyword fragments are chained up, and this overall situation type Hash table includes key Key and key assignments Value, wherein, key Key Associating including Namepace.A with classA, Namepace.B with class partial B associates, Namepace.C and class C associates.
The overall situation type Hash table set up except solve partial object related question, also solve type search and The problem of type traversal.
C#Follow-up type search can be interfered by the another name instruction in code, therefore needs exist for first parsing code In another name instruction, then the place using the instruction of this another name is carried out the expansion of Token aspect.
Figure 21 is a kind of code schematic diagram processing another name instruction execution according to embodiments of the present invention.Such as Figure 21 institute Showing, the code using another name instruction is " using A=N1.N2.A;Class B:A{} ", the instruction execution of this another name is processed it It is " class B:N1.N2.A " afterwards, thus avoids the instruction of this another name and follow-up type search is interfered.
The work that type association associates with function is with variable, type and the method for its correspondence by lexical unit Token Object is bound, and the meaning of do so is when traveling through lexical unit sequence TokenList, it is possible to know current lexical unit Types of variables corresponding to Token, this variable is Value Types or reference type, and whether the realization of this function call refer to Variable paid close attention to etc., this can significantly improve upper strata check item ability in terms of semantic level and tracking function.
That type association associates with function it is crucial that type search.C#Concrete representated by type Token in language Type is affected by current context.Figure 22 is the signal of the type search of a kind of lexical unit according to embodiments of the present invention Figure.As shown in figure 22, in theory from the point of view of, the base class A of class B points to N1.A or N2.A and is possible, and it practice, C# Compiler is " nearby principle " run into that this situation takes, i.e. the type of priority match " recently ", for example above, is compiled Translate device and A can be pointed to N2.A.
This embodiment, carrying out type search and function lookup when, takes same thinking and processes, based on life The global symbol table become, it is achieved that the type search algorithm of context-sensitive, sweeps that is, code file is performed static code Retouching, this is an important guarantee to scanning result correctness.
Figure 23 is the flow chart of a kind of method that code file performs static code scanning according to embodiments of the present invention. As shown in figure 23, the method comprises the following steps:
Step S2301, splits lexical unit section excessively corresponding for type according to level, in press-in stack.
Step S2302, takes out the element in stack to determine type name.
Take out the element in stack, that is, type name N of type, do not include the qualified name of type.
Step S2303, it is judged that whether type name is system type.
Judging whether type name is system type, if it is judged that the entitled system type of type, end code file is static The flow process of code scans, if it is judged that type name is not system type, performs step S2304.
Step S2304, from the beginning of the base class sub-object at lexical unit section place, searches type name N from the bottom up.
If it is judged that type name is not system type, the base class sub-object at lexical unit section place starts, and looks into from the bottom up Look for type name N.
Step S2305, it may be judged whether find type name N.
Base class sub-object at lexical unit section place starts, after searching type name N from the bottom up, it may be judged whether find Type name N.If it is judged that find type name N, perform step S2308, if it is judged that do not find type name N, hold Row step S2306.
Step S2306, searches type name N from overall situation type Hash table.
When judging not find type name N, from overall situation type Hash table, search type name N.
Step S2307, it is judged that whether find type name N in overall situation type Hash table.
If it is judged that overall situation type Hash table finds type name N, perform step S2308, if it is judged that entirely Office's type Hash table does not finds type name N, the flow process of end code file static code scanning.
Step S2308, the type qualified name in checking stack.
If it is judged that overall situation type Hash table finds type name N, the type qualified name in checking stack.
Step S2309, it is judged that whether the type qualified name in stack matches with preset kind qualified name.
If it is determined that the type qualified name in popping matches with preset kind qualified name, perform step S2310, if sentenced The type qualified name broken in popping does not matches with preset kind qualified name, the flow process of end code file static code scanning.
Step S2310, returns the type matched.
This type matched is the result of type search, that is, the result of code scans.
This embodiment is split according to level by the lexical unit section that type is excessively corresponding, in press-in stack, takes out in stack Element is to determine type name, it is judged that whether type name is system type, if it is judged that type name is not system type, from morphology The base class sub-object at elementary section place starts, and searches type name N from the bottom up, it may be judged whether find type name N, if it is judged that Do not find type name N, from overall situation type Hash table, search type name N, it is judged that whether look in overall situation type Hash table Find type name N.If it is judged that find type name N, the type qualified name in checking stack in overall situation type Hash table, sentence Whether the type qualified name in disconnected stack matches with preset kind qualified name.If it is determined that the type qualified name in popping is with default Type qualified name matches, and returns the type matched, it is achieved that code file performs static code scanning, improves code The accuracy of file scan.
Embodiment 3
The applied environment of the embodiment of the present invention can be, but not limited to reference to the applied environment in above-described embodiment, the present embodiment In this is repeated no more.Embodiments provide and the most specifically should for the one implementing above-mentioned data processing method With.
Figure 24 is the schematic diagram of a kind of source code according to embodiments of the present invention.As shown in figure 24, by obtaining code File, code file is to include the source text of character string;Character string in code file is carried out morphological analysis, To lexical unit sequence;Resolve code file, obtain presetting object;It is associated building to lexical unit sequence and default object Vertical global symbol table, the data message of the global symbol table all default object in record code file;And according to entirely Office's symbol table performs static code scanning to code file, obtains scanning result, and scanning result at least includes lexical unit sequence The lookup result of the type of row, obtains the code shown in Figure 25, and wherein, Figure 25 is a kind of symbolization according to embodiments of the present invention The schematic diagram of the debugging file of output.Wherein, identical variable ID represents same variable, and can associate determining of this variable Justice, can directly quote its definition by function Func.
According to global symbol table, code file is performed static code scanning, after obtaining scanning result, set up data Flow model, the basic ideas setting up data flow model are that simulation code performs flow process, speculate as much as possible and record variable is being worked as The span that front upper and lower literary composition is possible.
Process the plaintext numerical value Token in code, data character stream section corresponding for Token records numerical information;For Some simple function (returns constant value or return value is through simple arithmetic operations), and simulation calculates function return value;Place Reason bit arithmetic symbol &, for expression formula " i=j&4;", the result of variable i can only be 4 or 0;For for circulation acceptance of the bid quasi-mode Formula: such as " for (int i=0;i<10;++ i) ", simulation performs the process that i is from 0 to 9;When variable is assigned given value, with The service condition of the follow-up variable of track, the data character stream section of more new variables;If if conditional statement has carried out size, has sentenced variable Relatively, then the value condition that this variable is possible in current context can be speculated, then resolve the most backward from if condition The service condition of variable, updates corresponding data character stream section.
Figure 26 is a kind of code schematic diagram setting up data flow model according to embodiments of the present invention.As shown in figure 26, logical Cross " int i=10;If (j > 0) i=42;J=1 " show that variable i is { 10,42} in the span of current context.
The embodiment of the present invention is found that code potential problems, it is possible to allow programmer efficiently, low cost repair these problems, carry Rise code quality.Symbolization result accurately, efficiently, uses the type search algorithm of context-sensitive, compares simple character string From the point of view of Pi Pei, there is higher accuracy, additionally in Data Structure Design, reduce structure size as far as possible, make full use of slow Deposit, internal memory and efficiency have good performance.The embodiment of the present invention is not based on compiling, it is possible in the situation that compiling is not passed through Under detect, analyze, do not affect overall symbol flow process and result.Can realize based on C Plus Plus, can support at present Windows Linux Mac system.
It should be noted that the multilingual extension of the support of technical scheme, this technical scheme is except supporting C#Language Outside speech, can support other similar language such as Java in theory, C/C++ language, they broadly fall into class C language.Java Language is all much like with C# on grammer and code spice, and therefore this programme can be smaller for the support cost of Java language; C/C++ language comprises based on header file, there may be change on solution details.The same centering of technical scheme Between the support of language, C#Language usually operates in " hosts virtual machine " upper (.NET's or Mono), C#Language can be compiled into Intermediate language (IL), therefore, based on the data structure of global symbol table in this programme and interface, can direct global symbol C# Corresponding IL language, user has only to provide compiled assembly file (dll file or exe file etc.).Type Realizing being more prone to function lookup, the precision of data flow model and AST also can improve.
It should be noted that for aforesaid each method embodiment, in order to be briefly described, therefore it is all expressed as a series of Combination of actions, but those skilled in the art should know, the present invention is not limited by described sequence of movement because According to the present invention, some step can use other orders or carry out simultaneously.Secondly, those skilled in the art also should know Knowing, embodiment described in this description belongs to preferred embodiment, involved action and the module not necessarily present invention Necessary.
Through the above description of the embodiments, those skilled in the art is it can be understood that arrive according to above-mentioned enforcement The method of example can add the mode of required general hardware platform by software and realize, naturally it is also possible to by hardware, but a lot In the case of the former is more preferably embodiment.Based on such understanding, technical scheme is the most in other words to existing The part that technology contributes can embody with the form of software product, and this computer software product is stored in a storage In medium (such as ROM/RAM, magnetic disc, CD), including some instructions with so that a station terminal equipment (can be mobile phone, calculate Machine, server, or the network equipment etc.) perform the method described in each embodiment of the present invention.
Embodiment 4
According to embodiments of the present invention, additionally provide a kind of for implementing above-mentioned data processing method.Figure 27 is according to this A kind of schematic diagram of the data processing equipment of inventive embodiments.As shown in figure 27, this data processing equipment includes: first obtains list Unit 10, analytic unit 20, resolution unit 30, associative cell 40 and scanning element 50.
First acquiring unit 10, is used for obtaining code file, and wherein, code file is to include the source program literary composition of character string This.
Analytic unit 20, for the character string in code file is carried out morphological analysis, obtains lexical unit sequence.
Resolution unit 30, is used for resolving code file, obtains presetting object.
Associative cell 40, for being associated setting up global symbol table to lexical unit sequence and default object, wherein, The data message of the global symbol table all default object in record code file.
Scanning element 50, for code file being performed static code scanning according to global symbol table, obtains scanning result, Wherein, scanning result at least includes the lookup result of the type to lexical unit sequence.
It should be noted that the first acquiring unit 10 in this embodiment may be used for performing in the embodiment of the present application 1 Step S202, the analytic unit 20 in this embodiment may be used for performing step S204 in the embodiment of the present application 1, this embodiment In resolution unit 30 may be used for performing step S206 in the embodiment of the present application 1, the associative cell 40 in this embodiment can For step S208 performed in the embodiment of the present application 1, the scanning element 50 in this embodiment is real for performing the application Execute step S210 in example 1.
Figure 28 is the schematic diagram of another kind of data processing equipment according to embodiments of the present invention.As shown in figure 28, these data Processing means includes: the first acquiring unit 10, analytic unit 20, resolution unit 30, associative cell 40 and scanning element 50.This number Also include according to processing means: second acquisition unit 60.
It should be noted that the first acquiring unit 10 of this embodiment, analytic unit 20, resolution unit 30, associative cell 40 is identical with the effect in the data processing equipment shown in Figure 27, and here is omitted.
Second acquisition unit 60, for being associated setting up global symbol table to lexical unit sequence and default object Before, in the case of the logic not changing code file, lexical unit sequence is simplified, be simplified lexical unit sequence Row.
Associative cell 40 is for being associated setting up global symbol table to simplification lexical unit sequence and default object.
Alternatively, this data processing equipment also includes: first sets up unit, for lexical unit sequence is being carried out letter Change, after being simplified lexical unit sequence, set up tree-like grammatical structure, wherein, tree-like language according to simplifying lexical unit sequence Method structure is the tree structure of the data object for storage with default grammer.Above-mentioned associative cell 40 is for according to tree-like language Simplification lexical unit sequence and default object are associated setting up global symbol table by method structure.
Alternatively, first set up unit and include: the first acquisition module and first sets up module.Wherein, the first acquisition module For obtaining the single code expression in lexical unit sequence;First sets up module for setting up according to single code expression Tree-like grammatical structure.
Alternatively, default object includes multiple default object, and associative cell 40 includes: second sets up module, the second acquisition Module and relating module.Wherein, second sets up module presets global object's list for setting up according to multiple default objects, its In, preset global object's list for being associated by the same type of different keyword fragments in code file;Second obtains Delivery block instructs for obtaining the another name in code file and the code using another name instruction in code file is performed process, To processing code;Relating module, in the list of default global object, according to processing code by lexical unit sequence and and word The default object that method unit sequence is corresponding performs type association or function association, obtains global symbol table.
Alternatively, relating module includes at least one of for the method performed: will be used for type according to processing code Lexical unit sequence and the default object answered with lexical unit sequence pair of statement perform to associate;Will be with code according to processing code The lexical unit sequence that variable in file is corresponding performs to associate with variable.
Alternatively, relating module be additionally operable to according to process code by the lexical unit sequence corresponding with default call method and Preset call method and perform association.
Alternatively, relating module comprises determining that submodule, it is judged that submodule with associate submodule.Wherein it is determined that submodule For determining the type name of the type of lexical unit sequence and searching type name in the list of default global object;Judge submodule For judging whether the type name found meets pre-conditioned;And association submodule is for judging the type that finds When name meets pre-conditioned, according to processing code, lexical unit sequence and default object are performed type association or functional relationships Connection.
Alternatively, scanning element 50 includes: split module, determine module, the first judge module, first search module, the Two search module, the first authentication module, the second judge module and first returns module.Wherein, split module, for fractionation being tied In fruit press-in stack;Determine module, for determining the type name to be found of lexical unit sequence according to the stack top element in stack;First Judge module, is used for judging whether type name to be found is preset kind name;First search module, for judge to be found When type name is not preset kind name, from the base class sub-object at lexical unit section place, search type name to be found;Second searches Module, in time not finding type name to be found in the base class sub-object from lexical unit section place, from global symbol table Search type name to be found;First authentication module, with when finding type name to be found in global symbol table, verifies to be found The type qualified name of type name, wherein, type limits the entitled title for limiting type name to be found;Second judge module, For judging whether type qualified name matches with default qualified name;And first return module, for judging that type limits Name and preset qualified name when matching, return type name to be found using as scanning result.
Alternatively, scanning element 50 also includes: the second authentication module, the 3rd judge module and second return module.Wherein, Second authentication module, after searching type name to be found in the base class sub-object from lexical unit section place, from morphology When the base class sub-object at elementary section place finds type name to be found, verify the type qualified name of type name to be found;3rd Judge module, is used for judging whether type qualified name matches with default qualified name;And second return module, for judge Go out type qualified name and default qualified name when matching, return type name to be found using as scanning result.
Alternatively, this data processing equipment also includes: pretreatment unit, for entering the character string in code file Row morphological analysis, before obtaining lexical unit sequence, performs pretreatment to code file, obtains pretreatment code, wherein, locate in advance Reason code is the character stream meeting preset rules, and analytic unit 20 divides for the character string in pretreatment code is carried out morphology Analysis, obtains lexical unit sequence.
Alternatively, pretreatment unit includes at least one of for the method performed: the space of filtering code file;Delete Annotation except code file;The pre-processing instruction of code file is processed according to preset configuration;To code file Unified coding, output Code character stream.
Alternatively, analytic unit 20 includes: read module, comprising modules and generation module.Wherein, read module, it is used for Read the character stream of code file;Comprising modules, for forming morpheme by character stream;And generation module, for according to morpheme Generate lexical unit sequence, wherein, each lexical unit in lexical unit sequence and morpheme one_to_one corresponding.
Alternatively, this data processing equipment also includes: second sets up unit, for right with default to lexical unit sequence After being associated setting up global symbol table, set up the execution flow process for simulation code file according to global symbol table Data flow model.
Alternatively, second set up unit and include: analog module and the 3rd sets up module.Wherein, analog module is used for basis The execution flow process of global symbol table simulation code file, obtains analog result;3rd sets up module for building according to analog result Vertical data flow model.
Alternatively, analog module is used in the case of comparing the variable of code file according to global symbol table, Determine and record the span of variable.
Herein it should be noted that the example that realized with corresponding step of said units and module and application scenarios phase With, but it is not limited to above-described embodiment 1 disclosure of that.It should be noted that said units and module are as one of device Divide in the hardware environment that may operate in as shown in Figure 1, can be realized by software, it is also possible to realized by hardware.
By said units and module, the technical problem that in correlation technique, the accuracy of code scans is low can be solved, And then reach to improve the technique effect of the accuracy of code scans.
Herein it should be noted that the example that realized with corresponding step of said units and module and application scenarios phase With, but it is not limited to above-described embodiment 1 disclosure of that.It should be noted that said units and module are as one of device Divide in the hardware environment that may operate in as shown in Figure 1, can be realized by software, it is also possible to realized by hardware, wherein, firmly Part environment includes network environment.
Embodiment 5
According to embodiments of the present invention, a kind of server for implementing above-mentioned data processing method or terminal are additionally provided.
Figure 29 is the structured flowchart of a kind of terminal according to embodiments of the present invention.As shown in figure 29, this terminal may include that One or more (only illustrating one in figure) processor 291, memorizer 293 and transmitting device 295 are (in above-described embodiment Dispensing device), as shown in figure 29, this terminal can also include input-output equipment 297.
Wherein, memorizer 293 can be used for storing software program and module, such as the data process side in the embodiment of the present invention Programmed instruction/module that method is corresponding with device, processor 291 by operation be stored in the software program in memorizer 293 and Module, thus perform the application of various function and data process, i.e. realize above-mentioned data processing method.Memorizer 293 can wrap Include high speed random access memory, it is also possible to include nonvolatile memory, as one or more magnetic storage device, flash memory or Other non-volatile solid state memories of person.In some instances, memorizer 293 can farther include remote relative to processor 291 The memorizer that journey is arranged, these remote memories can be connected to terminal by network.The example of above-mentioned network includes but does not limits In the Internet, intranet, LAN, mobile radio communication and combinations thereof.
Above-mentioned transmitting device 295 is for receiving via a network or sending data, it is also possible to for processor with Data transmission between memorizer.Above-mentioned network instantiation can include cable network and wireless network.In an example, Transmitting device 295 includes a network adapter (Network Interface Controller, referred to as NIC), and it can lead to Cross netting twine and other network equipments to be connected with router thus communication can be carried out with the Internet or LAN.In an example, Transmitting device 295 is radio frequency (Radio Frequency, referred to as RF) module, and it is for wirelessly entering with the Internet Row communication.
Wherein, specifically, memorizer 293 is used for storing application program.
Processor 291 can call the application program of memorizer 293 storage by transmitting device 295, to perform following step Rapid:
Obtaining code file, wherein, code file is to include the source text of character string;
Character string in code file is carried out morphological analysis, obtains lexical unit sequence;
Resolve code file, obtain presetting object;It is associated setting up the overall situation to lexical unit sequence and default object Symbol table, wherein, the data message of the global symbol table all default object in record code file;
According to global symbol table, code file being performed static code scanning, obtain scanning result, wherein, scanning result is extremely Include the lookup result of type to lexical unit sequence less.
Processor 291 is additionally operable to perform following step: be associated lexical unit sequence and default object setting up Before global symbol table, in the case of the logic not changing code file, lexical unit sequence is simplified, is simplified Lexical unit sequence, is associated setting up global symbol table to lexical unit sequence and default object and includes: to simplifying morphology Unit sequence and default object are associated setting up global symbol table.
Processor 291 is additionally operable to perform following step: is simplifying lexical unit sequence, is being simplified lexical unit After sequence, setting up tree-like grammatical structure according to simplifying lexical unit sequence, wherein, tree-like grammatical structure is for have for storage Preset the tree structure of the data object of grammer, be associated simplification lexical unit sequence and default object setting up overall situation symbol Number table includes: according to tree-like grammatical structure to simplifying lexical unit sequence and default object is associated setting up global symbol Table.
Processor 291 is additionally operable to perform following step: obtain the single code expression in lexical unit sequence;And root Tree-like grammatical structure is set up according to single code expression.
Processor 291 is additionally operable to perform following step: sets up according to multiple default objects and presets global object's list, its In, preset global object's list for being associated by the same type of different keyword fragments in code file;Obtain generation Another name in code file instructs and the code using another name instruction in code file is performed process, obtains processing code;In advance If in global object's list, according to processing code, lexical unit sequence and the default object answered with lexical unit sequence pair are performed Type association or function association, obtain global symbol table.
Processor 291 is additionally operable to perform following step: according to processing code by the morphology list corresponding with default call method Metasequence performs to associate with default call method.
Processor 291 is additionally operable to perform following step: determines the type name of the type of lexical unit sequence and is presetting entirely Office's list object searches type name;Judge whether the type name found meets pre-conditioned;And if it is judged that search To type name meet pre-conditioned, according to processing code, lexical unit sequence and default object are performed type association or letter Number association.
Processor 291 is additionally operable to perform following step: split the lexical unit section including one or more lexical unit, Obtain split result;Split result is pressed in stack;The class to be found of lexical unit sequence is determined according to the stack top element in stack Type name;Judge whether type name to be found is preset kind name;If it is judged that type name to be found is not preset kind name, from The base class sub-object at lexical unit section place searches type name to be found;Base class sub-object from lexical unit section place is not looked into When finding type name to be found, from global symbol table, search type name to be found;To be checked finding from global symbol table When looking for type name, verifying the type qualified name of type name to be found, wherein, type limits entitled for limiting type name to be found Title;Judge whether type qualified name matches with default qualified name;And if it is judged that type qualified name with limit in advance Name and match, return type name to be found using as scanning result.
Processor 291 is additionally operable to perform following step: search to be found in the base class sub-object from lexical unit section place After type name, when finding type name to be found in the base class sub-object from lexical unit section place, verify type to be found The type qualified name of name;Judge whether type qualified name matches with default qualified name;And if it is judged that type qualified name Match with default qualified name, return type name to be found using as scanning result.
Processor 291 is additionally operable to perform following step: the character string in code file is being carried out morphological analysis, is obtaining Before lexical unit sequence, code file being performed pretreatment, obtains pretreatment code, wherein, pretreatment code is pre-for meeting If the character stream of rule, the character string in pretreatment code is carried out morphological analysis, obtain lexical unit sequence.
Processor 291 is additionally operable to perform following step: read the character stream of code file;Character stream is formed morpheme;With And generate lexical unit sequence, wherein, each lexical unit in lexical unit sequence and morpheme one_to_one corresponding according to morpheme.
Processor 291 is additionally operable to perform following step: be associated lexical unit sequence and default object setting up After global symbol table, set up the data flow model of the execution flow process for simulation code file according to global symbol table.
Processor 291 is additionally operable to perform following step: according to the execution flow process of global symbol table simulation code file, obtain Analog result;And set up data flow model according to analog result.
Processor 291 is additionally operable to perform following step: comparing the variable of code file according to global symbol table In the case of, determine and record the span of variable.
Use the embodiment of the present invention, it is provided that the scheme of a kind of data processing method.By obtaining code file, wherein, Code file is to include the source text of character string;Character string in code file is carried out morphological analysis, obtains word Method unit sequence;Resolve code file, obtain presetting object;Lexical unit sequence and default object are associated to set up complete Office's symbol table, wherein, the data message of the global symbol table all default object in record code file;And according to entirely Office's symbol table performs static code scanning to code file, obtains scanning result, and wherein, scanning result at least includes morphology list The lookup result of the type of metasequence, has reached to carry out code file the purpose of symbolization process, it is achieved thereby that improve generation The technique effect of the accuracy of code scanning, and then solve the technical problem that in correlation technique, the accuracy of code scans is low.
Alternatively, the concrete example in the present embodiment is referred to the example described in above-described embodiment, the present embodiment Do not repeat them here.
It will appreciated by the skilled person that the structure shown in Figure 29 is only signal, terminal can be smart mobile phone (such as Android phone, iOS mobile phone etc.), panel computer, palm PC and mobile internet device (Mobile Internet Devices, MID), the terminal unit such as PAD.Figure 29 its structure of above-mentioned electronic installation is not caused restriction.Such as, terminal is also The assembly (such as network interface, display device etc.) more or more less than shown in Figure 29 can be included, or have and Figure 29 institute Show different configurations.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can Completing carrying out the device-dependent hardware of command terminal by program, this program can be stored in a computer-readable recording medium In, storage medium may include that flash disk, read only memory (Read-Only Memory, ROM), random access device (Random Access Memory, RAM), disk or CD etc..
Embodiment 6
Embodiments of the invention additionally provide a kind of storage medium.Alternatively, in the present embodiment, above-mentioned storage medium can Program code for configuration for executing data processing.
Alternatively, in the present embodiment, multiple during above-mentioned storage medium may be located at the network shown in above-described embodiment On at least one network equipment in the network equipment.
Alternatively, in the present embodiment, storage medium is arranged to storage for the program code performing following steps:
Obtaining code file, wherein, code file is to include the source text of character string;
Character string in code file is carried out morphological analysis, obtains lexical unit sequence;
Resolve code file, obtain presetting object;It is associated setting up the overall situation to lexical unit sequence and default object Symbol table, wherein, the data message of the global symbol table all default object in record code file;
According to global symbol table, code file being performed static code scanning, obtain scanning result, wherein, scanning result is extremely Include the lookup result of type to lexical unit sequence less.
Alternatively, storage medium is also configured to storage for the program code performing following steps: to lexical unit Before sequence and default object are associated setting up global symbol table, in the case of the logic not changing code file, right Lexical unit sequence simplifies, and is simplified lexical unit sequence, lexical unit sequence and default object are associated with Set up global symbol table to include: be associated setting up global symbol table to simplification lexical unit sequence and default object.
Alternatively, storage medium is also configured to storage for the program code performing following steps: to lexical unit Sequence simplifies, and after being simplified lexical unit sequence, sets up tree-like grammatical structure according to simplifying lexical unit sequence, its In, tree-like grammatical structure is the tree structure of the data object for storage with default grammer, to simplifying lexical unit sequence It is associated setting up global symbol table with default object to include: according to tree-like grammatical structure to simplifying lexical unit sequence with pre- If object is associated setting up global symbol table.
Alternatively, storage medium is also configured to storage for the program code performing following steps: obtain lexical unit Single code expression in sequence;And set up tree-like grammatical structure according to single code expression.
Alternatively, storage medium is also configured to storage for the program code performing following steps: preset according to multiple Object is set up and is preset global object list, wherein, preset global object's list for by code file same type of not It is associated with keyword fragment;Obtain the another name instruction in code file and to the code using another name instruction in code file Perform process, obtain processing code;In the list of default global object, according to processing code by lexical unit sequence and and morphology The default object that unit sequence is corresponding performs type association or function association, obtains global symbol table.
Alternatively, storage medium is also configured to storage for the program code performing following steps: according to processing code The lexical unit sequence corresponding with default call method and default call method are performed association.
Alternatively, storage medium is also configured to storage for the program code performing following steps: determine lexical unit The type name of the type of sequence also searches type name in the list of default global object;Judge whether the type name found meets Pre-conditioned;And if it is judged that the type name found meets pre-conditioned, according to processing code by lexical unit sequence Perform type association with default object or function associates.
Alternatively, storage medium is also configured to storage for the program code performing following steps: splits and includes one Or the lexical unit section of multiple lexical units, obtains split result;Split result is pressed in stack;According to the stack top unit in stack Element determines the type name to be found of lexical unit sequence;Judge whether type name to be found is preset kind name;If it is judged that Type name to be found is not preset kind name, searches type name to be found from the base class sub-object at lexical unit section place;From When the base class sub-object at lexical unit section place does not finds type name to be found, from global symbol table, search type to be found Name;When finding type name to be found in global symbol table, verify the type qualified name of type name to be found, wherein, type Limit the entitled title for limiting type name to be found;Judge whether type qualified name matches with default qualified name;And If it is judged that type qualified name matches with default qualified name, return type name to be found using as scanning result.
Alternatively, storage medium is also configured to storage for the program code performing following steps: from lexical unit After the base class sub-object at section place searches type name to be found, find in the base class sub-object from lexical unit section place and treat When searching type name, verify the type qualified name of type name to be found;Judge whether type qualified name limits famous prime minister with default Join;And if it is judged that type qualified name matches with default qualified name, return type name to be found using as scanning result.
Alternatively, storage medium is also configured to storage for the program code performing following steps: to code file In character string carry out morphological analysis, before obtaining lexical unit sequence, to code file perform pretreatment, obtain pretreatment Code, wherein, pretreatment code is the character stream meeting preset rules, the character string in pretreatment code is carried out morphology and divides Analysis, obtains lexical unit sequence.
Alternatively, storage medium is also configured to storage for the program code performing following steps: read code file Character stream;Character stream is formed morpheme;And generate lexical unit sequence according to morpheme, wherein, in lexical unit sequence Each lexical unit and morpheme one_to_one corresponding.
Alternatively, storage medium is also configured to storage for the program code performing following steps: to lexical unit After sequence and default object are associated setting up global symbol table, set up for simulation code file according to global symbol table The data flow model of execution flow process.
Alternatively, storage medium is also configured to storage for the program code performing following steps: according to global symbol The execution flow process of table simulation code file, obtains analog result;And set up data flow model according to analog result.
Alternatively, storage medium is also configured to storage for the program code performing following steps: according with according to the overall situation In the case of the variable of code file is compared by number table, determine and record the span of variable.
Alternatively, the concrete example in the present embodiment is referred to the example described in above-described embodiment, the present embodiment Do not repeat them here.
Alternatively, in the present embodiment, above-mentioned storage medium can include but not limited to: USB flash disk, read only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), portable hard drive, magnetic disc or The various medium that can store program code such as CD.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
If the integrated unit in above-described embodiment realizes and as independent product using the form of SFU software functional unit When selling or use, can be stored in the storage medium that above computer can read.Based on such understanding, the skill of the present invention Part that prior art is contributed by art scheme the most in other words or this technical scheme completely or partially can be with soft The form of part product embodies, and this computer software product is stored in storage medium, including some instructions with so that one Platform or multiple stage computer equipment (can be for personal computer, server or the network equipment etc.) perform each embodiment institute of the present invention State all or part of step of method.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not has in certain embodiment The part described in detail, may refer to the associated description of other embodiments.
In several embodiments provided herein, it should be understood that disclosed client, can be by other side Formula realizes.Wherein, device embodiment described above is only schematically, the division of the most described unit, and the most only one Kind of logic function divides, actual can have when realizing other dividing mode, the most multiple unit or assembly can in conjunction with or It is desirably integrated into another system, or some features can be ignored, or do not perform.Another point, shown or discussed mutual it Between coupling direct-coupling or communication connection can be the INDIRECT COUPLING by some interfaces, unit or module or communication link Connect, can be being electrical or other form.
The described unit illustrated as separating component can be or may not be physically separate, shows as unit The parts shown can be or may not be physical location, i.e. may be located at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected according to the actual needs to realize the mesh of the present embodiment scheme 's.
It addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to two or more unit are integrated in a unit.Above-mentioned integrated list Unit both can realize to use the form of hardware, it would however also be possible to employ the form of SFU software functional unit realizes.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For Yuan, under the premise without departing from the principles of the invention, it is also possible to make some improvements and modifications, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims (18)

1. a data processing method, it is characterised in that including:
Obtaining code file, wherein, described code file is to include the source text of character string;
Character string in described code file is carried out morphological analysis, obtains lexical unit sequence;
Resolve described code file, obtain presetting object;
It is associated setting up global symbol table to described lexical unit sequence and described default object, wherein, described overall situation symbol Number table is for recording the data message of all described default object in described code file;And
According to described global symbol table to described code file perform static code scanning, obtain scanning result, wherein, described in sweep Retouch result and at least include the lookup result of the type to described lexical unit sequence.
Method the most according to claim 1, it is characterised in that
Before being associated setting up described global symbol table to described lexical unit sequence and described default object, described side Method also includes: in the case of the logic not changing described code file, simplifies described lexical unit sequence, obtains letter Change lexical unit sequence,
Described lexical unit sequence and described default object are associated set up described global symbol table include: to described letter Change lexical unit sequence and described default object is associated setting up described global symbol table.
Method the most according to claim 2, it is characterised in that
Simplifying described lexical unit sequence, after obtaining described simplification lexical unit sequence, described method also includes: Setting up tree-like grammatical structure according to described simplification lexical unit sequence, wherein, described tree-like grammatical structure is for have for storage Preset the tree structure of the data object of grammer,
Described simplification lexical unit sequence and described default object are associated set up described global symbol table include: according to Described simplification lexical unit sequence and described default object are associated setting up described overall situation symbol by described tree-like grammatical structure Number table.
Method the most according to claim 3, it is characterised in that set up described tree-like grammer according to described lexical unit sequence Structure includes:
Obtain the single code expression in described lexical unit sequence;And
Described tree-like grammatical structure is set up according to described single code expression.
Method the most according to claim 1, it is characterised in that described default object includes multiple default object, to described Lexical unit sequence and described default object are associated setting up global symbol table and include:
Setting up according to the plurality of default object and preset global object's list, wherein, the list of described default global object is used for will Same type of different keyword fragments in described code file are associated;
Obtain the another name instruction in described code file and the code using described another name instruction in described code file is performed Process, obtain processing code;
In the list of described default global object, according to described process code by described lexical unit sequence and with described morphology list The default object that metasequence is corresponding performs type association or function association, obtains described global symbol table.
Method the most according to claim 5, it is characterised in that according to described process code by described lexical unit sequence and The default object answered with described lexical unit sequence pair performs type association and includes at least one of:
The lexical unit sequence of type declarations and pre-with what described lexical unit sequence pair was answered will be used for according to described process code If object performs association;
According to described process code, the lexical unit sequence corresponding with the variable in described code file and described variable are performed Association.
Method the most according to claim 5, it is characterised in that according to described process code by described lexical unit sequence and The default object execution functional relationships answered with described lexical unit sequence pair is coupled to include less: will be with default according to described process code The lexical unit sequence that call method is corresponding performs to associate with described default call method.
Method the most according to claim 5, it is characterised in that according to described process code by described lexical unit sequence and The default object answered with described lexical unit sequence pair performs type association or function associates and includes:
Determine the type name of the type of described lexical unit sequence and search described type in the list of described default global object Name;
Judge whether the described type name found meets pre-conditioned;And
If it is judged that the described type name found meets described pre-conditioned, according to described process code by described morphology list Metasequence performs type association with described default object or function associates.
Method the most according to claim 8, it is characterised in that described code file is performed according to described global symbol table Static code scans, and obtains described scanning result and includes:
Split the lexical unit section including one or more described lexical unit, obtain split result;
By in described split result press-in stack;
The type name to be found of described lexical unit sequence is determined according to the stack top element in described stack;
Judge whether described type name to be found is preset kind name;
If it is judged that described type name to be found is not described preset kind name, from the base class pair at described lexical unit section place As the described type name to be found of middle lookup;
When not finding described type name to be found in the base class sub-object from described lexical unit section place, from described overall situation symbol Described type name to be found searched by number table;
When finding described type name to be found from described global symbol table, verify the type limit of described type name to be found Naming, wherein, described type limits the entitled title for limiting described type name to be found;
Judge whether described type qualified name matches with default qualified name;And
If it is judged that described type qualified name matches with described default qualified name, return described type name to be found using as Described scanning result.
Method the most according to claim 9, searches described to be checked in the base class sub-object from described lexical unit section place After looking for type name, described method also includes:
When finding described type name to be found in the base class sub-object from described lexical unit section place, verify described to be found The type qualified name of type name;
Judge whether described type qualified name matches with described default qualified name;And
If it is judged that described type qualified name matches with described default qualified name, return described type name to be found using as Described scanning result.
11. methods according to claim 1, it is characterised in that
Character string in described code file is being carried out morphological analysis, before obtaining described lexical unit sequence, described side Method also includes: described code file being performed pretreatment, obtains pretreatment code, wherein, described pretreatment code is pre-for meeting If the character stream of rule,
Character string in described code file is carried out morphological analysis, obtains described lexical unit sequence and include: to described pre- The character string processed in code carries out morphological analysis, obtains described lexical unit sequence.
12. methods according to claim 10, it is characterised in that to described code file perform pretreatment include with down to One of few:
Filter the space of described code file;
Delete the annotation of described code file;
The pre-processing instruction of described code file is processed according to preset configuration;
To described code file Unified coding, export code character stream.
13. methods according to claim 1, it is characterised in that the character stream in described code file is carried out morphology and divides Analysis, obtains described lexical unit sequence and includes:
Read the character stream of described code file;
Described character stream is formed morpheme;And
According to described morpheme generate described lexical unit sequence, wherein, each lexical unit in described lexical unit sequence with Described morpheme one_to_one corresponding.
14. methods according to claim 1, it is characterised in that to described lexical unit sequence and described default object After being associated setting up described global symbol table, described method also includes: set up for mould according to described global symbol table Intend the data flow model of the execution flow process of described code file.
15. methods according to claim 14, it is characterised in that set up described data stream mould according to described global symbol table Type includes:
According to the execution flow process of code file described in described global symbol table simulation, obtain analog result;And
Described data flow model is set up according to described analog result.
16. methods according to claim 15, it is characterised in that according to code file described in described global symbol table simulation Execution flow process, obtain described analog result and include: according to described global symbol table, the variable of described code file is being carried out In the case of Bi compare, determine and record the span of described variable.
17. 1 kinds of data processing equipments, it is characterised in that including:
First acquiring unit, is used for obtaining code file, and wherein, described code file is to include the source program literary composition of character string This;
Analytic unit, for the character string in described code file is carried out morphological analysis, obtains lexical unit sequence;
Resolution unit, is used for resolving described code file, obtains presetting object;
Associative cell, for being associated setting up global symbol table to described lexical unit sequence and described default object, its In, described global symbol table is for recording the data message of all described default object in described code file;And
Scanning element, for described code file being performed static code scanning according to described global symbol table, obtains scanning knot Really, wherein, described scanning result at least includes the lookup result of the type to described lexical unit sequence.
18. devices according to claim 17, it is characterised in that
Described device also includes: second acquisition unit, for closing described lexical unit sequence and described default object Connection is before setting up described global symbol table, in the case of the logic not changing described code file, to described lexical unit Sequence simplifies, and is simplified lexical unit sequence,
Described associative cell for be associated described simplification lexical unit sequence and described default object setting up described entirely Office's symbol table.
CN201610613852.7A 2016-07-29 2016-07-29 Data processing method and device Active CN106227668B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610613852.7A CN106227668B (en) 2016-07-29 2016-07-29 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610613852.7A CN106227668B (en) 2016-07-29 2016-07-29 Data processing method and device

Publications (2)

Publication Number Publication Date
CN106227668A true CN106227668A (en) 2016-12-14
CN106227668B CN106227668B (en) 2017-11-17

Family

ID=57535333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610613852.7A Active CN106227668B (en) 2016-07-29 2016-07-29 Data processing method and device

Country Status (1)

Country Link
CN (1) CN106227668B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153564A (en) * 2017-06-22 2017-09-12 拜椰特(上海)软件技术有限公司 A kind of morphology analytical tool
CN107608875A (en) * 2017-08-03 2018-01-19 北京奇安信科技有限公司 A kind of localization process method and device of static code
CN108170425A (en) * 2017-12-29 2018-06-15 东莞市高标软件科技有限公司 A kind of amending method of program code, modification device and terminal device
CN108304369A (en) * 2017-05-03 2018-07-20 腾讯科技(深圳)有限公司 A kind of recognition methods of file type and device
CN108537086A (en) * 2018-03-29 2018-09-14 广东欧珀移动通信有限公司 Method for information display, device, storage medium and mobile terminal
CN108549538A (en) * 2018-04-11 2018-09-18 深圳市腾讯网络信息技术有限公司 A kind of code detection method, device, storage medium and test terminal
CN108874825A (en) * 2017-05-12 2018-11-23 北京京东尚科信息技术有限公司 A kind of method of calibration and device of abnormal data
CN109359188A (en) * 2018-09-30 2019-02-19 北京数聚鑫云信息技术有限公司 A kind of component method of combination and system
CN109542420A (en) * 2018-10-15 2019-03-29 张海光 A kind of Code Edit method based on label
CN109558119A (en) * 2018-11-09 2019-04-02 杭州安恒信息技术股份有限公司 A method of the Web frame based on Java traverses request address
CN109656567A (en) * 2018-12-20 2019-04-19 北京树根互联科技有限公司 The dynamic approach and system of heterogeneousization business data processing logic
CN109710218A (en) * 2018-11-26 2019-05-03 福建天泉教育科技有限公司 A kind of object automatic switching method and terminal
CN109814939A (en) * 2017-11-20 2019-05-28 华为技术有限公司 The production method and device of a kind of dynamic loading method, file destination
CN110069455A (en) * 2017-09-21 2019-07-30 北京华为数字技术有限公司 A kind of file mergences method and device
CN110197181A (en) * 2019-05-31 2019-09-03 烽火通信科技股份有限公司 A kind of cable character detection method and system based on OCR
CN110309050A (en) * 2019-05-22 2019-10-08 深圳壹账通智能科技有限公司 Detection method, device, server and the storage medium of code specification
CN110489127A (en) * 2019-08-12 2019-11-22 腾讯科技(深圳)有限公司 Error code determines method, apparatus, computer readable storage medium and equipment
CN110795069A (en) * 2018-08-02 2020-02-14 Tcl集团股份有限公司 Code analysis method, intelligent terminal and computer readable storage medium
CN111385249A (en) * 2018-12-28 2020-07-07 中国电力科学研究院有限公司 Vulnerability detection method
CN113110947A (en) * 2021-04-16 2021-07-13 中国工商银行股份有限公司 Program call chain generation method, system, electronic device and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode
CN101482847A (en) * 2009-01-19 2009-07-15 北京邮电大学 Detection method based on safety bug defect mode
CN102799520A (en) * 2012-06-27 2012-11-28 清华大学 Static checking method and device for source code pairing
CN104915293A (en) * 2015-06-12 2015-09-16 北京邮电大学 Software testing method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode
CN101482847A (en) * 2009-01-19 2009-07-15 北京邮电大学 Detection method based on safety bug defect mode
CN102799520A (en) * 2012-06-27 2012-11-28 清华大学 Static checking method and device for source code pairing
CN104915293A (en) * 2015-06-12 2015-09-16 北京邮电大学 Software testing method and system

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304369B (en) * 2017-05-03 2020-12-01 腾讯科技(深圳)有限公司 File type identification method and device
CN108304369A (en) * 2017-05-03 2018-07-20 腾讯科技(深圳)有限公司 A kind of recognition methods of file type and device
CN108874825A (en) * 2017-05-12 2018-11-23 北京京东尚科信息技术有限公司 A kind of method of calibration and device of abnormal data
CN107153564A (en) * 2017-06-22 2017-09-12 拜椰特(上海)软件技术有限公司 A kind of morphology analytical tool
CN107153564B (en) * 2017-06-22 2020-07-07 拜椰特(上海)软件技术有限公司 Lexical analysis tool
CN107608875A (en) * 2017-08-03 2018-01-19 北京奇安信科技有限公司 A kind of localization process method and device of static code
CN107608875B (en) * 2017-08-03 2020-11-06 奇安信科技集团股份有限公司 Localization processing method and device for static code
CN110069455A (en) * 2017-09-21 2019-07-30 北京华为数字技术有限公司 A kind of file mergences method and device
CN109814939A (en) * 2017-11-20 2019-05-28 华为技术有限公司 The production method and device of a kind of dynamic loading method, file destination
CN108170425B (en) * 2017-12-29 2021-03-19 东莞市高标软件科技有限公司 Program code modification method and device and terminal equipment
CN108170425A (en) * 2017-12-29 2018-06-15 东莞市高标软件科技有限公司 A kind of amending method of program code, modification device and terminal device
CN108537086A (en) * 2018-03-29 2018-09-14 广东欧珀移动通信有限公司 Method for information display, device, storage medium and mobile terminal
CN108549538A (en) * 2018-04-11 2018-09-18 深圳市腾讯网络信息技术有限公司 A kind of code detection method, device, storage medium and test terminal
CN110795069A (en) * 2018-08-02 2020-02-14 Tcl集团股份有限公司 Code analysis method, intelligent terminal and computer readable storage medium
CN109359188A (en) * 2018-09-30 2019-02-19 北京数聚鑫云信息技术有限公司 A kind of component method of combination and system
CN109359188B (en) * 2018-09-30 2020-01-14 北京数聚鑫云信息技术有限公司 Component arranging method and system
CN109542420A (en) * 2018-10-15 2019-03-29 张海光 A kind of Code Edit method based on label
CN109558119A (en) * 2018-11-09 2019-04-02 杭州安恒信息技术股份有限公司 A method of the Web frame based on Java traverses request address
CN109710218A (en) * 2018-11-26 2019-05-03 福建天泉教育科技有限公司 A kind of object automatic switching method and terminal
CN109710218B (en) * 2018-11-26 2022-02-11 福建天泉教育科技有限公司 Object automatic conversion method and terminal
CN109656567A (en) * 2018-12-20 2019-04-19 北京树根互联科技有限公司 The dynamic approach and system of heterogeneousization business data processing logic
CN109656567B (en) * 2018-12-20 2022-02-01 北京树根互联科技有限公司 Dynamic method and system for heterogeneous service data processing logic
CN111385249A (en) * 2018-12-28 2020-07-07 中国电力科学研究院有限公司 Vulnerability detection method
CN111385249B (en) * 2018-12-28 2023-07-18 中国电力科学研究院有限公司 Vulnerability detection method
CN110309050A (en) * 2019-05-22 2019-10-08 深圳壹账通智能科技有限公司 Detection method, device, server and the storage medium of code specification
CN110197181A (en) * 2019-05-31 2019-09-03 烽火通信科技股份有限公司 A kind of cable character detection method and system based on OCR
CN110197181B (en) * 2019-05-31 2021-04-30 烽火通信科技股份有限公司 Cable character detection method and system based on OCR
CN110489127A (en) * 2019-08-12 2019-11-22 腾讯科技(深圳)有限公司 Error code determines method, apparatus, computer readable storage medium and equipment
CN110489127B (en) * 2019-08-12 2023-10-13 腾讯科技(深圳)有限公司 Error code determination method, apparatus, computer-readable storage medium and device
CN113110947A (en) * 2021-04-16 2021-07-13 中国工商银行股份有限公司 Program call chain generation method, system, electronic device and medium
CN113110947B (en) * 2021-04-16 2024-04-02 中国工商银行股份有限公司 Program call chain generation method, system, electronic device and medium

Also Published As

Publication number Publication date
CN106227668B (en) 2017-11-17

Similar Documents

Publication Publication Date Title
CN106227668B (en) Data processing method and device
CN103473171B (en) A kind of fraction of coverage calling path based on function dynamically follows the tracks of method and device
CN110502227B (en) Code complement method and device, storage medium and electronic equipment
CN111931181B (en) Software logic vulnerability detection method based on graph mining
JP2003345850A (en) Method for verification of combinational circuit using filtering oriented approach
CN112560100B (en) Data desensitizing method and device, computer readable storage medium and electronic equipment
CN106406972B (en) Program compiling method and compiler
CN111813675A (en) SSA structure analysis method and device, electronic equipment and storage medium
CN108563561B (en) Program implicit constraint extraction method and system
US20130144803A1 (en) Method and System for Generating One Flow Models from Runtime Service Delivery Process
WO2023241529A1 (en) Vulnerability information processing method, service apparatus and vulnerability detection module
CN109614325A (en) A kind of method and device, electronic equipment and the storage medium of determining control property
Kuhlmann et al. On the complexity of CCG parsing
CN112988163B (en) Intelligent adaptation method, intelligent adaptation device, intelligent adaptation electronic equipment and intelligent adaptation medium for programming language
Egolf et al. Verbatim: a verified lexer generator
US10642714B2 (en) Mapping dynamic analysis data to source code
CN105302547A (en) Fault injection method for Verilog HDL design
Cooper et al. Performance modeling and analysis of software architectures: An aspect-oriented UML based approach
CN112148392A (en) Function call chain acquisition method and device and storage medium
CN109254774A (en) The management method and device of code in software development system
CN115221047A (en) Automatic test case generation method and electronic equipment
Wang et al. Learning program representations with a tree-structured transformer
CN114282227A (en) Safety analysis and detection method for intelligent contract of Fabric block chain system
CN110727428B (en) Method and device for converting service logic layer codes and electronic equipment
CN105787214A (en) Method and device for model verification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant