CN106227668B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN106227668B
CN106227668B CN201610613852.7A CN201610613852A CN106227668B CN 106227668 B CN106227668 B CN 106227668B CN 201610613852 A CN201610613852 A CN 201610613852A CN 106227668 B CN106227668 B CN 106227668B
Authority
CN
China
Prior art keywords
code
lexical unit
type
unit sequence
code file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610613852.7A
Other languages
Chinese (zh)
Other versions
CN106227668A (en
Inventor
邹越
严明
张蓓
黄斌
袁明凯
魏学峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610613852.7A priority Critical patent/CN106227668B/en
Publication of CN106227668A publication Critical patent/CN106227668A/en
Application granted granted Critical
Publication of CN106227668B publication Critical patent/CN106227668B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/146Coding or compression of tree-structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/315Object-oriented languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/73Program documentation

Abstract

The invention discloses a kind of data processing method and device.Wherein, the data processing method includes:Code file is obtained, wherein, code file is to include the source text of character string;Morphological analysis is carried out to the character string in code file, obtains lexical unit sequence;Code file is parsed, obtains default object;Lexical unit sequence and default object are associated to establish global symbol table, wherein, the data message for all default objects that global symbol table is used in record code file;And static code scanning is performed to code file according to global symbol table, scanning result is obtained, wherein, scanning result comprises at least the lookup result to the type of lexical unit sequence.The present invention solves the low technical problem of accuracy of code scans in correlation technique.

Description

Data processing method and device
Technical field
The present invention relates to data processing field, in particular to a kind of data processing method and device.
Background technology
At present, there is no the solution of global symbol for the data processing method for generating symbolism, it is static on upper strata When code check item scans, the result of non-global symbol is not accurate enough.
In traditional compilation process, logical relation that abstract syntax tree (AST) can be established between code expression, than Such as, the logical relation of if sentences and else sentences in if-else statement interludes, abstract language is established not for single code expression Method structure by the language codes of compiling, it is necessary to be used as input, once syntax error be present in the language codes of input, then structure Global abstract syntax tree construction out would is that mistake and without reference to meaning.It is so right not in the case of in compiling Code is detected, analyzed, and can influence overall symbol flow and result.Therefore, symbolism is carried out according to abstract syntax tree Data processing accuracy is low, and the accuracy being scanned to the code of symbolism is low, and then makes programmer be not easy to find in code The defects of existing, reduce performance and security, reduce programmer handle code efficiency, and improve reparation into This.
Current Data processing uses simple string matching, and accuracy is low, in addition in Data Structure Design, number It is larger according to structure, it is impossible to fully using data buffer storage, to reduce the efficiency of data storage.
For code scans in correlation technique accuracy it is low the problem of, not yet propose effective solution at present.
The content of the invention
The embodiments of the invention provide a kind of data processing method and device, at least to solve code scans in correlation technique The low technical problem of accuracy.
A kind of one side according to embodiments of the present invention, there is provided data processing method.The data processing method includes: Code file is obtained, wherein, code file is to include the source text of character string;Character string in code file is entered Row morphological analysis, obtain lexical unit sequence;Code file is parsed, obtains default object;To lexical unit sequence and default pair As being associated to establish global symbol table, wherein, all default objects that global symbol table is used in record code file Data message;And static code scanning is performed to code file according to global symbol table, scanning result is obtained, wherein, scanning As a result the lookup result to the type of lexical unit sequence is comprised at least.
Another aspect according to embodiments of the present invention, additionally provide a kind of data processing equipment.The data processing equipment bag Include:First acquisition unit, for obtaining code file, wherein, code file is to include the source text of character string;Analysis Unit, for carrying out morphological analysis to the character string in code file, obtain lexical unit sequence;Resolution unit, for solving Code file is analysed, obtains default object;Associative cell is complete to establish for being associated to lexical unit sequence and default object Office's symbol table, wherein, the data message for all default objects that global symbol table is used in record code file;And scanning is single Member, for performing static code scanning to code file according to global symbol table, scanning result is obtained, wherein, scanning result is extremely Few lookup result included to the type of lexical unit sequence.
In embodiments of the present invention, code file is obtained, wherein, code file is to include the source program text of character string This;Morphological analysis is carried out to the character string in code file, obtains lexical unit sequence;Code file is parsed, is preset Object;Lexical unit sequence and default object are associated to establish global symbol table, wherein, global symbol table is used to record The data message of all default objects in code file;And static code is performed to code file according to global symbol table and swept Retouch, obtain scanning result, wherein, scanning result comprises at least the lookup result to the type of lexical unit sequence, due to passing through Lexical unit sequence and default object are associated to establish global symbol table, code file performed according to global symbol table Static code scans, and obtains scanning result, has reached the purpose that symbolism processing is carried out to code file, it is achieved thereby that improving The technique effect of the accuracy of code scans, and then the technical problem that the accuracy that solves code scans in correlation technique is low.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is a kind of hardware block diagram of the terminal of data processing method according to embodiments of the present invention;
Fig. 2 is a kind of flow chart of data processing method according to embodiments of the present invention;
Fig. 3 is a kind of stream of method that tree-like syntactic structure is established according to lexical unit sequence according to embodiments of the present invention Cheng Tu;
Fig. 4 is according to embodiments of the present invention a kind of is associated lexical unit sequence and default object to establish the overall situation The flow chart of the method for symbol table;
Fig. 5 be it is according to embodiments of the present invention it is a kind of according to processing code by lexical unit sequence and with lexical unit sequence Corresponding default object performs the flow chart of type association or the method for function association;
Fig. 6 is according to embodiments of the present invention a kind of to perform static code to code file according to global symbol table and scan The flow chart of method;
Fig. 7 is that another kind according to embodiments of the present invention performs static code scanning to code file according to global symbol table Method flow chart;
Fig. 8 is the method that another character stream in code file according to embodiments of the present invention carries out morphological analysis Flow chart;
Fig. 9 is a kind of flow of method that data flow model is established according to global symbol table according to embodiments of the present invention Figure;
Figure 10 is the flow chart of another data processing method according to embodiments of the present invention;
Figure 11 be it is according to embodiments of the present invention it is a kind of code file is pre-processed before code schematic diagram;
Figure 12 is the code schematic diagram after being pre-processed according to embodiments of the present invention to code file;
Figure 13 is that one kind according to embodiments of the present invention carries out morphological analysis to pretreatment code, obtains lexical unit sequence Code schematic diagram;
Figure 14 is a kind of schematic diagram of lexical unit sequence according to embodiments of the present invention;
Figure 15 is a kind of schematic diagram simplified to lexical unit sequence according to embodiments of the present invention;
Figure 16 is the schematic diagram that another kind according to embodiments of the present invention is simplified to lexical unit sequence;
Figure 17 is a kind of schematic diagram of abstract syntax tree construction according to embodiments of the present invention;
Figure 18 is a kind of flow chart of method for establishing global symbol table according to embodiments of the present invention;
Figure 19 is a kind of schematic diagram of the class of object according to embodiments of the present invention;
Figure 20 is a kind of schematic diagram of global type Hash table according to embodiments of the present invention;
Figure 21 is that a kind of instructed to alias according to embodiments of the present invention performs the code schematic diagram handled;
Figure 22 is a kind of schematic diagram of the type search of lexical unit according to embodiments of the present invention;
Figure 23 is a kind of flow chart of method that static code scanning is performed to code file according to embodiments of the present invention;
Figure 24 is a kind of schematic diagram of source code according to embodiments of the present invention;
Figure 25 is a kind of schematic diagram of the debugging file of symbolism output according to embodiments of the present invention;
Figure 26 is a kind of code schematic diagram for establishing data flow model according to embodiments of the present invention;
Figure 27 is a kind of schematic diagram of data processing equipment according to embodiments of the present invention;
Figure 28 is the schematic diagram of another data processing equipment according to embodiments of the present invention;And
Figure 29 is a kind of structured flowchart of terminal according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people The every other embodiment that member is obtained under the premise of creative work is not made, it should all belong to the model that the present invention protects Enclose.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, " Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so use Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except illustrating herein or Order beyond those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover Cover it is non-exclusive include, be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product Or the intrinsic other steps of equipment or unit.
Embodiment 1
According to embodiments of the present invention, there is provided a kind of embodiment of data processing method.
Alternatively, in the present embodiment, above-mentioned data processing method can apply to as shown in Figure 1 by server 102 In the hardware environment formed with terminal 104.Fig. 1 is that a kind of computer of data processing method according to embodiments of the present invention is whole The hardware block diagram at end.As shown in figure 1, server 102 is attached by network and terminal 104, above-mentioned network include but It is not limited to:Wide area network, Metropolitan Area Network (MAN) or LAN, terminal 104 are not limited to PC, mobile phone, tablet personal computer etc..The embodiment of the present invention Data processing method can be performed by server 102, can also be performed, be can also be by server 102 by terminal 104 Performed jointly with terminal 104.Wherein, terminal 104 perform the embodiment of the present invention data processing method can also be by installed in Client thereon performs.
Fig. 2 is a kind of flow chart of data processing method according to embodiments of the present invention.As shown in Fig. 2 the data processing Method may comprise steps of:
Step S202, obtain code file.
In the technical scheme that the application above-mentioned steps S202 is provided, code file is obtained, wherein, code file is to include The source text of character string, can be C#The code file of language program, the C#Language is the program design language of Microsoft's exploitation Speech, character string namely character stream.By input code file, and then obtain the code file.
Step S204, morphological analysis is carried out to the character string in code file, obtains lexical unit sequence.
In the technical scheme that the application above-mentioned steps S204 is provided, morphology point is carried out to the character string in code file Analysis, obtains lexical unit sequence, that is, the character string of code file to be converted into the process of word sequence.
Morphological analysis is that character string is converted to the process of word (Token) sequence.Carry out morphological analysis program or Person's function is called lexical analyzer (Lexical analyzer, abbreviation Lexer), is also scanner (Scanner).Morphological analysis Device exists typically in the form of function, is called for syntax analyzer.The program for completing morphological analysis task is referred to as morphological analysis journey Sequence or lexical analyzer or scanner.Source program is scanned from left to right, it is all kinds of according to the morphological rule identification of language Word, and produce the attribute of respective word.
After code file is obtained, the character string of code file is read, character string is formed into morpheme, generated and defeated Go out a lexical unit sequence, wherein, lexical unit sequence is the set of all lexical units generated after morphological analysis, It is subsequent treatment and the master data result of upper strata check item traversal code, its essence is a doubly linked list, is safeguarded all Lexical unit, such as, if, for are lexical unit.Wherein, doubly linked list namely double linked list, it is one kind of chained list, it each There are two pointers in Data Node, be respectively directed to immediate successor and direct precursor, therefore from any one in doubly linked list Node starts, and can easily access its forerunner's node and successor node.
The corresponding morpheme of each lexical unit, includes the string value of corresponding morpheme, in addition to the lexical unit Associated other attributes, such as, the attribute of the pointer of next lexical unit is pointed to, points to the pointer of a upper lexical unit Attribute, point to the attribute of the pointer of pairing lexical unit, the pairing lexical unit is the morphology that matches with former lexical unit Unit, such as, former lexical unit is " (", then it is ") to match lexical unit ".The attribute associated with lexical unit also includes word The type of method unit, such as, numeral, character string, variable, function, keyword etc..The attribute associated with lexical unit also includes Lexical unit points to the pointer of symbol table, that is, variable lexical unit points to the variable object in symbol table, function lexical unit Point to corresponding function object.The attribute associated with lexical unit also includes the line number of lexical unit, lexical unit Syntax tree structure pointer, the syntax tree structure pointer are the abstract syntax tree construction for safeguarding lexical unit.With lexical unit Associated attribute also includes the data flow architecture pointer of lexical unit.
Step S206, code file is parsed, obtain default object.
In the technical scheme that the application above-mentioned steps S206 is provided, code file is parsed, obtains default object.
After code file is obtained, code file is parsed, the generation class corresponding with code file, name are empty Between, the default object such as method field, and the inclusion relation established between default object.Alternatively, code file includes multiple generations Code file, parses all code files successively, generates the class corresponding with code file, NameSpace, method field etc. respectively Default object, and the inclusion relation established between all default objects.The default object can be the base class of all objects, corresponding Code with logical meaning, can be class, method, attribute etc., for base class corresponding to the lexical unit section in differentiating method Object or NameSpace, enumerate, method, field, commission, event, attribute, enumerator etc..Default object has corresponding Type, inheritance between type mainly considers from whether the structure of code aspect similar.
Step S208, lexical unit sequence and default object are associated to establish global symbol table.
In the technical scheme that the application above-mentioned steps S208 is provided, lexical unit sequence and default object are associated To establish global symbol table, wherein, the data message for all default objects that global symbol table is used in record code file, build The process of vertical global symbol table namely the process that symbolism is carried out to code file.
Symbolism can refer to, for instruction, IA, constant, variable, register etc., table justice be used on screen Shown with readable very strong symbol.Symbol table is the data structure for language translator.In symbol table, program source generation Statement or use information of each identifier with it in code is bound together, such as its data type, action scope and interior Deposit address.Symbol table is needed constantly collection, record during compiler works and accorded with using some grammers in source program Number type and the relevant information such as feature.The a little information of this symbol table are typically stored in system in a tabular form.Such as constant table, become Famous-brand clock, array name table, process famous-brand clock, label table etc. are measured, is referred to as symbol table.For symbol table organization, construction and manager The quality of method can directly affect the operational efficiency of compiling system.
Character string in code file carries out morphological analysis, obtains lexical unit sequence and parsing code text Part, default object is obtained afterwards, it is necessary to which the definition of the same type of different piece in code file is associated, that is, The different keyword fragments of same type part in code file are chained up, so as to establish global symbol table, the overall situation Symbol table includes key (key) and key assignments corresponding with key (Value), the number for all default objects in record code file It is believed that breath, have recorded the formatted message of all object logics in code, all codes that can be used in scan code file, So as to improve the accuracy of code scans.
Step S210, static code scanning is performed to code file according to global symbol table, obtains scanning result.
In the technical scheme that the application above-mentioned steps S210 is provided, static state is performed to code file according to global symbol table Code scans, scanning result is obtained, wherein, scanning result comprises at least the lookup result to the type of lexical unit sequence.
After being associated lexical unit sequence and default object to establish global symbol table, according to global symbol table To code file perform static code scanning, static code scanning be in soft project, programmer after source code is finished writing, The compiling of compiler is needed not move through, without building the running environment of source code, and directly using some scanning tools to source code It is scanned, substantial amounts of manpower and time cost can be saved, improve development efficiency, and finds out many present in source code Some security breaches that can not be only found by manpower, so as to improve the accuracy of code scans, are greatly reduced in project Security risk, improve software quality.So accurately and efficiently symbolism knot is provided for the scanning of upper strata static code check item Fruit so that the check item in code possesses syntactic level, across function scanning, semantic level and logic analysis ability, final defeated The problem of code scans result gone out can help to hide in developer, tester's fast positioning code, and then improve Code quality, reduce rehabilitation cost of the later stage to code.Taking into full account that code file missing, type definition missing and grammer are wrong In the case of by mistake, the semiosis of the code file of the embodiment need not compile the code file of input, it is not required that generation Code file, which can compile, to be passed through.
Alternatively, static code scanning is performed to code file according to global symbol table and is applied to all C#The item of language Purpose static code inspection.
By above-mentioned steps S202 to step S210, by obtaining code file, code file is to include character string Source text;Morphological analysis is carried out to the character string in code file, obtains lexical unit sequence;Parse code file, Obtain default object;Lexical unit sequence and default object are associated to establish global symbol table, global symbol table is used for The data message of all default objects in record code file;Static code is performed according to global symbol table to code file to sweep Retouch, obtain scanning result, scanning result comprises at least the lookup result to the type of lexical unit sequence, can solve correlation The low technical problem of the accuracy of code scans in technology, and then improve the technique effect of the accuracy of code scans.
As a kind of optional embodiment, it is associated to lexical unit sequence and default object to establish global symbol Before table, in the case where not changing the logic of code file, lexical unit sequence is simplified, is simplified lexical unit Sequence, lexical unit sequence and default object are associated to be included with establishing global symbol table:To simplifying lexical unit sequence It is associated with default object to establish global symbol table.
Different item destination code, the code spice of distinct program person are different, so that code file has diversity, So it is unfavorable for the foundation of global symbol table.It is associated to lexical unit sequence and default object to establish global symbol table Before, in the case where not changing the logic of code file, lexical unit sequence is simplified, that is, to lexical unit sequence Row carry out equivalencing in logic, are simplified lexical unit sequence.The step of by simplifying lexical unit sequence, is unified The style of code file, so as to reduce the cost of global symbol, and improve the accuracy of upper strata static scanning.To word Method unit sequence is simplified, and is simplified after lexical unit sequence, is carried out to simplifying lexical unit sequence and default object Associate to establish global symbol table.
As a kind of optional embodiment, simplify to lexical unit sequence, be simplified lexical unit sequence it Afterwards, tree-like syntactic structure is established according to simplified lexical unit sequence, wherein, tree-like syntactic structure is to have default language for storing The tree structure of the data object of method, it is associated to simplifying lexical unit sequence and default object to establish global symbol table bag Include:It is associated according to tree-like syntactic structure to simplifying lexical unit sequence and default object to establish global symbol table.
Tree-like syntactic structure is the tree-shaped form of expression of the abstract syntax structure of code file, that is, abstract syntax tree (Abstract Syntax Tree, referred to as AST), tree-like syntactic structure are a binary trees, each non-leaf nodes generation One operator of table, its two child nodes represent two computing components of the operator respectively.Tree-like syntactic structure contains The logical construction of expression formula and the priority relationship of operator, so as to improve the accuracy of code scene matching and realize code The efficiency of scene corresponding to file.Simplify to lexical unit sequence, be simplified after lexical unit sequence, according to letter Change lexical unit sequence and establish tree-like syntactic structure, the tree-like syntactic structure is for storing the data object with default grammer Tree structure.After tree-like syntactic structure is established according to simplified lexical unit sequence, according to tree-like syntactic structure to simplification Lexical unit sequence and default object are associated to establish global symbol table.
As a kind of optional embodiment, tree-like grammer knot is established by the single code expression in lexical unit sequence Structure.
Fig. 3 is a kind of stream of method that tree-like syntactic structure is established according to lexical unit sequence according to embodiments of the present invention Cheng Tu.As shown in figure 3, the method that tree-like syntactic structure is established according to lexical unit sequence comprises the following steps:
Step S301, obtain the single code expression in lexical unit sequence.
In the technical scheme that the application above-mentioned steps S301 is provided, tree-like syntactic structure is established according to lexical unit sequence Including:Obtain the single code expression in lexical unit sequence.Lexical unit sequence is made up of multiple code expressions, right Lexical unit sequence is simplified, and is simplified after lexical unit sequence, obtains the single code table in lexical unit sequence Up to formula.
Step S302, tree-like syntactic structure is established according to single code expression.
In the technical scheme that the application above-mentioned steps S302 is provided, tree-like grammer knot is established according to single code expression Structure.Logical relation of traditional tree-like syntactic structure between compilation process includes code expression, such as, if-else languages The logical relation of sentence and else sentences, and the tree-like syntactic structure in the embodiment and the tree-like grammer knot during conventional encoder Structure is different, and the tree-like syntactic structure of the embodiment establishes tree-like syntactic structure just for single code expression, do not establish generation Structural relation between code expression formula and code expression.Due to the embodiment point out it is incomplete or can not be by compiling Code file, once by syntax error being present in the code file of input, then and the tree-like syntactic structure built would is that mistake By mistake and be nonsensical, and the tree-like syntactic structure of the single expression formula built is also simply whole if there is mistake There is mistake in the part of code file, has no effect on the tree-like syntactic structure of other code expressions in code file, so as to Improve reliability of the tree-like syntactic structure in building process.
The embodiment is by obtaining the single code expression in lexical unit sequence;Established according to single code expression Tree-like syntactic structure, so as to reach the purpose of the foundation to tree-like syntactic structure, improve tree-like syntactic structure and building Reliability in journey.
As a kind of optional embodiment, default object includes multiple default objects, in step S210, to lexical unit Sequence and default object are associated to be arranged with establishing global symbol table by establishing default global object according to multiple default objects Table, obtains the instruction of the alias in code file and the code to using alias to instruct in code file performs processing, is handled Code;In the list of default global object, according to processing code by lexical unit sequence and corresponding with lexical unit sequence pre- If object performs association to obtain global symbol table.
Fig. 4 is according to embodiments of the present invention a kind of is associated lexical unit sequence and default object to establish the overall situation The flow chart of the method for symbol table.As shown in figure 4, this is associated to lexical unit sequence and default object to establish global symbol The method of number table comprises the following steps:
Step S401, the list of default global object is established according to multiple default objects.
In the technical scheme that the application above-mentioned steps S401 is provided, default global object is established according to multiple default objects List, the plurality of default object can be the object such as class, NameSpace, method field corresponding to code file, wherein, preset complete Office's list object is used to the same type of different keyword fragments in code file being associated, that is, default object column Table is used to associate the definition of the same type of different piece in code file.
Code file includes multiple code files, parses all code files, generates class corresponding with code file, name Multiple default objects such as space, method field, and establish multiple corresponding inclusion relations of default object.Alternatively, each code The corresponding default list object of file, the default list object can include Scope objects, CType objects, CSI objects, CNamespace objects, CNum enumeration objects, CFunction objects, CField objects, CDelegate objects, CEvent pairs As, CProperty objects, CIndexer objects, CSymbolFile objects.Merge CSymbolFile pairs of all code files As etc., wherein, Scope objects are the base class of all objects, and the corresponding code with logical meaning, CType objects are class, side Method, attribute etc., C is corresponded to for base class sub-object, CSI objects corresponding to the lexical unit section in differentiating method#Class in language Class, structure Struct, interface Interface, CNamespace object are NameSpace, and CNum enumeration objects are to enumerate, CFunction objects are method, and CField objects are field, and CDelegate objects are commission, CEvent objects be event, CProperty objects are attribute, CIndexer objects are enumerator, and CSymbolFile objects correspond to code file physically, The corresponding CSymbolFile object of one code file.Merge all default objects, it is global right to be established by key and key assignments As list, such as, merge the rare CSymbolFile objects of institute, key is used as using the fully qualified name of CSI objects.With corresponding CSI The key assignments of object establishes global type Hash table, the Hash table is directly to enter according to key value (Key Value) as key assignments The data structure that row accesses.
Step S402, obtains the alias instruction in code file and the code to using alias to instruct in code file performs Processing, obtain handling code.
In the technical scheme that the application above-mentioned steps S402 is provided, the alias obtained in code file instructs and to code The code instructed in file using alias performs processing, obtains handling code.Alias instruction in code file is a name The alias that space or type are specified, that is, identifier.Alias instruction directly includes the compilation unit of this instruction, is naming In space effectively.The alias instructs can interfere to the type search of code file, can be other to using this in code file The place of name instruction carries out the expansion in lexical unit aspect, obtains handling code.
Step S403, in the list of default global object, according to processing code by lexical unit sequence and and lexical unit Object is preset corresponding to sequence and performs type association or function association, obtains global symbol table.
In the technical scheme that the application above-mentioned steps S403 is provided, in the list of default global object, according to processing generation Code is by lexical unit sequence and default object corresponding with lexical unit sequence performs type association or function associates, and obtains complete Office's symbol table.Type association and function association are to be bound variable corresponding with lexical unit, type and method object, Types of variables corresponding to the lexical unit currently traveled through is so known that when traveling through lexical unit sequence, and then determines the change Amount is Value Types or reference type, and function associates the calling for realizing function, it is determined whether variable of interest etc. is refer to, So as to significantly increase ability of the upper strata check item in terms of semantic level and tracking function.
By establishing the list of default global object according to multiple default objects, the list of default global object is used for the embodiment Same type of different keyword fragments in code file are associated;Obtain the alias instruction in code file and to generation The code instructed in code file using alias performs processing, obtains handling code;In the list of default global object, according to processing Code is by lexical unit sequence and default object corresponding with lexical unit sequence performs type association or function associates, and obtains Global symbol table, realize and lexical unit sequence and default object are associated to establish the purpose of global symbol table.
As a kind of optional embodiment, according to processing code by lexical unit sequence and corresponding with lexical unit sequence Default object, which performs type association, includes at least one of:According to processing code by the lexical unit sequence for type declarations Default object corresponding with lexical unit sequence, which performs, to be associated;Will be corresponding with the variable in code file according to processing code Lexical unit sequence is performed with variable and associated.
According to processing code by for the lexical unit sequence of type declarations and corresponding with lexical unit sequence default pair Can be lexical unit in the lexical unit sequence for type declarations and corresponding with lexical unit default as performing association Object performs association;Type association can also be lexical unit sequence corresponding with the variable in code file according to processing code Row are performed with variable and associated, and can be that lexical unit associates with variable-definition corresponding to variable, it is achieved thereby that global symbol table Type association during foundation.
As a kind of optional embodiment, according to processing code by lexical unit sequence and corresponding with lexical unit sequence Default object performs function association and comprised at least:According to processing code will lexical unit sequence corresponding with default call method and Default call method performs association.
Lexical unit sequence corresponding with default call method and default call method are performed by association according to processing code, Can be that lexical unit corresponding with default call method and the default call method are performed by association according to processing code, so that Realize the function association during global symbol table is established.
As a kind of optional embodiment, according to processing code by lexical unit sequence and corresponding with lexical unit sequence Default object execution type association or function association are met pre- by the type name found in the list of default global object If lexical unit sequence is performed with default object by type association according to processing code in the case of condition or function associates.
Fig. 5 be it is according to embodiments of the present invention it is a kind of according to processing code by lexical unit sequence and with lexical unit sequence Corresponding default object performs the flow chart of type association or the method for function association.As shown in figure 5, this method is including following Step:
Step S501, determine the type name of the type of lexical unit sequence and search type in the list of default global object Name.
In the technical scheme that the application above-mentioned steps S501 is provided, the type name of type of lexical unit sequence is determined simultaneously Type name is searched in the list of default global object.The type of lexical unit sequence is obtained, the type of lexical unit has type Name, determine the type name of lexical unit.The type name is searched in the list of default global object.
Whether step S502, the type name for judging to find meet preparatory condition.
In the technical scheme that the application above-mentioned steps S502 is provided, whether the type name for judging to find meets default bar Part.The type name of type in code file is by context code influences, after type name is found, sentences what is found Whether type name meets preparatory condition, optionally it is determined that the type name nearest from object corresponding to lexical unit meets default bar Part.
Step S503, lexical unit sequence and default object are performed by type association according to processing code or function closes Connection.
In the technical scheme that the application above-mentioned steps S503 is provided, the type name of type of lexical unit sequence is determined simultaneously Type name is searched in the list of default global object.After the type name for judging to find meets preparatory condition, that is, according to The code that alias instruction is treated is held lexical unit sequence and default object corresponding with lexical unit sequence Row type association or function association.
The embodiment is by determining the type name of the type of lexical unit sequence and being searched in the list of default global object Type name;In the case where the type name for judging to find meets preparatory condition according to processing code by lexical unit sequence and Default object performs type association or function association, it is achieved thereby that according to processing code by lexical unit sequence and and morphology Object is preset corresponding to unit sequence and performs type association or the purpose of function association, significantly improves upper strata check item in semanteme Performance in terms of aspect and function call tracking.
As a kind of optional embodiment, in step S210, in static generation, is performed to code file according to global symbol table Code scanning, obtains scanning result by determining type name to be found, in the case where type name to be found is not preset kind name, Type name to be found is searched from the base class sub-object where lexical unit section, in the base class sub-object where lexical unit section not When finding type name to be found, type name to be found is searched from default global object's list, is arranged from default global object When type name to be found is found in table, and when type qualified name and default qualified name match, return to type name to be found To be used as scanning result.
Fig. 6 is according to embodiments of the present invention a kind of to perform static code to code file according to global symbol table and scan The flow chart of method.As shown in fig. 6, this according to global symbol table to code file perform static code scan method include with Lower step:
Step S601, the lexical unit section for including one or more lexical unit is split, obtains split result.
In the technical scheme that the application above-mentioned steps S601 is provided, the word for including one or more lexical unit is split Method elementary section, obtains split result.Lexical unit segment table shows the sequence of one or more lexical unit composition.Will with it is to be found Lexical unit section is split according to level corresponding to type, obtains split result.
Step S602, split result is pressed into stack.
In the technical scheme that the application above-mentioned steps S602 is provided, if it is judged that type qualified name and default qualified name Match, return to type name to be found to be used as scanning result.Include the morphology list of one or more lexical unit in fractionation First section, after obtaining split result, the split result is pressed into stack S.
Step S603, the stack top element in stack determine the type name to be found of lexical unit sequence.
In the technical scheme that the application above-mentioned steps S603 is provided, the stack top element in stack determines lexical unit sequence The type name to be found of row.After split result is pressed into stack, the stack top element in stack is obtained, the stack top element is to treat The type name of type is searched, not the type qualified name including type to be found.
Step S604, judge whether type name to be found is preset kind name.
In the technical scheme that the application above-mentioned steps S604 is provided, judge whether type name to be found is preset kind Name.Preset kind name can be the system type name in language in itself place system library.Determined in the stack top element in stack After the type name to be found of lexical unit sequence, judge whether type name to be found is system type name.
Step S605, type name to be found is searched from the base class sub-object where lexical unit section.
In the technical scheme that the application above-mentioned steps S605 is provided, if it is judged that type name to be found is not default class Type name, type name to be found is searched from the base class sub-object where lexical unit section.Alternatively, from lexical unit section The base class CScope of the object in physical file where TokenSection starts, and searches type name to be found from the bottom up.
Step S606, when not finding type name to be found in the base class sub-object where lexical unit section, from the overall situation Type name to be found is searched in symbol table.
In the technical scheme that the application above-mentioned steps S606 is provided, in the base class sub-object where lexical unit section not When finding type name to be found, global type Hash list can be included from global symbol table, arranged according to global type Hash Table search type name to be found.
Step S607, when finding type name to be found from global symbol table, verify the type of type name to be found Qualified name.
It is to be checked being found from default global object's list in the technical scheme that the application above-mentioned steps S607 is provided When looking for type name, the type qualified name of type name to be found is verified, the type limits entitled for limiting type name to be found Title.Alternatively, if when not finding type name to be found from default global object's list, terminate to hold code file The process of row static code scanning.
Step S608, judges whether type qualified name matches with default qualified name.
In the technical scheme that the application above-mentioned steps S608 is provided, preset and limit the entitled qualified name matched, judge Whether the type qualified name in stack matches with default qualified name.
Step S609, type name to be found is returned to be used as scanning result.
In the technical scheme that the application above-mentioned steps S609 is provided, if it is judged that type qualified name and default qualified name Match, return to type name to be found, the entitled type name finally matched of type to be found of the return.
The embodiment includes the lexical unit section of one or more lexical unit by splitting, and obtains split result;Will In split result press-in stack;Stack top element in stack determines the type name to be found of lexical unit sequence;Judge to be found Whether type name is preset kind name;If it is judged that type name to be found is not preset kind name, where from lexical unit section Base class sub-object in search type name to be found;Type to be found is not found in the base class sub-object where lexical unit section During name, type name to be found is searched from default global object's list, it is to be found being found from default global object's list During type name, the type qualified name of type name to be found is verified, type limits the entitled title for being used to limit type name to be found; Judge whether type qualified name matches with default qualified name;If it is judged that type qualified name matches with default qualified name, Type name to be found is returned to using as scanning result, it is achieved thereby that performing static code to code file according to global symbol table The purpose of scanning.
As a kind of optional embodiment, step S210, static code is performed to code file according to global symbol table and swept Retouch, after type name to be found is searched in the base class sub-object where lexical unit section, from the base where lexical unit section When type name to be found is found in class object, the type qualified name of type name to be found is verified;Whether judge type qualified name Match with default qualified name;And if it is judged that type qualified name matches with default qualified name, return to type to be found Name is to be used as scanning result.
Fig. 7 is that another kind according to embodiments of the present invention performs static code scanning to code file according to global symbol table Method flow chart.Include as shown in fig. 7, this performs the method that static code scans to code file according to global symbol table Following steps:
Step S701, when finding type name to be found in the base class sub-object where lexical unit section, checking is to be checked Look for the type qualified name of type name.
In the technical scheme that the application above-mentioned steps S701 is provided, looked into when in the base class sub-object where lexical unit section When finding type name to be found, the type qualified name of type name to be found is verified.Alternatively, from lexical unit section The base class CScope of the object in physical file where TokenSection starts, and finds type name to be found from the bottom up When, verify the type qualified name of type name to be found.
Step S702, judges whether type qualified name matches with default qualified name.
The application above-mentioned steps S702 provide technical scheme in, judge type qualified name whether with default restriction famous prime minister Matching.When verifying the type qualified name of type name to be found, judge type qualified name in stack whether with default restriction famous prime minister Matching.
Step S703, type name to be found is returned to be used as scanning result.
In the technical scheme that the application above-mentioned steps S703 is provided, if it is judged that type qualified name and default qualified name Match, return to type name to be found to be used as scanning result.If it is judged that property qualified name and default qualified name not phase Match somebody with somebody, continue executing with step S605, type name to be found is searched from the base class sub-object where lexical unit section.
After the embodiment searches type name to be found in the base class sub-object where lexical unit section, from morphology list When finding type name to be found in the base class sub-object where first section, the type qualified name of type name to be found is verified;Judge class Whether type qualified name matches with default qualified name;If it is judged that type qualified name matches with default qualified name, return is treated Type name is searched using as scanning result, so as to realize that this performs what static code scanned according to global symbol table to code file Purpose.
As a kind of optional embodiment, the character string in code file carries out morphological analysis, obtains morphology list Before metasequence, pretreatment is performed to code file, obtains pre-processing code, wherein, pretreatment code is to meet preset rules Character stream, in code file character string carry out morphological analysis, obtaining lexical unit sequence includes:To pre-processing code In character string carry out morphological analysis, obtain lexical unit sequence.
Pretreatment is first of the processing carried out to code file, the code unrelated with valid code is filtered out, that is, mistake The content unrelated with follow-up global symbol table is filtered, so as to the word to provide specification to carrying out morphological analysis in code file Accord with sequence.Pretreatment is being performed to code file, obtained after pre-processing code, the character string in pretreatment code is being carried out Morphological analysis, obtain lexical unit sequence.
As a kind of optional embodiment, performing pretreatment to code file includes at least one of:Filtering code The space of file;Delete the annotation of code file;According to the pre-processing instruction of preset configuration processing code file;To code file Unified coding, exports coding character stream.
Character string in code file carries out morphological analysis, before obtaining lexical unit sequence, to code file Performing pretreatment includes a variety of methods, the space of filtering code file, so as to remove unnecessary space;Annotation is to code file Execution does not have any impact, thus deletes the annotation of code file;Pretreatment according to preset configuration processing code file refers to Order, pre-processing instruction can be handled according to project configuration, the pre-processing instruction can be #define, #if, #error, #line Deng.Can to code file Unified coding, exports coding character stream, such as, ASCII code character streams, so as to realize to code The pretreatment of file.
As a kind of optional embodiment, step S204, morphological analysis is carried out to the character stream in code file, obtains word Method unit sequence is by reading the character stream of code file;Character stream is formed into morpheme, and lexical unit sequence is generated according to morpheme Arrange to realize.
Fig. 8 is the method that another character stream in code file according to embodiments of the present invention carries out morphological analysis Flow chart.As shown in figure 8, the method that morphological analysis is carried out to the character stream in code file comprises the following steps:
Step S801, read the character stream of code file.
In the technical scheme that the application above-mentioned steps S801 is provided, the character stream of code file is read.It can read in pre- Handle the code character stream of output.
Step S802, character stream is formed into morpheme.
In the technical scheme that the application above-mentioned steps S802 is provided, character stream is formed into morpheme.Can will to code text The code character stream composition morpheme exported when part is pre-processed.Morpheme is that the angle of immediate constituent from word or stem determines Pronunciation and meaning combination, it is not necessarily the pronunciation and meaning binding constituents of minimum, and whether the morpheme in word only from being minimum pronunciation and meaning knot Synthesis divides to determine.
Step S803, lexical unit sequence is generated according to morpheme.
In the technical scheme that the application above-mentioned steps S803 is provided, after character stream is formed into morpheme, according to morpheme Lexical unit sequence is generated, each lexical unit in lexical unit sequence corresponds with morpheme.
The embodiment is by reading the character stream of code file;Character stream is formed into morpheme;And word is generated according to morpheme Method unit sequence, each lexical unit in lexical unit sequence corresponds with morpheme, it is achieved thereby that in code file Character stream carry out morphological analysis, obtain the purpose of lexical unit sequence.
As a kind of optional embodiment, it is associated to lexical unit sequence and default object to establish global symbol After table, the data flow model of the execution flow for simulation code file is established according to global symbol table.
The basic ideas of structure data flow model are to perform flow according to global symbol table simulation code, can be with global symbol The lexical unit with plaintext numerical value in list processing code, numerical value letter is recorded in data flow field corresponding to lexical unit Breath;Can for simple function, such as, simply pass through simple arithmetic operations for returning to constant value or return value, according to complete Office's symbol table simulation calculates function return value;It can be accorded with according to global symbol list processing bit arithmetic;Can be according to global symbol table Mode standard in being circulated for for is simulated;When variable is assigned given value, can be tracked according to global symbol table The service condition, the data flow field of more new variables etc. of follow-up variable.
As a kind of optional embodiment, data flow model is established according to global symbol table and passed through according to global symbol table mould Intend the execution flow of code file, obtain analog result to establish data flow model.
Fig. 9 is a kind of flow of method that data flow model is established according to global symbol table according to embodiments of the present invention Figure.As shown in figure 9, the method that data flow model is established according to global symbol table comprises the following steps:
Step S901, according to the execution flow of global symbol table simulation code file, obtain analog result.
In the technical scheme that the application above-mentioned steps S901 is provided, according to global symbol table simulation code execution logic, Speculate as much as possible and record variable is in the possible span of current context, obtain analog result.
Step S902, data flow model is established according to analog result.
In the technical scheme that the application above-mentioned steps S902 is provided, in holding according to global symbol table simulation code file Row flow, after obtaining analog result, data flow model is established according to analog result.
The embodiment obtains analog result by the execution flow according to global symbol table simulation code file;According to mould Intend result and establish data flow model, it is achieved thereby that establishing the purpose of data flow model according to global symbol table.
As a kind of optional embodiment, according to the execution flow of global symbol table simulation code file, simulation knot is obtained Fruit includes:In the case where being compared according to global symbol table to the variable of code file, the value of variable is determined and recorded Scope.
In the case where being compared according to global symbol table to the variable of code file, the value of variable is determined and recorded Scope, if if conditional statements have carried out the judgement such as size to variable, then can speculate that the variable can in current context The value condition of energy, then parse the service condition of variable, update data stream field backward forward from If conditions.
The embodiment of the present invention obtains code file, and code file is to include the source text of character string;To code text Character string in part carries out morphological analysis, obtains lexical unit sequence;Code file is parsed, obtains default object;To morphology Unit sequence and default object are associated to establish global symbol table, and global symbol table is used for all in record code file The data message of default object;And static code scanning is performed to code file according to global symbol table, scanning result is obtained, Scanning result comprises at least the lookup result to the type of lexical unit sequence, can be used for the exploitation of upper strata check item, finds The defects of there may be in code, performance and safety problem, can allow programmer efficiently, low cost repair these problems, so as to Code quality is lifted, makes symbolism result accurate, efficient, the type search algorithm of context-sensitive can be used, compared to simple String matching for, there is higher accuracy, in addition in Data Structure Design, reduce structure size as far as possible, fill Divide using caching, there is good performance on internal memory and efficiency, can not detected in compiling in the case of, point Analysis, does not influence overall symbol flow and result, improves the accuracy of code scans, can be based on C Plus Plus and realize, can be with Support Windows Linux Mac systems.
Embodiment 2
Technical scheme is illustrated with reference to preferred embodiment.
The embodiment is for C#Global symbol scheme of the language based on decomplier, swept for upper strata static code check item Retouch and provide accurately and efficiently symbolism result so that check item possess syntactic level, across function scanning, semantic level and A certain degree of logic analysis ability.The code scans result of final output can help to develop, tester's fast positioning generation The problem of being hidden in code, lifting code quality, reduce the rehabilitation cost in later stage.C is used suitable for all#The project of language it is quiet State code check.
The embodiment has taken into full account the situation of code file missing, type definition missing and syntax error, therefore symbol Change process need not compile the C of input#Code, it is not required that C#Code, which can compile, to be passed through.
This embodiment achieves for C#Symbolism flow of the language based on decomplier;The data structure of global symbol table with And the logic of structure global symbol table;And when type search and function lookup context-sensitive lookup algorithm.
The application scenarios of the embodiment are:Input C#The code file of language program, to C#Language codes carry out morphology point Analysis, establishes lexical unit chained list and abstract syntax tree, then extracts variable, Function feature information, builds global symbol table, and build Vertical variable uses rule, function call link, finally, track the service condition of variable in code, thus it is speculated that the possible value of variable Scope, variable data flow model is finally established, the symbolism result needed for checking and tune are provided for upper strata static code check item Use interface.
Figure 10 is the flow chart of another data processing method according to embodiments of the present invention.As shown in Figure 10, the data Processing method comprises the following steps:
Step S1001, pretreatment is performed to code file, obtain pre-processing code.
It is first of processing for input code that pretreatment is carried out to code file, and filtering out successive characterization need not Content, obtain pre-process code.
Step S1002, morphological analysis is carried out to pretreatment code, obtains lexical unit sequence.
Morphological analysis is carried out to the character string of the code text after pretreatment, character string is converted into lexical unit Sequence.
Step S1003, lexical unit sequence is simplified, be simplified lexical unit sequence.
Enter row equivalent code logic to lexical unit sequence to replace, be simplified lexical unit sequence, so as to specification Code format.
Step S1004, establish tree-like syntactic structure.
After lexical unit sequence is simplified, tree-like syntactic structure, the building process class of tree-like syntactic structure are established It is similar to build abstract syntax tree AST process in compilation process.
Step S1005, global symbol table is established according to tree-like syntactic structure.
After tree-like syntactic structure is established, global symbol table is built, the global symbol table have recorded all in code patrol Collect the formatted message of object.
Step S1006, data flow model is established according to global symbol table.
After global symbol table is established, data flow model, simulation code execution logic, note are established according to global symbol table Record the possible span of variable.
The embodiment obtains pre-processing code, morphology is carried out to pretreatment code by performing pretreatment to code file Analysis, obtains lexical unit sequence, lexical unit sequence is simplified, be simplified lexical unit sequence, establish tree-like language Method structure, global symbol table is established according to tree-like syntactic structure, data flow model is established according to global symbol table, improves code The accuracy of file scan.
Pretreatment is performed to step S1001, code file below, pretreatment code is obtained and is introduced.
Pretreatment is first of the processing carried out to source text code, it is therefore an objective to which the morphological analysis for after provides rule The character stream of model, the filtering part unrelated with valid code.It is main to include following several partial contents:Filter unnecessary space;Go Fall annotation;According to project configuration processing pre-processing instruction (#define, #if, #error, #line etc.);Unicode file is compiled Code, export ASCII code character streams.
Figure 11 be it is according to embodiments of the present invention it is a kind of code file is pre-processed before code schematic diagram.Such as figure Shown in 11, before being pre-processed to code file, code file includes the part unrelated with valid code, including:Annotation "/* This is the entry point.*/", pre-processing instruction " #if TEST CMD ", pre-processing instruction " //defind ", Pre-processing instruction " #else ", invalid sentence " Console.WriteLie (" TEST_CMD is not defind. ");//not Defind. ", pre-processing instruction " #endif ", and space etc..
Figure 12 is the code schematic diagram after being pre-processed according to embodiments of the present invention to code file.Such as Figure 12 institutes Show, the filtering part unrelated with valid code.Filtered unnecessary space, remove annotation, according to project configuration processing pretreatment Instruction, Unicode document No., after waiting processing, the code after being pre-processed to code file is not included shown in Figure 11 Annotation "/* This is the entry point.*/", pre-processing instruction " #if TEST CMD ", pre-processing instruction " // Defind ", pre-processing instruction " #else ", invalid sentence " Console.WriteLie (" TEST_CMD is not defind.”);//not defind. ", pre-processing instruction " #endif ", and excess space etc., so as to be follow-up morphological analysis The character stream of specification is provided.
Below to step S1002, morphological analysis is carried out to pretreatment code, lexical unit sequence is obtained and is introduced.
The main task of morphological analysis is to read in the character stream of pretreatment output, and character stream formed into morpheme, and by morpheme Generate and export lexical unit (Token) sequence, the corresponding morpheme of each lexical unit.Whole lexical unit sequence It is Token list.Token list are the Data Structures of subsequent treatment and upper strata check item traversal code, are a generations Code file is by all Token generated after morphological analysis set.
Figure 13 is that one kind according to embodiments of the present invention carries out morphological analysis to pretreatment code, obtains lexical unit sequence Code schematic diagram.As shown in figure 13, code " for (int index=0 are pre-processed;index<42;++ index) ", to pre- place Manage code " for (int index=0;index<42;++ index) " morphological analysis is carried out, obtain lexical unit sequence: " for ", " (", " int ", " index ", "=", " 0 ", ";”、“index”、“<”、“42”、“;”、“++”、“index”、“)”.
Lexical unit Token is the most basic unit for carrying out syntactic analysis and rule scanning, and code scans include upper Layer scanning and bottom scanning, bottom scanning include technical scheme, and upper strata scanning is rule scanning, such as, except zero Operation, so as to find aacode defect well.Lexical unit Token in addition to the string value comprising the morpheme, also with Other attributes associated lexical unit Token:For example point to next lexical unit Token pointer;Point to a word Method unit Token pointer;The pointer for pointing to " pairing " lexical unit Token (for left bracket, that is, points to right parenthesis Pointer);Lexical unit Token type (numeral, character string, variable, function, keyword etc.);Lexical unit Token points to symbol The pointer of number table (variable Token points to the variable object in symbol table, and function Token points to its corresponding function object);Word Method unit Token line number;Lexical unit Token syntax tree structure pointer (safeguards lexical unit Token abstract syntax tree Structure);Lexical unit Token data flow architecture pointer.
Lexical unit sequence TokenList is substantially a doubly linked list, safeguards all lexical unit Token.
Figure 14 is a kind of schematic diagram of lexical unit sequence according to embodiments of the present invention.As shown in figure 14, code " if (i >0) lexical unit sequence TokenList " is doubly linked list, wherein, " if " → " (" → " i " → ">”→“0”→“)”→“0” →“>" → " i " → " (" → " if ", wherein, " (" and ") " pairing.
Step S1003 is simplified to lexical unit sequence below, lexical unit sequence is simplified and is introduced.
Disparity items code, the code spice of distinct program person are different, objectively form the various of code file Property present situation, this to global symbol table establish process cause some troubles.In order to reduce the cost for realizing global symbol And the accuracy of upper strata static scanning is improved, it is necessary to which some simplify after T lexical unit sequence Token list are established Step makes code file Unicode style.
All simplified steps of the embodiment can not all change the logic of code, can only be that code file is carried out in logic Equivalencing.
Figure 15 is a kind of schematic diagram simplified to lexical unit sequence according to embodiments of the present invention.Such as Figure 15 institutes Show, simplify C#Type based on Platform Type, by code " System.Int32il=default (int);" simplified, letter After changing default keywords, lexical unit sequence " int i1=0 are simplified;”.
Figure 16 is the schematic diagram that another kind according to embodiments of the present invention is simplified to lexical unit sequence.Such as Figure 16 institutes Show, by lamada expression formulas " i=>I+5 " be reduced to canonical form " (i)=>{return i+5;}”.
Tree-like syntactic structure is established to step S1004 below to be introduced.
Tree-like syntactic structure, that is, abstract syntax tree, is C#The tree-shaped form of expression of the abstract syntax structure of code.Tree Shape syntactic structure is a binary tree, and each non-leaf nodes represents an operator, and its two child nodes represent respectively Two computing components of the operator.The priority of logical construction and operator that tree-like syntactic structure contains expression formula is closed System, this characteristic can improve the accuracy of code scene matching and realize the efficiency of the scene.
It should be noted that the abstract syntax tree construction of the embodiment and the abstract syntax tree construction during conventional encoder Different places are, the logical relation that the abstract syntax tree construction during conventional encoder can be established between code expression, Such as the logical relation of if sentences and else sentences in if-else statement interludes, and the abstract syntax tree construction in the embodiment is only For establishing abstract syntax structure in single code expression, the structure do not established between code expression and code expression Relation.The embodiment is supported incomplete or can not pass through the C of compiling#Code is as input, once the C inputted#Code is present Syntax error, then the global abstract syntax tree construction built would is that mistake and without reference to meaning, and build single The abstract syntax tree construction of expression formula, if there is mistake, that is also simply local, does not influence the abstract syntax of other expression formulas Tree construction.
Figure 17 is a kind of schematic diagram of abstract syntax tree construction according to embodiments of the present invention.As shown in figure 17, this is abstract Syntax tree structure is tree-shaped for code " String.Format (" { 0 } { 1 } { 2 } { 3 } ", Func (1,2), " tsc# ", 1+2*3) " Structure.
Global symbol table is established according to tree-like syntactic structure to step S1005 below to be introduced.
The foundation of global symbol table is to realize the judgement during code scans to types of variables, and function call tracking Basis, it is more powerful than other symbolism schemes matched based on text and canonical.
Figure 18 is a kind of flow chart of method for establishing global symbol table according to embodiments of the present invention.As shown in figure 18, The method for establishing global symbol table comprises the following steps:
Step S1801, the objects such as all code files, class corresponding to generation, NameSpace, method field are parsed successively, And corresponding inclusion relation is established, the corresponding CSymbolFile object of each code file.
According to C#The taxeme of language, define CSymbolFile list objects as shown in table 1.
Table 1CSymbolFile list objects
Step S1802, merge all CSymbolFile objects, it is corresponding using the fully qualified name of CSI objects as key CSI objects establish global type Hash table as value.
The step is mainly the related question for solving partial key word types.
Step S1803, handle the alias instruction in code.
The alias instruction (Using alias directives) in code is handled, is prepared for follow-up type association
Step S1804, type declarations Token associate with corresponding type object;Variable Token associates with variable-definition.
Lexical unit Token for type declarations associates with corresponding type object, with the variable pair in code file The lexical unit Token answered is performed with variable and associated.
Step S1805, method call Token associate with corresponding method object.
Method call Token associates with corresponding method object, that is, lexical unit corresponding with default call method Token is performed with default call method and associated, so as to realize the foundation to global symbol table.
The embodiment by parsing all code files, class corresponding to generation, NameSpace, method field etc. pair successively As, and corresponding inclusion relation is established, the corresponding CSymbolFile object of each code file, merge all CSymbolFile objects, using the fully qualified name of CSI objects as key, corresponding CSI objects establish global class as value Type Hash table, handles the alias instruction in code, and type declarations Token associates with corresponding type object;Variable Token is with becoming Amount definition association, method call Token associates with corresponding method object, it is achieved thereby that the foundation to global symbol table.
Figure 19 is a kind of schematic diagram of the class of object according to embodiments of the present invention.As shown in figure 19, the class of the object is The class of object shown in table 1, the inheritance between type are mainly to consider from whether the structure of code aspect is similar.Its Middle CSI represents a class Class, structure Struct or interface Interface objects, because these three types are in code aspect There is similar structure, so integrating together.CSymbolFile objects represent a code file physically, and it after Hold from NameSpace CNamespace is because the superiors of code file, which are considered, conceals " namespace Global { ... } " global namespace, both have identical logic implication.
Figure 20 is a kind of schematic diagram of global type Hash table according to embodiments of the present invention.As shown in figure 20, C#Language The definition of partial keywords, i.e. same class or method is supported, can be in multiple code files, so needing same The definition of the different piece of one type associates., will be same type of during establishing global type Hash table Different partial keyword fragments are chained up, and the global type Hash table includes key Key and key assignments Value, wherein, key Key Associated including Namepace.A with classA, Namepace.B associates with class partial B, Namepace.C and class C is associated.
Global type Hash table is established except solving the related question of partial objects, also solve type search and The problem of type travels through.
C#Alias instruction in code can interfere to follow-up type search, therefore need exist for first parsing code In alias instruction, then to using the alias instruct place carry out Token aspects expansion.
Figure 21 is that a kind of instructed to alias according to embodiments of the present invention performs the code schematic diagram handled.Such as Figure 21 institutes Show, the code instructed using alias is " using A=N1.N2.A;class B:A { } ", the alias is instructed and performs processing It is " class B afterwards:N1.N2.A ", follow-up type search is interfered so as to avoid alias instruction.
The work that type association associates with function is with variable, type and method corresponding to it by lexical unit Token Object is bound, and the meaning so done is able to know that current lexical unit when traveling through lexical unit sequence TokenList Types of variables corresponding to Token, the variable are Value Types or reference type, and whether the realization of this function call refer to Variable of interest etc., this can significantly improve ability of the upper strata check item in terms of semantic level and tracking function.
The key that type association associates with function is type search.C#It is specific representated by a type Token in language Type is influenceed by current context.Figure 22 is a kind of signal of the type search of lexical unit according to embodiments of the present invention Figure.As shown in figure 22, in theory, class B base class A, which points to N1.A or N2.A, to be all possible, and in fact, C# Compiler is " nearby principle " run into that such case takes, i.e. the type of priority match " nearest ", for example above, is compiled N2.A can be pointed to A by translating device.
The embodiment takes same thinking to handle, based on life when type search and function lookup is carried out Into global symbol table, realize the type search algorithm of context-sensitive, swept that is, performing static code to code file Retouch, this is one to scanning result correctness important guarantee.
Figure 23 is a kind of flow chart of method that static code scanning is performed to code file according to embodiments of the present invention. As shown in figure 23, this method comprises the following steps:
Step S2301, the excessively corresponding lexical unit section of type is split according to level, be pressed into stack.
Step S2302, the element in stack is taken out to determine type name.
The element in stack is taken out, that is, the type name N of type, the not qualified name including type.
Step S2303, judge whether type name is system type.
Judge whether type name is system type, if it is judged that the entitled system type of type, end code file is static The flow of code scans, if it is judged that type name is not system type, perform step S2304.
Step S2304, since the base class sub-object where lexical unit section, type name N is searched from the bottom up.
If it is judged that type name is not system type, the base class sub-object where lexical unit section starts, looked into from the bottom up Look for type name N.
Step S2305, judge whether to find type name N.
Start in the base class sub-object where lexical unit section, after searching type name N from the bottom up, judge whether to find Type name N.If it is judged that finding type name N, step S2308 is performed, if it is judged that not finding type name N, is held Row step S2306.
Step S2306, type name N is searched from global type Hash table.
When judging not find type name N, type name N is searched from global type Hash table.
Step S2307, judge whether find type name N in global type Hash table.
If it is judged that finding type name N in global type Hash table, step S2308 is performed, if it is judged that complete Type name N, the flow of end code file static code scanning are not found in office's type Hash table.
Step S2308, verify the type qualified name in stack.
If it is judged that finding type name N in global type Hash table, the type qualified name in stack is verified.
Step S2309, judges whether the type qualified name in stack matches with preset kind qualified name.
If it is determined that the type qualified name in popping matches with preset kind qualified name, step S2310 is performed, if sentenced The type qualified name broken in popping does not match with preset kind qualified name, the flow of end code file static code scanning.
Step S2310, return to the type matched.
The type matched is the result of type search, that is, the result of code scans.
The embodiment is pressed into stack, taken out in stack by the way that the excessively corresponding lexical unit section of type is split according to level Element judges whether type name is system type, if it is judged that type name is not system type, from morphology to determine type name Base class sub-object where elementary section starts, and searches type name N from the bottom up, judges whether to find type name N, if it is judged that Type name N is not found, type name N is searched from global type Hash table, judges whether looked into global type Hash table Find type name N.If it is judged that finding type name N in global type Hash table, the type qualified name in stack is verified, is sentenced Whether the type qualified name in disconnected stack matches with preset kind qualified name.If it is determined that the type qualified name in popping is with presetting Type qualified name matches, and returns to the type matched, realizes and static code scanning is performed to code file, improve code The accuracy of file scan.
Embodiment 3
The application environment of the embodiment of the present invention can be, but not limited to reference to the application environment in above-described embodiment, the present embodiment In this is repeated no more.The embodiments of the invention provide optionally specifically should for implementing a kind of of above-mentioned data processing method With.
Figure 24 is a kind of schematic diagram of source code according to embodiments of the present invention.As shown in figure 24, by obtaining code File, code file are to include the source text of character string;Morphological analysis is carried out to the character string in code file, obtained To lexical unit sequence;Code file is parsed, obtains default object;Lexical unit sequence and default object are associated to build Vertical global symbol table, the data message for all default objects that global symbol table is used in record code file;And according to complete Office's symbol table performs static code scanning to code file, obtains scanning result, and scanning result is comprised at least to lexical unit sequence The lookup result of the type of row, the code shown in Figure 25 is obtained, wherein, Figure 25 is a kind of symbolism according to embodiments of the present invention The schematic diagram of the debugging file of output.Wherein, identical variable ID represents same variable, and can associate determining for the variable Justice, its definition can directly be quoted by function Func.
Static code scanning is being performed to code file according to global symbol table, after obtaining scanning result, is establishing data Flow model, the basic ideas for establishing data flow model are that simulation code performs flow, speculate as much as possible and record variable is being worked as The front upper and lower possible span of text.
The plaintext numerical value Token in code is handled, numerical information is recorded in data flow field corresponding to Token;For Some simple functions (return to constant value or return value simply passes through simple arithmetic operations), simulation calculates function return value;Place Bit arithmetic symbol & is managed, for expression formula " i=j&4;", the result of variable i can only be 4 or 0;Acceptance of the bid quasi-mode is circulated for for Formula:Such as " for (int i=0;i<10;++ i) ", simulation performs processes of the i from 0 to 9;When variable is assigned given value, with The service condition of the follow-up variable of track, the data flow field of more new variables;If if conditional statements have carried out size to variable, sentenced Compare, then the variable possible value condition in current context can be speculated, then parsed backward forward from if conditions The service condition of variable, update corresponding data flow field.
Figure 26 is a kind of code schematic diagram for establishing data flow model according to embodiments of the present invention.As shown in figure 26, lead to Cross " int i=10;if(j>0) i=42;J=1 " show that variable i in the span of current context is { 10,42 }.
The embodiment of the present invention is found that code potential problems, can allow programmer efficiently, low cost repair these problems, carry Rise code quality.Symbolism result is accurate, efficient, using the type search algorithm of context-sensitive, compared to simple character string For matching, there is higher accuracy, in addition in Data Structure Design, reduce structure size as far as possible, make full use of slow Deposit, there is good performance on internal memory and efficiency.The embodiment of the present invention is not based on compiling, situation about can not pass through in compiling Under detected, analyze, do not influence overall symbol flow and result.It can realize based on C Plus Plus, can support at present Windows Linux Mac systems.
It should be noted that the multilingual extension of the support of technical scheme, the technical scheme is except supporting C#Language Outside speech, other similar language such as Java can be supported in theory, C/C++ language, they belong to class C language.Java Language is much like all with C# on grammer and code spice, therefore support cost of this programme for Java language can be smaller; C/C++ language is included based on header file, and change is might have on solution details.The same centering of technical scheme Between language support, C#Language is usually to operate in " hosts virtual machine " (.NET or Mono), C#Language can be compiled into Intermediate language (IL), therefore, data structure and interface based on global symbol table in this programme can directly global symbol C# Corresponding IL language, user only need to provide compiled assembly file (dll files or exe files etc.).Type Realize and be more prone to function lookup, data flow model and AST precision can also improve.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as a series of Combination of actions, but those skilled in the art should know, the present invention is not limited by described sequence of movement because According to the present invention, some steps can use other orders or carry out simultaneously.Secondly, those skilled in the art should also know Know, embodiment described in this description belongs to preferred embodiment, and involved action and module are not necessarily of the invention It is necessary.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The method of example can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but a lot In the case of the former be more preferably embodiment.Based on such understanding, technical scheme is substantially in other words to existing The part that technology contributes can be embodied in the form of software product, and the computer software product is stored in a storage In medium (such as ROM/RAM, magnetic disc, CD), including some instructions to cause a station terminal equipment (can be mobile phone, calculate Machine, server, or network equipment etc.) perform method described in each embodiment of the present invention.
Embodiment 4
According to embodiments of the present invention, additionally provide a kind of for implementing above-mentioned data processing method.Figure 27 is according to this A kind of schematic diagram of data processing equipment of inventive embodiments.As shown in figure 27, the data processing equipment includes:First obtains list Member 10, analytic unit 20, resolution unit 30, associative cell 40 and scanning element 50.
First acquisition unit 10, for obtaining code file, wherein, code file is to include the source program text of character string This.
Analytic unit 20, for carrying out morphological analysis to the character string in code file, obtain lexical unit sequence.
Resolution unit 30, for parsing code file, obtain default object.
Associative cell 40, for being associated lexical unit sequence and default object to establish global symbol table, wherein, The data message for all default objects that global symbol table is used in record code file.
Scanning element 50, for performing static code scanning to code file according to global symbol table, scanning result is obtained, Wherein, scanning result comprises at least the lookup result to the type of lexical unit sequence.
It should be noted that the first acquisition unit 10 in the embodiment can be used for performing in the embodiment of the present application 1 Step S202, the analytic unit 20 in the embodiment can be used for performing the step S204 in the embodiment of the present application 1, the embodiment In resolution unit 30 can be used for performing the step S206 in the embodiment of the present application 1, the associative cell 40 in the embodiment can For the step S208 in execution the embodiment of the present application 1, the scanning element 50 in the embodiment is real for performing the application Apply the step S210 in example 1.
Figure 28 is the schematic diagram of another data processing equipment according to embodiments of the present invention.As shown in figure 28, the data Processing unit includes:First acquisition unit 10, analytic unit 20, resolution unit 30, associative cell 40 and scanning element 50.The number Also include according to processing unit:Second acquisition unit 60.
It should be noted that the first acquisition unit 10 of the embodiment, analytic unit 20, resolution unit 30, associative cell 40 is identical with the effect in the data processing equipment shown in Figure 27, and here is omitted.
Second acquisition unit 60, for being associated to lexical unit sequence and default object to establish global symbol table Before, in the case where not changing the logic of code file, lexical unit sequence is simplified, is simplified lexical unit sequence Row.
Associative cell 40 is used to be associated to establish global symbol table to simplifying lexical unit sequence and default object.
Alternatively, the data processing equipment also includes:First establishes unit, for carrying out letter to lexical unit sequence Change, be simplified after lexical unit sequence, tree-like syntactic structure is established according to lexical unit sequence is simplified, wherein, tree-like language Method structure is the tree structure for storing the data object with default grammer.Above-mentioned associative cell 40 is used for according to tree-like language Method structure is associated to establish global symbol table to simplifying lexical unit sequence and default object.
Alternatively, first establish unit and include:First acquisition module and first establishes module.Wherein, the first acquisition module For obtaining the single code expression in lexical unit sequence;First, which establishes module, is used to be established according to single code expression Tree-like syntactic structure.
Alternatively, presetting object includes multiple default objects, and associative cell 40 includes:Second establishes module, the second acquisition Module and relating module.Wherein, second module is established for establishing the lists of default global object according to multiple default objects, its In, the list of default global object is used to the same type of different keyword fragments in code file being associated;Second obtains The alias instruction and the code to using alias to instruct in code file that modulus block is used to obtain in code file perform processing, obtain To processing code;Relating module, in the list of default global object, according to processing code by lexical unit sequence and and word Object is preset corresponding to method unit sequence and performs type association or function association, obtains global symbol table.
Alternatively, the method that relating module is used to perform includes at least one of:Type will be used for according to processing code The lexical unit sequence of statement and default object corresponding with lexical unit sequence perform association;Will be with code according to processing code Lexical unit sequence corresponding to variable in file is performed with variable and associated.
Alternatively, relating module be additionally operable to according to processing code will lexical unit sequence corresponding with default call method and Default call method performs association.
Alternatively, relating module includes:Determination sub-module, judging submodule with associate submodule.Wherein it is determined that submodule For the type that determines lexical unit sequence type name and search type name in the list of default global object;Judging submodule Whether the type name for judging to find meets preparatory condition;And association submodule is used in the type for judging to find When name meets preparatory condition, lexical unit sequence and default object are performed by type association according to processing code or function closes Connection.
Alternatively, scanning element 50 includes:Split module, determining module, the first judge module, the first searching modul, the Two searching moduls, the first authentication module, the second judge module and first return to module.Wherein, module is split, is tied for that will split In fruit press-in stack;Determining module, the type name to be found of lexical unit sequence is determined for the stack top element in stack;First Judge module, for judging whether type name to be found is preset kind name;First searching modul, for judge it is to be found When type name is not preset kind name, type name to be found is searched from the base class sub-object where lexical unit section;Second searches Module, during for not finding type name to be found in the base class sub-object where lexical unit section, from global symbol table Search type name to be found;First authentication module, during with finding type name to be found in global symbol table, checking is to be found The type qualified name of type name, wherein, type limits the entitled title for being used to limit type name to be found;Second judge module, For judging whether type qualified name matches with default qualified name;And first return module, for judging type limit Name when matching with default qualified name, return to type name to be found to be used as scanning result.
Alternatively, scanning element 50 also includes:Second authentication module, the 3rd judge module and second return to module.Wherein, Second authentication module, after searching type name to be found in the base class sub-object where lexical unit section, from morphology When finding type name to be found in the base class sub-object where elementary section, the type qualified name of type name to be found is verified;3rd Judge module, for judging whether type qualified name matches with default qualified name;And second return module, for judging When going out type qualified name and default qualified name and matching, type name to be found is returned to be used as scanning result.
Alternatively, the data processing equipment also includes:Pretreatment unit, enter for the character string in code file Row morphological analysis, before obtaining lexical unit sequence, pretreatment is performed to code file, obtain pre-processing code, wherein, pre- place Reason code is the character stream for meeting preset rules, and analytic unit 20 is used to carry out morphology point to the character string in pretreatment code Analysis, obtains lexical unit sequence.
Alternatively, the method that pretreatment unit is used to perform includes at least one of:The space of filtering code file;Delete Except the annotation of code file;According to the pre-processing instruction of preset configuration processing code file;To code file Unified coding, output Code character stream.
Alternatively, analytic unit 20 includes:Read module, comprising modules and generation module.Wherein, read module, it is used for Read the character stream of code file;Comprising modules, for character stream to be formed into morpheme;And generation module, for according to morpheme Lexical unit sequence is generated, wherein, each lexical unit in lexical unit sequence corresponds with morpheme.
Alternatively, the data processing equipment also includes:Second establishes unit, for lexical unit sequence and default pair After being associated to establish global symbol table, the execution flow for simulation code file is established according to global symbol table Data flow model.
Alternatively, second establish unit and include:Analog module and the 3rd establishes module.Wherein, analog module is used for basis The execution flow of global symbol table simulation code file, obtains analog result;3rd, which establishes module, is used to be built according to analog result Vertical data flow model.
Alternatively, analog module is used in the case where being compared according to global symbol table to the variable of code file, Determine and record the span of variable.
Herein it should be noted that example and application scenarios phase that said units and module are realized with corresponding step Together, but it is not limited to the disclosure of that of above-described embodiment 1.It should be noted that one as device of said units and module Divide and may operate in hardware environment as shown in Figure 1, can be realized, can also be realized by hardware by software.
By said units and module, can solve the low technical problem of the accuracy of code scans in correlation technique, And then improve the technique effect of the accuracy of code scans.
Herein it should be noted that example and application scenarios phase that said units and module are realized with corresponding step Together, but it is not limited to the disclosure of that of above-described embodiment 1.It should be noted that one as device of said units and module Divide and may operate in hardware environment as shown in Figure 1, can be realized, can also be realized by hardware by software, wherein, firmly Part environment includes network environment.
Embodiment 5
According to embodiments of the present invention, a kind of server or terminal for being used to implement above-mentioned data processing method is additionally provided.
Figure 29 is a kind of structured flowchart of terminal according to embodiments of the present invention.As shown in figure 29, the terminal can include: One or more (one is only shown in figure) processor 291, memory 293 and transmitting devices 295 are (in such as above-mentioned embodiment Dispensing device), as shown in figure 29, the terminal can also include input-output equipment 297.
Wherein, memory 293 can be used for storage software program and module, such as the data processing side in the embodiment of the present invention Programmed instruction/module corresponding to method and device, processor 291 by operation be stored in software program in memory 293 and Module, so as to perform various function application and data processing, that is, realize above-mentioned data processing method.Memory 293 can wrap Include high speed random access memory, nonvolatile memory can also be included, as one or more magnetic storage device, flash memory or Other non-volatile solid state memories of person.In some instances, memory 293 can further comprise remote relative to processor 291 The memory that journey is set, these remote memories can pass through network connection to terminal.The example of above-mentioned network includes but unlimited In internet, intranet, LAN, mobile radio communication and combinations thereof.
Above-mentioned transmitting device 295 is used to data are received or sent via network, can be also used for processor with Data transfer between memory.Above-mentioned network instantiation may include cable network and wireless network.In an example, Transmitting device 295 includes a network adapter (Network Interface Controller, referred to as NIC), and it can lead to Netting twine is crossed with other network equipments with router to be connected so as to be communicated with internet or LAN.In an example, Transmitting device 295 is radio frequency (Radio Frequency, referred to as RF) module, and it is used to wirelessly with internet enter Row communication.
Wherein, specifically, memory 293 is used to store application program.
Processor 291 can call the application program that memory 293 stores by transmitting device 295, to perform following steps Suddenly:
Code file is obtained, wherein, code file is to include the source text of character string;
Morphological analysis is carried out to the character string in code file, obtains lexical unit sequence;
Code file is parsed, obtains default object;Lexical unit sequence and default object are associated to establish the overall situation Symbol table, wherein, the data message for all default objects that global symbol table is used in record code file;
Static code scanning is performed to code file according to global symbol table, obtains scanning result, wherein, scanning result is extremely Few lookup result included to the type of lexical unit sequence.
Processor 291 is additionally operable to perform following step:Lexical unit sequence and default object are being associated with foundation Before global symbol table, in the case where not changing the logic of code file, lexical unit sequence is simplified, is simplified Lexical unit sequence, lexical unit sequence and default object are associated to be included with establishing global symbol table:To simplifying morphology Unit sequence and default object are associated to establish global symbol table.
Processor 291 is additionally operable to perform following step:Simplify to lexical unit sequence, be simplified lexical unit After sequence, tree-like syntactic structure is established according to lexical unit sequence is simplified, wherein, tree-like syntactic structure is to have for storing The tree structure of the data object of default grammer, it is associated to simplifying lexical unit sequence and default object to establish global symbol Number table includes:It is associated according to tree-like syntactic structure to simplifying lexical unit sequence and default object to establish global symbol Table.
Processor 291 is additionally operable to perform following step:Obtain the single code expression in lexical unit sequence;And root Tree-like syntactic structure is established according to single code expression.
Processor 291 is additionally operable to perform following step:The list of default global object is established according to multiple default objects, its In, the list of default global object is used to the same type of different keyword fragments in code file being associated;Obtain generation Alias in code file instructs and the code to using alias to instruct in code file performs processing, obtains handling code;Pre- If in global object's list, lexical unit sequence and default object corresponding with lexical unit sequence are performed according to processing code Type association or function association, obtain global symbol table.
Processor 291 is additionally operable to perform following step:Will morphology list corresponding with default call method according to processing code Metasequence is performed with default call method and associated.
Processor 291 is additionally operable to perform following step:Determine the type name of the type of lexical unit sequence and default complete Type name is searched in office's list object;Whether the type name for judging to find meets preparatory condition;And if it is judged that search To type name meet preparatory condition, lexical unit sequence and default object are performed by type association or letter according to processing code Number association.
Processor 291 is additionally operable to perform following step:Splitting includes the lexical unit section of one or more lexical unit, Obtain split result;Split result is pressed into stack;Stack top element in stack determines the class to be found of lexical unit sequence Type name;Judge whether type name to be found is preset kind name;If it is judged that type name to be found is not preset kind name, from Type name to be found is searched in base class sub-object where lexical unit section;Do not looked into the base class sub-object where lexical unit section When finding type name to be found, type name to be found is searched from global symbol table;It is to be checked being found from global symbol table When looking for type name, the type qualified name of type name to be found is verified, wherein, type limits entitled for limiting type name to be found Title;Judge whether type qualified name matches with default qualified name;And if it is judged that type qualified name with limiting in advance Name and match, return to type name to be found to be used as scanning result.
Processor 291 is additionally operable to perform following step:Searched in the base class sub-object where lexical unit section to be found After type name, when finding type name to be found in the base class sub-object where lexical unit section, type to be found is verified The type qualified name of name;Judge whether type qualified name matches with default qualified name;And if it is judged that type qualified name Match with default qualified name, return to type name to be found to be used as scanning result.
Processor 291 is additionally operable to perform following step:Character string in code file carries out morphological analysis, obtains Before lexical unit sequence, pretreatment is performed to code file, obtains pre-processing code, wherein, pretreatment code is pre- to meet If the character stream of rule, morphological analysis is carried out to the character string in pretreatment code, obtains lexical unit sequence.
Processor 291 is additionally operable to perform following step:Read the character stream of code file;Character stream is formed into morpheme;With And lexical unit sequence is generated according to morpheme, wherein, each lexical unit in lexical unit sequence corresponds with morpheme.
Processor 291 is additionally operable to perform following step:Lexical unit sequence and default object are being associated with foundation After global symbol table, the data flow model of the execution flow for simulation code file is established according to global symbol table.
Processor 291 is additionally operable to perform following step:According to the execution flow of global symbol table simulation code file, obtain Analog result;And data flow model is established according to analog result.
Processor 291 is additionally operable to perform following step:The variable of code file is being compared according to global symbol table In the case of, determine and record the span of variable.
Using the embodiment of the present invention, there is provided a kind of scheme of data processing method.By obtaining code file, wherein, Code file is to include the source text of character string;Morphological analysis is carried out to the character string in code file, obtains word Method unit sequence;Code file is parsed, obtains default object;Lexical unit sequence and default object are associated complete to establish Office's symbol table, wherein, the data message for all default objects that global symbol table is used in record code file;And according to complete Office's symbol table performs static code scanning to code file, obtains scanning result, wherein, scanning result is comprised at least to morphology list The lookup result of the type of metasequence, the purpose that symbolism processing is carried out to code file is reached, it is achieved thereby that improving generation The technique effect of the accuracy of code scanning, and then the technical problem that the accuracy that solves code scans in correlation technique is low.
Alternatively, the specific example in the present embodiment may be referred to the example described in above-described embodiment, the present embodiment It will not be repeated here.
It will appreciated by the skilled person that the structure shown in Figure 29 is only to illustrate, terminal can be smart mobile phone (such as Android phone, iOS mobile phones), tablet personal computer, palm PC and mobile internet device (Mobile Internet Devices, MID), the terminal device such as PAD.Figure 29 it does not cause to limit to the structure of above-mentioned electronic installation.For example, terminal is also It may include more either less components (such as network interface, display device etc.) than shown in Figure 29 or have and Figure 29 institutes Show different configurations.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can To be completed by program come command terminal device-dependent hardware, the program can be stored in a computer-readable recording medium In, storage medium can include:Flash disk, read-only storage (Read-Only Memory, ROM), random access device (Random Access Memory, RAM), disk or CD etc..
Embodiment 6
Embodiments of the invention additionally provide a kind of storage medium.Alternatively, in the present embodiment, above-mentioned storage medium can For the program code of configuration for executing data processing.
Alternatively, in the present embodiment, above-mentioned storage medium can be located at multiple in the network shown in above-described embodiment On at least one network equipment in the network equipment.
Alternatively, in the present embodiment, storage medium is arranged to the program code that storage is used to perform following steps:
Code file is obtained, wherein, code file is to include the source text of character string;
Morphological analysis is carried out to the character string in code file, obtains lexical unit sequence;
Code file is parsed, obtains default object;Lexical unit sequence and default object are associated to establish the overall situation Symbol table, wherein, the data message for all default objects that global symbol table is used in record code file;
Static code scanning is performed to code file according to global symbol table, obtains scanning result, wherein, scanning result is extremely Few lookup result included to the type of lexical unit sequence.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:To lexical unit Sequence and default object are associated with before establishing global symbol table, right in the case where not changing the logic of code file Lexical unit sequence is simplified, and is simplified lexical unit sequence, lexical unit sequence and default object are associated with Establishing global symbol table includes:It is associated to simplifying lexical unit sequence and default object to establish global symbol table.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:To lexical unit Sequence is simplified, and is simplified after lexical unit sequence, and tree-like syntactic structure is established according to lexical unit sequence is simplified, its In, tree-like syntactic structure is the tree structure for storing the data object with default grammer, to simplifying lexical unit sequence Be associated with default object is included with establishing global symbol table:According to tree-like syntactic structure to simplified lexical unit sequence and in advance If object is associated to establish global symbol table.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:Obtain lexical unit Single code expression in sequence;And tree-like syntactic structure is established according to single code expression.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:According to multiple default Object establishes default global object's list, wherein, preset global object's list be used for by code file it is same type of not It is associated with keyword fragment;Obtain the alias instruction in code file and the code to using alias to instruct in code file Processing is performed, obtains handling code;In the list of default global object, according to processing code by lexical unit sequence and and morphology Object is preset corresponding to unit sequence and performs type association or function association, obtains global symbol table.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:According to processing code Lexical unit sequence corresponding with default call method and default call method are performed into association.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:Determine lexical unit The type name of the type of sequence simultaneously searches type name in the list of default global object;Whether the type name for judging to find meets Preparatory condition;And if it is judged that the type name found meets preparatory condition, according to processing code by lexical unit sequence Type association is performed with default object or function associates.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:Fractionation includes one Or the lexical unit section of multiple lexical units, obtain split result;Split result is pressed into stack;Stack top member in stack Element determines the type name to be found of lexical unit sequence;Judge whether type name to be found is preset kind name;If it is judged that Type name to be found is not preset kind name, and type name to be found is searched from the base class sub-object where lexical unit section;From When not finding type name to be found in the base class sub-object where lexical unit section, type to be found is searched from global symbol table Name;When type name to be found is found in global symbol table, the type qualified name of type name to be found is verified, wherein, type Limit the entitled title for being used to limit type name to be found;Judge whether type qualified name matches with default qualified name;And If it is judged that type qualified name matches with default qualified name, type name to be found is returned to be used as scanning result.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:From lexical unit After searching type name to be found in base class sub-object where section, find and treat in the base class sub-object where lexical unit section When searching type name, the type qualified name of type name to be found is verified;Judge type qualified name whether with it is default restriction famous prime minister Match somebody with somebody;And if it is judged that type qualified name matches with default qualified name, type name to be found is returned to be used as scanning result.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:To code file In character string carry out morphological analysis, before obtaining lexical unit sequence, pretreatment is performed to code file, pre-processed Code, wherein, pretreatment code is the character stream for meeting preset rules, and morphology point is carried out to the character string in pretreatment code Analysis, obtains lexical unit sequence.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:Read code file Character stream;Character stream is formed into morpheme;And lexical unit sequence is generated according to morpheme, wherein, in lexical unit sequence Each lexical unit corresponds with morpheme.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:To lexical unit After sequence and default object are associated to establish global symbol table, established according to global symbol table for simulation code file Execution flow data flow model.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:According to global symbol The execution flow of table simulation code file, obtains analog result;And data flow model is established according to analog result.
Alternatively, storage medium is also configured to the program code that storage is used to perform following steps:Accorded with according to the overall situation In the case that number table is compared to the variable of code file, the span of variable is determined and recorded.
Alternatively, the specific example in the present embodiment may be referred to the example described in above-described embodiment, the present embodiment It will not be repeated here.
Alternatively, in the present embodiment, above-mentioned storage medium can include but is not limited to:USB flash disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. is various can be with the medium of store program codes.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
If the integrated unit in above-described embodiment is realized in the form of SFU software functional unit and is used as independent product Sale or in use, the storage medium that above computer can be read can be stored in.Based on such understanding, skill of the invention The part or all or part of the technical scheme that art scheme substantially contributes to prior art in other words can be with soft The form of part product is embodied, and the computer software product is stored in storage medium, including some instructions are causing one Platform or multiple stage computers equipment (can be personal computer, server or network equipment etc.) perform each embodiment institute of the present invention State all or part of step of method.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in some embodiment The part of detailed description, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed client, can be by others side Formula is realized.Wherein, device embodiment described above is only schematical, such as the division of the unit, and only one Kind of division of logic function, can there is an other dividing mode when actually realizing, for example, multiple units or component can combine or Another system is desirably integrated into, or some features can be ignored, or do not perform.It is another, it is shown or discussed it is mutual it Between coupling or direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some interfaces, unit or module Connect, can be electrical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims (17)

  1. A kind of 1. data processing method, it is characterised in that including:
    Code file is obtained, wherein, the code file is to include the source text of character string;
    Morphological analysis is carried out to the character string in the code file, obtains lexical unit sequence;
    The code file is parsed, obtains default object, wherein, the default object is the tool corresponding with the code file There is the object of type;
    The lexical unit sequence and the default object are associated to establish global symbol table, wherein, the global symbol Number table is used for the data message for recording all default objects in the code file;And
    Static code scanning is performed to the code file according to the global symbol table, obtains scanning result, wherein, it is described to sweep Retouch lookup result of the result including at least the type to the lexical unit sequence;
    Wherein, the default object includes multiple default objects, and the lexical unit sequence and the default object are closed Connection is included with establishing global symbol table:The list of default global object is established according to the multiple default object, wherein, it is described default Global object's list is used to the same type of different keyword fragments in the code file being associated;Obtain the generation Alias in code file instructs and the code to using the alias to instruct in the code file performs processing, obtains handling generation Code;In default global object's list, according to it is described processing code by the lexical unit sequence and with the morphology list Object is preset corresponding to metasequence and performs type association or function association, obtains the global symbol table.
  2. 2. according to the method for claim 1, it is characterised in that
    The lexical unit sequence and the default object are being associated with before establishing the global symbol table, the side Method also includes:In the case where not changing the logic of the code file, the lexical unit sequence is simplified, obtains letter Change lexical unit sequence,
    The lexical unit sequence and the default object are associated to be included with establishing the global symbol table:To the letter Change lexical unit sequence and the default object is associated to establish the global symbol table.
  3. 3. according to the method for claim 2, it is characterised in that
    Simplify to the lexical unit sequence, after obtaining the simplified lexical unit sequence, methods described also includes: Tree-like syntactic structure is established according to the simplified lexical unit sequence, wherein, the tree-like syntactic structure is to have for storing The tree structure of the data object of default grammer,
    The simplified lexical unit sequence and the default object are associated to be included with establishing the global symbol table:According to The tree-like syntactic structure is associated to the simplified lexical unit sequence and the default object to establish the global symbol Number table.
  4. 4. according to the method for claim 3, it is characterised in that the tree-like grammer is established according to the lexical unit sequence Structure includes:
    Obtain the single code expression in the lexical unit sequence;And
    The tree-like syntactic structure is established according to the single code expression.
  5. 5. according to the method for claim 1, it is characterised in that according to the processing code by the lexical unit sequence and Default object corresponding with the lexical unit sequence, which performs type association, includes at least one of:
    According to the processing code by for the lexical unit sequence of type declarations and corresponding with the lexical unit sequence pre- If object performs association;
    Lexical unit sequence corresponding with the variable in the code file and the variable are performed according to the processing code Association.
  6. 6. according to the method for claim 1, it is characterised in that according to the processing code by the lexical unit sequence and Default object corresponding with the lexical unit sequence performs function association and comprised at least:Will be with presetting according to the processing code Lexical unit sequence corresponding to call method is performed with the default call method and associated.
  7. 7. according to the method for claim 1, it is characterised in that according to the processing code by the lexical unit sequence and Default object corresponding with the lexical unit sequence, which performs type association or function association, to be included:
    Determine the type name of the type of the lexical unit sequence and search the type in default global object's list Name;
    Whether the type name for judging to find meets preparatory condition;And
    If it is judged that the type name found meets the preparatory condition, according to the code that handles by the morphology list Metasequence performs type association with the default object or function associates.
  8. 8. according to the method for claim 7, it is characterised in that the code file is performed according to the global symbol table Static code scans, and obtaining the scanning result includes:
    Splitting includes the lexical unit section of one or more lexical unit, obtains split result;
    The split result is pressed into stack;
    Stack top element in the stack determines the type name to be found of the lexical unit sequence;
    Judge whether the type name to be found is preset kind name;
    If it is judged that the type name to be found is not the preset kind name, the base class pair where the lexical unit section The type name to be found is searched as middle;
    When not finding the type name to be found in the base class sub-object where the lexical unit section, from the global symbol The type name to be found is searched in number table;
    When finding the type name to be found from the global symbol table, the type limit of the type name to be found is verified Name, wherein, the type limits the entitled title for being used to limit the type name to be found;
    Judge whether the type qualified name matches with default qualified name;And
    If it is judged that the type qualified name matches with the default qualified name, return the type name to be found using as The scanning result.
  9. 9. according to the method for claim 8, searched in the base class sub-object where the lexical unit section described to be checked Look for after type name, methods described also includes:
    When finding the type name to be found in the base class sub-object where the lexical unit section, checking is described to be found The type qualified name of type name;
    Judge whether the type qualified name matches with the default qualified name;And
    If it is judged that the type qualified name matches with the default qualified name, return the type name to be found using as The scanning result.
  10. 10. according to the method for claim 1, it is characterised in that
    Character string in the code file carries out morphological analysis, before obtaining the lexical unit sequence, the side Method also includes:Pretreatment is performed to the code file, obtains pre-processing code, wherein, the pretreatment code is pre- to meet If the character stream of rule,
    Morphological analysis is carried out to the character string in the code file, obtaining the lexical unit sequence includes:To described pre- The character string handled in code carries out morphological analysis, obtains the lexical unit sequence.
  11. 11. according to the method for claim 10, it is characterised in that performing pretreatment to the code file is included below extremely It is one of few:
    Filter the space of the code file;
    Delete the annotation of the code file;
    The pre-processing instruction of the code file is handled according to preset configuration;
    To the code file Unified coding, exports coding character stream.
  12. 12. according to the method for claim 1, it is characterised in that morphology is carried out to the character string in the code file Analysis, obtaining the lexical unit sequence includes:
    Read the character stream of the code file;
    The character stream is formed into morpheme;And
    The lexical unit sequence is generated according to the morpheme, wherein, each lexical unit in the lexical unit sequence with The morpheme corresponds.
  13. 13. according to the method for claim 1, it is characterised in that to the lexical unit sequence and the default object After being associated to establish the global symbol table, methods described also includes:Established according to the global symbol table for mould Intend the data flow model of the execution flow of the code file.
  14. 14. according to the method for claim 13, it is characterised in that the data flow mould is established according to the global symbol table Type includes:
    According to the execution flow of code file described in the global symbol table simulation, analog result is obtained;And
    The data flow model is established according to the analog result.
  15. 15. according to the method for claim 14, it is characterised in that according to code file described in the global symbol table simulation Execution flow, obtaining the analog result includes:The variable of the code file is being carried out according to the global symbol table In the case of comparing, the span of the variable is determined and recorded.
  16. A kind of 16. data processing equipment, it is characterised in that including:
    First acquisition unit, for obtaining code file, wherein, the code file is to include the source program text of character string This;
    Analytic unit, for carrying out morphological analysis to the character string in the code file, obtain lexical unit sequence;
    Resolution unit, for parsing the code file, default object is obtained, wherein, the default object is and the code The corresponding object with type of file;
    Associative cell, for being associated the lexical unit sequence and the default object to establish global symbol table, its In, the global symbol table is used for the data message for recording all default objects in the code file;And
    Scanning element, for performing static code scanning to the code file according to the global symbol table, obtain scanning knot Fruit, wherein, the scanning result comprises at least the lookup result to the type of the lexical unit sequence;
    Wherein, the default object includes multiple default objects, and the lexical unit sequence and the default object are closed Connection is included with establishing global symbol table:The list of default global object is established according to the multiple default object, wherein, it is described default Global object's list is used to the same type of different keyword fragments in the code file being associated;Obtain the generation Alias in code file instructs and the code to using the alias to instruct in the code file performs processing, obtains handling generation Code;In default global object's list, according to it is described processing code by the lexical unit sequence and with the morphology list Object is preset corresponding to metasequence and performs type association or function association, obtains the global symbol table.
  17. 17. device according to claim 16, it is characterised in that
    Described device also includes:Second acquisition unit, for being closed to the lexical unit sequence and the default object Connection is with before establishing the global symbol table, in the case where not changing the logic of the code file, to the lexical unit Sequence is simplified, and is simplified lexical unit sequence,
    The associative cell is described complete to establish for being associated to the simplified lexical unit sequence and the default object Office's symbol table.
CN201610613852.7A 2016-07-29 2016-07-29 Data processing method and device Active CN106227668B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610613852.7A CN106227668B (en) 2016-07-29 2016-07-29 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610613852.7A CN106227668B (en) 2016-07-29 2016-07-29 Data processing method and device

Publications (2)

Publication Number Publication Date
CN106227668A CN106227668A (en) 2016-12-14
CN106227668B true CN106227668B (en) 2017-11-17

Family

ID=57535333

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610613852.7A Active CN106227668B (en) 2016-07-29 2016-07-29 Data processing method and device

Country Status (1)

Country Link
CN (1) CN106227668B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304369B (en) * 2017-05-03 2020-12-01 腾讯科技(深圳)有限公司 File type identification method and device
CN108874825B (en) * 2017-05-12 2021-11-02 北京京东尚科信息技术有限公司 Abnormal data verification method and device
CN107153564B (en) * 2017-06-22 2020-07-07 拜椰特(上海)软件技术有限公司 Lexical analysis tool
CN107608875B (en) * 2017-08-03 2020-11-06 奇安信科技集团股份有限公司 Localization processing method and device for static code
CN110069455B (en) * 2017-09-21 2021-12-14 北京华为数字技术有限公司 File merging method and device
CN109814939B (en) * 2017-11-20 2021-10-15 华为技术有限公司 Dynamic loading method, and target file manufacturing method and device
CN108170425B (en) * 2017-12-29 2021-03-19 东莞市高标软件科技有限公司 Program code modification method and device and terminal equipment
CN108537086A (en) * 2018-03-29 2018-09-14 广东欧珀移动通信有限公司 Method for information display, device, storage medium and mobile terminal
CN108549538B (en) * 2018-04-11 2021-03-02 深圳市腾讯网络信息技术有限公司 Code detection method and device, storage medium and test terminal
CN110795069A (en) * 2018-08-02 2020-02-14 Tcl集团股份有限公司 Code analysis method, intelligent terminal and computer readable storage medium
CN109359188B (en) * 2018-09-30 2020-01-14 北京数聚鑫云信息技术有限公司 Component arranging method and system
CN109542420A (en) * 2018-10-15 2019-03-29 张海光 A kind of Code Edit method based on label
CN109558119B (en) * 2018-11-09 2022-12-27 杭州安恒信息技术股份有限公司 Java-based Web framework traversal request address method
CN109710218B (en) * 2018-11-26 2022-02-11 福建天泉教育科技有限公司 Object automatic conversion method and terminal
CN109656567B (en) * 2018-12-20 2022-02-01 北京树根互联科技有限公司 Dynamic method and system for heterogeneous service data processing logic
CN111385249B (en) * 2018-12-28 2023-07-18 中国电力科学研究院有限公司 Vulnerability detection method
CN110197181B (en) * 2019-05-31 2021-04-30 烽火通信科技股份有限公司 Cable character detection method and system based on OCR
CN110489127B (en) * 2019-08-12 2023-10-13 腾讯科技(深圳)有限公司 Error code determination method, apparatus, computer-readable storage medium and device
CN113110947B (en) * 2021-04-16 2024-04-02 中国工商银行股份有限公司 Program call chain generation method, system, electronic device and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode
CN101482847A (en) * 2009-01-19 2009-07-15 北京邮电大学 Detection method based on safety bug defect mode
CN102799520A (en) * 2012-06-27 2012-11-28 清华大学 Static checking method and device for source code pairing
CN104915293A (en) * 2015-06-12 2015-09-16 北京邮电大学 Software testing method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286132A (en) * 2008-06-02 2008-10-15 北京邮电大学 Test method and system based on software defect mode
CN101482847A (en) * 2009-01-19 2009-07-15 北京邮电大学 Detection method based on safety bug defect mode
CN102799520A (en) * 2012-06-27 2012-11-28 清华大学 Static checking method and device for source code pairing
CN104915293A (en) * 2015-06-12 2015-09-16 北京邮电大学 Software testing method and system

Also Published As

Publication number Publication date
CN106227668A (en) 2016-12-14

Similar Documents

Publication Publication Date Title
CN106227668B (en) Data processing method and device
CN108446540B (en) Program code plagiarism type detection method and system based on source code multi-label graph neural network
CN103547998B (en) For compiling the method and apparatus of regular expression
US9122540B2 (en) Transformation of computer programs and eliminating errors
US20060069545A1 (en) Method and apparatus for transducer-based text normalization and inverse text normalization
CN107608677A (en) A kind of process of compilation method, apparatus and electronic equipment
CN103473171A (en) Coverage rate dynamic tracking method and device based on function call paths
CN101650651A (en) Visualizing method of source code level program structure
CN110502227B (en) Code complement method and device, storage medium and electronic equipment
CN111931181B (en) Software logic vulnerability detection method based on graph mining
CN108563561B (en) Program implicit constraint extraction method and system
CN111813675A (en) SSA structure analysis method and device, electronic equipment and storage medium
US20210089284A1 (en) Method and system for using subroutine graphs for formal language processing
JP4951416B2 (en) Program verification method and program verification apparatus
US7409619B2 (en) System and methods for authoring domain specific rule-driven data generators
CN112988163B (en) Intelligent adaptation method, intelligent adaptation device, intelligent adaptation electronic equipment and intelligent adaptation medium for programming language
CN103235757B (en) Several apparatus and method that input domain tested object is tested are made based on robotization
US10642714B2 (en) Mapping dynamic analysis data to source code
CN109254774A (en) The management method and device of code in software development system
CN115221047A (en) Automatic test case generation method and electronic equipment
JP2023016738A (en) Method, computer program and computer for improving technological process of programming computer using dynamic programming language (type inference in dynamic languages)
CN114282227A (en) Safety analysis and detection method for intelligent contract of Fabric block chain system
CN1226692C (en) Machine translation system based on semanteme and its method
CN110244954A (en) A kind of Compilation Method and equipment of application program
CN109597624A (en) A kind of method that SQL is formatted

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant