CN107992324A - A kind of code search method based on constraint solving - Google Patents
A kind of code search method based on constraint solving Download PDFInfo
- Publication number
- CN107992324A CN107992324A CN201711405834.0A CN201711405834A CN107992324A CN 107992324 A CN107992324 A CN 107992324A CN 201711405834 A CN201711405834 A CN 201711405834A CN 107992324 A CN107992324 A CN 107992324A
- Authority
- CN
- China
- Prior art keywords
- code
- constraint
- source
- function
- search method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000004458 analytical method Methods 0.000 claims abstract description 14
- 230000006870 function Effects 0.000 claims description 38
- 238000012545 processing Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 5
- 238000011161 development Methods 0.000 claims description 3
- 230000003068 static effect Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 238000009635 antibiotic susceptibility testing Methods 0.000 claims 1
- 230000007812 deficiency Effects 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 8
- 230000004087 circulation Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 241000270322 Lepidosauria Species 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000005206 flow analysis Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
Abstract
The invention discloses a kind of code search method based on constraint solving, it is concretely comprised the following steps:Step 1: open source projects are obtained from open source community;Step 2: analyzing source code using JPF and JDT, SSA forms are translated into;Step 3: using the code of JDT analysis SSA forms, constraint is translated into;Step 4: source code and the constraint of generation are corresponded, structure code constraints storehouse;Step 5: structure code search system, helps user's searching code.The method based on constraint solving that is mainly characterized by of this method solves the problems, such as code search, and Do statement and class members's variable are handled, it compensate for the deficiency to work in the past, and substantially increase the accuracy rate of code search, so that programmer may search for the code needed in software development process, used for reference or be multiplexed, improve software development efficiency and quality.
Description
Technical field
The present invention relates to computer software fields, in particular to a kind of code search method based on constraint solving.
Background technology
With pouring in for thought of increasing income, open source community is quickly grown, large quantities of open source communities such as GitHub,
SourceForge, BitBucket the trustship outstanding code of magnanimity, for programmer reference, directly multiplexing or use.Soft
In part development process, exploit person can greatly improve the development quality and effect of software to a certain extent with reference to these ripe codes
Rate.Yet with Open Source Code enormous amount, most of open source community is also only provided simply based on keywords such as entry names
Code search method, searches out the code come and does not meet the demand of programmer mostly, it is necessary to which substantial amounts of artificial screening, this is undoubtedly
It is very cumbersome, poorly efficient.
The existing many work on code search technology now, is broadly divided into the code search skill based on keyword
Art, code search technology and semantic-based code search technology based on syntactic structure.Code search skill based on keyword
Art is mainly matched by the function name of code, class name, variable name etc., lacks and the specific of code is understood, and is based on syntactic structure
Code search technology mainly by building Call Graph, carry out the overall structure of code analysis, be only suitable for programmer and search for certain
The code of kind structure.Although semantic-based code search technology can make up above-mentioned deficiency, code analysis has in itself
Certain challenge, current work have significant limitation.
The content of the invention
The purpose of the present invention, in view of problem above, the present invention is intended to provide a kind of code search side based on constraint solving
Method, this method are improved existing semantic-based code search technology, there is provided to Java Do statements, class members becomes
The processing of the problems such as amount, realizes code to the automatic conversion of constraint, and framework code search system on this basis, helps journey
Sequence person finds the code of needs.
In order to realize foregoing invention purpose, scheme that the present invention uses for:A kind of code search side based on constraint solving
Method, concretely comprises the following steps:
Step 1:Open source projects are obtained from open source community;
Step 2:Using JPF and JDT analysis source codes, SSA forms are translated into;
Step 3:Using the code of JDT analysis SSA forms, constraint is translated into;
Step 4:Source code and the constraint of generation are corresponded, build code-constraint storehouse;
Step 5:Code search system is built, helps user's searching code;
The present invention technique effect be:
1st, the solution for the problems such as source code being automatically converted to constraint, and providing Do statement and class members's variable is thought
Road.
2nd, existing semantic-based code search technology is improved, can provides for programmer and more accurately search
Rope
Effect, the problem of code can be used by solving artificial screening.
3rd, accurately and efficiently code search system can greatly improve the efficiency and quality of programmer's software development.
Brief description of the drawings
Fig. 1 is the code search method flow diagram based on constraint solving of the embodiment of the present invention.
Fig. 2 is the example in the extraction if sentences path of the embodiment of the present invention.
Fig. 3 is the example in the extraction while statement path of the embodiment of the present invention.
Fig. 4 is that the reachable path of the embodiment of the present invention is converted into the example of SSA forms.
Fig. 5 is the example that two paths of Fig. 2 are converted into constraint.
Embodiment
The present invention is described in further details with specific embodiment below in conjunction with the accompanying drawings.
A kind of code search method based on constraint solving of the present embodiment, this method are mainly characterized by asking based on constraint
The method of solution solves the problems, such as code search, and Do statement and class members's variable are handled, and compensate for what is worked in the past
Deficiency, and substantially increase the accuracy rate of code search so that programmer may search for what is needed in software development process
Code, is used for reference or is multiplexed, and improves software development efficiency and quality.Fig. 1 is the embodiment of the present invention based on constraint solving
Code search method flow diagram, specific steps include:
Step 1:Project is obtained from open source software;
Reptile instrument acquisition can be directly downloaded or write in the open source communities such as GitHub, SourceForge to increase income
Project.Main Analysis of the present invention solves the code that Java language is write.
Step 2:Using JPF and JDT analysis source codes, SSA forms are translated into;
The open source projects that step 1 is obtained are imported into Eclipse instruments, compiling generation .class files.
Source code .class files corresponding with its are analyzed using the jpf-symbc modules of JPF instruments, obtaining should
The reachable path of all member functions in open source projects.Because JPF instruments not can determine that the path of execution when analyzing if sentences
It is True branches or False branches, it is therefore desirable to carry out pitching pile processing in the if sentences of source code.The processing of the part is existing
Work have been directed to, be no longer specifically described realizing details here.Fig. 2 is the two of JPF tool analysis func functions generation
Paths, P1 be True branches path, P2 be False branches under path, assert (a<0) represent to judge a<0.
Do statement is increasingly complex in processing.Because during static analysis, the cycle-index of Do statement is nothing
What method determined, just it can determine that in program operation process only.It is the angle analysis from source code herein, is translated into about
Beam, which is static, in order to solve the problems, such as that cycle-index is uncertain, provides that each Do statement at most circulates here
N times, be Do statement generation n+1 paths, and the paths having more are not circulate the path directly jumped out.Fig. 3 be with
The example of while statement analysis, set n as 2 in the example, i.e., at most circulates and generate three paths twice, in figure, and P1 is not have
There is circulation directly to jump out, P2 jumps out for circulation primary, and P3 is jumped out twice for circulation.Here n is set with certain limitation, if
N is too small, can cause that the path that may be gone to, such as the example of Fig. 3 cannot be included, the condition for circulating end is a>=10,
Circulate 7 times and jump out altogether, and 2 set cannot cover the path;And if n is too big, such as n is set as 50, if source generation
There are two circulations to occur side by side in code, 50 × 50 paths can be produced, if three circulations are arranged side by side, it will produce 50 × 50 × 50
Paths, cause number of paths to sharply increase, and run into the bottleneck of path explosion.
Handled more than, all reachable paths of a function can be obtained, then using JDT instruments to all reachable
The code construction AST trees in path, analyze and are translated into SSA forms.SSA is a kind of intermediate representation, it ensures each
Variable is only assigned once in a program, i.e., each variable is only defined once, it is ensured that it is closed with accurate use-definition
System so that data-flow analysis and optimization algorithm are simpler.It is relatively simple that the code of reachable path is converted into SSA, here not
It is described in detail, Fig. 4 is the example that code is converted into SSA forms.
Step 3:Using the code of JDT analysis SSA forms, constraint is translated into;
Code construction AST of the step using JDT instruments to the SSA forms of generation, handles the statement of variable in code, assigns
It is worth sentence and conditional statement, code is converted into constraint.The type handled herein includes int, boolean and String types,
And the Class Type that these types are formed.Demonstrate and how to be converted below, constraint form is the input of z3 solvers
Form.
The statement of variable:
1.int a;->(declare-const a Int)
3.Student student;->(declare-const student(Student Int Bool String))
Variable name and types of variables are obtained from AST, is claimed as constraining using declare-const orders.For
The variable of int, boolean and String type, has Int, Bool in z3, and String types correspond to therewith, directly uses
Declare-const orders statement can (1).And Class Type needs to obtain all names of variables and type in class, use
Declare-datatypes command definitions the type (2), then uses declare-const order sound in object-instantiated
The bright variable (3).
Assignment statement:
1.a=1;->(assert (=a 1))
2.student=student2;->(assert (=student student2))
3.student.node=1;->(assert (=(node student) 1))
For assignment statement, judge whether the value of the right and left is identical using assert orders, assert orders are responsible for will
The formula is added in the internal stack of z3.The assignment statement of the variable of int, boolean, String and Class Type directly uses
Assert judge equal sign the right and left it is equal can (1,2), and the assignment statement comprising class members's variable is, it is necessary to use (member
Variable name class variable name) form represent the member variable, then determined whether equal (3) using assert orders.
Conditional statement:
assert(a<10);->(assert(<a 10))
assert(student.node<10);->(assert(<(node student)10))
Conditional statement also represents that method for transformation is identical with the rule of assignment statement, no longer does here using assert orders
Describe in detail.
Handled more than, the code of SSA forms can be converted into the constraint of z3 forms.The constraint of the form can be straight
Connect and solved using z3 solvers.
Step 4:Source code and the constraint of generation are corresponded, build code-constraint storehouse;
Handled by above step, the source code of all functions of project, and each corresponding each road of function can be obtained
The constraint in footpath.Next these data are stored using MySQL database, builds code-constraint storehouse, facilitate user to inquire about.Data
The establishment structure of storehouse table is as follows:
Wherein, Id fields are unique identifier, and projectName fields are entry name, and methodName fields are function
Name, contains the absolute path of function in the project, and the statement of function, and source fields are the source code of the function,
Constraints fields are the corresponding constraint of source code.
Step 5:Code search system is built, helps user's searching code;
When user inquires about code, the input and output that code search system is responsible for user to provide (input representative function
Parameter, exports the return value for function) constraint is converted into, parameter is then found out in code-constraint storehouse and returns to Value Types phase
Matched function, the constraint of these functions is combined one by one with input and output constraint, is solved using z3 solvers, if z3 is solved to
Sat, then it represents that the constraint meets the demand of user, and corresponding function is returned to user, if unsat, represents not meeting.
Detailed explanation is carried out used here as the function of Fig. 2.Func function representations in Fig. 2, if parameter a is less than 0,
Return a and b's and, otherwise return a and b difference.The input of the function is two int types, and output is also int types.Fig. 5
For the constraint of two paths of function generation.User is when inquiring about code, it is only necessary to provide the input of two int types, one
The output of a int types, such as { 2,1,1 }, code search instrument are converted into constraint, (assert (=input1 2)),
(assert (=input2 1)), (assert (=output 1)), the then constraint with two paths of Fig. 5 is combined one by one,
Being solved using z3, the constraint solving of any paths is returned as sat and then shows that constraining corresponding functor closes user demand, this
In example, the constraint solving of path P 1 is sat, then shows that the source code of the function needed for user, is returned to user by func functions.
If user provides inquiry { 2,1,2 }, the constraint solving of two paths is all unsat, then it represents that the function does not meet user's
Demand, code search system can search for the constraint of other functions in the database, solve one by one, until searching out suitable letter
Number.
The foregoing is only a preferred embodiment of the present invention, not makees limitation in any form to the present invention, appoints
What those skilled in the art, without departing from the scope of the present invention, according to the present invention technical spirit to
Any simple modification that upper embodiment is made, or equivalent variations, still fall within the protection domain of claims of the present invention.
Claims (10)
- A kind of 1. code search method based on constraint solving, it is characterised in that include the following steps:Step 1: open source projects are obtained from open source community;Step 2: analyzing source code using JPF and JDT, SSA forms are translated into;Step 3: using the code of JDT analysis SSA forms, constraint is translated into;Step 4: source code and the constraint of generation are corresponded, code-constraint storehouse is built;Step 5: structure code search system, helps user's searching code.
- 2. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 1:The open source community refers to the Open Source Code storehouses such as Sourceforge, Github, BitBucket, Google Code.
- 3. the code search method according to claim 2 based on constraint solving, it is characterised in that in step 1:Directly download or write in the open source communities such as Sourceforge, Github, BitBucket, Google Code and climb Worm instrument obtains open source projects.
- 4. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 2:Step 2.1:.class is analyzed using the jpf-symbc modules of JPF (Java PathFinder) instrument, is obtained The reachable path of all member functions;Step 2.2:Using JDT (Java Development tools) to reachable path structure AST (Abstract Syntax Tree) set, analyze and be converted into SSA (Static Single-Assignment) form.
- 5. the code search method according to claim 4 based on constraint solving, it is characterised in that in step 2:The open source projects that step 1 obtains are imported into Eclipse instruments, compiling generation .class files;Source code .class files corresponding with its are analyzed using the jpf-symbc modules of JPF instruments, this is obtained and increases income The reachable path of all member functions in project;Pitching pile processing is carried out in the if sentences of source code;After all reachable paths for obtaining a function, the code construction AST trees using JDT instruments to all reachable paths, point Analyse and be translated into SSA forms;SSA is a kind of intermediate representation, ensures that each variable is only assigned once in a program, I.e. each variable is only defined once, it is ensured that it is with accurate use-definition relation.
- 6. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 3:Path construction ASTs of the JDT to SSA forms is reused, analyzes and is translated into constraint;Comprising to if sentences, while Sentence, for sentences, and the conversion of class members's variable.
- 7. the code search method according to claim 6 based on constraint solving, it is characterised in that in step 3:Code construction AST using JDT instruments to the SSA forms of generation, handles the statement of variable in code, assignment statement and bar Part sentence, constraint is converted into by code;The type of processing includes int, boolean and String types, and these type structures Into Class Type.
- 8. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 4:The source code of all member functions and its constraint one-to-one corresponding are stored in MySQL database, build code-constraint Storehouse.
- 9. the code search method according to claim 8 based on constraint solving, it is characterised in that in step 4:By the source code of all functions of the project that obtains, and the constraint in each corresponding each path of function;Use MySQL data These data of library storage, build code-constraint storehouse, facilitate user to inquire about;The establishment structure of database table is as follows:
Field Type Key Default Id int PRI NULL projectName varchar(512) NULL methodName varchar(512) NULL source text NULL constraints text NULL Wherein, Id fields are unique identifier, and projectName fields are entry name, and methodName fields are function name, bag Containing the absolute path of function in the project, and the statement of function, source fields are the source code of the function, Constraints fields are the corresponding constraint of source code. - 10. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 5, structure Code search instrument, the code that user is needed using input and output inquiry:When user inquires about code, input and output that code search system is responsible for the user to provide (parameter of input representative function, Export the return value for function) constraint is converted into, parameter and return value type matching are then found out in code-constraint storehouse Function, the constraint of these functions is combined one by one with input and output constraint, is solved using z3 solvers, if z3 is solved to sat, Represent that the constraint meets the demand of user, corresponding function is returned into user, if unsat, represent not meeting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711405834.0A CN107992324A (en) | 2017-12-22 | 2017-12-22 | A kind of code search method based on constraint solving |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711405834.0A CN107992324A (en) | 2017-12-22 | 2017-12-22 | A kind of code search method based on constraint solving |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107992324A true CN107992324A (en) | 2018-05-04 |
Family
ID=62041477
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711405834.0A Pending CN107992324A (en) | 2017-12-22 | 2017-12-22 | A kind of code search method based on constraint solving |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107992324A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111177312A (en) * | 2019-12-10 | 2020-05-19 | 同济大学 | Open source code searching method with grammar and semantics fused |
CN112527388A (en) * | 2019-09-17 | 2021-03-19 | 中国科学院软件研究所 | GitHub large-scale open source code-oriented quick code file tracing method and device |
CN112948374A (en) * | 2021-01-29 | 2021-06-11 | 吉林大学 | Relational database searching method based on logic program |
CN117193750A (en) * | 2023-11-08 | 2023-12-08 | 深圳大数信科技术有限公司 | Full stack low code platform implementation method based on CraphQL |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104077144A (en) * | 2014-07-07 | 2014-10-01 | 西安交通大学 | Data race detection and evidence generation method based on multithreaded program constraint building |
CN106610898A (en) * | 2016-12-28 | 2017-05-03 | 南京大学 | JPF-based Java code SSA single path generation method |
CN106649118A (en) * | 2016-12-28 | 2017-05-10 | 南京大学 | Generating method of SSA single path of Java code based on AST |
CN107247668A (en) * | 2017-06-07 | 2017-10-13 | 成都四象联创科技有限公司 | Code automatic detection and bearing calibration |
-
2017
- 2017-12-22 CN CN201711405834.0A patent/CN107992324A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104077144A (en) * | 2014-07-07 | 2014-10-01 | 西安交通大学 | Data race detection and evidence generation method based on multithreaded program constraint building |
CN106610898A (en) * | 2016-12-28 | 2017-05-03 | 南京大学 | JPF-based Java code SSA single path generation method |
CN106649118A (en) * | 2016-12-28 | 2017-05-10 | 南京大学 | Generating method of SSA single path of Java code based on AST |
CN107247668A (en) * | 2017-06-07 | 2017-10-13 | 成都四象联创科技有限公司 | Code automatic detection and bearing calibration |
Non-Patent Citations (1)
Title |
---|
KATHRYN T. STOLEE,SEBASTIAN ELBAUM,DANIEL DOBOS: ""Solving the Search for Source Code"", 《ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112527388A (en) * | 2019-09-17 | 2021-03-19 | 中国科学院软件研究所 | GitHub large-scale open source code-oriented quick code file tracing method and device |
CN112527388B (en) * | 2019-09-17 | 2022-10-11 | 中国科学院软件研究所 | GitHub large-scale open source code-oriented quick code file tracing method and device |
CN111177312A (en) * | 2019-12-10 | 2020-05-19 | 同济大学 | Open source code searching method with grammar and semantics fused |
CN112948374A (en) * | 2021-01-29 | 2021-06-11 | 吉林大学 | Relational database searching method based on logic program |
CN117193750A (en) * | 2023-11-08 | 2023-12-08 | 深圳大数信科技术有限公司 | Full stack low code platform implementation method based on CraphQL |
CN117193750B (en) * | 2023-11-08 | 2024-03-15 | 深圳大数信科技术有限公司 | Full stack low code platform implementation method based on CraphQL |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Elallaoui et al. | Automatic transformation of user stories into UML use case diagrams using NLP techniques | |
Rostaing et al. | Automatic differentiation in Odyssee | |
CN107992324A (en) | A kind of code search method based on constraint solving | |
CN101770363B (en) | Method and device for transformation of executable code from into different programming language | |
Korns | Accuracy in symbolic regression | |
US11341127B2 (en) | Information processing system, information processing apparatus, information processing method, and information processing program | |
US8806452B2 (en) | Transformation of computer programs and eliminating errors | |
Greifenberg et al. | Engineering tagging languages for DSLs | |
CN114547619B (en) | Vulnerability restoration system and restoration method based on tree | |
CN109799990A (en) | Source code annotates automatic generation method and system | |
CN107203468A (en) | A kind of software version evolution comparative analysis method based on AST | |
US20140325472A1 (en) | Providing Code, Code Generator and Software Development Environment | |
CN108563561B (en) | Program implicit constraint extraction method and system | |
Radošević et al. | Source code generator based on dynamic frames | |
Fischer et al. | Abstract syntax trees-and their role in model driven software development | |
Weiss et al. | Decision-model-based code generation for SPLE | |
CN106649118A (en) | Generating method of SSA single path of Java code based on AST | |
CN105930162B (en) | A kind of characteristic positioning method based on subgraph search | |
CN107092515B (en) | LPMLN reasoning method and system based on answer set logic program | |
CN111176993A (en) | Code static detection method based on abstract syntax tree | |
Long et al. | A logical data exchange model for adapting different methods abstracting plant architecture | |
Barbier et al. | Model-driven engineering applied to crop modeling | |
CN117251477B (en) | Standardized inspection method and system for data development script based on SQL (structured query language) analysis | |
Chodarev et al. | Development of Oberon-0 using YAJCo | |
Nähring et al. | From standardized modeling formats to modeling languages and back—An exploration based on SBML and ML-Rules |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180504 |
|
RJ01 | Rejection of invention patent application after publication |