CN107992324A - A kind of code search method based on constraint solving - Google Patents

A kind of code search method based on constraint solving Download PDF

Info

Publication number
CN107992324A
CN107992324A CN201711405834.0A CN201711405834A CN107992324A CN 107992324 A CN107992324 A CN 107992324A CN 201711405834 A CN201711405834 A CN 201711405834A CN 107992324 A CN107992324 A CN 107992324A
Authority
CN
China
Prior art keywords
code
constraint
source
function
search method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711405834.0A
Other languages
Chinese (zh)
Inventor
张天
吴少博
潘敏学
姜人和
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201711405834.0A priority Critical patent/CN107992324A/en
Publication of CN107992324A publication Critical patent/CN107992324A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a kind of code search method based on constraint solving, it is concretely comprised the following steps:Step 1: open source projects are obtained from open source community;Step 2: analyzing source code using JPF and JDT, SSA forms are translated into;Step 3: using the code of JDT analysis SSA forms, constraint is translated into;Step 4: source code and the constraint of generation are corresponded, structure code constraints storehouse;Step 5: structure code search system, helps user's searching code.The method based on constraint solving that is mainly characterized by of this method solves the problems, such as code search, and Do statement and class members's variable are handled, it compensate for the deficiency to work in the past, and substantially increase the accuracy rate of code search, so that programmer may search for the code needed in software development process, used for reference or be multiplexed, improve software development efficiency and quality.

Description

A kind of code search method based on constraint solving
Technical field
The present invention relates to computer software fields, in particular to a kind of code search method based on constraint solving.
Background technology
With pouring in for thought of increasing income, open source community is quickly grown, large quantities of open source communities such as GitHub, SourceForge, BitBucket the trustship outstanding code of magnanimity, for programmer reference, directly multiplexing or use.Soft In part development process, exploit person can greatly improve the development quality and effect of software to a certain extent with reference to these ripe codes Rate.Yet with Open Source Code enormous amount, most of open source community is also only provided simply based on keywords such as entry names Code search method, searches out the code come and does not meet the demand of programmer mostly, it is necessary to which substantial amounts of artificial screening, this is undoubtedly It is very cumbersome, poorly efficient.
The existing many work on code search technology now, is broadly divided into the code search skill based on keyword Art, code search technology and semantic-based code search technology based on syntactic structure.Code search skill based on keyword Art is mainly matched by the function name of code, class name, variable name etc., lacks and the specific of code is understood, and is based on syntactic structure Code search technology mainly by building Call Graph, carry out the overall structure of code analysis, be only suitable for programmer and search for certain The code of kind structure.Although semantic-based code search technology can make up above-mentioned deficiency, code analysis has in itself Certain challenge, current work have significant limitation.
The content of the invention
The purpose of the present invention, in view of problem above, the present invention is intended to provide a kind of code search side based on constraint solving Method, this method are improved existing semantic-based code search technology, there is provided to Java Do statements, class members becomes The processing of the problems such as amount, realizes code to the automatic conversion of constraint, and framework code search system on this basis, helps journey Sequence person finds the code of needs.
In order to realize foregoing invention purpose, scheme that the present invention uses for:A kind of code search side based on constraint solving Method, concretely comprises the following steps:
Step 1:Open source projects are obtained from open source community;
Step 2:Using JPF and JDT analysis source codes, SSA forms are translated into;
Step 3:Using the code of JDT analysis SSA forms, constraint is translated into;
Step 4:Source code and the constraint of generation are corresponded, build code-constraint storehouse;
Step 5:Code search system is built, helps user's searching code;
The present invention technique effect be:
1st, the solution for the problems such as source code being automatically converted to constraint, and providing Do statement and class members's variable is thought Road.
2nd, existing semantic-based code search technology is improved, can provides for programmer and more accurately search Rope
Effect, the problem of code can be used by solving artificial screening.
3rd, accurately and efficiently code search system can greatly improve the efficiency and quality of programmer's software development.
Brief description of the drawings
Fig. 1 is the code search method flow diagram based on constraint solving of the embodiment of the present invention.
Fig. 2 is the example in the extraction if sentences path of the embodiment of the present invention.
Fig. 3 is the example in the extraction while statement path of the embodiment of the present invention.
Fig. 4 is that the reachable path of the embodiment of the present invention is converted into the example of SSA forms.
Fig. 5 is the example that two paths of Fig. 2 are converted into constraint.
Embodiment
The present invention is described in further details with specific embodiment below in conjunction with the accompanying drawings.
A kind of code search method based on constraint solving of the present embodiment, this method are mainly characterized by asking based on constraint The method of solution solves the problems, such as code search, and Do statement and class members's variable are handled, and compensate for what is worked in the past Deficiency, and substantially increase the accuracy rate of code search so that programmer may search for what is needed in software development process Code, is used for reference or is multiplexed, and improves software development efficiency and quality.Fig. 1 is the embodiment of the present invention based on constraint solving Code search method flow diagram, specific steps include:
Step 1:Project is obtained from open source software;
Reptile instrument acquisition can be directly downloaded or write in the open source communities such as GitHub, SourceForge to increase income Project.Main Analysis of the present invention solves the code that Java language is write.
Step 2:Using JPF and JDT analysis source codes, SSA forms are translated into;
The open source projects that step 1 is obtained are imported into Eclipse instruments, compiling generation .class files.
Source code .class files corresponding with its are analyzed using the jpf-symbc modules of JPF instruments, obtaining should The reachable path of all member functions in open source projects.Because JPF instruments not can determine that the path of execution when analyzing if sentences It is True branches or False branches, it is therefore desirable to carry out pitching pile processing in the if sentences of source code.The processing of the part is existing Work have been directed to, be no longer specifically described realizing details here.Fig. 2 is the two of JPF tool analysis func functions generation Paths, P1 be True branches path, P2 be False branches under path, assert (a<0) represent to judge a<0.
Do statement is increasingly complex in processing.Because during static analysis, the cycle-index of Do statement is nothing What method determined, just it can determine that in program operation process only.It is the angle analysis from source code herein, is translated into about Beam, which is static, in order to solve the problems, such as that cycle-index is uncertain, provides that each Do statement at most circulates here N times, be Do statement generation n+1 paths, and the paths having more are not circulate the path directly jumped out.Fig. 3 be with The example of while statement analysis, set n as 2 in the example, i.e., at most circulates and generate three paths twice, in figure, and P1 is not have There is circulation directly to jump out, P2 jumps out for circulation primary, and P3 is jumped out twice for circulation.Here n is set with certain limitation, if N is too small, can cause that the path that may be gone to, such as the example of Fig. 3 cannot be included, the condition for circulating end is a>=10, Circulate 7 times and jump out altogether, and 2 set cannot cover the path;And if n is too big, such as n is set as 50, if source generation There are two circulations to occur side by side in code, 50 × 50 paths can be produced, if three circulations are arranged side by side, it will produce 50 × 50 × 50 Paths, cause number of paths to sharply increase, and run into the bottleneck of path explosion.
Handled more than, all reachable paths of a function can be obtained, then using JDT instruments to all reachable The code construction AST trees in path, analyze and are translated into SSA forms.SSA is a kind of intermediate representation, it ensures each Variable is only assigned once in a program, i.e., each variable is only defined once, it is ensured that it is closed with accurate use-definition System so that data-flow analysis and optimization algorithm are simpler.It is relatively simple that the code of reachable path is converted into SSA, here not It is described in detail, Fig. 4 is the example that code is converted into SSA forms.
Step 3:Using the code of JDT analysis SSA forms, constraint is translated into;
Code construction AST of the step using JDT instruments to the SSA forms of generation, handles the statement of variable in code, assigns It is worth sentence and conditional statement, code is converted into constraint.The type handled herein includes int, boolean and String types, And the Class Type that these types are formed.Demonstrate and how to be converted below, constraint form is the input of z3 solvers Form.
The statement of variable:
1.int a;->(declare-const a Int)
3.Student student;->(declare-const student(Student Int Bool String))
Variable name and types of variables are obtained from AST, is claimed as constraining using declare-const orders.For The variable of int, boolean and String type, has Int, Bool in z3, and String types correspond to therewith, directly uses Declare-const orders statement can (1).And Class Type needs to obtain all names of variables and type in class, use Declare-datatypes command definitions the type (2), then uses declare-const order sound in object-instantiated The bright variable (3).
Assignment statement:
1.a=1;->(assert (=a 1))
2.student=student2;->(assert (=student student2))
3.student.node=1;->(assert (=(node student) 1))
For assignment statement, judge whether the value of the right and left is identical using assert orders, assert orders are responsible for will The formula is added in the internal stack of z3.The assignment statement of the variable of int, boolean, String and Class Type directly uses Assert judge equal sign the right and left it is equal can (1,2), and the assignment statement comprising class members's variable is, it is necessary to use (member Variable name class variable name) form represent the member variable, then determined whether equal (3) using assert orders.
Conditional statement:
assert(a<10);->(assert(<a 10))
assert(student.node<10);->(assert(<(node student)10))
Conditional statement also represents that method for transformation is identical with the rule of assignment statement, no longer does here using assert orders Describe in detail.
Handled more than, the code of SSA forms can be converted into the constraint of z3 forms.The constraint of the form can be straight Connect and solved using z3 solvers.
Step 4:Source code and the constraint of generation are corresponded, build code-constraint storehouse;
Handled by above step, the source code of all functions of project, and each corresponding each road of function can be obtained The constraint in footpath.Next these data are stored using MySQL database, builds code-constraint storehouse, facilitate user to inquire about.Data The establishment structure of storehouse table is as follows:
Wherein, Id fields are unique identifier, and projectName fields are entry name, and methodName fields are function Name, contains the absolute path of function in the project, and the statement of function, and source fields are the source code of the function, Constraints fields are the corresponding constraint of source code.
Step 5:Code search system is built, helps user's searching code;
When user inquires about code, the input and output that code search system is responsible for user to provide (input representative function Parameter, exports the return value for function) constraint is converted into, parameter is then found out in code-constraint storehouse and returns to Value Types phase Matched function, the constraint of these functions is combined one by one with input and output constraint, is solved using z3 solvers, if z3 is solved to Sat, then it represents that the constraint meets the demand of user, and corresponding function is returned to user, if unsat, represents not meeting.
Detailed explanation is carried out used here as the function of Fig. 2.Func function representations in Fig. 2, if parameter a is less than 0, Return a and b's and, otherwise return a and b difference.The input of the function is two int types, and output is also int types.Fig. 5 For the constraint of two paths of function generation.User is when inquiring about code, it is only necessary to provide the input of two int types, one The output of a int types, such as { 2,1,1 }, code search instrument are converted into constraint, (assert (=input1 2)), (assert (=input2 1)), (assert (=output 1)), the then constraint with two paths of Fig. 5 is combined one by one, Being solved using z3, the constraint solving of any paths is returned as sat and then shows that constraining corresponding functor closes user demand, this In example, the constraint solving of path P 1 is sat, then shows that the source code of the function needed for user, is returned to user by func functions. If user provides inquiry { 2,1,2 }, the constraint solving of two paths is all unsat, then it represents that the function does not meet user's Demand, code search system can search for the constraint of other functions in the database, solve one by one, until searching out suitable letter Number.
The foregoing is only a preferred embodiment of the present invention, not makees limitation in any form to the present invention, appoints What those skilled in the art, without departing from the scope of the present invention, according to the present invention technical spirit to Any simple modification that upper embodiment is made, or equivalent variations, still fall within the protection domain of claims of the present invention.

Claims (10)

  1. A kind of 1. code search method based on constraint solving, it is characterised in that include the following steps:
    Step 1: open source projects are obtained from open source community;
    Step 2: analyzing source code using JPF and JDT, SSA forms are translated into;
    Step 3: using the code of JDT analysis SSA forms, constraint is translated into;
    Step 4: source code and the constraint of generation are corresponded, code-constraint storehouse is built;
    Step 5: structure code search system, helps user's searching code.
  2. 2. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 1:
    The open source community refers to the Open Source Code storehouses such as Sourceforge, Github, BitBucket, Google Code.
  3. 3. the code search method according to claim 2 based on constraint solving, it is characterised in that in step 1:
    Directly download or write in the open source communities such as Sourceforge, Github, BitBucket, Google Code and climb Worm instrument obtains open source projects.
  4. 4. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 2:
    Step 2.1:.class is analyzed using the jpf-symbc modules of JPF (Java PathFinder) instrument, is obtained The reachable path of all member functions;
    Step 2.2:Using JDT (Java Development tools) to reachable path structure AST (Abstract Syntax Tree) set, analyze and be converted into SSA (Static Single-Assignment) form.
  5. 5. the code search method according to claim 4 based on constraint solving, it is characterised in that in step 2:
    The open source projects that step 1 obtains are imported into Eclipse instruments, compiling generation .class files;
    Source code .class files corresponding with its are analyzed using the jpf-symbc modules of JPF instruments, this is obtained and increases income The reachable path of all member functions in project;Pitching pile processing is carried out in the if sentences of source code;
    After all reachable paths for obtaining a function, the code construction AST trees using JDT instruments to all reachable paths, point Analyse and be translated into SSA forms;SSA is a kind of intermediate representation, ensures that each variable is only assigned once in a program, I.e. each variable is only defined once, it is ensured that it is with accurate use-definition relation.
  6. 6. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 3:
    Path construction ASTs of the JDT to SSA forms is reused, analyzes and is translated into constraint;Comprising to if sentences, while Sentence, for sentences, and the conversion of class members's variable.
  7. 7. the code search method according to claim 6 based on constraint solving, it is characterised in that in step 3:
    Code construction AST using JDT instruments to the SSA forms of generation, handles the statement of variable in code, assignment statement and bar Part sentence, constraint is converted into by code;The type of processing includes int, boolean and String types, and these type structures Into Class Type.
  8. 8. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 4:
    The source code of all member functions and its constraint one-to-one corresponding are stored in MySQL database, build code-constraint Storehouse.
  9. 9. the code search method according to claim 8 based on constraint solving, it is characterised in that in step 4:
    By the source code of all functions of the project that obtains, and the constraint in each corresponding each path of function;Use MySQL data These data of library storage, build code-constraint storehouse, facilitate user to inquire about;The establishment structure of database table is as follows:
    Field Type Key Default Id int PRI NULL projectName varchar(512) NULL methodName varchar(512) NULL source text NULL constraints text NULL
    Wherein, Id fields are unique identifier, and projectName fields are entry name, and methodName fields are function name, bag Containing the absolute path of function in the project, and the statement of function, source fields are the source code of the function, Constraints fields are the corresponding constraint of source code.
  10. 10. the code search method according to claim 1 based on constraint solving, it is characterised in that in step 5, structure Code search instrument, the code that user is needed using input and output inquiry:
    When user inquires about code, input and output that code search system is responsible for the user to provide (parameter of input representative function, Export the return value for function) constraint is converted into, parameter and return value type matching are then found out in code-constraint storehouse Function, the constraint of these functions is combined one by one with input and output constraint, is solved using z3 solvers, if z3 is solved to sat, Represent that the constraint meets the demand of user, corresponding function is returned into user, if unsat, represent not meeting.
CN201711405834.0A 2017-12-22 2017-12-22 A kind of code search method based on constraint solving Pending CN107992324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711405834.0A CN107992324A (en) 2017-12-22 2017-12-22 A kind of code search method based on constraint solving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711405834.0A CN107992324A (en) 2017-12-22 2017-12-22 A kind of code search method based on constraint solving

Publications (1)

Publication Number Publication Date
CN107992324A true CN107992324A (en) 2018-05-04

Family

ID=62041477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711405834.0A Pending CN107992324A (en) 2017-12-22 2017-12-22 A kind of code search method based on constraint solving

Country Status (1)

Country Link
CN (1) CN107992324A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177312A (en) * 2019-12-10 2020-05-19 同济大学 Open source code searching method with grammar and semantics fused
CN112527388A (en) * 2019-09-17 2021-03-19 中国科学院软件研究所 GitHub large-scale open source code-oriented quick code file tracing method and device
CN112948374A (en) * 2021-01-29 2021-06-11 吉林大学 Relational database searching method based on logic program
CN117193750A (en) * 2023-11-08 2023-12-08 深圳大数信科技术有限公司 Full stack low code platform implementation method based on CraphQL

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077144A (en) * 2014-07-07 2014-10-01 西安交通大学 Data race detection and evidence generation method based on multithreaded program constraint building
CN106610898A (en) * 2016-12-28 2017-05-03 南京大学 JPF-based Java code SSA single path generation method
CN106649118A (en) * 2016-12-28 2017-05-10 南京大学 Generating method of SSA single path of Java code based on AST
CN107247668A (en) * 2017-06-07 2017-10-13 成都四象联创科技有限公司 Code automatic detection and bearing calibration

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077144A (en) * 2014-07-07 2014-10-01 西安交通大学 Data race detection and evidence generation method based on multithreaded program constraint building
CN106610898A (en) * 2016-12-28 2017-05-03 南京大学 JPF-based Java code SSA single path generation method
CN106649118A (en) * 2016-12-28 2017-05-10 南京大学 Generating method of SSA single path of Java code based on AST
CN107247668A (en) * 2017-06-07 2017-10-13 成都四象联创科技有限公司 Code automatic detection and bearing calibration

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KATHRYN T. STOLEE,SEBASTIAN ELBAUM,DANIEL DOBOS: ""Solving the Search for Source Code"", 《ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112527388A (en) * 2019-09-17 2021-03-19 中国科学院软件研究所 GitHub large-scale open source code-oriented quick code file tracing method and device
CN112527388B (en) * 2019-09-17 2022-10-11 中国科学院软件研究所 GitHub large-scale open source code-oriented quick code file tracing method and device
CN111177312A (en) * 2019-12-10 2020-05-19 同济大学 Open source code searching method with grammar and semantics fused
CN112948374A (en) * 2021-01-29 2021-06-11 吉林大学 Relational database searching method based on logic program
CN117193750A (en) * 2023-11-08 2023-12-08 深圳大数信科技术有限公司 Full stack low code platform implementation method based on CraphQL
CN117193750B (en) * 2023-11-08 2024-03-15 深圳大数信科技术有限公司 Full stack low code platform implementation method based on CraphQL

Similar Documents

Publication Publication Date Title
Elallaoui et al. Automatic transformation of user stories into UML use case diagrams using NLP techniques
Rostaing et al. Automatic differentiation in Odyssee
CN107992324A (en) A kind of code search method based on constraint solving
CN101770363B (en) Method and device for transformation of executable code from into different programming language
Korns Accuracy in symbolic regression
US11341127B2 (en) Information processing system, information processing apparatus, information processing method, and information processing program
US8806452B2 (en) Transformation of computer programs and eliminating errors
Greifenberg et al. Engineering tagging languages for DSLs
CN114547619B (en) Vulnerability restoration system and restoration method based on tree
CN109799990A (en) Source code annotates automatic generation method and system
CN107203468A (en) A kind of software version evolution comparative analysis method based on AST
US20140325472A1 (en) Providing Code, Code Generator and Software Development Environment
CN108563561B (en) Program implicit constraint extraction method and system
Radošević et al. Source code generator based on dynamic frames
Fischer et al. Abstract syntax trees-and their role in model driven software development
Weiss et al. Decision-model-based code generation for SPLE
CN106649118A (en) Generating method of SSA single path of Java code based on AST
CN105930162B (en) A kind of characteristic positioning method based on subgraph search
CN107092515B (en) LPMLN reasoning method and system based on answer set logic program
CN111176993A (en) Code static detection method based on abstract syntax tree
Long et al. A logical data exchange model for adapting different methods abstracting plant architecture
Barbier et al. Model-driven engineering applied to crop modeling
CN117251477B (en) Standardized inspection method and system for data development script based on SQL (structured query language) analysis
Chodarev et al. Development of Oberon-0 using YAJCo
Nähring et al. From standardized modeling formats to modeling languages and back—An exploration based on SBML and ML-Rules

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180504

RJ01 Rejection of invention patent application after publication