CN103559025A - Software refactoring method through clustering - Google Patents

Software refactoring method through clustering Download PDF

Info

Publication number
CN103559025A
CN103559025A CN201310495785.XA CN201310495785A CN103559025A CN 103559025 A CN103559025 A CN 103559025A CN 201310495785 A CN201310495785 A CN 201310495785A CN 103559025 A CN103559025 A CN 103559025A
Authority
CN
China
Prior art keywords
source code
entity
program entity
correlation coefficient
software
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310495785.XA
Other languages
Chinese (zh)
Other versions
CN103559025B (en
Inventor
曹阳
王永会
王守金
李孟歆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Jianzhu University
Original Assignee
Shenyang Jianzhu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Jianzhu University filed Critical Shenyang Jianzhu University
Priority to CN201310495785.XA priority Critical patent/CN103559025B/en
Publication of CN103559025A publication Critical patent/CN103559025A/en
Application granted granted Critical
Publication of CN103559025B publication Critical patent/CN103559025B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Stored Programmes (AREA)

Abstract

The invention discloses a software refactoring method through clustering and belongs to the technical field of software engineering. The software refactoring method through clustering is characterized by comprising the following steps of inputting source code information into a source code parser; parsing the source code information and extracting program entities and related attributes of the program entities; calling a filter, screening out redundant information in the source code information, and utilizing a field rule base to establish a system fact base; determining correlation coefficients among the entities through similarity calculation; automatically decomposing a core business concern module based on functions through a directed graph cluster analysis method; verifying the correctness of the refactored system and adjusting the field rule base according to the verified results. According to the software refactoring method through clustering, a large complex software system is automatically decomposed into smaller and more manageable subsystems, and the system is easy to comprehend and maintain; meanwhile, by modifying the correlation coefficients of the attributes of the field rule base, the software refactoring method through clustering is applicable to different application fields, thereby having good universality.

Description

A kind of method that adopts cluster mode to carry out software reconfiguration
Technical field
The present invention relates to a kind of method that adopts cluster mode to carry out software reconfiguration, belong to technical field of software engineering.
Background technology
Reconstruct is the process of program conversion, this process improvement the realization of software requirement, and keep program behavior constant.Software, in its life cycle, inevitably changes, and these variations may be the changes of user's request, also may be in order to correct the mistake of software itself.In order to reduce the maintenance cost of software, extend its serviceable life, software maintenance personnel often face the problem of software reconfiguration.But along with scale and the complexity of software systems increases, influencing each other between each functional module of system becomes more complicated.Particularly those lack the Legacy System of document, how to be reconstructed, and are current software maintenance urgent problems.
In order to address this problem, China's patent of invention, the patent No. is 200810163396.6 " the code level component assembly method based on grammer reconstruct ", it is using member as having on relatively independent function and reusable software module basis, code level component assembly method based on grammer reconstruct is disclosed in a kind of technical field of software engineering.The method is isolated abstract syntax and concrete syntax from the syntax gauge of program language, and sets up the code level member that meets described new syntax standard, then carries out Components Composition, has independently advantage of language while making software repeated usage.But this reconstruct is code level, lack the improvement to program structure.Particularly, have a large amount of core business focus based on function in system, its logical relation has dividing of power, and these core business focus may be distributed in different modules.For example, in program, needing that the stock in manufacturing is controlled to function modifies, and this part function intersperses among a plurality of modules such as material inventory administration module, finished goods inventory administration module and material supply module, according to above-mentioned patented method, the modification of code will relate to this plurality of modules, and this will increase the cost of modification and the probability of makeing mistakes undoubtedly.Just because of core business focus is crossed over a plurality of modules possibly, be both and must be convenient to revise, safeguard and upgrade, software maintenance personnel are when carrying out software reconfiguration, answer dependence and coupled characteristic between routine analyzer core business focus, simple based on grammer reconfiguration code level member, the system after reconstruct cannot obtain essence and improve.
Summary of the invention
The present invention puts forward for the problems referred to above, and object is to provide a kind of method that adopts cluster mode to carry out software reconfiguration, by large-scale, complicated software systems automatic classifying Cheng Geng little, more manageable subsystem.
For achieving the above object, the technical scheme of technical solution problem of the present invention is:
(1) input source code file, and by source code information analysis, according to program syntax rule, intactly representation program semantic information;
(2) structure filtrator, screen out unnecessary information, according to Program Semantics information determining system program entity and association attributes thereof, and according to the dependence between program entity attribute and coupled characteristic, the correlation coefficient of given each association attributes in domain-planning storehouse, generates factbase;
(3) similarity is calculated, and has a plurality of association attributeses between program entity, calculates the correlation coefficient between determine procedures entity by similarity;
(4) set up cluster, by digraph clustering methodology, by program entity cluster to similar in system or that the degree of correlation is high bunch, each bunch forms a new module;
(5) result is visual, by the result obtaining after cluster analysis, with the form of easily understanding and use, offers system maintenance personnel to complete software reconfiguration.
The present invention compared with prior art has following advantageous effect:
(1) adopt cluster mode to carry out software reconfiguration, the target of its improvement is the core business focus based on function, and the module of system is rebuild to post code will have good reusability;
(2) after cluster analysis, the similar or degree of correlation of each module internal program entity is higher, solves code and disperses and chaotic problem, system easy to understand and maintenance;
(3) different application, the same attribute of program entity has different linked characters, can be by revising the correlation coefficient of attribute in domain-planning storehouse, the cluster result producing is like this also by difference, and this makes this patent have better versatility.
Accompanying drawing explanation
Fig. 1 adopts cluster mode to carry out software reconfiguration process schematic diagram.。
Fig. 2 source code resolver structural representation.
Fig. 3 digraph clustering methodology example schematic A---digraph.
Fig. 4 digraph clustering methodology example schematic B---dendrogram.
Embodiment
Referring to accompanying drawing of the present invention, also in conjunction with specific embodiments the present invention is further elaborated, but protection scope of the present invention is not limited by specific embodiment, with claims, is as the criterion.In addition, under the prerequisite without prejudice to the present invention program, within any change that those of ordinary skills made for the present invention easily realize or change all will fall into claim scope of the present invention.
Referring to accompanying drawing 1, the present invention includes following steps:
The first step, referring to accompanying drawing 2, calls source code resolver, and source code is resolved and filtered, and sets up factbase.The detailed process of this step is described below:
(1) referring to accompanying drawing 2, source code file is scanned, and by source code input information source code resolver;
(2) referring to accompanying drawing 2, source code information is resolved, extract program entity and association attributes thereof in code information.Detailed process is: source code is carried out to syntax parsing; Extract the syntax tree of code information; Syntax tree is carried out to semanteme resolves; Obtain program entity and association attributes thereof in code information.Program entity comprises: class, function, operation flow; Entity attribute comprises: bag, file, function, database, test case etc.;
(3) referring to accompanying drawing 2, call filtrator, screen out information unnecessary in source code information, in conjunction with the correlation coefficient of each given attribute of domain-planning storehouse, set up factbase.
The foundation in syntax rule storehouse.For syntax analyzer provides syntax rule, this syntax analyzer can be translated into the context-free grammar of certain programmed language the syntax tree of this programming language.
The foundation in domain-planning storehouse.Different applications, the same attribute of program entity has different linked characters.With reference to field factor, according to dependence, the coupled characteristic between native system program entity attribute, the correlation coefficient of given each association attributes of native system.The attribute associated with core business focus wherein, its coefficient value is higher, to guarantee that core business focus obtains higher aggregation.Domain knowledge is the description collection of this field function, and the description of each function is comprised: program entity numbering, affiliated field, version number, functional description, business object, backup, the association attributes having and correlation coefficient.
The foundation of factbase, under domain-planning to obtaining after source code information filtering.The core business focus that is this system is described collection, the description of each program entity is comprised: program entity numbering, interface name, core business focus, input parameter, output parameter, rreturn value, program entity supplier, version number, key word, the association attributes having and correlation coefficient.
Second step, similarity is calculated.Between program entity, there are a plurality of association attributeses, according to formula 1, carry out similarity calculating, the correlation coefficient between determine procedures entity;
Figure 201310495785X100002DEST_PATH_IMAGE001
(1)
Wherein, x, yrepresentation program entity, drepresent number of attributes; s (x, y)correlation coefficient between representation program entity x, y, s(x k , y k )for program entity x, y kthe correlation coefficient of individual attribute; w(x k , y k )get 0or 1, representation program entity x, y kwhether individual attribute is relevant.
The 3rd step, cluster analysis.According to the program entity dependence in factbase, set up digraph, then according to the similarity result of calculation of entity, carry out cluster analysis.The detailed process of this step is described below:
(1) set up digraph.Referring to accompanying drawing 3, be example, suppose to exist in factbase 10 program entity (numbering: 1---10), set up digraph.The solid line with arrow in this figure, represents that 2 entities have dependence.As shown in Figure 3, we can say that entity 2 relies on entity 1.
(2) digraph cluster analysis.Referring to accompanying drawing 4, it is example, according to the similarity of entity, carry out cluster analysis, obtain 2 bunches 1,2,3}, 4,6,10}, in this figure, solid line represents to have the higher similar or degree of correlation (correlation coefficient value is high), dotted line represents the lower similar or degree of correlation (correlation coefficient value is low).Wherein, the similar or degree of correlation of entity 4 and entity 10 is higher, and this is because they have quoted entity 7 jointly; And similar or the degree of correlation is lower between entity 5 and entity 6, non-core services focus when this may their common child node 8.
The 4th step, reconstructed module.By in program entity cluster to similar in system or that the degree of correlation is high bunch, each bunch forms a new module.These modules are offered to system maintenance personnel to complete software reconfiguration with the form of easily understanding and use;
The 5th step, verifying correctness.System after reconstruct is submitted to user or domain expert, opinion collection, and carry out Completeness, consistency check and nonredundancy check.
The 6th step, adjusts domain-planning storehouse.According to collecting assay or suggestion, adjust the correlation coefficient of Zhong, this area, domain-planning storehouse entity association attributes.Re-start again system reconfiguration.
Complete above step, can realize software reconfiguration, software systems large-scale, complexity go out automatic classifying the module of the core business focus based on function, and system architecture is manageability more.

Claims (4)

1. adopt cluster mode to carry out a method for software reconfiguration, it is characterized in that: described method contains successively following steps and is:
Step 1, calls source code resolver, and source code is resolved and filtered, and sets up factbase;
the detailed process of this step is described below:
(1) source code file is scanned, and by source code input information source code resolver;
(2) source code information is resolved, extract program entity and association attributes thereof in code information;
Detailed process is: source code is carried out to syntax parsing; Extract the syntax tree of code information; Syntax tree is carried out to semanteme resolves; Obtain program entity and association attributes thereof in code information;
Program entity comprises: class, function, operation flow; Entity attribute comprises: bag, file, function, database, test case etc.;
(3) call filtrator, screen out information unnecessary in source code information, in conjunction with the correlation coefficient of each given attribute of domain-planning storehouse, set up factbase;
Step 2, similarity is calculated;
By the association attributes existing between program entity is carried out to similarity calculating, the correlation coefficient between determine procedures entity;
Step 3, cluster analysis;
According to the program entity dependence in factbase, set up digraph, then according to the similarity result of calculation of entity, carry out cluster analysis;
The detailed process of this step is described below:
(1) set up digraph;
The solid line with arrow in this figure, represents that 2 entities have dependence;
(2) digraph cluster analysis;
According to the similarity of entity, carry out cluster analysis, in figure, solid line represents to have the higher similar or degree of correlation (correlation coefficient value is high), and dotted line represents the lower similar or degree of correlation (correlation coefficient value is low);
The node clustering that solid line is connected is bunch;
Step 4, reconstructed module;
By in program entity cluster to similar in system or that the degree of correlation is high bunch, each bunch forms a new module;
These modules are offered to system maintenance personnel to complete software reconfiguration with the form of easily understanding and use;
Step 5, verifying correctness;
System after reconstruct is submitted to user or domain expert, opinion collection, and carry out Completeness, consistency check and nonredundancy check;
Step 6, adjusts domain-planning storehouse;
According to collecting assay or suggestion, adjust the correlation coefficient of Zhong, this area, domain-planning storehouse entity association attributes;
Re-start again system reconfiguration.
2. a kind of method that adopts cluster mode to carry out software reconfiguration according to claim 1, it is characterized in that: by the knowledge of grammar, set up syntax rule storehouse, for syntax analyzer provides syntax rule, this syntax analyzer can be translated into the context-free grammar of certain programmed language the syntax tree of this programming language.
3. a kind of method that adopts cluster mode to carry out software reconfiguration according to claim 1, is characterized in that: by domain knowledge, set up domain-planning storehouse, and different applications, the same attribute of program entity has different linked characters;
With reference to field factor, according to dependence, the coupled characteristic between native system program entity attribute, the correlation coefficient of given each association attributes of native system;
Domain knowledge is the description collection of this field function, and the description of each function is comprised: program entity numbering, affiliated field, version number, functional description, business object, backup, the association attributes having and correlation coefficient.
4. a kind of method that adopts cluster mode to carry out software reconfiguration according to claim 1, it is characterized in that: under domain-planning to obtaining factbase after source code information filtering, a kind of method that adopts cluster mode to carry out software reconfiguration described in claim 1, the core business focus that is this system is described collection, the description of each program entity is comprised: program entity numbering, interface name, core business focus, input parameter, output parameter, rreturn value, program entity supplier, version number, key word, the association attributes having and correlation coefficient.
CN201310495785.XA 2013-10-21 2013-10-21 Software refactoring method through clustering Expired - Fee Related CN103559025B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310495785.XA CN103559025B (en) 2013-10-21 2013-10-21 Software refactoring method through clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310495785.XA CN103559025B (en) 2013-10-21 2013-10-21 Software refactoring method through clustering

Publications (2)

Publication Number Publication Date
CN103559025A true CN103559025A (en) 2014-02-05
CN103559025B CN103559025B (en) 2017-01-25

Family

ID=50013281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310495785.XA Expired - Fee Related CN103559025B (en) 2013-10-21 2013-10-21 Software refactoring method through clustering

Country Status (1)

Country Link
CN (1) CN103559025B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593182A (en) * 2013-10-27 2014-02-19 沈阳建筑大学 Method for reconfiguring software by using clustering mode
CN104391964A (en) * 2014-12-01 2015-03-04 南京大学 Method for storing source codes into graph database
CN107678968A (en) * 2017-10-18 2018-02-09 北京奇虎科技有限公司 Sample extraction method, apparatus, computing device and the storage medium of source code function
CN109165155A (en) * 2018-06-20 2019-01-08 扬州大学 A kind of software defect recovery template extracting method based on clustering
CN110659063A (en) * 2019-08-08 2020-01-07 平安科技(深圳)有限公司 Software project reconstruction method and device, computer device and storage medium
CN111475158A (en) * 2020-03-16 2020-07-31 咪咕文化科技有限公司 Sub-domain dividing method and device, electronic equipment and computer readable storage medium
CN113190269A (en) * 2021-04-16 2021-07-30 南京航空航天大学 Code reconstruction method based on programming context information
CN113238796A (en) * 2021-05-17 2021-08-10 北京京东振世信息技术有限公司 Code reconstruction method, device, equipment and storage medium
CN113504972A (en) * 2021-07-26 2021-10-15 京东科技控股股份有限公司 Service deployment method and device, electronic equipment and storage medium
US11269625B1 (en) 2020-10-20 2022-03-08 International Business Machines Corporation Method and system to identify and prioritize re-factoring to improve micro-service identification
CN114237774A (en) * 2022-02-14 2022-03-25 北京安盟信息技术股份有限公司 Internal calling method for removing dependence of functional module

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090276757A1 (en) * 2008-04-30 2009-11-05 Fraunhofer Usa, Inc. Systems and methods for inference and management of software code architectures
CN103235877A (en) * 2013-04-12 2013-08-07 北京工业大学 Robot control software module partitioning method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090276757A1 (en) * 2008-04-30 2009-11-05 Fraunhofer Usa, Inc. Systems and methods for inference and management of software code architectures
CN103235877A (en) * 2013-04-12 2013-08-07 北京工业大学 Robot control software module partitioning method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
方晨 等: "主成分分析和聚类分析在软件重构中的应用", 《计算机工程与设计》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593182A (en) * 2013-10-27 2014-02-19 沈阳建筑大学 Method for reconfiguring software by using clustering mode
CN104391964A (en) * 2014-12-01 2015-03-04 南京大学 Method for storing source codes into graph database
CN107678968A (en) * 2017-10-18 2018-02-09 北京奇虎科技有限公司 Sample extraction method, apparatus, computing device and the storage medium of source code function
CN109165155A (en) * 2018-06-20 2019-01-08 扬州大学 A kind of software defect recovery template extracting method based on clustering
CN109165155B (en) * 2018-06-20 2021-06-22 扬州大学 Software defect repairing template extraction method based on cluster analysis
CN110659063A (en) * 2019-08-08 2020-01-07 平安科技(深圳)有限公司 Software project reconstruction method and device, computer device and storage medium
CN111475158A (en) * 2020-03-16 2020-07-31 咪咕文化科技有限公司 Sub-domain dividing method and device, electronic equipment and computer readable storage medium
US11269625B1 (en) 2020-10-20 2022-03-08 International Business Machines Corporation Method and system to identify and prioritize re-factoring to improve micro-service identification
CN113190269A (en) * 2021-04-16 2021-07-30 南京航空航天大学 Code reconstruction method based on programming context information
CN113238796A (en) * 2021-05-17 2021-08-10 北京京东振世信息技术有限公司 Code reconstruction method, device, equipment and storage medium
CN113504972A (en) * 2021-07-26 2021-10-15 京东科技控股股份有限公司 Service deployment method and device, electronic equipment and storage medium
CN114237774A (en) * 2022-02-14 2022-03-25 北京安盟信息技术股份有限公司 Internal calling method for removing dependence of functional module

Also Published As

Publication number Publication date
CN103559025B (en) 2017-01-25

Similar Documents

Publication Publication Date Title
CN103559025A (en) Software refactoring method through clustering
CN111712809A (en) Learning ETL rules by example
CN105912594B (en) SQL statement processing method and system
CN105373469A (en) Interface based software automation test method
CN105912595A (en) Data origin collection method of relational databases
US20110314060A1 (en) Markup language based query and file generation
CN106919612A (en) A kind of processing method and processing device of SQL script of reaching the standard grade
CN103902269B (en) System and method for generating MIB files through XML files
CN107291450A (en) A kind of quick code automatic generation method for programming friendly
CN109446221A (en) A kind of interactive data method for surveying based on semantic analysis
CN107491476B (en) Data model conversion and query analysis method suitable for various big data management systems
CN102591777A (en) Unit test code generation method and device
CN109902117A (en) Operation system analysis method and device
CN109992271B (en) Layered architecture recognition method based on code vocabulary and structure dependence
CN103593182A (en) Method for reconfiguring software by using clustering mode
CN102902818A (en) Method and device for upgrading database
US9652478B2 (en) Method and apparatus for generating an electronic document schema from a relational model
CN103020318A (en) Method for maintenance of database tables in database
CN108256820A (en) A kind of PBOM methods of adjustment under three-dimensional assembled view based on MBD
Sanchez et al. Bigraphical modelling of architectural patterns
CN111984826B (en) XML-based data automatic warehousing method, system, device and storage medium
CN112130849B (en) Code automatic generation method and device
CN103678349A (en) Method and device for filtering useless data
Lu et al. Zen-CC: An automated and incremental conformance checking solution to support interactive product configuration
CN111008011A (en) System builder for power platform application development

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170125

Termination date: 20171021

CF01 Termination of patent right due to non-payment of annual fee