CN108182071A - A kind of configuration error detection method of software-oriented upgrading - Google Patents

A kind of configuration error detection method of software-oriented upgrading Download PDF

Info

Publication number
CN108182071A
CN108182071A CN201711432725.8A CN201711432725A CN108182071A CN 108182071 A CN108182071 A CN 108182071A CN 201711432725 A CN201711432725 A CN 201711432725A CN 108182071 A CN108182071 A CN 108182071A
Authority
CN
China
Prior art keywords
code
variations
product
configuration
software
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711432725.8A
Other languages
Chinese (zh)
Inventor
周红卫
刘延新
周博
吴昊
张晓洲
王钟沛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Run He Software Inc Co
Original Assignee
Jiangsu Run He Software Inc Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Run He Software Inc Co filed Critical Jiangsu Run He Software Inc Co
Priority to CN201711432725.8A priority Critical patent/CN108182071A/en
Publication of CN108182071A publication Critical patent/CN108182071A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

Invention is related to a kind of configuration error detection method of software-oriented upgrading.The code structure of Puppet products is analyzed first, changes descriptive model from the type of code, position, content, the feature for changing operation etc. coded description, design code;Then a large amount of history of project being obtained from item code version control tool using web crawlers and submitting record, variation feature is extracted using code variations descriptive model and is quantified;The hierarchy clustering method based on density is finally used to polymerize similar code variations, by the error reason for going out to cause code to adjust to the code variations analysis and summary in same cluster, so as to obtain common error pattern.

Description

A kind of configuration error detection method of software-oriented upgrading
Technical field
The present invention relates to a kind of configuration error detection methods of software-oriented upgrading, belong to software technology field.
Background technology
Cloud computing technology causes application system more to tend to scale, diversification, personalization and real time implementation, thus also makes cloud Using the challenge for facing the demands such as agile development, lasting delivery, rapid deployment, efficient O&M.In this context, O&M is automated Technology and tool are more used instead traditional artificial O&M and apply release cycle and publication mistake to reduce.Automate O&M Refer under minimum manual intervention, with third party's configuration management tool or code script, ensure the efficient stable of software systems Operation.The application system increasing in face of the service logic and scale that become increasingly complex, automation maintenance work make from early days Hundreds machine is safeguarded with simple script, develops into the efficient pipe that more large-scale cluster is realized by third party's configuration management tool Reason.In O&M automation process, the use of configuration management tool is key link therein.Configuration management tool is using basis The software systems that the mode of facility, that is, code describes target environment using field language-specific are configured, and are formed for specific software Product is configured, the purpose of automatic deployment and configuration is realized by performing the product.Therefore, configuration product is to install, be configured With the reusable perform script for managing a certain specific software system.The master in operation management field is become using configuration management tool Product is often uploaded to community and is divided by stream trend, user while the progress automatic management of relevant configuration product is write It enjoys, other users can download use without writing configuration management code again.Current large-scale product storage dispersion, lacks Effective Classification Management mechanism brings difficulty to correct selection and using suitable product.
The community of official of current main-stream configuration management tool only provides the retrieval of simple the Resources list and keyword match Service, search meet the product of user demand when, need to take a significant amount of time retrieval result browsed and refined.In order to Enhance recall precision, some communities employ tag system, however label can lack rule by the self-defined establishment of product developer Plasticity, and dependent on the technological know-how level and label use habit of developer, thus lack effective product classified body System provides efficient retrieval to the user.On the other hand, product is configured in community to be contributed by developer mostly, code quality is depended on and opened The technical merit of originator, therefore the code of community's product still has potential mistake, leads to goal systems configuration error, to software system Before system carries out automation operation management, therefore, to assure that the correctness of product is configured, to avoid system configuration errors.However configuration system Product are write by field language-specific, different from mainstream programming language, currently for configuration product code error research still not Foot.
The automation O&M for realizing large scale system using configuration management tool is disposed, and reduces the technology door of maintenance work Sill promote the O&M efficiency of system under cloud environment.However how to efficiently use, management software configuration product, promote product code Quality.Product amounts are numerous in configuration management tool community, but dispersion is stored in multiple resources banks, and recall precision exists not Foot.Since the exploitation of product is needed compared with high-tech threshold, user retrieves and is multiplexed existing in the product resources bank of respective community Product become more easily method.User downloads corresponding product according to deployment configuration demand from resources bank, for disposing ring Border reduces the link of product exploitation.How Unify legislation is carried out to various products, large-scale data are carried out at automation Managing and improving recall precision becomes firstly the need of solving the problems, such as.Product code quality depends on developer's skill in existing resource library Art is horizontal, and configuration product mistake will cause goal systems abnormal.Since configuration management tool development time is short, for configuration product The testing mechanism of code quality is still immature, therefore the configuration product code in community resource library is more by developer's itself Technical merit and the feedback error of user, the correctness of product code can not ensure.If user performs in reuse artifacts With vicious configuration product, system exception will be caused.Configuration error has become the main reason for leading to the system failure it One.Therefore it realizes error pattern present in automatically analysis configuration code and carries out error detection for avoiding system exception Have great importance.Currently, operation management field is just become using the automation O&M that configuration management tool is important means Main trend.However when selecting suitable product, there are still the problem of recall precision is low, product code quality can not ensure, Play a greater role to configuration management tool in automation O&M field the serious restriction of generation.
Invention content
It is proposed a kind of efficient systematic searching positioning of software configuration product and code error detection method, design, which is realized to be directed to, matches The system for unified management of product is put, to improving configuration product service efficiency, pushing automation O&M that there is highly important research Value and realistic meaning.
The technology of the present invention solution:A kind of configuration error detection method of software-oriented upgrading, feature are to realize Step is as follows:
The first step analyzes the code structure of Puppet products, from the type of code, position, content, changes operation etc. description Feature, the design code of code change descriptive model.
Configuration product mostly to state the exploitation of the field language-specific of formula, therefore the program analysis tool of conventional programming language without Method is applied to product code.On the basis of product language feature is understood in depth, invention is by code variations and changes class substantially Type forms code variations descriptive model, and quantization is realized to the product code of textual form.Manifests is main product mesh It records, is the main executable configuration product of product under the catalogue, including startup file and function class file;Files catalogues are used The static file loaded is needed during deployment of components is preserved;The ruby plug-in units needed are performed under libs catalogues comprising product; Templates catalogues preserve the profile template of component, by the configuration file of template dynamic generation component software, Tests and spec catalogues are the use example and test file of the product respectively.The Puppet codes of invention concern refer to lead The bright part of human hair combing waste in catalogue manifests with " pp " for suffix name.The type file uses the syntactic structure of similar ruby, by portion File, operation, user, software package, service in administration's configuration process etc. are defined as resource, and 48 kinds of Puppet supports at present are resources-type Type, each resource has the general-purpose attributes such as title and respectively distinctive attribute, a series of resources pass through in a certain order Form class or definition.It is identical with conventional programming language, Puppet linguistic variables include character string, numerical value, array, support assignment, Logic judgment such as compares at the operations, can additionally define dependence, combination.Code variations are the code blocks of single file primary Front and rear content change is submitted, code block refers to the code snippet in same resource or logical process, this represents primary and submits When single file can include multiple code variations, and code variations can include the modification of a line or lines of code. The user attribute modifications of file resources be owner attributes, show original code error be file resources user attributes with Owner attributes are used with.The reason of code is caused to be submitted is not only the reparation of code error, further includes the increase, more of new function Removal, code refactoring of remaining resource etc., however increase new function and remove spare resources be often code block entirety increase with It deletes, code refactoring relates generally to many places code revision of more line number.Therefore, invention is for code reparation and other several behaviour The characteristics of making only retains modification line number and is less than operation inside the code block of certain threshold value.The essence of code variations is certain format Text, for the ease of quantization characteristic, clustering processing, invention, which defines, basic changes type.
Second step obtains a large amount of history of project from item code version control tool using web crawlers and submits record, Variation feature is extracted using code variations descriptive model and is quantified.
The basic feature for changing difference section code in type specification code variations, from type, position, content, operation four Code difference is described in a aspect.Type refers to the type of syntactic block in Puppet codes in definition, including method, money Source, condition judgment, option judgement, relationship, variable-definition, selector;Position refers to concrete position of the variance codes in code block, The test statement of method name, condition judgment such as method code block and execution block;Content refers to being specifically identified for variance codes, Parameter in specific object title that such as certain resource changes, method;Operation refers to change conditions, including three classes, increase, It deletes and changes.
The global characteristics of code variations are difficult to due to directly comparing code text, because the invention is using the abstract language of parsing The method of method tree.Invention uses the abstract syntax tree analytical tool puppet-parser of Puppet language, and opens on this basis It has sent out abstract syntax tree and has compared tool ASTComparer.The tool obtains the abstract syntax tree of code before and after certain submission first; Then traversal compares the nodes of two abstract syntax tree same positions simultaneously, judge whether the increase of node, delete and The modification of node content, label have discrepant node;The feature of all flag nodes is finally extracted, is matched corresponding basic It changes in type.Since the code variations of errors repair only account for very little part in entire file, only need to compare generation The corresponding abstract syntax tree of code of part is changed, to reduce unnecessary workload.
Third walks, and similar code variations is polymerize using the hierarchy clustering method based on density, by same cluster Code variations analysis and summary goes out to cause the error reason that code adjusts, so as to obtain common error pattern.
It needs to solve the problems, such as two aspects for the cluster of code variations:The selection of clustering algorithm and the extraction of feature. The basic number of types that changes that invention is included code variations is as feature vector, so as to obtain all code variations compositions Eigenmatrix.Matrix column represents different types of basic variation type, and different code variations are represented per a line.Selection is suitable Distance function cluster result is equally had a major impact.When submitting code due to user, may be used also while mistake is repaired There can be other code operations, form noise;And the different basic types that changes is in the significance degree for changing code personal characteristics Aspect has different weights.Thus the common methods distance metric such as Euclidean distance and cosine similarity is ineffective, and invention is adopted By the use of manhatton distance as the measure of sample point distance, manhatton distance can weigh the basic of code variations and change type Whole difference.
Description of the drawings
Fig. 1 is configuration error detection method flow.
Specific embodiment
Below in conjunction with specific embodiments and the drawings, the present invention is described in detail, as shown in Figure 1, embodiment of the present invention side Method flow:
Invention obtains representative most active 14 Puppet product items using crawler system from Github code storages Mesh is tested, and 14 projects amount to 15000 submission data, and table 8 lists the bulleted list for participating in experiment, these projects Code submits number more than 500, amounts to download more than 14 000 000 times, has enough representativenesses.For every Data are submitted, carry out, according to file type, removing other non-Puppet files except product inventory first.It is non-erroneous to remove The code variations of reparation simultaneously reduce unnecessary workload, and invention filters and retains modification line number in the code block within 10 rows Portion's code variations.By above-mentioned two step, invention obtains 1980 effective code variations.
Corresponding between the variation of product code and error pattern can not automatically complete, it is still necessary to manually by checking source Code and the analysis and identifications such as list the problem of Puppet.Invention 300 valid codes of random labelling first change, and identify its correspondence Error pattern type, as evaluation criteria;Then it is changed as experimental data and is clustered using whole valid codes; To after cluster result cluster, whether judge mark code variations error pattern corresponding with other code variations in same cluster is consistent; The ratio being consistent in last statistical result cluster with marked code variations identifies error pattern using the ratio as clustering method Accuracy rate.Due to invention be concerned with the universal error pattern of spanned item purpose, here only consider simultaneously be related to 3 projects and Above result cluster;For the code variations of same project in result cluster, only randomly choose one and carry out manual identified to reduce Workload.Since cluster will obtain multiple result clusters, thus all result clusters entirety are evaluated using micro- average calculation Accuracy rate
Invention compares tool ASTComparer using the code variations based on abstract syntax tree, is extracted from valid code variation Amount to 3100 basic variation types to 642 kinds, the basic number of types that changes that single code variations are included is up to 10, most Small is 1, and average each code variations include 1.56 basic variation types.On above-mentioned working foundation, invention was carried out to generation The cluster experiment that code changes, it is 5 to set example quantity in most tuftlet, using the hierarchy clustering method based on density, is finally obtained 85 result clusters change wherein there is 41 clusters to contain marker code and with interpretation, become effective cluster, remaining result cluster Code variations type dispersion, it is difficult to be summarized as error pattern.
(1)Variable reference is abnormal:The type mistake includes reference undefined variable, reference null value, type or value mistake. Variable in Puppet language includes two kinds:One kind is extractor collected environmental variance information from object-computer, separately One kind is User Defined variable.The former can directly quote in any position of item code, and custom variable must make With preceding definition.If the type of undefined variable or variable has been used not to be inconsistent with expection, mistake will be caused and interrupt execution. The amendment of the error pattern is the judgement for increasing new variable-definition in variable reference position and increasing variable-definition before use, Extract obtained basic variation type.
(2)File Resource Properties set mistake:Include the improper setting of multiple attributes under File resources, File resources are The syntactic block of file operation is carried out to template system, the error pattern of the position has:" 1. path " is mixed with " name " attribute, The two attributes represent path and the title of file respectively, and corresponding basic variation type is " resource_file_path_ Add, resource_file_name_delete ";" 2. mode " attribute is to the authority definition mistake of file, corresponding basic change Dynamic type is " resource_file_mode_modify ";The correct form of the property value is four digital character strings, if Directly using number, will lead to not identify;3. " ensure " attribute represents the state of file, the selectable value of attribute only has Five kinds of " absent, present, file, directory, link " if there is other values, will cause file attribute setting to be lost It loses.
(3)Operating system judges abnormal:The type mistake includes lacking operating system judgement, " operatingsystem " It is mixed with " osfamily " etc..Puppet products can run on different operating system platforms, due to different system and system The execution sentence of version has difference, therefore the judgement of multiple operating systems is shown as in code.The error pattern corresponding generation Code repairs the decision logic for being increase or modification operating system.
(4)Functional based method uses mistake:Including " include ", " regsubst ", " fail ", " validate_bool " The function name and parameter of the functions such as " validate_re ", " validate_string " use mistake.Puppet language can draw Enter method base using function, but due to version updating etc., function call mode changes, so as to cause function name And the use mistake of function parameter.

Claims (1)

1. method characteristic is to realize that step is as follows:
The first step analyzes the code structure of Puppet products, from the type of code, position, content, changes operation etc. description Feature, the design code of code change descriptive model;
Second step obtains a large amount of history of project from item code version control tool using web crawlers and submits record, uses Code variations descriptive model extracts variation feature and is quantified;
Third walks, and similar code variations is polymerize using the hierarchy clustering method based on density, by the code in same cluster Variation analysis sums up the error reason that code is caused to adjust, so as to obtain common error pattern.
CN201711432725.8A 2017-12-26 2017-12-26 A kind of configuration error detection method of software-oriented upgrading Pending CN108182071A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711432725.8A CN108182071A (en) 2017-12-26 2017-12-26 A kind of configuration error detection method of software-oriented upgrading

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711432725.8A CN108182071A (en) 2017-12-26 2017-12-26 A kind of configuration error detection method of software-oriented upgrading

Publications (1)

Publication Number Publication Date
CN108182071A true CN108182071A (en) 2018-06-19

Family

ID=62547116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711432725.8A Pending CN108182071A (en) 2017-12-26 2017-12-26 A kind of configuration error detection method of software-oriented upgrading

Country Status (1)

Country Link
CN (1) CN108182071A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109947419A (en) * 2019-03-29 2019-06-28 泰康保险集团股份有限公司 A kind of method and device for realizing logic judgment
CN114547085A (en) * 2022-03-22 2022-05-27 中国铁塔股份有限公司 Data processing method and device, electronic equipment and storage medium
TWI769578B (en) * 2020-11-06 2022-07-01 中華電信股份有限公司 Software product assembly and delivery equipment, system and method thereof

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109947419A (en) * 2019-03-29 2019-06-28 泰康保险集团股份有限公司 A kind of method and device for realizing logic judgment
CN109947419B (en) * 2019-03-29 2022-04-26 泰康保险集团股份有限公司 Method and device for realizing logic judgment
TWI769578B (en) * 2020-11-06 2022-07-01 中華電信股份有限公司 Software product assembly and delivery equipment, system and method thereof
CN114547085A (en) * 2022-03-22 2022-05-27 中国铁塔股份有限公司 Data processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Lenarduzzi et al. The technical debt dataset
Kaliszyk et al. Holstep: A machine learning dataset for higher-order logic theorem proving
CN109697162B (en) Software defect automatic detection method based on open source code library
CN102804147B (en) Perform the code check executive system of the code check of ABAP source code
US11726760B2 (en) Systems and methods for entry point-based code analysis and transformation
Nguyen et al. A study of repetitiveness of code changes in software evolution
US20230244476A1 (en) Systems and methods for code analysis heat map interfaces
US20170192758A1 (en) Method and apparatus for migration of application source code
US9213707B2 (en) Ordered access of interrelated data files
Kamimura et al. Extracting candidates of microservices from monolithic application code
US7340475B2 (en) Evaluating dynamic expressions in a modeling application
CN108388445B (en) Continuous integration method based on 'platform + application' mode
CN108932192A (en) A kind of Python Program Type defect inspection method based on abstract syntax tree
CN106663003A (en) Systems and methods for software analysis
CN111459799A (en) Software defect detection model establishing and detecting method and system based on Github
CN108182071A (en) A kind of configuration error detection method of software-oriented upgrading
CN114510267B (en) Program ABI interface compatibility calculation method based on Linux system
CN103235729A (en) Software model synchronization method based on code modification
US20210365258A1 (en) Method and system for updating legacy software
CN114661423A (en) Cluster configuration detection method and device, computer equipment and storage medium
CN110633084B (en) Transcoding derivation method and device based on single sample
Zhou et al. Deeptle: Learning code-level features to predict code performance before it runs
Dhamija et al. A review paper on software engineering areas implementing data mining tools & techniques
Nguyen et al. Using topic model to suggest fine-grained source code changes
WO2022093178A1 (en) Ci/cd pipeline code recommendations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180619

WD01 Invention patent application deemed withdrawn after publication