CN107329770A - The personalized recommendation method repaired for software security BUG - Google Patents

The personalized recommendation method repaired for software security BUG Download PDF

Info

Publication number
CN107329770A
CN107329770A CN201710554336.6A CN201710554336A CN107329770A CN 107329770 A CN107329770 A CN 107329770A CN 201710554336 A CN201710554336 A CN 201710554336A CN 107329770 A CN107329770 A CN 107329770A
Authority
CN
China
Prior art keywords
bug
developer
security
software
recommendation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710554336.6A
Other languages
Chinese (zh)
Inventor
孙小兵
张诗渊
李斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangzhou University
Original Assignee
Yangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangzhou University filed Critical Yangzhou University
Priority to CN201710554336.6A priority Critical patent/CN107329770A/en
Publication of CN107329770A publication Critical patent/CN107329770A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Stored Programmes (AREA)

Abstract

The present invention relates to the personalized recommendation method repaired for software security BUG.Present invention pretreatment and extraction characteristic vector keyword, build new security Bug storehouses, associated developer is extracted and analyzed to security bug comment and discussion, find out the relevant developers of similar bug, carry out matching screening with the label of developer using bug labels, the bug that Analysis and Screening comes out, the recommendation for providing key is explained, the bug repaired some modification patterns are extracted from the security bug storehouses of establishment using the sorting algorithm in machine learning, for recommending developer out to be selected.Instant invention overcomes repair not in time, the low defect of quality.The present invention recommends suitable developer and reparation pattern and provide recommendation to explain from historical information (such as developer's historical review information and history repairing quality) and development Experience angle.

Description

The personalized recommendation method repaired for software security BUG
Technical field
The invention belongs to software maintenance field, the personalized recommendation method more particularly to repaired for software security BUG.
Background technology
Due to the complexity and the diversity of developer of software project, it is not perfect, or many to cause each software Or some bug can all occur less.And these bug occurred generally require to be repaired in time, the peace especially occurred in project Full property bug.According to correlative study, these securities bug can usually be used by a hacker to vicious attack software system, steal important money Expect and distort user data etc., so as to bring huge economic loss to industrial quarters, and the heavy damage peace of relevant enterprise Full reputation.Security bug is serious threat to each tissue, is likely to result in serious currency or reputation infringement.So Repair security bug as soon as possible as early as possible most important to development teams.Therefore, in order to ensure software information safety, in system Security bug should more be paid close attention to by software developer and attendant.How these securities are timely and effectively repaired Bug becomes industrial quarters and the problem of academia pays close attention to jointly.The important index that security bug is repaired in current software project is past Toward all higher than other kinds of bug.
Before the present invention makes, in order to more rapid and better solve software bug, many associated recommendation technology quilts are had at present Put forward.The main historical data according to software developer of these technologies, recommends suitable developer's completion one specific Software bug.But these methods do not consider some historical review information of developer.Containing in comment has very big Information content, for example, when we from bugreporter have no way rapid extraction key message when, we but can be from history Some views of other developers to this bug are quickly found in comment information, so as to quickly provide solution, together When from comment we can obtain whether a developer is excellent to bug repairing quality with subjective, whether meet big Subproblem submitter.Then one of factor that the historical review information for having repaired bug is also served as recommending by we is developed The recommendation of person.Meanwhile, for different security bug, using suitable reparation pattern, it can greatly lift the effect of software maintenance Rate.However, security bug reparation generally requires specific mode in practice, but developer is difficult efficient in repair process It was found that these patterns.In addition, at present existing software developer recommend method be suitably applied in mostly solution non-safety it is soft Part bug.But due to the feature, existing recommendation skill such as the reparation promptnesses of security software bug inherently are strong, quality requirement is high Art often recommends software developer not for security bug.
The content of the invention
The purpose of the present invention, which is that, overcomes drawbacks described above, develops the personalized recommendation repaired for software security BUG Method.
The technical scheme is that:
The personalized recommendation method repaired for software security BUG, it is mainly characterized by following steps:
Step 1) pretreatment and extract characteristic vector keyword:Being chosen from some open source software projects more has
It is representational it is more massive with complete history of evolution software project (such as Eclipse, Mozilla, Bugzilla etc.) as research object, using vector space model to security bug property content (bug descriptions, Commit files) pre-processed, calculating the most word of occurrence number in bug descriptions according to existing TF-IDF technologies is used as spy Levy vectorial keyword;
Step 2) the new security Bug storehouses of structures:Using what is handled well key is used as with characteristic vector keyword Word, sets up the personalized security bug storehouses with keyword;
Step 3) by associated developer in such as Bugzilla to security bug comment and discuss carry Take and analyze, find out the relevant developers of similar bug, and the information and corpus of developer is extracted, it is split afterwards Field involved by hair personnel carries out labeling definition and management, builds one and is similar to developer's net with Keyword Tag Network or community are so as to subsequent recommendation;
Step 4) carries out matching screening using bug labels with the label of developer, while finding out some and modification request Similar Commit historical informations, analyze the historical experience and repairing quality and divided rank of related software developer, select Recommend the developer for being adapted for carrying out modification request;
Step 5) bug that screens of analytical procedures (4), with reference to related software developer historical experience and repair matter Measure the given weights of these key factors to be calculated, the developer for going out initializing recommendation carries out ranking, while providing pass The recommendation of key is explained;
Step 6) uses and extracted in the security bug storehouses that are created from step (2) of sorting algorithm in machine learning The bug repaired some modification patterns, are analyzed and build a security related bug modification pattern base, by it with repairing Change request and carry out similarity mode, feedback analysis and recommending most suitable modification pattern and is available for recommending developer out to enter Row selection.
Advantages of the present invention and effect are that (such as developer's historical review information and history repair matter from historical information Amount) and development Experience angle recommend suitable developer and reparation pattern and provide recommendation explanation.The method is not only more accurate Really recommend suitable developer, and combine the exploitation historic task recommendation of the developer except effectively repairing Complex pattern supplies the reparation bug of developer rapidly and efficiently, realizes personalized recommendation function, more effectively improves developer's dimension Protect the efficiency and accuracy rate of software.Mainly there is the following advantage:
(1) current software recommendation developer's method does not account for the bug repaired some historical review information. These information, which can not only make us quickly match history correlation bug and developer, can also help us quickly to recognize reparation Pattern.Our method combines the corpus and history restoration information of historical review information, and it is suitable to recommend for security bug Developer, security bug is carried out in time, it is efficient repair, realize personalized recommendation.
(2) according to the bug of different attribute, adaptable reparation pattern is recommended to developer, it is efficient to carry out software bug's Repair.
Brief description of the drawings
Fig. 1 --- schematic flow sheet of the present invention.
Fig. 2 --- the generating process schematic diagram of present invention key term vector.
Fig. 3 --- the structure schematic flow sheet in invention software terms security bug storehouses.
Fig. 4 --- the structure schematic flow sheet in invention software terms security bug developer storehouse.
Embodiment
The present invention technical thought be:
Mainly for software security bug, with reference to the comment information corpus and history restoration information of developer, simultaneously Have also contemplated that the factors such as the history development Experience and bug repairing qualities of software developer, recommend be adapted to repair software security Bug Developer Network, recommend developer using related modification pattern and bug associated ancillary informations, thus in time and High-quality reparation software security bug.Before developer is recommended, software security bug rule bases are re-created, to The security bug repaired known and developer carry out the pretreatment of tagging management, utilize the safety created Property bug storehouses, for emerging bug develop personnel and repair pattern recommendation.We are quickly filtered out using label The developer associated with new security bug, carries out ranking, according to exploitation to the developer screened for the first time afterwards History commit information, experience, comment information of personnel etc. ultimately produce individual character to recommending developer to be out ranked up The developer of change and reparation pattern recommendation list simultaneously provide recommendation explanation.
The present invention is specifically described below.
As shown in Figure 1:
" characteristic vector keyword is extracted " first, according to the keyword of extraction difference " building security bug storehouses " and " structure Security bug developer storehouse ", the two combines progress " tag match screening " and draws " initial recommendation result ", passes through " sequence " Last " consequently recommended result " is provided with reference to " repairing pattern to recommend " and " recommending to explain ".
Comprise the following steps that:
Step 1) extraction characteristic vector keywords.Before software developer is recommended, vector space model pair is used Security bug property content (bug descriptions, commit files) is pre-processed.Bug contents attributes can pass through vector Spatial model is indicated.Bug can be expressed as a crucial term vector by the model.If bug is described as some entities such as Bug ID, directly can regard these entities as keyword.But if content is textual form, then needing introducing, some understand nature language The technology extracting keywords of speech, this technology are it is contemplated that the famous TF-IDF technologies of use information searching field are extracted. The management to bug and developer can be improved by carrying out pretreatment to the bug repaired, then be accelerated using characteristic vector keyword The speed of matching and the rate of precision for improving recommendation.Fig. 2 is the process of the crucial term vector of text generation, i.e. " text-participle-reality Physical examination survey-keyword ranking-key term vector ".For a bug d, its content representation is as follows into a crucial term vector:
di={ (e1, w1), (e2, w2) ...
Wherein, eiIt is exactly keyword, wiIt is the corresponding weight of keyword.If description information is textual form, we can be with The weight of word is calculated using TF-IDF formula:
Step 2) structure security bug storehouses.Security bug is belonged to by the use of the crucial term vector extracted as label Property carry out tagging management and classification, build related new security bug attribute libraries.Fig. 3 is the structure stream in security bug storehouses Journey, i.e., choose more representative more massive soft with complete history of evolution from some open source software projects Repaired security bug is as research object in part project (such as Eclipse, Mozilla, Bugzilla etc.), to each Security bug description information, comment content and related modification pattern is analyzed, and extracts keyword as each peace Full property bug feature describes label, is built into a new security bug attribute library.
For example:Bug1291016 " Heap-buffer-overflow in Bugzilla projects nsCaseTransformTextRunFactory::The label that TransformString " can be extracted after being pre-processed can Think " buffer overflow ", " check data ", " arises ", " csectype-bounds ", " Web " etc..
Step 3) structure security bug developers storehouse.The corpus that historical review information is carried out to developer is carried out Excavate and analyze, extract the related keyword of developer's historical act, labeling definition and management, wound are carried out to keyword Build developer's information bank.Developer's information bank is divided into two parts, and one is developer's behavior database, for depositing Some historical behavior information of developer;Another is developer's attribute database, one for depositing developer A little base attributes are such as the ability rating of developer is with the field specialized in.Fig. 4 is software project security bug developer The framework process in storehouse, i.e., belong to developer's historical behavior information (historical review information and history restoration information) and developer Property information pre-process and obtain developer's vector characteristics keyword, and built using the developer with characteristic key words new Security bug developer storehouse.
For example:The comment personal information that we are directed in Bugzilla projects in bug1291016 carries out analysis and can obtain, Developer's Jonathan Kew historical behavior information labels have " confirm bugs ", " Web " etc.;Dug according to the information of author The attribute tags excavated have " Irish ", " Web ", " Java ", " Firefox " etc..Say these information as developer Jonathan Kew label, new developer storehouse is created using the developer with label.
Step 4) tag match screens and shows initial recommendation result.Can be more rapidly more efficient using the matching of label Contact is built between developer and bug, the developer that we need to find is gone out using bug label filtrations.Marked using bug Sign and carry out matching screening with the label of developer, while some Commit historical informations similar to modification request are found out, point The historical experience and repairing quality of related software developer is separated out, picks out to meet tag match success or change with history and asks The bug of related similar commit information, the developer of modification request can be realized by recommending.The bug of same label we It may be considered and there is similitude, the similarity between bug is calculated according to the quantity of tag match.It is similar between Bug Degree can use inverted list to be calculated, and setting up bug- developer's inverted list, (i.e. each developer sets up one and includes him History completes the bug repaired list), then for each developer, by the bug of reparation in his bug lists two-by-two altogether Jia 1 in existing matrix.The related developers of the bug that will be repaired are carried out cluster and screened by this step, form primary Recommendation list.
For example:We can filter out and take part in the bug in bug1291016 bug comment informations in Bugzilla projects The developer of discussion, such as " Jonathan Kew ", " Milan Sreckovic ", " Cameron McCormack " " Liz Henry " et al.;Filter out and have concurrently in the developer personnel storehouse that we create also from the step (3) " Buffer overflow ", " Firefox ", " csectype-bounds " and " confirm bugs " developer " AI Billings ", " Wes Kocher " and " the primary recommendation list of Daniel Veditz " et al. compositions.
Step 5) developers that go out to initial recommendation of carry out ranking, while the recommendation for providing key is explained.In order to get out of the way Hair personnel accept our recommendation results, and we, which provide recommendation and explained, makes our recommendation results more transparent, understands our push away Recommend method and increase degree of belief of the developer to the present invention naturally.We use the different specific weight shared by many factors The final result that gets off finally is calculated to determine the priority of sequence.It is contemplated that to bug to be repaired and recommendation go out it is initial Developer repaired similar correlation, the history repairing quality of developer, developer's experience and qualifications and record of service between bug, Developer comments on number and liveness etc..(comment information for excavating history bug is also carried out the experience to developer by us And repairing quality does an evaluation.) similarity between wherein Bug can use inverted list to be calculated, set up bug- and open Hair personnel inverted list (i.e. each developer sets up a bug for completing to repair comprising his history list), then for every Individual developer, Jia 1 in co-occurrence matrix two-by-two by the bug of reparation in his bug lists.According to recommendation results, we can provide Recommendation explain have:1. can knowable people:Developer's information network is constructed in step 3, we are from comment information and go through The social network information of developer is can also be seen that in history action message, if we prefer that going out developer had history friendship The developer of stream, can improve the degree of belief of developer.2. repairing quality is good:According to comment information, we can also excavate Go out excellent situation of the developer to bug repairing quality.By to develop personal information excavation we can be Found in Bugzilla each developer have Statuses changed record, by calculate Bugs filed, Comments made, Assigned to quantity and Commented are fine or not come the repairing quality for judging developer.Example Such as:For developer Jonathan Kew (Bugs filed-804, Comments made-17222, Assigned to- 1286) and for Matt Wobensmith (Bugs filed-155, Comments made-1952, Assigned to-10), We judge that Jonathan Kew repairing quality is higher than Matt Wobensmith 3. most suitable developers:According to many in step 5 Plant the developer ranked the first after combined factors are calculated.For example:Bug1291016 in Bugzilla projects, we integrate label The factors such as matching degree, developer's repairing quality, the bug that developer has repaired and the bug similarity, recommend Jonathan Kew are most suitable developer.
Step 6) shows consequently recommended result and recommends reparation pattern.Using the sorting algorithm in machine learning from step (2) some modification patterns that the bug repaired is extracted in the security bug storehouses created in are analyzed and build a phase Security bug modification pattern bases are closed, it similarity mode are subjected to modification request, feedback analysis simultaneously is recommended most suitable repair Changing pattern is available for developer to be selected.According to the bug of different attribute, recommend adaptable reparation pattern to developer, it is high The reparation for carrying out software bug of effect.For example, for the buffer overflow types in Memory safety bug we The reparation pattern is recommended to have:Add bounds checking, change buffer size, replace API etc..

Claims (1)

1. the personalized recommendation method repaired for software security BUG, it is characterised in that following steps:
Step 1) extraction characteristic vector keywords:Before software developer is recommended, vector space model pair is used Securitybug property content is pre-processed, and calculating the most word of occurrence number in bug descriptions according to TF-IDF technologies makees It is characterized vectorial keyword;
Step 2) using the crucial term vector extracted as label to security bug attributes carry out tagging management and point Class, builds the personalized security bug attribute libraries with keyword;
Step 3) corpus and history restoration information progress data minings and analysis of the to developer's historical review information, carry The related keyword of developer's historical act is taken out, labeling definition and management are carried out to keyword, developer's letter is created Cease storehouse;
Step 4) carries out matching screening using bug labels with the label of developer, while it is similar to modification request to find out some Commit historical informations, pick out and meet tag match success or change the related similar commit information of request to history Bug, the developer of modification request can be realized by recommending;
Step 5) analytical procedures 4) bug that screens, with reference to the historical experience and repairing quality of related software developer, this is several Individual key factor gives weights and calculated, and the developer gone out to initializing recommendation carries out ranking, while providing pushing away for key Recommend explanation;
Step 6) uses the sorting algorithm in machine learning from step 2) in extract in the security bug storehouses that are created and repaiied Multiple bug some modification patterns, are analyzed and build a security related bug modification pattern base, please with modification by it Seek carry out similarity mode, feedback analysis is simultaneously recommended most suitable modification pattern and is available for recommending developer out to be selected Select.
CN201710554336.6A 2017-07-04 2017-07-04 The personalized recommendation method repaired for software security BUG Pending CN107329770A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710554336.6A CN107329770A (en) 2017-07-04 2017-07-04 The personalized recommendation method repaired for software security BUG

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710554336.6A CN107329770A (en) 2017-07-04 2017-07-04 The personalized recommendation method repaired for software security BUG

Publications (1)

Publication Number Publication Date
CN107329770A true CN107329770A (en) 2017-11-07

Family

ID=60197212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710554336.6A Pending CN107329770A (en) 2017-07-04 2017-07-04 The personalized recommendation method repaired for software security BUG

Country Status (1)

Country Link
CN (1) CN107329770A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408100A (en) * 2018-09-08 2019-03-01 扬州大学 A kind of software defect information fusion method based on multi-source data
CN110597490A (en) * 2019-08-26 2019-12-20 珠海格力电器股份有限公司 Software development demand distribution method and device
WO2019242108A1 (en) * 2018-06-20 2019-12-26 扬州大学 Software-bug repair template extraction method based on cluster analysis
CN110858369A (en) * 2018-08-24 2020-03-03 国信优易数据有限公司 Data value evaluation system and method and electronic equipment
US11321638B2 (en) 2020-03-16 2022-05-03 Kyndryl, Inc. Interoperable smart AI enabled evaluation of models

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110219360A1 (en) * 2010-03-05 2011-09-08 Microsoft Corporation Software debugging recommendations
CN102262663A (en) * 2011-07-25 2011-11-30 中国科学院软件研究所 Method for repairing software defect reports
CN103246603A (en) * 2013-03-21 2013-08-14 中国科学院软件研究所 Automatic distribution method for software bug reports of bug tracking system
CN105426514A (en) * 2015-11-30 2016-03-23 扬州大学 Personalized mobile APP recommendation method
CN105446734A (en) * 2015-10-14 2016-03-30 扬州大学 Software development history-based developer network relation construction method
CN106126736A (en) * 2016-06-30 2016-11-16 扬州大学 Software developer's personalized recommendation method that software-oriented safety bug repairs

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110219360A1 (en) * 2010-03-05 2011-09-08 Microsoft Corporation Software debugging recommendations
CN102262663A (en) * 2011-07-25 2011-11-30 中国科学院软件研究所 Method for repairing software defect reports
CN103246603A (en) * 2013-03-21 2013-08-14 中国科学院软件研究所 Automatic distribution method for software bug reports of bug tracking system
CN105446734A (en) * 2015-10-14 2016-03-30 扬州大学 Software development history-based developer network relation construction method
CN105426514A (en) * 2015-11-30 2016-03-23 扬州大学 Personalized mobile APP recommendation method
CN106126736A (en) * 2016-06-30 2016-11-16 扬州大学 Software developer's personalized recommendation method that software-oriented safety bug repairs

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019242108A1 (en) * 2018-06-20 2019-12-26 扬州大学 Software-bug repair template extraction method based on cluster analysis
CN110858369A (en) * 2018-08-24 2020-03-03 国信优易数据有限公司 Data value evaluation system and method and electronic equipment
CN109408100A (en) * 2018-09-08 2019-03-01 扬州大学 A kind of software defect information fusion method based on multi-source data
CN110597490A (en) * 2019-08-26 2019-12-20 珠海格力电器股份有限公司 Software development demand distribution method and device
US11321638B2 (en) 2020-03-16 2022-05-03 Kyndryl, Inc. Interoperable smart AI enabled evaluation of models

Similar Documents

Publication Publication Date Title
Arpteg et al. Software engineering challenges of deep learning
CN107329770A (en) The personalized recommendation method repaired for software security BUG
CN109389143A (en) A kind of Data Analysis Services system and method for automatic modeling
CN106776538A (en) The information extracting method of enterprise's noncanonical format document
CN110968695A (en) Intelligent labeling method, device and platform based on active learning of weak supervision technology
US12001951B2 (en) Automated contextual processing of unstructured data
CN110287329A (en) A kind of electric business classification attribute excavation method based on commodity text classification
CN108549723A (en) A kind of text concept sorting technique, device and server
Wibisono et al. The use of big data analytics and artificial intelligence in central banking
Fazayeli et al. Towards auto-labelling issue reports for pull-based software development using text mining approach
US20220382795A1 (en) Method and system for detection of misinformation
Radygin et al. Application of text mining technologies in Russian language for solving the problems of primary financial monitoring
CN110347806A (en) Original text discriminating method, device, equipment and computer readable storage medium
CN111930944B (en) File label classification method and device
TWI772023B (en) Information processing device, information processing method and information processing program
CN110069686A (en) User behavior analysis method, apparatus, computer installation and storage medium
Badrinath et al. An overview of global research trends in BIM from analysis of BIM publications
CN116737111B (en) Safety demand analysis method based on scenerization
CN112001484A (en) Safety defect report prediction method based on multitask deep learning
CN116795978A (en) Complaint information processing method and device, electronic equipment and medium
Mariano et al. Improve Classification of Commits Maintenance Activities with Quantitative Changes in Source Code.
STRUCTURING End2end unstructured data processing, confidential data structuring & storage using image processing, nlp, machine learning, and blockchain
Sönmez Classifying common vulnerabilities and exposures database using text mining and graph theoretical analysis
Jin et al. Diagnosis of corporate insolvency using massive news articles for credit management
Salamanos et al. HyperGraphDis: Leveraging Hypergraphs for Contextual and Social-Based Disinformation Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171107