CN107329770A - The personalized recommendation method repaired for software security BUG - Google Patents
The personalized recommendation method repaired for software security BUG Download PDFInfo
- Publication number
- CN107329770A CN107329770A CN201710554336.6A CN201710554336A CN107329770A CN 107329770 A CN107329770 A CN 107329770A CN 201710554336 A CN201710554336 A CN 201710554336A CN 107329770 A CN107329770 A CN 107329770A
- Authority
- CN
- China
- Prior art keywords
- bug
- developer
- security
- software
- recommendation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/65—Updates
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- Stored Programmes (AREA)
Abstract
The present invention relates to the personalized recommendation method repaired for software security BUG.Present invention pretreatment and extraction characteristic vector keyword, build new security Bug storehouses, associated developer is extracted and analyzed to security bug comment and discussion, find out the relevant developers of similar bug, carry out matching screening with the label of developer using bug labels, the bug that Analysis and Screening comes out, the recommendation for providing key is explained, the bug repaired some modification patterns are extracted from the security bug storehouses of establishment using the sorting algorithm in machine learning, for recommending developer out to be selected.Instant invention overcomes repair not in time, the low defect of quality.The present invention recommends suitable developer and reparation pattern and provide recommendation to explain from historical information (such as developer's historical review information and history repairing quality) and development Experience angle.
Description
Technical field
The invention belongs to software maintenance field, the personalized recommendation method more particularly to repaired for software security BUG.
Background technology
Due to the complexity and the diversity of developer of software project, it is not perfect, or many to cause each software
Or some bug can all occur less.And these bug occurred generally require to be repaired in time, the peace especially occurred in project
Full property bug.According to correlative study, these securities bug can usually be used by a hacker to vicious attack software system, steal important money
Expect and distort user data etc., so as to bring huge economic loss to industrial quarters, and the heavy damage peace of relevant enterprise
Full reputation.Security bug is serious threat to each tissue, is likely to result in serious currency or reputation infringement.So
Repair security bug as soon as possible as early as possible most important to development teams.Therefore, in order to ensure software information safety, in system
Security bug should more be paid close attention to by software developer and attendant.How these securities are timely and effectively repaired
Bug becomes industrial quarters and the problem of academia pays close attention to jointly.The important index that security bug is repaired in current software project is past
Toward all higher than other kinds of bug.
Before the present invention makes, in order to more rapid and better solve software bug, many associated recommendation technology quilts are had at present
Put forward.The main historical data according to software developer of these technologies, recommends suitable developer's completion one specific
Software bug.But these methods do not consider some historical review information of developer.Containing in comment has very big
Information content, for example, when we from bugreporter have no way rapid extraction key message when, we but can be from history
Some views of other developers to this bug are quickly found in comment information, so as to quickly provide solution, together
When from comment we can obtain whether a developer is excellent to bug repairing quality with subjective, whether meet big
Subproblem submitter.Then one of factor that the historical review information for having repaired bug is also served as recommending by we is developed
The recommendation of person.Meanwhile, for different security bug, using suitable reparation pattern, it can greatly lift the effect of software maintenance
Rate.However, security bug reparation generally requires specific mode in practice, but developer is difficult efficient in repair process
It was found that these patterns.In addition, at present existing software developer recommend method be suitably applied in mostly solution non-safety it is soft
Part bug.But due to the feature, existing recommendation skill such as the reparation promptnesses of security software bug inherently are strong, quality requirement is high
Art often recommends software developer not for security bug.
The content of the invention
The purpose of the present invention, which is that, overcomes drawbacks described above, develops the personalized recommendation repaired for software security BUG
Method.
The technical scheme is that:
The personalized recommendation method repaired for software security BUG, it is mainly characterized by following steps:
Step 1) pretreatment and extract characteristic vector keyword:Being chosen from some open source software projects more has
It is representational it is more massive with complete history of evolution software project (such as Eclipse, Mozilla,
Bugzilla etc.) as research object, using vector space model to security bug property content (bug descriptions,
Commit files) pre-processed, calculating the most word of occurrence number in bug descriptions according to existing TF-IDF technologies is used as spy
Levy vectorial keyword;
Step 2) the new security Bug storehouses of structures:Using what is handled well key is used as with characteristic vector keyword
Word, sets up the personalized security bug storehouses with keyword;
Step 3) by associated developer in such as Bugzilla to security bug comment and discuss carry
Take and analyze, find out the relevant developers of similar bug, and the information and corpus of developer is extracted, it is split afterwards
Field involved by hair personnel carries out labeling definition and management, builds one and is similar to developer's net with Keyword Tag
Network or community are so as to subsequent recommendation;
Step 4) carries out matching screening using bug labels with the label of developer, while finding out some and modification request
Similar Commit historical informations, analyze the historical experience and repairing quality and divided rank of related software developer, select
Recommend the developer for being adapted for carrying out modification request;
Step 5) bug that screens of analytical procedures (4), with reference to related software developer historical experience and repair matter
Measure the given weights of these key factors to be calculated, the developer for going out initializing recommendation carries out ranking, while providing pass
The recommendation of key is explained;
Step 6) uses and extracted in the security bug storehouses that are created from step (2) of sorting algorithm in machine learning
The bug repaired some modification patterns, are analyzed and build a security related bug modification pattern base, by it with repairing
Change request and carry out similarity mode, feedback analysis and recommending most suitable modification pattern and is available for recommending developer out to enter
Row selection.
Advantages of the present invention and effect are that (such as developer's historical review information and history repair matter from historical information
Amount) and development Experience angle recommend suitable developer and reparation pattern and provide recommendation explanation.The method is not only more accurate
Really recommend suitable developer, and combine the exploitation historic task recommendation of the developer except effectively repairing
Complex pattern supplies the reparation bug of developer rapidly and efficiently, realizes personalized recommendation function, more effectively improves developer's dimension
Protect the efficiency and accuracy rate of software.Mainly there is the following advantage:
(1) current software recommendation developer's method does not account for the bug repaired some historical review information.
These information, which can not only make us quickly match history correlation bug and developer, can also help us quickly to recognize reparation
Pattern.Our method combines the corpus and history restoration information of historical review information, and it is suitable to recommend for security bug
Developer, security bug is carried out in time, it is efficient repair, realize personalized recommendation.
(2) according to the bug of different attribute, adaptable reparation pattern is recommended to developer, it is efficient to carry out software bug's
Repair.
Brief description of the drawings
Fig. 1 --- schematic flow sheet of the present invention.
Fig. 2 --- the generating process schematic diagram of present invention key term vector.
Fig. 3 --- the structure schematic flow sheet in invention software terms security bug storehouses.
Fig. 4 --- the structure schematic flow sheet in invention software terms security bug developer storehouse.
Embodiment
The present invention technical thought be:
Mainly for software security bug, with reference to the comment information corpus and history restoration information of developer, simultaneously
Have also contemplated that the factors such as the history development Experience and bug repairing qualities of software developer, recommend be adapted to repair software security
Bug Developer Network, recommend developer using related modification pattern and bug associated ancillary informations, thus in time and
High-quality reparation software security bug.Before developer is recommended, software security bug rule bases are re-created, to
The security bug repaired known and developer carry out the pretreatment of tagging management, utilize the safety created
Property bug storehouses, for emerging bug develop personnel and repair pattern recommendation.We are quickly filtered out using label
The developer associated with new security bug, carries out ranking, according to exploitation to the developer screened for the first time afterwards
History commit information, experience, comment information of personnel etc. ultimately produce individual character to recommending developer to be out ranked up
The developer of change and reparation pattern recommendation list simultaneously provide recommendation explanation.
The present invention is specifically described below.
As shown in Figure 1:
" characteristic vector keyword is extracted " first, according to the keyword of extraction difference " building security bug storehouses " and " structure
Security bug developer storehouse ", the two combines progress " tag match screening " and draws " initial recommendation result ", passes through " sequence "
Last " consequently recommended result " is provided with reference to " repairing pattern to recommend " and " recommending to explain ".
Comprise the following steps that:
Step 1) extraction characteristic vector keywords.Before software developer is recommended, vector space model pair is used
Security bug property content (bug descriptions, commit files) is pre-processed.Bug contents attributes can pass through vector
Spatial model is indicated.Bug can be expressed as a crucial term vector by the model.If bug is described as some entities such as Bug
ID, directly can regard these entities as keyword.But if content is textual form, then needing introducing, some understand nature language
The technology extracting keywords of speech, this technology are it is contemplated that the famous TF-IDF technologies of use information searching field are extracted.
The management to bug and developer can be improved by carrying out pretreatment to the bug repaired, then be accelerated using characteristic vector keyword
The speed of matching and the rate of precision for improving recommendation.Fig. 2 is the process of the crucial term vector of text generation, i.e. " text-participle-reality
Physical examination survey-keyword ranking-key term vector ".For a bug d, its content representation is as follows into a crucial term vector:
di={ (e1, w1), (e2, w2) ...
Wherein, eiIt is exactly keyword, wiIt is the corresponding weight of keyword.If description information is textual form, we can be with
The weight of word is calculated using TF-IDF formula:
Step 2) structure security bug storehouses.Security bug is belonged to by the use of the crucial term vector extracted as label
Property carry out tagging management and classification, build related new security bug attribute libraries.Fig. 3 is the structure stream in security bug storehouses
Journey, i.e., choose more representative more massive soft with complete history of evolution from some open source software projects
Repaired security bug is as research object in part project (such as Eclipse, Mozilla, Bugzilla etc.), to each
Security bug description information, comment content and related modification pattern is analyzed, and extracts keyword as each peace
Full property bug feature describes label, is built into a new security bug attribute library.
For example:Bug1291016 " Heap-buffer-overflow in Bugzilla projects
nsCaseTransformTextRunFactory::The label that TransformString " can be extracted after being pre-processed can
Think " buffer overflow ", " check data ", " arises ", " csectype-bounds ", " Web " etc..
Step 3) structure security bug developers storehouse.The corpus that historical review information is carried out to developer is carried out
Excavate and analyze, extract the related keyword of developer's historical act, labeling definition and management, wound are carried out to keyword
Build developer's information bank.Developer's information bank is divided into two parts, and one is developer's behavior database, for depositing
Some historical behavior information of developer;Another is developer's attribute database, one for depositing developer
A little base attributes are such as the ability rating of developer is with the field specialized in.Fig. 4 is software project security bug developer
The framework process in storehouse, i.e., belong to developer's historical behavior information (historical review information and history restoration information) and developer
Property information pre-process and obtain developer's vector characteristics keyword, and built using the developer with characteristic key words new
Security bug developer storehouse.
For example:The comment personal information that we are directed in Bugzilla projects in bug1291016 carries out analysis and can obtain,
Developer's Jonathan Kew historical behavior information labels have " confirm bugs ", " Web " etc.;Dug according to the information of author
The attribute tags excavated have " Irish ", " Web ", " Java ", " Firefox " etc..Say these information as developer
Jonathan Kew label, new developer storehouse is created using the developer with label.
Step 4) tag match screens and shows initial recommendation result.Can be more rapidly more efficient using the matching of label
Contact is built between developer and bug, the developer that we need to find is gone out using bug label filtrations.Marked using bug
Sign and carry out matching screening with the label of developer, while some Commit historical informations similar to modification request are found out, point
The historical experience and repairing quality of related software developer is separated out, picks out to meet tag match success or change with history and asks
The bug of related similar commit information, the developer of modification request can be realized by recommending.The bug of same label we
It may be considered and there is similitude, the similarity between bug is calculated according to the quantity of tag match.It is similar between Bug
Degree can use inverted list to be calculated, and setting up bug- developer's inverted list, (i.e. each developer sets up one and includes him
History completes the bug repaired list), then for each developer, by the bug of reparation in his bug lists two-by-two altogether
Jia 1 in existing matrix.The related developers of the bug that will be repaired are carried out cluster and screened by this step, form primary
Recommendation list.
For example:We can filter out and take part in the bug in bug1291016 bug comment informations in Bugzilla projects
The developer of discussion, such as " Jonathan Kew ", " Milan Sreckovic ", " Cameron McCormack " " Liz
Henry " et al.;Filter out and have concurrently in the developer personnel storehouse that we create also from the step (3) " Buffer overflow ",
" Firefox ", " csectype-bounds " and " confirm bugs " developer " AI Billings ", " Wes
Kocher " and " the primary recommendation list of Daniel Veditz " et al. compositions.
Step 5) developers that go out to initial recommendation of carry out ranking, while the recommendation for providing key is explained.In order to get out of the way
Hair personnel accept our recommendation results, and we, which provide recommendation and explained, makes our recommendation results more transparent, understands our push away
Recommend method and increase degree of belief of the developer to the present invention naturally.We use the different specific weight shared by many factors
The final result that gets off finally is calculated to determine the priority of sequence.It is contemplated that to bug to be repaired and recommendation go out it is initial
Developer repaired similar correlation, the history repairing quality of developer, developer's experience and qualifications and record of service between bug,
Developer comments on number and liveness etc..(comment information for excavating history bug is also carried out the experience to developer by us
And repairing quality does an evaluation.) similarity between wherein Bug can use inverted list to be calculated, set up bug- and open
Hair personnel inverted list (i.e. each developer sets up a bug for completing to repair comprising his history list), then for every
Individual developer, Jia 1 in co-occurrence matrix two-by-two by the bug of reparation in his bug lists.According to recommendation results, we can provide
Recommendation explain have:1. can knowable people:Developer's information network is constructed in step 3, we are from comment information and go through
The social network information of developer is can also be seen that in history action message, if we prefer that going out developer had history friendship
The developer of stream, can improve the degree of belief of developer.2. repairing quality is good:According to comment information, we can also excavate
Go out excellent situation of the developer to bug repairing quality.By to develop personal information excavation we can be
Found in Bugzilla each developer have Statuses changed record, by calculate Bugs filed,
Comments made, Assigned to quantity and Commented are fine or not come the repairing quality for judging developer.Example
Such as:For developer Jonathan Kew (Bugs filed-804, Comments made-17222, Assigned to-
1286) and for Matt Wobensmith (Bugs filed-155, Comments made-1952, Assigned to-10),
We judge that Jonathan Kew repairing quality is higher than Matt Wobensmith 3. most suitable developers:According to many in step 5
Plant the developer ranked the first after combined factors are calculated.For example:Bug1291016 in Bugzilla projects, we integrate label
The factors such as matching degree, developer's repairing quality, the bug that developer has repaired and the bug similarity, recommend
Jonathan Kew are most suitable developer.
Step 6) shows consequently recommended result and recommends reparation pattern.Using the sorting algorithm in machine learning from step
(2) some modification patterns that the bug repaired is extracted in the security bug storehouses created in are analyzed and build a phase
Security bug modification pattern bases are closed, it similarity mode are subjected to modification request, feedback analysis simultaneously is recommended most suitable repair
Changing pattern is available for developer to be selected.According to the bug of different attribute, recommend adaptable reparation pattern to developer, it is high
The reparation for carrying out software bug of effect.For example, for the buffer overflow types in Memory safety bug we
The reparation pattern is recommended to have:Add bounds checking, change buffer size, replace API etc..
Claims (1)
1. the personalized recommendation method repaired for software security BUG, it is characterised in that following steps:
Step 1) extraction characteristic vector keywords:Before software developer is recommended, vector space model pair is used
Securitybug property content is pre-processed, and calculating the most word of occurrence number in bug descriptions according to TF-IDF technologies makees
It is characterized vectorial keyword;
Step 2) using the crucial term vector extracted as label to security bug attributes carry out tagging management and point
Class, builds the personalized security bug attribute libraries with keyword;
Step 3) corpus and history restoration information progress data minings and analysis of the to developer's historical review information, carry
The related keyword of developer's historical act is taken out, labeling definition and management are carried out to keyword, developer's letter is created
Cease storehouse;
Step 4) carries out matching screening using bug labels with the label of developer, while it is similar to modification request to find out some
Commit historical informations, pick out and meet tag match success or change the related similar commit information of request to history
Bug, the developer of modification request can be realized by recommending;
Step 5) analytical procedures 4) bug that screens, with reference to the historical experience and repairing quality of related software developer, this is several
Individual key factor gives weights and calculated, and the developer gone out to initializing recommendation carries out ranking, while providing pushing away for key
Recommend explanation;
Step 6) uses the sorting algorithm in machine learning from step 2) in extract in the security bug storehouses that are created and repaiied
Multiple bug some modification patterns, are analyzed and build a security related bug modification pattern base, please with modification by it
Seek carry out similarity mode, feedback analysis is simultaneously recommended most suitable modification pattern and is available for recommending developer out to be selected
Select.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710554336.6A CN107329770A (en) | 2017-07-04 | 2017-07-04 | The personalized recommendation method repaired for software security BUG |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710554336.6A CN107329770A (en) | 2017-07-04 | 2017-07-04 | The personalized recommendation method repaired for software security BUG |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107329770A true CN107329770A (en) | 2017-11-07 |
Family
ID=60197212
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710554336.6A Pending CN107329770A (en) | 2017-07-04 | 2017-07-04 | The personalized recommendation method repaired for software security BUG |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107329770A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109408100A (en) * | 2018-09-08 | 2019-03-01 | 扬州大学 | A kind of software defect information fusion method based on multi-source data |
CN110597490A (en) * | 2019-08-26 | 2019-12-20 | 珠海格力电器股份有限公司 | Software development demand distribution method and device |
WO2019242108A1 (en) * | 2018-06-20 | 2019-12-26 | 扬州大学 | Software-bug repair template extraction method based on cluster analysis |
CN110858369A (en) * | 2018-08-24 | 2020-03-03 | 国信优易数据有限公司 | Data value evaluation system and method and electronic equipment |
US11321638B2 (en) | 2020-03-16 | 2022-05-03 | Kyndryl, Inc. | Interoperable smart AI enabled evaluation of models |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110219360A1 (en) * | 2010-03-05 | 2011-09-08 | Microsoft Corporation | Software debugging recommendations |
CN102262663A (en) * | 2011-07-25 | 2011-11-30 | 中国科学院软件研究所 | Method for repairing software defect reports |
CN103246603A (en) * | 2013-03-21 | 2013-08-14 | 中国科学院软件研究所 | Automatic distribution method for software bug reports of bug tracking system |
CN105426514A (en) * | 2015-11-30 | 2016-03-23 | 扬州大学 | Personalized mobile APP recommendation method |
CN105446734A (en) * | 2015-10-14 | 2016-03-30 | 扬州大学 | Software development history-based developer network relation construction method |
CN106126736A (en) * | 2016-06-30 | 2016-11-16 | 扬州大学 | Software developer's personalized recommendation method that software-oriented safety bug repairs |
-
2017
- 2017-07-04 CN CN201710554336.6A patent/CN107329770A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110219360A1 (en) * | 2010-03-05 | 2011-09-08 | Microsoft Corporation | Software debugging recommendations |
CN102262663A (en) * | 2011-07-25 | 2011-11-30 | 中国科学院软件研究所 | Method for repairing software defect reports |
CN103246603A (en) * | 2013-03-21 | 2013-08-14 | 中国科学院软件研究所 | Automatic distribution method for software bug reports of bug tracking system |
CN105446734A (en) * | 2015-10-14 | 2016-03-30 | 扬州大学 | Software development history-based developer network relation construction method |
CN105426514A (en) * | 2015-11-30 | 2016-03-23 | 扬州大学 | Personalized mobile APP recommendation method |
CN106126736A (en) * | 2016-06-30 | 2016-11-16 | 扬州大学 | Software developer's personalized recommendation method that software-oriented safety bug repairs |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019242108A1 (en) * | 2018-06-20 | 2019-12-26 | 扬州大学 | Software-bug repair template extraction method based on cluster analysis |
CN110858369A (en) * | 2018-08-24 | 2020-03-03 | 国信优易数据有限公司 | Data value evaluation system and method and electronic equipment |
CN109408100A (en) * | 2018-09-08 | 2019-03-01 | 扬州大学 | A kind of software defect information fusion method based on multi-source data |
CN110597490A (en) * | 2019-08-26 | 2019-12-20 | 珠海格力电器股份有限公司 | Software development demand distribution method and device |
US11321638B2 (en) | 2020-03-16 | 2022-05-03 | Kyndryl, Inc. | Interoperable smart AI enabled evaluation of models |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Arpteg et al. | Software engineering challenges of deep learning | |
CN107329770A (en) | The personalized recommendation method repaired for software security BUG | |
CN109389143A (en) | A kind of Data Analysis Services system and method for automatic modeling | |
CN106776538A (en) | The information extracting method of enterprise's noncanonical format document | |
CN110968695A (en) | Intelligent labeling method, device and platform based on active learning of weak supervision technology | |
US12001951B2 (en) | Automated contextual processing of unstructured data | |
CN110287329A (en) | A kind of electric business classification attribute excavation method based on commodity text classification | |
CN108549723A (en) | A kind of text concept sorting technique, device and server | |
Wibisono et al. | The use of big data analytics and artificial intelligence in central banking | |
Fazayeli et al. | Towards auto-labelling issue reports for pull-based software development using text mining approach | |
US20220382795A1 (en) | Method and system for detection of misinformation | |
Radygin et al. | Application of text mining technologies in Russian language for solving the problems of primary financial monitoring | |
CN110347806A (en) | Original text discriminating method, device, equipment and computer readable storage medium | |
CN111930944B (en) | File label classification method and device | |
TWI772023B (en) | Information processing device, information processing method and information processing program | |
CN110069686A (en) | User behavior analysis method, apparatus, computer installation and storage medium | |
Badrinath et al. | An overview of global research trends in BIM from analysis of BIM publications | |
CN116737111B (en) | Safety demand analysis method based on scenerization | |
CN112001484A (en) | Safety defect report prediction method based on multitask deep learning | |
CN116795978A (en) | Complaint information processing method and device, electronic equipment and medium | |
Mariano et al. | Improve Classification of Commits Maintenance Activities with Quantitative Changes in Source Code. | |
STRUCTURING | End2end unstructured data processing, confidential data structuring & storage using image processing, nlp, machine learning, and blockchain | |
Sönmez | Classifying common vulnerabilities and exposures database using text mining and graph theoretical analysis | |
Jin et al. | Diagnosis of corporate insolvency using massive news articles for credit management | |
Salamanos et al. | HyperGraphDis: Leveraging Hypergraphs for Contextual and Social-Based Disinformation Detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171107 |