CN108958710A - Method for extracting covariance correlation of project progress based on emotional factors - Google Patents

Method for extracting covariance correlation of project progress based on emotional factors Download PDF

Info

Publication number
CN108958710A
CN108958710A CN201810728956.1A CN201810728956A CN108958710A CN 108958710 A CN108958710 A CN 108958710A CN 201810728956 A CN201810728956 A CN 201810728956A CN 108958710 A CN108958710 A CN 108958710A
Authority
CN
China
Prior art keywords
comment
comment data
day
request
correlation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810728956.1A
Other languages
Chinese (zh)
Other versions
CN108958710B (en
Inventor
杨波
卫新洁
刘超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China University of Technology
Original Assignee
North China University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China University of Technology filed Critical North China University of Technology
Priority to CN201810728956.1A priority Critical patent/CN108958710B/en
Publication of CN108958710A publication Critical patent/CN108958710A/en
Application granted granted Critical
Publication of CN108958710B publication Critical patent/CN108958710B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/20Software design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/10Requirements analysis; Specification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management

Abstract

the invention discloses a covariance correlation extraction method for project progress based on emotional factors, which comprises the steps of extracting comment data of projects in a GitHub, sending HTTP requests by splicing UR L by adopting a GitHub API, returning HTTP response contents, analyzing Json format data in texts, and displaying results or storing the results into the local.

Description

Based on emotional factor to the covariance correlation extraction method of development of projects
Technical field
The present invention relates to a kind of methods for influencing software development progress, more particularly, refer to a kind of emotional factor to item The covariance correlation extraction method of mesh progress.
Background technique
Open source software refers to that user is allowed to be based on OSI (Open System Interconnection, open system Interconnection) open source protocol listed, it freely used in the range of agreement license, modify software source code, and can be by software source Code and other software code are combined a kind of software form used.GitHub (is one towards open source and privately owned software The hosted platform of project) as social development platform be numerous open source softwares one kind.
Influence many because being known as of GitHub platform development process, height, exploitation including developer's ability level The problem of person's number is how many, open source software solve speed, the excitation that user participates in and emotional factor etc..Wherein emotional factor exists Influence during GitHub Open Source Software has been interested by researchers.
For classification " Recursive Deep Models for reference to disclosed in October, 2013 of emotional factor Semantic Compositionality Over a Sentiment Treebank ", emotional factor has been divided into five in this text Class: " 5 sentiment of the Recursive Neural Tensor Network accurately predicting classes,very negative to very positive(––,–,0,+,++),at every node of a parse Tree and capturing the negation and its scope in this sentence. ", translation are as follows: " very It is passive, more passive, neutral, relatively more positive, very positive ".
Different affective factors play certain effect in software development process, but open for emotional factor in GitHub Influence in source software lacks the research of correlation especially in terms of development of projects.And the comment data in GitHub is not There is special GitHub problem data Extraction specification, analysis is got up, and there is also certain difficulty.
Summary of the invention
A kind of emotional factor proposed by the present invention to solve the covariance correlation extraction method of development of projects, this method Certainly be how to extract the comment data of project in GitHub;And its comment data is subjected to sentiment analysis, obtain sentiment analysis Related data;And the correlation analysis of emotional factor Yu development of projects speed is obtained using correlation analysis.
It is of the invention it is a kind of based on emotional factor to the covariance correlation extraction method of development of projects, it is characterised in that packet Include following processing step:
Step 1 extracts request from the Pull requests module in GitHub project and merges comment data collection PR;
The request merges comment data collection and is denoted as PR={ r1,r2,…,ra,…,rA};
Step 2 obtains the comment content of Pull requests module from request merging data collection PR, and in comment Hold matching sentiment analysis value;
Any one comment content exports a sentiment analysis value SE after emotion matching treatment SST processing, then has:
Belong to r1In comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST Output belongs to r afterwards1Sentiment analysis value be denoted as
Belong to r2In comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST Output belongs to r afterwards2Sentiment analysis value be denoted as
Belong to raIn comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST Output belongs to r afterwardsaSentiment analysis value be denoted as
Belong to rAIn comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST Output belongs to r afterwardsASentiment analysis value be denoted as
Sentiment analysis value SE, which refers to, is input to emotion matching treatment SST for the comment content in any one comment data In, a numerical value is then exported by emotion matching treatment SST;
Step 3 obtains the comment amount of Pull requests module from request merging data collection PR;
Comment time and comment content are carried in comment data since any one request merges, according to comment Time carrys out partition request and merges comment data collection PR={ r1,r2,…,ra,…,rA, it will obtain respectively:
Day-request as unit of day merges comment dataB ∈A;
Week-request as unit of week merges comment data C∈A;Or:
The moon-request as unit of the moon merges comment data D∈A;
In PR={ r1,r2,…,ra,…,rAIn the day-request of belonging to that marks off merge comment data
Number scale be day-comment amount numday
In PR={ r1,r2,…,ra,…,rAIn mark off belong to week-request merge comment data
Number scale be week-comment amount numweek
In PR={ r1,r2,…,ra,…,rAIn the moon-request of belonging to that marks off merge comment data
Number scale be the moon-comment amount nummonth
Step 4 obtains the correlation degree of the comment data divided with the time according to emotion degree of correlation IFC;
In the present invention, emotion degree of correlationMiddle n indicates summation element, i table Show summing target, numiIndicate the comment amount under summing target,Indicate average review amount, SEiIndicate the feelings under summing target Feel assay value,Indicate average sentiment analysis value.IFC calculated by product moment method, with two factors and respective average value from Based on difference, the degree of correlation that is multiplied to reflect between two factors by two deviations;The value of IFC is bigger, between two factors of expression Degree of correlation is higher;It is on the contrary then indicate that the degree of correlation between two factors is lower;IFC > 0 is indicated to exist between two factors and is positively correlated, IFC < 0 indicates there is negative correlation between two factors, and IFC=0 indicates that there is no linear correlations between two factors;
If merging comment data to day-requestCarry out emotion phase Pass degreeProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCday
If merging comment data to week-requestIt carries out Emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCweek
If merging comment data to the moon-request Carry out emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, It is denoted as IFCmonth
The present invention is based on emotional factors to be the advantages of covariance correlation extraction method of development of projects: soft from increasing income The angle that the data generated in part development process are analyzed is set out, to influence GitHub Open Source Software process emotion because Element is analyzed, and is proposed request and is merged the influence factors such as increase speed, and to several physical quantitys of emotional factor and proposition Between existing correlation analyzed.
Specific embodiment
Below in conjunction with embodiment, the present invention is described in further detail.
In the present invention, " Recursive Deep Models for Semantic disclosed in October, 2013 is quoted Compositionality Over a Sentiment Treebank ", in article " very passive, more passive, neutral, Comparing actively, very actively " five class emotional factor digital quantizations are that " 0,1,2,3 and 4 " correspond in Pull requests module The sentiment analysis value for commenting on content, is denoted as SE;Then there are SE=0, SE=1, SE=2, SE=3 or SE=4.One comment content A corresponding sentiment analysis value SE." the Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank " carry out sentiment analysis value SE acquisition in the present invention be known as emotion matching treatment SST。
In the present invention, it will be applied onto the Pull requests module in GitHub platform to carry out mentioning for comment data It takes and analyzes.It include to comment on content and comment amount (or to comment on the number of content) in comment data.For mentioning for comment It takes, clones the source code of interested project on GitHub to locally, the code library of a local is created, then in code library Code carry out Pull requests module comment data extraction.
Technology of the invention solves the problems, such as: for the comment data of Pull requests module in GitHub platform, using In the correlation for extracting comment content and comment amount and emotional factor, to study influence of the emotional factor to development of projects.
In the present invention, request merges comment data collection PR and refers to when user is using GitHub platform, by asking for appearance Topic, defect etc. comment on content feed to developer, and then developer carries out screening arrangement to the comment data of user, it will be considered that suitable The comment that conjunction improves its project merges.Indicate that request merges comment data collection and is denoted as PR={ r in the form of set1, r2,…,ra,…,rA, r1Indicate that first request merges comment data, r2Indicate that Article 2 request merges comment data, raTable Show that any one request merges comment data, rAIndicate that the last item request merges comment data, A indicates that request merges comment number According to total number;For convenience of explanation, raAlso illustrate that a articles request merges comment data, a indicates that request merges comment data Identification number.Any one request merges comment data raIn include having time and comment content, r is expressed as using aggregate forma= { time, content }, wherein content is comment content;For including day, week and the moon, then time=in the time {day,week,month};Time time is 3 dimensional analysis, and the comment data and corresponding request analyzed in the event section are closed And the correlation between speed and emotional factor.
Of the invention is had based on processing step of the emotional factor to the covariance correlation extraction method of development of projects:
Step 1 extracts request from the Pull requests module in GitHub project and merges comment data collection PR;
In the present invention, indicate that request merges comment data collection and is denoted as PR={ r in the form of set1,r2,…, ra,…,rA};
r1Indicate that first request merges comment data;
r2Indicate that Article 2 request merges comment data;
raIndicate that a articles request merges comment data;A indicates that request merges the identification number of comment data;
rAIndicate that the last item request merges comment data, A indicates that request merges the total number of comment data.
In the present invention, request merges comment data collection PR and refers to when user is using GitHub platform, by asking for appearance Topic, defect etc. comment on content feed to developer, and then developer carries out screening arrangement to the comment of user, it will be considered that is suitble to change Comment into its project merges.For the extraction of comment, the source code of the upper interested project of GitHub is cloned to local, The code library of a local is created, the extraction of Pull requests module comment data is then carried out to the code in code library.
Step 2 obtains the comment content of Pull requests module from request merging data collection PR, and in comment Hold matching sentiment analysis value;
In the present invention, any one comment content exports a sentiment analysis value after emotion matching treatment SST processing SE then has:
Belong to r1In comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST Output belongs to r afterwards1Sentiment analysis value be denoted as
Belong to r2In comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST Output belongs to r afterwards2Sentiment analysis value be denoted as
Belong to raIn comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST Output belongs to r afterwardsaSentiment analysis value be denoted as
Belong to rAIn comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST Output belongs to r afterwardsASentiment analysis value be denoted as
In the present invention, sentiment analysis value SE, which refers to, is input to emotion for the comment content in any one comment data With in processing SST, a numerical value is then exported by emotion matching treatment SST.
Step 3 obtains the comment amount of Pull requests module from request merging data collection PR;
Comment time and comment content are carried in comment data since any one request merges, according to comment Time carrys out partition request and merges comment data collection PR={ r1,r2,…,ra,…,rA, the comment as unit of day will be obtained respectively (i.e. day-request merges comment data to dataB ∈ A), with Zhou Weidan (i.e. week-request merges comment data to the comment data of positionC ∈ A) or as unit of the moon comment data (front-month-request merge comment data
D∈A)。
In PR={ r1,r2,…,ra,…,rAIn the day-request of belonging to that marks off merge comment data
Number scale be day-comment amount numday, say for convenience It is bright, the numdayIt is assigned a value of 500, i.e. B=500.
In PR={ r1,r2,…,ra,…,rAIn mark off belong to week-request merge comment data
Number scale be week-comment amount numweek, for side Just illustrate, the numweekIt is assigned a value of 1500, i.e. C=1500.
In PR={ r1,r2,…,ra,…,rAIn the moon-request of belonging to that marks off merge comment data
Number scale be the moon-comment amount nummonth, For convenience of explanation, the nummonthIt is assigned a value of 5000, i.e. D=5000.
It indicates to take day day as first comment data for commenting on the time;
It indicates to take day day as the Article 2 comment data for commenting on the time;
It indicates to take day day as any one comment data for commenting on the time;B indicates to be to comment on commenting for time with day day By the identification number of data;
It indicates to take day day as the last item comment data for commenting on the time;
It indicates to take day week as first comment data for commenting on the time;
It indicates to take day week as the Article 2 comment data for commenting on the time;
It indicates to take day week as any one comment data for commenting on the time;C indicates with day week to be the comment time Comment data identification number;
It indicates to take day week as the last item comment data for commenting on the time;
It indicates to take day month as first comment data for commenting on the time;
It indicates to take day month as the Article 2 comment data for commenting on the time;
It indicates to take day month as any one comment data for commenting on the time;D indicates to be when commenting on day month Between comment data identification number;
It indicates to take day month as the last item comment data for commenting on the time.
In the present invention, the number (i.e. comment amount) of the comment content of Pull requests module needs is extracted respectively And comment content, it is for the quantization in carrying out sentiment analysis as emotional factor correlation.
Step 4 obtains the correlation degree of the comment data divided with the time according to emotion degree of correlation IFC;
In the present invention, emotion degree of correlationMiddle n indicates summation element, i table Show summing target, numiIndicate the comment amount under summing target,Indicate average review amount, SEiIndicate the feelings under summing target Feel assay value,Indicate average sentiment analysis value.IFC calculated by product moment method, with two factors and respective average value from Based on difference, the degree of correlation that is multiplied to reflect between two factors by two deviations.The value of IFC is bigger, between two factors of expression Degree of correlation is higher;It is on the contrary then indicate that the degree of correlation between two factors is lower.IFC > 0 is indicated to exist between two factors and is positively correlated, IFC < 0 indicates there is negative correlation between two factors, and IFC=0 indicates that there is no linear correlations between two factors.
If merging comment data to day-requestCarry out emotion phase Pass degreeProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCday
If merging comment data to week-requestIt carries out Emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCweek
If merging comment data to the moon-request Carry out emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, It is denoted as IFCmonth
Embodiment 1
It is that comment chronomere carries out emotion correlation analysis to Pull requests module comment data with day:
If there are 3 comment datas in the Pull requests module on 1 day January in × year, it is denoted as first comment respectively DataArticle 2 comment dataWith Article 3 comment data
Then belong to × the comment amount on January 1, in is denoted as num1/1/(num1/1/=3);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value on January 1, in is denoted as
Belong to × emotion the degree of correlation on January 1, in Work as IFC1/1/When being 0, show to belong to × there is no correlations for the comment amount on January 1, in and emotional value.Comment on emotional value pair Development of projects does not have an impact.
If there are 7 comment datas in the Pull requests module on 2 days January in × year, it is denoted as first comment respectively DataArticle 2 comment dataArticle 3 comment dataArticle 4 comment dataArticle 5 Comment dataArticle 6 comment dataWith Article 7 comment data
Then belong to × the comment amount on January 1, in is denoted as num2/1/(num2/1/=7);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value on 1-2 days January in year is denoted as
Belong to × emotion the degree of correlation on 1-2 days January in year
Work as IFC2/1/When less than 0, show to belong to × the comment amount on 1-2 days January in year and emotional value exist it is negatively correlated.Comment on feelings There may be negative influences to development of projects for inductance value.
Comment data not on the same day is handled in the method in × on January 2, in, obtain respectively comment amount not on the same day, Average emotional value and emotion degree of correlation:
When IFC is greater than 0, shows that comment amount and emotional value exist and be positively correlated.That is comment emotional value can generate development of projects Positive influence.
If there are 7 comment datas in the Pull requests module on 10 days January in × year, it is denoted as first respectively and comments By dataArticle 2 comment dataArticle 3 comment dataArticle 4 comment dataThe Five comment datasArticle 6 comment dataWith Article 7 comment data
Then belong to × the comment amount on January 10, in is denoted as num10/1/(num10/1/=7);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value on January 10, in is denoted as
Belong to × emotion the degree of correlation on 1-10 days January in year:
Work as IFC10/1/When greater than 0, show to belong to × the comment amount on 1-10 days January in year and emotional value exist and be positively correlated.Comment Positive influence can be generated to development of projects by emotional value.
Embodiment 2
Request as unit of week merges the processing of comment data collection, for the extraction of comment, clones and feels emerging on GitHub The source code of the project of interest creates the week-code library an of local to locally, then carries out to the code in week-code library The extraction of Pull requests module comment data.
Emotion correlation analysis is carried out to Pull requests module comment data for comment chronomere with week:
If there are 3 comment datas in the Pull requests module in first week × year, it is denoted as first comment respectively DataArticle 2 comment dataWith Article 3 comment data
Then belong to × the comment amount in first week year is denoted as num1/ year(num1/ year=3);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value in first week year is denoted as
Belong to × emotion the degree of correlation in first week year Work as IFC1/ yearWhen being 0, show to belong to × there is no correlations for the comment amount in first week year and emotional value.That is comment emotional value is to item Mesh progress does not have an impact.
If there are 8 comment datas in the Pull requests module of × year second week, it is denoted as first comment respectively DataArticle 2 comment dataArticle 3 comment dataArticle 4 comment dataArticle 5 comment DataArticle 6 comment dataArticle 7 comment dataWith Article 8 comment data
Then belong to × year second week comment amount be denoted as num2/ year(num2/ year=8);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value in second week week in year is denoted as
Belong to × emotion the degree of correlation in week 1-2 weeks year
Work as IFC2/ yearIt is less than When 0, show to belong to × the comment amount in 1-2 weeks year and emotional value exist it is negatively correlated.That is comment emotional value may produce development of projects Raw negative influence.
Comment data not on the same day is handled in the method in × the 2nd week year, obtains comment amount not on the same day, flat respectively Equal emotional value and emotion degree of correlation:
When IFC is greater than 0, shows that comment amount and emotional value exist and be positively correlated.That is comment emotional value can generate development of projects Positive influence.
Belong to × 10 weeks emotion degrees of correlation before year
Work as IFC10/1/When greater than 0, show to belong to × 10 weeks comment amounts and emotional value presence positive correlation before year.Comment on Emotional value can generate positive influence to development of projects.
From PR={ r1,r2,…,ra,…,rAIn choose week-request as unit of all week and merge comment dataC ∈ A, while obtaining asking in week-as unit of all week Ask the comment amount WEEK_PR_num for merging comment data, and WEEK_PR_num=C;
It is describedInIndicate first as unit of week Request merges comment data,Indicate that second request as unit of week merges comment data,Indicate any one Request as unit of week merges comment data,Indicate the last one request merging comment data as unit of week, C table Show that the request as unit of week merges the total number of comment data, C ∈ A;For convenience of explanation,Also illustrate that c-th with week Merge comment data for the request of unit, c indicates that the request as unit of week merges the identification number of comment data.
Embodiment 3
Request as unit of the moon merges the processing of comment data collection, for the extraction of comment, clones and feels emerging on GitHub The source code of the project of interest creates the moon-code library an of local to locally, then carries out to the code in the moon-code library The extraction of Pull requests module comment data.
It is that comment chronomere carries out emotion correlation analysis to Pull requests module comment data with the moon:
If there are 5 comment datas in the Pull requests module in January in × year, it is denoted as first comment number respectively According toArticle 2 comment dataArticle 3 comment dataArticle 4 comment dataAnd Article 5 Comment data
Then belong to × the comment amount in January in year is denoted as numJan/(numJan/=5);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value in January in year is denoted as
Belong to × emotion the degree of correlation in January in year
Work as IFCJan/When being 0, show Belong to × there is no correlations for the comment amount in January in year and emotional value.That is comment emotional value does not have an impact development of projects.
If there are 10 comment datas in the Pull requests module in February in × year, it is denoted as first comment respectively DataArticle 2 comment dataArticle 3 comment dataArticle 4 comment data5th Comment dataArticle 6 comment dataArticle 7 comment dataArticle 8 comment dataArticle 9 comment dataWith Article 10 comment data
Then belong to × the comment amount in 2 months years is denoted as numFeb/(numFeb/=10);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value in 2 months years is denoted as
Belong to × emotion the degree of correlation of the month in year 1-2
Work as IFCFeb/When less than 0, show to belong to × the comment amount of the month in year 1-2 and emotional value exist it is negatively correlated.Comment on emotional value To development of projects, there may be negative influences.
Comment data not on the same day is handled in the method in × 2 months years, obtains comment amount not on the same day, average respectively Emotional value and emotion degree of correlation:
When IFC is greater than 0, shows that comment amount and emotional value exist and be positively correlated.That is comment emotional value can generate development of projects Positive influence.
Belong to × before year October emotion degree of correlation:
Work as IFCOct/When greater than 0, show to belong to × the comment amount of the month in year 1-10 and emotional value exist and be positively correlated.Comment on Emotional value can generate positive influence to development of projects.
From PR={ r1,r2,…,ra,…,rAIn choose the moon-request as unit of moon month and merge comment dataD ∈ A, while obtaining as unit of moon month The moon-request merges the comment amount MONTH_PR_num of comment data, and MONTH_PR_num=D;It is describedInIndicate first request as unit of the moon Merge comment data,Indicate that second request as unit of the moon merges comment data,Indicate any one with The moon is that the request of unit merges comment data,Indicate the last one request merging comment data as unit of the moon, D table Show that the request as unit of the moon merges the total number of comment data, D ∈ A;For convenience of explanation,Also illustrate that d-th with the moon Merge comment data for the request of unit, d indicates that the request as unit of the moon merges the identification number of comment data.

Claims (6)

1. it is a kind of based on emotional factor to the covariance correlation extraction method of development of projects, it is characterised in that include following place Manage step:
Step 1 extracts request from the Pull requests module in GitHub project and merges comment data collection PR;
The request merges comment data collection and is denoted as PR={ r1,r2,…,ra,…,rA};
r1Indicate that first request merges comment data;
r2Indicate that Article 2 request merges comment data;
raIndicate that a articles request merges comment data;A indicates that request merges the identification number of comment data;
rAIndicate that the last item request merges comment data, A indicates that request merges the total number of comment data.
Step 2 obtains the comment content of Pull requests module from request merging data collection PR, and to comment content With sentiment analysis value;
Any one comment content exports a sentiment analysis value SE after emotion matching treatment SST processing, then has:
Belong to r1In comment content be denoted asIt will be describedIt is defeated after emotion matching treatment SST processing Belong to r out1Sentiment analysis value be denoted as
Belong to r2In comment content be denoted asIt will be describedIt is defeated after emotion matching treatment SST processing Belong to r out2Sentiment analysis value be denoted as
Belong to raIn comment content be denoted asIt will be describedIt is defeated after emotion matching treatment SST processing Belong to r outaSentiment analysis value be denoted as
Belong to rAIn comment content be denoted asIt will be describedIt is defeated after emotion matching treatment SST processing Belong to r outASentiment analysis value be denoted as
Sentiment analysis value SE, which refers to, is input to the comment content in any one comment data in emotion matching treatment SST, so A numerical value is exported by emotion matching treatment SST afterwards.
Step 3 obtains the comment amount of Pull requests module from request merging data collection PR;
Comment time and comment content are carried in comment data since any one request merges, according to the comment time Carry out partition request and merges comment data collection PR={ r1,r2,…,ra,…,rA, day-request as unit of day will be obtained respectively Merge comment dataB ∈ A, the week-request as unit of week are closed And comment dataC ∈ A or the moon-as unit of the moon Request merges comment dataD∈A。
In PR={ r1,r2,…,ra,…,rAIn the day-request of belonging to that marks off merge comment data
Number scale be day-comment amount numday
In PR={ r1,r2,…,ra,…,rAIn mark off belong to week-request merge comment data
Number scale be week-comment amount numweek
In PR={ r1,r2,…,ra,…,rAIn the moon-request of belonging to that marks off merge comment data
Number scale be the moon-comment amount nummonth
It indicates to take day day as first comment data for commenting on the time;
It indicates to take day day as the Article 2 comment data for commenting on the time;
It indicates to take day day as any one comment data for commenting on the time;B indicates to take day day as the comment number for commenting on the time According to identification number;
It indicates to take day day as the last item comment data for commenting on the time;
It indicates to take day week as first comment data for commenting on the time;
It indicates to take day week as the Article 2 comment data for commenting on the time;
It indicates to take day week as any one comment data for commenting on the time;C indicates to be to comment on commenting for time with day week By the identification number of data;
It indicates to take day week as the last item comment data for commenting on the time;
It indicates to take day month as first comment data for commenting on the time;
It indicates to take day month as the Article 2 comment data for commenting on the time;
It indicates to take day month as any one comment data for commenting on the time;D indicates to take day month as the comment time The identification number of comment data;
It indicates to take day month as the last item comment data for commenting on the time;
Step 4 obtains the correlation degree of the comment data divided with the time according to emotion degree of correlation IFC;
In the present invention, emotion degree of correlationMiddle n indicates summation element, and i expression is asked And index, numiIndicate the comment amount under summing target,Indicate average review amount, SEiIndicate the emotion point under summing target Analysis value,Indicate average sentiment analysis value.IFC is calculated by product moment method, is with two factors and the deviation of respective average value Basis, the degree of correlation for being multiplied to reflect between two factors by two deviations.The value of IFC is bigger, indicates the correlation between two factors Degree is higher;It is on the contrary then indicate that the degree of correlation between two factors is lower.IFC > 0 is indicated to exist between two factors and is positively correlated, IFC < 0 indicates there is negative correlation between two factors, and IFC=0 indicates that there is no linear correlations between two factors.
If merging comment data to day-requestCarry out emotion correlation journey DegreeProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCday
If merging comment data to week-requestCarry out emotion Degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCweek
If merging comment data to the moon-requestIt carries out Emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCmonth
2. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature It is: the numdayIt is assigned a value of 500, i.e. B=500.
3. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature It is: the numweekIt is assigned a value of 1500, i.e. C=1500.
4. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature It is: the nummonthIt is assigned a value of 5000, i.e. D=5000.
5. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature It is: will be applied onto the Pull requests module in GitHub platform to carry out the extraction and analysis of comment data.Comment on number It include comment content and comment amount in.For the extraction of comment, the source code for cloning interested project on GitHub is arrived It is local, the code library of a local is created, Pull requests module comment data then is carried out to the code in code library It extracts.
6. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature It is: software development progress is improved by the emotion degree of correlation IFC after quantization.
CN201810728956.1A 2018-07-05 2018-07-05 Method for extracting covariance correlation of project progress based on emotional factors Active CN108958710B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810728956.1A CN108958710B (en) 2018-07-05 2018-07-05 Method for extracting covariance correlation of project progress based on emotional factors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810728956.1A CN108958710B (en) 2018-07-05 2018-07-05 Method for extracting covariance correlation of project progress based on emotional factors

Publications (2)

Publication Number Publication Date
CN108958710A true CN108958710A (en) 2018-12-07
CN108958710B CN108958710B (en) 2021-07-16

Family

ID=64485789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810728956.1A Active CN108958710B (en) 2018-07-05 2018-07-05 Method for extracting covariance correlation of project progress based on emotional factors

Country Status (1)

Country Link
CN (1) CN108958710B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101782898A (en) * 2010-03-25 2010-07-21 中国科学院计算技术研究所 Method for analyzing tendentiousness of affective words
CN101901230A (en) * 2009-05-31 2010-12-01 国际商业机器公司 Information retrieval method, user comment processing method and system thereof
CN103064971A (en) * 2013-01-05 2013-04-24 南京邮电大学 Scoring and Chinese sentiment analysis based review spam detection method
US20160092793A1 (en) * 2014-09-26 2016-03-31 Thomson Reuters Global Resources Pharmacovigilance systems and methods utilizing cascading filters and machine learning models to classify and discern pharmaceutical trends from social media posts
US20160277424A1 (en) * 2015-03-20 2016-09-22 Ashif Mawji Systems and Methods for Calculating a Trust Score

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901230A (en) * 2009-05-31 2010-12-01 国际商业机器公司 Information retrieval method, user comment processing method and system thereof
CN101782898A (en) * 2010-03-25 2010-07-21 中国科学院计算技术研究所 Method for analyzing tendentiousness of affective words
CN103064971A (en) * 2013-01-05 2013-04-24 南京邮电大学 Scoring and Chinese sentiment analysis based review spam detection method
US20160092793A1 (en) * 2014-09-26 2016-03-31 Thomson Reuters Global Resources Pharmacovigilance systems and methods utilizing cascading filters and machine learning models to classify and discern pharmaceutical trends from social media posts
US20160277424A1 (en) * 2015-03-20 2016-09-22 Ashif Mawji Systems and Methods for Calculating a Trust Score

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨波: ""Sentiments Analysis in GitHub Repositories:"", 《2017 24TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS》 *

Also Published As

Publication number Publication date
CN108958710B (en) 2021-07-16

Similar Documents

Publication Publication Date Title
CN108804512B (en) Text classification model generation device and method and computer readable storage medium
CN103530540B (en) User identity attribute detection method based on man-machine interaction behavior characteristics
CN103646088B (en) Product comment fine-grained emotional element extraction method based on CRFs and SVM
CN108959603A (en) Personalized recommendation system and method based on deep neural network
CN104572616B (en) The definite method and apparatus of Text Orientation
CN110019943B (en) Video recommendation method and device, electronic equipment and storage medium
CN110020038A (en) Webpage information extracting method, device, system and electronic equipment
CN106779921A (en) Recommend method and device
US10496751B2 (en) Avoiding sentiment model overfitting in a machine language model
CN105279148B (en) A kind of APP software users comment on uniformity determination methods
CN109903127A (en) A kind of group recommending method, device, storage medium and server
CN110110663A (en) A kind of age recognition methods and system based on face character
CN108198631A (en) Evidence-based medical outcome generation method and device
Nadikattu Fundamental applications of machine learning across the globe
Du et al. An interactive network for end-to-end review helpfulness modeling
US10817576B1 (en) Systems and methods for searching an unstructured dataset with a query
US10474919B2 (en) Method for determining and displaying products on an electronic display device
CN107357782A (en) One kind identification user&#39;s property method for distinguishing and terminal
Fiol-Roig et al. Data mining techniques for web page classification
KR102410715B1 (en) Apparatus and method for analyzing sentiment of text data based on machine learning
CN102866997B (en) The treating method and apparatus of user data
CN108958710A (en) Method for extracting covariance correlation of project progress based on emotional factors
CN108197274A (en) Abnormal individual character detection method and device based on dialogue
CN112085158A (en) Book recommendation method based on stack noise reduction self-encoder
Clayton et al. Immersive Visualization and Multi-Sensor Fusion Systems, Process Simulation and Modeling Technologies, and Digital Twin Connected Factories and Virtual Plant Floor Networks in the Industrial Metaverse

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant