CN108958710A - Method for extracting covariance correlation of project progress based on emotional factors - Google Patents
Method for extracting covariance correlation of project progress based on emotional factors Download PDFInfo
- Publication number
- CN108958710A CN108958710A CN201810728956.1A CN201810728956A CN108958710A CN 108958710 A CN108958710 A CN 108958710A CN 201810728956 A CN201810728956 A CN 201810728956A CN 108958710 A CN108958710 A CN 108958710A
- Authority
- CN
- China
- Prior art keywords
- comment
- comment data
- day
- request
- correlation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002996 emotional effect Effects 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 title claims description 13
- 238000000605 extraction Methods 0.000 claims abstract description 21
- 230000008451 emotion Effects 0.000 claims description 98
- 238000012545 processing Methods 0.000 claims description 64
- 238000004458 analytical method Methods 0.000 claims description 33
- 238000011161 development Methods 0.000 claims description 28
- 238000013480 data collection Methods 0.000 claims description 20
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 claims description 15
- 230000002596 correlated effect Effects 0.000 claims description 11
- 239000000284 extract Substances 0.000 claims description 4
- 239000006227 byproduct Substances 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 3
- 238000012552 review Methods 0.000 claims description 3
- 238000010367 cloning Methods 0.000 claims 1
- 241001269238 Data Species 0.000 description 7
- 238000010219 correlation analysis Methods 0.000 description 5
- 238000003556 assay Methods 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000013075 data extraction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000004141 dimensional analysis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/20—Software design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/10—Requirements analysis; Specification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
Abstract
the invention discloses a covariance correlation extraction method for project progress based on emotional factors, which comprises the steps of extracting comment data of projects in a GitHub, sending HTTP requests by splicing UR L by adopting a GitHub API, returning HTTP response contents, analyzing Json format data in texts, and displaying results or storing the results into the local.
Description
Technical field
The present invention relates to a kind of methods for influencing software development progress, more particularly, refer to a kind of emotional factor to item
The covariance correlation extraction method of mesh progress.
Background technique
Open source software refers to that user is allowed to be based on OSI (Open System Interconnection, open system
Interconnection) open source protocol listed, it freely used in the range of agreement license, modify software source code, and can be by software source
Code and other software code are combined a kind of software form used.GitHub (is one towards open source and privately owned software
The hosted platform of project) as social development platform be numerous open source softwares one kind.
Influence many because being known as of GitHub platform development process, height, exploitation including developer's ability level
The problem of person's number is how many, open source software solve speed, the excitation that user participates in and emotional factor etc..Wherein emotional factor exists
Influence during GitHub Open Source Software has been interested by researchers.
For classification " Recursive Deep Models for reference to disclosed in October, 2013 of emotional factor
Semantic Compositionality Over a Sentiment Treebank ", emotional factor has been divided into five in this text
Class: " 5 sentiment of the Recursive Neural Tensor Network accurately predicting
classes,very negative to very positive(––,–,0,+,++),at every node of a parse
Tree and capturing the negation and its scope in this sentence. ", translation are as follows: " very
It is passive, more passive, neutral, relatively more positive, very positive ".
Different affective factors play certain effect in software development process, but open for emotional factor in GitHub
Influence in source software lacks the research of correlation especially in terms of development of projects.And the comment data in GitHub is not
There is special GitHub problem data Extraction specification, analysis is got up, and there is also certain difficulty.
Summary of the invention
A kind of emotional factor proposed by the present invention to solve the covariance correlation extraction method of development of projects, this method
Certainly be how to extract the comment data of project in GitHub;And its comment data is subjected to sentiment analysis, obtain sentiment analysis
Related data;And the correlation analysis of emotional factor Yu development of projects speed is obtained using correlation analysis.
It is of the invention it is a kind of based on emotional factor to the covariance correlation extraction method of development of projects, it is characterised in that packet
Include following processing step:
Step 1 extracts request from the Pull requests module in GitHub project and merges comment data collection PR;
The request merges comment data collection and is denoted as PR={ r1,r2,…,ra,…,rA};
Step 2 obtains the comment content of Pull requests module from request merging data collection PR, and in comment
Hold matching sentiment analysis value;
Any one comment content exports a sentiment analysis value SE after emotion matching treatment SST processing, then has:
Belong to r1In comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST
Output belongs to r afterwards1Sentiment analysis value be denoted as
Belong to r2In comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST
Output belongs to r afterwards2Sentiment analysis value be denoted as
Belong to raIn comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST
Output belongs to r afterwardsaSentiment analysis value be denoted as
Belong to rAIn comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST
Output belongs to r afterwardsASentiment analysis value be denoted as
Sentiment analysis value SE, which refers to, is input to emotion matching treatment SST for the comment content in any one comment data
In, a numerical value is then exported by emotion matching treatment SST;
Step 3 obtains the comment amount of Pull requests module from request merging data collection PR;
Comment time and comment content are carried in comment data since any one request merges, according to comment
Time carrys out partition request and merges comment data collection PR={ r1,r2,…,ra,…,rA, it will obtain respectively:
Day-request as unit of day merges comment dataB
∈A;
Week-request as unit of week merges comment data
C∈A;Or:
The moon-request as unit of the moon merges comment data
D∈A;
In PR={ r1,r2,…,ra,…,rAIn the day-request of belonging to that marks off merge comment data
Number scale be day-comment amount numday;
In PR={ r1,r2,…,ra,…,rAIn mark off belong to week-request merge comment data
Number scale be week-comment amount numweek;
In PR={ r1,r2,…,ra,…,rAIn the moon-request of belonging to that marks off merge comment data
Number scale be the moon-comment amount nummonth;
Step 4 obtains the correlation degree of the comment data divided with the time according to emotion degree of correlation IFC;
In the present invention, emotion degree of correlationMiddle n indicates summation element, i table
Show summing target, numiIndicate the comment amount under summing target,Indicate average review amount, SEiIndicate the feelings under summing target
Feel assay value,Indicate average sentiment analysis value.IFC calculated by product moment method, with two factors and respective average value from
Based on difference, the degree of correlation that is multiplied to reflect between two factors by two deviations;The value of IFC is bigger, between two factors of expression
Degree of correlation is higher;It is on the contrary then indicate that the degree of correlation between two factors is lower;IFC > 0 is indicated to exist between two factors and is positively correlated,
IFC < 0 indicates there is negative correlation between two factors, and IFC=0 indicates that there is no linear correlations between two factors;
If merging comment data to day-requestCarry out emotion phase
Pass degreeProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCday;
If merging comment data to week-requestIt carries out
Emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, is denoted as
IFCweek;
If merging comment data to the moon-request
Carry out emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day,
It is denoted as IFCmonth。
The present invention is based on emotional factors to be the advantages of covariance correlation extraction method of development of projects: soft from increasing income
The angle that the data generated in part development process are analyzed is set out, to influence GitHub Open Source Software process emotion because
Element is analyzed, and is proposed request and is merged the influence factors such as increase speed, and to several physical quantitys of emotional factor and proposition
Between existing correlation analyzed.
Specific embodiment
Below in conjunction with embodiment, the present invention is described in further detail.
In the present invention, " Recursive Deep Models for Semantic disclosed in October, 2013 is quoted
Compositionality Over a Sentiment Treebank ", in article " very passive, more passive, neutral,
Comparing actively, very actively " five class emotional factor digital quantizations are that " 0,1,2,3 and 4 " correspond in Pull requests module
The sentiment analysis value for commenting on content, is denoted as SE;Then there are SE=0, SE=1, SE=2, SE=3 or SE=4.One comment content
A corresponding sentiment analysis value SE." the Recursive Deep Models for Semantic Compositionality
Over a Sentiment Treebank " carry out sentiment analysis value SE acquisition in the present invention be known as emotion matching treatment
SST。
In the present invention, it will be applied onto the Pull requests module in GitHub platform to carry out mentioning for comment data
It takes and analyzes.It include to comment on content and comment amount (or to comment on the number of content) in comment data.For mentioning for comment
It takes, clones the source code of interested project on GitHub to locally, the code library of a local is created, then in code library
Code carry out Pull requests module comment data extraction.
Technology of the invention solves the problems, such as: for the comment data of Pull requests module in GitHub platform, using
In the correlation for extracting comment content and comment amount and emotional factor, to study influence of the emotional factor to development of projects.
In the present invention, request merges comment data collection PR and refers to when user is using GitHub platform, by asking for appearance
Topic, defect etc. comment on content feed to developer, and then developer carries out screening arrangement to the comment data of user, it will be considered that suitable
The comment that conjunction improves its project merges.Indicate that request merges comment data collection and is denoted as PR={ r in the form of set1,
r2,…,ra,…,rA, r1Indicate that first request merges comment data, r2Indicate that Article 2 request merges comment data, raTable
Show that any one request merges comment data, rAIndicate that the last item request merges comment data, A indicates that request merges comment number
According to total number;For convenience of explanation, raAlso illustrate that a articles request merges comment data, a indicates that request merges comment data
Identification number.Any one request merges comment data raIn include having time and comment content, r is expressed as using aggregate forma=
{ time, content }, wherein content is comment content;For including day, week and the moon, then time=in the time
{day,week,month};Time time is 3 dimensional analysis, and the comment data and corresponding request analyzed in the event section are closed
And the correlation between speed and emotional factor.
Of the invention is had based on processing step of the emotional factor to the covariance correlation extraction method of development of projects:
Step 1 extracts request from the Pull requests module in GitHub project and merges comment data collection PR;
In the present invention, indicate that request merges comment data collection and is denoted as PR={ r in the form of set1,r2,…,
ra,…,rA};
r1Indicate that first request merges comment data;
r2Indicate that Article 2 request merges comment data;
raIndicate that a articles request merges comment data;A indicates that request merges the identification number of comment data;
rAIndicate that the last item request merges comment data, A indicates that request merges the total number of comment data.
In the present invention, request merges comment data collection PR and refers to when user is using GitHub platform, by asking for appearance
Topic, defect etc. comment on content feed to developer, and then developer carries out screening arrangement to the comment of user, it will be considered that is suitble to change
Comment into its project merges.For the extraction of comment, the source code of the upper interested project of GitHub is cloned to local,
The code library of a local is created, the extraction of Pull requests module comment data is then carried out to the code in code library.
Step 2 obtains the comment content of Pull requests module from request merging data collection PR, and in comment
Hold matching sentiment analysis value;
In the present invention, any one comment content exports a sentiment analysis value after emotion matching treatment SST processing
SE then has:
Belong to r1In comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST
Output belongs to r afterwards1Sentiment analysis value be denoted as
Belong to r2In comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST
Output belongs to r afterwards2Sentiment analysis value be denoted as
Belong to raIn comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST
Output belongs to r afterwardsaSentiment analysis value be denoted as
Belong to rAIn comment content be denoted asIt will be describedIt is handled through emotion matching treatment SST
Output belongs to r afterwardsASentiment analysis value be denoted as
In the present invention, sentiment analysis value SE, which refers to, is input to emotion for the comment content in any one comment data
With in processing SST, a numerical value is then exported by emotion matching treatment SST.
Step 3 obtains the comment amount of Pull requests module from request merging data collection PR;
Comment time and comment content are carried in comment data since any one request merges, according to comment
Time carrys out partition request and merges comment data collection PR={ r1,r2,…,ra,…,rA, the comment as unit of day will be obtained respectively
(i.e. day-request merges comment data to dataB ∈ A), with Zhou Weidan
(i.e. week-request merges comment data to the comment data of positionC
∈ A) or as unit of the moon comment data (front-month-request merge comment data
D∈A)。
In PR={ r1,r2,…,ra,…,rAIn the day-request of belonging to that marks off merge comment data
Number scale be day-comment amount numday, say for convenience
It is bright, the numdayIt is assigned a value of 500, i.e. B=500.
In PR={ r1,r2,…,ra,…,rAIn mark off belong to week-request merge comment data
Number scale be week-comment amount numweek, for side
Just illustrate, the numweekIt is assigned a value of 1500, i.e. C=1500.
In PR={ r1,r2,…,ra,…,rAIn the moon-request of belonging to that marks off merge comment data
Number scale be the moon-comment amount nummonth,
For convenience of explanation, the nummonthIt is assigned a value of 5000, i.e. D=5000.
It indicates to take day day as first comment data for commenting on the time;
It indicates to take day day as the Article 2 comment data for commenting on the time;
It indicates to take day day as any one comment data for commenting on the time;B indicates to be to comment on commenting for time with day day
By the identification number of data;
It indicates to take day day as the last item comment data for commenting on the time;
It indicates to take day week as first comment data for commenting on the time;
It indicates to take day week as the Article 2 comment data for commenting on the time;
It indicates to take day week as any one comment data for commenting on the time;C indicates with day week to be the comment time
Comment data identification number;
It indicates to take day week as the last item comment data for commenting on the time;
It indicates to take day month as first comment data for commenting on the time;
It indicates to take day month as the Article 2 comment data for commenting on the time;
It indicates to take day month as any one comment data for commenting on the time;D indicates to be when commenting on day month
Between comment data identification number;
It indicates to take day month as the last item comment data for commenting on the time.
In the present invention, the number (i.e. comment amount) of the comment content of Pull requests module needs is extracted respectively
And comment content, it is for the quantization in carrying out sentiment analysis as emotional factor correlation.
Step 4 obtains the correlation degree of the comment data divided with the time according to emotion degree of correlation IFC;
In the present invention, emotion degree of correlationMiddle n indicates summation element, i table
Show summing target, numiIndicate the comment amount under summing target,Indicate average review amount, SEiIndicate the feelings under summing target
Feel assay value,Indicate average sentiment analysis value.IFC calculated by product moment method, with two factors and respective average value from
Based on difference, the degree of correlation that is multiplied to reflect between two factors by two deviations.The value of IFC is bigger, between two factors of expression
Degree of correlation is higher;It is on the contrary then indicate that the degree of correlation between two factors is lower.IFC > 0 is indicated to exist between two factors and is positively correlated,
IFC < 0 indicates there is negative correlation between two factors, and IFC=0 indicates that there is no linear correlations between two factors.
If merging comment data to day-requestCarry out emotion phase
Pass degreeProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCday。
If merging comment data to week-requestIt carries out
Emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, is denoted as
IFCweek。
If merging comment data to the moon-request
Carry out emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day,
It is denoted as IFCmonth。
Embodiment 1
It is that comment chronomere carries out emotion correlation analysis to Pull requests module comment data with day:
If there are 3 comment datas in the Pull requests module on 1 day January in × year, it is denoted as first comment respectively
DataArticle 2 comment dataWith Article 3 comment data
Then belong to × the comment amount on January 1, in is denoted as num1/1/(num1/1/=3);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value on January 1, in is denoted as
Belong to × emotion the degree of correlation on January 1, in
Work as IFC1/1/When being 0, show to belong to × there is no correlations for the comment amount on January 1, in and emotional value.Comment on emotional value pair
Development of projects does not have an impact.
If there are 7 comment datas in the Pull requests module on 2 days January in × year, it is denoted as first comment respectively
DataArticle 2 comment dataArticle 3 comment dataArticle 4 comment dataArticle 5
Comment dataArticle 6 comment dataWith Article 7 comment data
Then belong to × the comment amount on January 1, in is denoted as num2/1/(num2/1/=7);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value on 1-2 days January in year is denoted as
Belong to × emotion the degree of correlation on 1-2 days January in year
Work as IFC2/1/When less than 0, show to belong to × the comment amount on 1-2 days January in year and emotional value exist it is negatively correlated.Comment on feelings
There may be negative influences to development of projects for inductance value.
Comment data not on the same day is handled in the method in × on January 2, in, obtain respectively comment amount not on the same day,
Average emotional value and emotion degree of correlation:
When IFC is greater than 0, shows that comment amount and emotional value exist and be positively correlated.That is comment emotional value can generate development of projects
Positive influence.
If there are 7 comment datas in the Pull requests module on 10 days January in × year, it is denoted as first respectively and comments
By dataArticle 2 comment dataArticle 3 comment dataArticle 4 comment dataThe
Five comment datasArticle 6 comment dataWith Article 7 comment data
Then belong to × the comment amount on January 10, in is denoted as num10/1/(num10/1/=7);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value on January 10, in is denoted as
Belong to × emotion the degree of correlation on 1-10 days January in year:
Work as IFC10/1/When greater than 0, show to belong to × the comment amount on 1-10 days January in year and emotional value exist and be positively correlated.Comment
Positive influence can be generated to development of projects by emotional value.
Embodiment 2
Request as unit of week merges the processing of comment data collection, for the extraction of comment, clones and feels emerging on GitHub
The source code of the project of interest creates the week-code library an of local to locally, then carries out to the code in week-code library
The extraction of Pull requests module comment data.
Emotion correlation analysis is carried out to Pull requests module comment data for comment chronomere with week:
If there are 3 comment datas in the Pull requests module in first week × year, it is denoted as first comment respectively
DataArticle 2 comment dataWith Article 3 comment data
Then belong to × the comment amount in first week year is denoted as num1/ year(num1/ year=3);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value in first week year is denoted as
Belong to × emotion the degree of correlation in first week year
Work as IFC1/ yearWhen being 0, show to belong to × there is no correlations for the comment amount in first week year and emotional value.That is comment emotional value is to item
Mesh progress does not have an impact.
If there are 8 comment datas in the Pull requests module of × year second week, it is denoted as first comment respectively
DataArticle 2 comment dataArticle 3 comment dataArticle 4 comment dataArticle 5 comment
DataArticle 6 comment dataArticle 7 comment dataWith Article 8 comment data
Then belong to × year second week comment amount be denoted as num2/ year(num2/ year=8);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value in second week week in year is denoted as
Belong to × emotion the degree of correlation in week 1-2 weeks year
Work as IFC2/ yearIt is less than
When 0, show to belong to × the comment amount in 1-2 weeks year and emotional value exist it is negatively correlated.That is comment emotional value may produce development of projects
Raw negative influence.
Comment data not on the same day is handled in the method in × the 2nd week year, obtains comment amount not on the same day, flat respectively
Equal emotional value and emotion degree of correlation:
When IFC is greater than 0, shows that comment amount and emotional value exist and be positively correlated.That is comment emotional value can generate development of projects
Positive influence.
Belong to × 10 weeks emotion degrees of correlation before year
Work as IFC10/1/When greater than 0, show to belong to × 10 weeks comment amounts and emotional value presence positive correlation before year.Comment on
Emotional value can generate positive influence to development of projects.
From PR={ r1,r2,…,ra,…,rAIn choose week-request as unit of all week and merge comment dataC ∈ A, while obtaining asking in week-as unit of all week
Ask the comment amount WEEK_PR_num for merging comment data, and WEEK_PR_num=C;
It is describedInIndicate first as unit of week
Request merges comment data,Indicate that second request as unit of week merges comment data,Indicate any one
Request as unit of week merges comment data,Indicate the last one request merging comment data as unit of week, C table
Show that the request as unit of week merges the total number of comment data, C ∈ A;For convenience of explanation,Also illustrate that c-th with week
Merge comment data for the request of unit, c indicates that the request as unit of week merges the identification number of comment data.
Embodiment 3
Request as unit of the moon merges the processing of comment data collection, for the extraction of comment, clones and feels emerging on GitHub
The source code of the project of interest creates the moon-code library an of local to locally, then carries out to the code in the moon-code library
The extraction of Pull requests module comment data.
It is that comment chronomere carries out emotion correlation analysis to Pull requests module comment data with the moon:
If there are 5 comment datas in the Pull requests module in January in × year, it is denoted as first comment number respectively
According toArticle 2 comment dataArticle 3 comment dataArticle 4 comment dataAnd Article 5
Comment data
Then belong to × the comment amount in January in year is denoted as numJan/(numJan/=5);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value in January in year is denoted as
Belong to × emotion the degree of correlation in January in year
Work as IFCJan/When being 0, show
Belong to × there is no correlations for the comment amount in January in year and emotional value.That is comment emotional value does not have an impact development of projects.
If there are 10 comment datas in the Pull requests module in February in × year, it is denoted as first comment respectively
DataArticle 2 comment dataArticle 3 comment dataArticle 4 comment data5th
Comment dataArticle 6 comment dataArticle 7 comment dataArticle 8 comment dataArticle 9 comment dataWith Article 10 comment data
Then belong to × the comment amount in 2 months years is denoted as numFeb/(numFeb/=10);
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
According to SST pairs of emotion matching treatmentProcessing, is belonged toEmotional value
Belong to × the average emotional value in 2 months years is denoted as
Belong to × emotion the degree of correlation of the month in year 1-2
Work as IFCFeb/When less than 0, show to belong to × the comment amount of the month in year 1-2 and emotional value exist it is negatively correlated.Comment on emotional value
To development of projects, there may be negative influences.
Comment data not on the same day is handled in the method in × 2 months years, obtains comment amount not on the same day, average respectively
Emotional value and emotion degree of correlation:
When IFC is greater than 0, shows that comment amount and emotional value exist and be positively correlated.That is comment emotional value can generate development of projects
Positive influence.
Belong to × before year October emotion degree of correlation:
Work as IFCOct/When greater than 0, show to belong to × the comment amount of the month in year 1-10 and emotional value exist and be positively correlated.Comment on
Emotional value can generate positive influence to development of projects.
From PR={ r1,r2,…,ra,…,rAIn choose the moon-request as unit of moon month and merge comment dataD ∈ A, while obtaining as unit of moon month
The moon-request merges the comment amount MONTH_PR_num of comment data, and MONTH_PR_num=D;It is describedInIndicate first request as unit of the moon
Merge comment data,Indicate that second request as unit of the moon merges comment data,Indicate any one with
The moon is that the request of unit merges comment data,Indicate the last one request merging comment data as unit of the moon, D table
Show that the request as unit of the moon merges the total number of comment data, D ∈ A;For convenience of explanation,Also illustrate that d-th with the moon
Merge comment data for the request of unit, d indicates that the request as unit of the moon merges the identification number of comment data.
Claims (6)
1. it is a kind of based on emotional factor to the covariance correlation extraction method of development of projects, it is characterised in that include following place
Manage step:
Step 1 extracts request from the Pull requests module in GitHub project and merges comment data collection PR;
The request merges comment data collection and is denoted as PR={ r1,r2,…,ra,…,rA};
r1Indicate that first request merges comment data;
r2Indicate that Article 2 request merges comment data;
raIndicate that a articles request merges comment data;A indicates that request merges the identification number of comment data;
rAIndicate that the last item request merges comment data, A indicates that request merges the total number of comment data.
Step 2 obtains the comment content of Pull requests module from request merging data collection PR, and to comment content
With sentiment analysis value;
Any one comment content exports a sentiment analysis value SE after emotion matching treatment SST processing, then has:
Belong to r1In comment content be denoted asIt will be describedIt is defeated after emotion matching treatment SST processing
Belong to r out1Sentiment analysis value be denoted as
Belong to r2In comment content be denoted asIt will be describedIt is defeated after emotion matching treatment SST processing
Belong to r out2Sentiment analysis value be denoted as
Belong to raIn comment content be denoted asIt will be describedIt is defeated after emotion matching treatment SST processing
Belong to r outaSentiment analysis value be denoted as
Belong to rAIn comment content be denoted asIt will be describedIt is defeated after emotion matching treatment SST processing
Belong to r outASentiment analysis value be denoted as
Sentiment analysis value SE, which refers to, is input to the comment content in any one comment data in emotion matching treatment SST, so
A numerical value is exported by emotion matching treatment SST afterwards.
Step 3 obtains the comment amount of Pull requests module from request merging data collection PR;
Comment time and comment content are carried in comment data since any one request merges, according to the comment time
Carry out partition request and merges comment data collection PR={ r1,r2,…,ra,…,rA, day-request as unit of day will be obtained respectively
Merge comment dataB ∈ A, the week-request as unit of week are closed
And comment dataC ∈ A or the moon-as unit of the moon
Request merges comment dataD∈A。
In PR={ r1,r2,…,ra,…,rAIn the day-request of belonging to that marks off merge comment data
Number scale be day-comment amount numday;
In PR={ r1,r2,…,ra,…,rAIn mark off belong to week-request merge comment data
Number scale be week-comment amount numweek;
In PR={ r1,r2,…,ra,…,rAIn the moon-request of belonging to that marks off merge comment data
Number scale be the moon-comment amount nummonth;
It indicates to take day day as first comment data for commenting on the time;
It indicates to take day day as the Article 2 comment data for commenting on the time;
It indicates to take day day as any one comment data for commenting on the time;B indicates to take day day as the comment number for commenting on the time
According to identification number;
It indicates to take day day as the last item comment data for commenting on the time;
It indicates to take day week as first comment data for commenting on the time;
It indicates to take day week as the Article 2 comment data for commenting on the time;
It indicates to take day week as any one comment data for commenting on the time;C indicates to be to comment on commenting for time with day week
By the identification number of data;
It indicates to take day week as the last item comment data for commenting on the time;
It indicates to take day month as first comment data for commenting on the time;
It indicates to take day month as the Article 2 comment data for commenting on the time;
It indicates to take day month as any one comment data for commenting on the time;D indicates to take day month as the comment time
The identification number of comment data;
It indicates to take day month as the last item comment data for commenting on the time;
Step 4 obtains the correlation degree of the comment data divided with the time according to emotion degree of correlation IFC;
In the present invention, emotion degree of correlationMiddle n indicates summation element, and i expression is asked
And index, numiIndicate the comment amount under summing target,Indicate average review amount, SEiIndicate the emotion point under summing target
Analysis value,Indicate average sentiment analysis value.IFC is calculated by product moment method, is with two factors and the deviation of respective average value
Basis, the degree of correlation for being multiplied to reflect between two factors by two deviations.The value of IFC is bigger, indicates the correlation between two factors
Degree is higher;It is on the contrary then indicate that the degree of correlation between two factors is lower.IFC > 0 is indicated to exist between two factors and is positively correlated, IFC
< 0 indicates there is negative correlation between two factors, and IFC=0 indicates that there is no linear correlations between two factors.
If merging comment data to day-requestCarry out emotion correlation journey
DegreeProcessing, obtains the comment data correlation degree for belonging to day, is denoted as IFCday。
If merging comment data to week-requestCarry out emotion
Degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, is denoted as
IFCweek。
If merging comment data to the moon-requestIt carries out
Emotion degree of correlationProcessing, obtains the comment data correlation degree for belonging to day, is denoted as
IFCmonth。
2. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature
It is: the numdayIt is assigned a value of 500, i.e. B=500.
3. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature
It is: the numweekIt is assigned a value of 1500, i.e. C=1500.
4. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature
It is: the nummonthIt is assigned a value of 5000, i.e. D=5000.
5. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature
It is: will be applied onto the Pull requests module in GitHub platform to carry out the extraction and analysis of comment data.Comment on number
It include comment content and comment amount in.For the extraction of comment, the source code for cloning interested project on GitHub is arrived
It is local, the code library of a local is created, Pull requests module comment data then is carried out to the code in code library
It extracts.
6. it is according to claim 1 based on emotional factor to the covariance correlation extraction method of development of projects, feature
It is: software development progress is improved by the emotion degree of correlation IFC after quantization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810728956.1A CN108958710B (en) | 2018-07-05 | 2018-07-05 | Method for extracting covariance correlation of project progress based on emotional factors |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810728956.1A CN108958710B (en) | 2018-07-05 | 2018-07-05 | Method for extracting covariance correlation of project progress based on emotional factors |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108958710A true CN108958710A (en) | 2018-12-07 |
CN108958710B CN108958710B (en) | 2021-07-16 |
Family
ID=64485789
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810728956.1A Active CN108958710B (en) | 2018-07-05 | 2018-07-05 | Method for extracting covariance correlation of project progress based on emotional factors |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108958710B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101782898A (en) * | 2010-03-25 | 2010-07-21 | 中国科学院计算技术研究所 | Method for analyzing tendentiousness of affective words |
CN101901230A (en) * | 2009-05-31 | 2010-12-01 | 国际商业机器公司 | Information retrieval method, user comment processing method and system thereof |
CN103064971A (en) * | 2013-01-05 | 2013-04-24 | 南京邮电大学 | Scoring and Chinese sentiment analysis based review spam detection method |
US20160092793A1 (en) * | 2014-09-26 | 2016-03-31 | Thomson Reuters Global Resources | Pharmacovigilance systems and methods utilizing cascading filters and machine learning models to classify and discern pharmaceutical trends from social media posts |
US20160277424A1 (en) * | 2015-03-20 | 2016-09-22 | Ashif Mawji | Systems and Methods for Calculating a Trust Score |
-
2018
- 2018-07-05 CN CN201810728956.1A patent/CN108958710B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101901230A (en) * | 2009-05-31 | 2010-12-01 | 国际商业机器公司 | Information retrieval method, user comment processing method and system thereof |
CN101782898A (en) * | 2010-03-25 | 2010-07-21 | 中国科学院计算技术研究所 | Method for analyzing tendentiousness of affective words |
CN103064971A (en) * | 2013-01-05 | 2013-04-24 | 南京邮电大学 | Scoring and Chinese sentiment analysis based review spam detection method |
US20160092793A1 (en) * | 2014-09-26 | 2016-03-31 | Thomson Reuters Global Resources | Pharmacovigilance systems and methods utilizing cascading filters and machine learning models to classify and discern pharmaceutical trends from social media posts |
US20160277424A1 (en) * | 2015-03-20 | 2016-09-22 | Ashif Mawji | Systems and Methods for Calculating a Trust Score |
Non-Patent Citations (1)
Title |
---|
杨波: ""Sentiments Analysis in GitHub Repositories:"", 《2017 24TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS》 * |
Also Published As
Publication number | Publication date |
---|---|
CN108958710B (en) | 2021-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108804512B (en) | Text classification model generation device and method and computer readable storage medium | |
CN103530540B (en) | User identity attribute detection method based on man-machine interaction behavior characteristics | |
CN103646088B (en) | Product comment fine-grained emotional element extraction method based on CRFs and SVM | |
CN108959603A (en) | Personalized recommendation system and method based on deep neural network | |
CN104572616B (en) | The definite method and apparatus of Text Orientation | |
CN110019943B (en) | Video recommendation method and device, electronic equipment and storage medium | |
CN110020038A (en) | Webpage information extracting method, device, system and electronic equipment | |
CN106779921A (en) | Recommend method and device | |
US10496751B2 (en) | Avoiding sentiment model overfitting in a machine language model | |
CN105279148B (en) | A kind of APP software users comment on uniformity determination methods | |
CN109903127A (en) | A kind of group recommending method, device, storage medium and server | |
CN110110663A (en) | A kind of age recognition methods and system based on face character | |
CN108198631A (en) | Evidence-based medical outcome generation method and device | |
Nadikattu | Fundamental applications of machine learning across the globe | |
Du et al. | An interactive network for end-to-end review helpfulness modeling | |
US10817576B1 (en) | Systems and methods for searching an unstructured dataset with a query | |
US10474919B2 (en) | Method for determining and displaying products on an electronic display device | |
CN107357782A (en) | One kind identification user's property method for distinguishing and terminal | |
Fiol-Roig et al. | Data mining techniques for web page classification | |
KR102410715B1 (en) | Apparatus and method for analyzing sentiment of text data based on machine learning | |
CN102866997B (en) | The treating method and apparatus of user data | |
CN108958710A (en) | Method for extracting covariance correlation of project progress based on emotional factors | |
CN108197274A (en) | Abnormal individual character detection method and device based on dialogue | |
CN112085158A (en) | Book recommendation method based on stack noise reduction self-encoder | |
Clayton et al. | Immersive Visualization and Multi-Sensor Fusion Systems, Process Simulation and Modeling Technologies, and Digital Twin Connected Factories and Virtual Plant Floor Networks in the Industrial Metaverse |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |