Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
Data analysing method provided by the embodiments of the present application can be applied in application environment as shown in Figure 1.Wherein,
The terminal 102 of each user is communicated with server 104 by network by network.The terminal 102 of each user is operating
During webpage, the corresponding attribute tags of each user in the available original user sample set of server 104, wherein
It include target user's sample set in original user sample set, attribute tags are that user generates during operating webpage,
Attribute tags include the first attribute tags, the second attribute tags and third attribute tags;Belonged to according to the first attribute tags and second
Property label the first classification processing is carried out to the user in target user's sample set, and count the first classification processing and each of obtain
The first quantity for the user for including in classification;Second is carried out to the user in target user's sample set according to third attribute tags
Classification processing, and count the second quantity for the user for including in each classification that the second classification processing obtains;It is obtained according to statistics
The first quantity obtain the first analysis data, and according to the user for including in obtained the second quantity and original user sample set
Quantity obtain second analysis data;Data analysis result is generated according to the first analysis data and the second analysis data.Wherein, eventually
End 102 can be, but not limited to be various personal computers, laptop, smart phone, tablet computer and portable wearable
Equipment, server 104 can be realized with the server cluster of the either multiple server compositions of independent server.
In one embodiment, as shown in Fig. 2, providing a kind of data analysing method, it is applied in Fig. 1 in this way
It is illustrated for terminal, comprising the following steps:
Step 202, the corresponding attribute tags of each user in original user sample set are obtained, wherein original user sample
It include target user's sample set in this set, attribute tags are that user generates during operating webpage, attribute tags packet
Include the first attribute tags, the second attribute tags and third attribute tags.
Wherein, original user sample set refers to carrying out the set of all user's samples of data analysis.Attribute tags
Refer to the label for indicating attribute of user during operating webpage.Such as " participation ", " having neither part nor lot in ", " residence time
Length ", " residence time is short ", " sharing ", " opening " etc..
Step 204, the user in target user's sample set is carried out according to the first attribute tags and the second attribute tags
First classification processing, and count the first quantity for the user for including in each classification that the first classification processing obtains.
Wherein, target user's sample set refers to the set of user's sample for classification processing processing.Specifically, it obtains
The initial data for taking corresponding second attribute tags of each user in original user sample set, according to initial data from original use
Target user is obtained in the sample set of family;Target user's sample set is generated according to the target user of acquisition.
Specifically, the user in target user's sample set can have the first attribute tags and the second attribute mark simultaneously
Label, classify to the user in target user's sample set according to the first attribute tags and the second attribute tags, and count every
The quantity of user in a classification.
For example, the first attribute tags can be " participation ", " having neither part nor lot in ", the second attribute tags can be the " residence time
Length ", " residence time is short ".The user in target user's sample set is carried out according to the first attribute tags and the second attribute tags
First classification processing, available four classification " participate in residence time long ", " it is short to participate in the residence time ", " when having neither part nor lot in stop
Between it is long " and " it is short to have neither part nor lot in the residence time ".
When user is when operating webpage, such as " supplementing with money ", " payment " of the button in webpage clicking, then the first attribute of the user
Label is " participation ";When not clicking on any button in webpage, then the first attribute tags of the user are " having neither part nor lot in ".User exists
Start to operate webpage to exiting webpage during the entire process of, when residence time on webpage is more than duration threshold value, then the user
The second attribute tags be " residence time is long ";When residence time on webpage is less than or equal to duration threshold value, then the user
The second attribute tags be " residence time is short ".
" it is long to participate in the residence time " indicates that user clicks the button in webpage while residence time on webpage is long.
" it is short to participate in the residence time " indicates that user clicks the button in webpage while residence time on webpage is short.It " has neither part nor lot in
Residence time is long " indicate that user does not click on any button in webpage while residence time on webpage is long.It " has neither part nor lot in and stops
Stay the time short " indicate that user does not click on any button in webpage while residence time on webpage is short.
Step 206, the second classification processing is carried out to the user in target user's sample set according to third attribute tags, and
Count the second quantity for the user for including in each classification that the second classification processing obtains.
For example, third attribute tags can be " opening ", " completion ", " recommendation ".According to third attribute tags " opening ",
User in target user's sample set can be divided into three classes " opening ", " completion " and " recommendation " by " completion ", " recommendation ", and unite
Count the quantity for the user for including in each classification.
" opening " indicates that user opens webpage." completion " indicates that user clicks the button in webpage and completes net
The step of activity in page." recommendation " indicates that user shares webpage.
Step 208, the first analysis data are obtained according to obtained the first quantity of statistics, and according to the second obtained quantity and
The quantity for the user for including in original user sample set obtains the second analysis data.
Wherein, the first analysis data refer to the data analyzed the first quantity.What the second analysis data referred to
It is the data analyzed the quantity for the user for including in the second quantity and original user sample set.
For example, obtaining four classification, " participation stops after carrying out the first classification processing to the user in target user's sample set
Stay the time long ", the first quantity of " participate in residence time short ", " it is long to have neither part nor lot in the residence time " and " it is short to have neither part nor lot in the residence time ",
It is clarity and participation according to the available first analysis data of four the first quantity.
After carrying out the second classification processing to the user in target user's sample set, three classification " opening ", " complete are obtained
At " and " recommendation " the second quantity, can according to the quantity for the user for including in three the second quantity and original user sample set
To obtain the second analysis data as visibility, completeness and recommendation.
Step 210, data analysis result is generated according to the first analysis data and the second analysis data.
For example, data analysis result can be, " visibility is too low, it may be possible to which, since link entrance is unobvious, user is difficult
Notice ", either " Attraction Degree is too low, it may be possible to since movable bonus is very little, lack attraction to user ".
Further, improved method can also be obtained by data analysis result, and such as " since visibility is too low, it can be with
The appropriate entrance launched advertisement and obtain user " " since Attraction Degree is too low, can properly increase amount of bonus, or improve prize
Quantity attracts more users ".
Above-mentioned data analysing method obtains the corresponding attribute tags of each user in original user sample set, wherein former
It include target user's sample set in beginning user's sample set, attribute tags are that user generates during operating webpage, are belonged to
Property label include the first attribute tags, the second attribute tags and third attribute tags;According to the first attribute tags and the second attribute
Label in target user's sample set user carry out the first classification processing, and count the first classification processing obtain each divide
The first quantity for the user for including in class;Second point is carried out to the user in target user's sample set according to third attribute tags
Class processing, and count the second quantity for the user for including in each classification that the second classification processing obtains;It is obtained according to statistics
First quantity obtains the first analysis data, and according to the user for including in obtained the second quantity and original user sample set
Quantity obtains the second analysis data;Data analysis result is generated according to the first analysis data and the second analysis data.By to original
The corresponding attribute tags of each user carry out classification processing in beginning user's sample set, and count the quantity in each classification, from
And the first analysis data and the second analysis data are obtained, data analysis knot is obtained according to the first analysis data and the second analysis data
The accuracy of data analysis can be improved in fruit.
In one embodiment, it obtains in original user sample set after the corresponding attribute tags of each user, also wraps
Include: obtaining the initial data of corresponding second attribute tags of each user in original user sample set, according to initial data from
Inactive users are obtained in original user sample set;It obtains the target user in addition to inactive users and generates target user's sample set
It closes.
Wherein, the second attribute tags can be " residence time is long ", " residence time is short ".Inactive users refer to that data are
Invalid data, that is, nugatory user is analyzed to data, such as use of the residence time less than 1s during operating webpage
Family, it is believed that caused by the user may be maloperation or malicious operation, the data for illustrating that the user generates do not have break-up value,
Then the user is inactive users.
In original user sample set, there are some pairs of data to analyze nugatory data, if by original user
Each user in sample set analyzes, and can reduce the accuracy of data analysis.If some users are operating webpage
In the process, since network cause rests on the overlong time on webpage, fail to exit always.Or due to server,
Occur the same data simultaneously in a short time.Therefore, it is necessary to which some pairs of data are analyzed nugatory data to remove, from original
Inactive users are obtained in beginning user's sample set, obtain target user further according to inactive users, target is generated according to target user
User's set.
In the present embodiment, according to the original number of corresponding second attribute tags of user each in original user sample set
According to obtaining inactive users from original user sample set, then obtain the data of target user in addition to inactive users and carry out
The accuracy of data analysis can be improved in analysis.
In one embodiment, initial data includes the first initial data and the second initial data, wherein the first original number
At the time of according to for indicating that user starts to operate webpage, the second initial data is used to indicate the duration of user's operation webpage;
The initial data for obtaining corresponding second attribute tags of each user in original user sample set, according to original number
At least one of inactive users, including following manner are obtained according to from original user sample set:
Step 302, the second original number of corresponding second attribute tags of each user in original user sample set is obtained
According to the user using the second initial data less than the first data threshold is as inactive users.
Wherein, the second initial data can be " 10s ", " 2s ".First data threshold can be it is pre-set, can also be with
It is arranged in real time, it is without being limited thereto.
It is understood that when user is during operating webpage, it can be due to network cause or maloperation, short
Start to operate webpage in time and exits webpage.Therefore, the data of the user are invalid data, which is inactive users.
For example, it is 1s that the first data threshold, which can be set, each user corresponding the in original user sample set is obtained
Second initial data of two attribute tags, when the second initial data is less than 1s, then second initial data is that do not have to data analysis
Valuable data remove second initial data less than 1s.When the second initial data is greater than 1s, then by second data
Corresponding user is as inactive users.
Step 304, the first original number of corresponding second attribute tags of each user in original user sample set is obtained
According to the second initial data, the user in original user sample set is clustered according to the first initial data, Statistical Clustering Analysis
The obtained number of users in each classification, the classification that number of users is greater than amount threshold work as target class as target category
When corresponding second initial data of all users in not is in data area, using the user in target category as no effectiveness
Family.
Wherein, the first initial data can be " 15:54 on November 14th, 2018 ".
Specifically, the user in original user sample set is clustered according to the first initial data, can will be started
The user's cluster at the time of operating webpage being the same second is the same classification, can also will start to operate the webpage moment to exiting net
User cluster of the page moment in objective time interval is the same classification, such as in 15:54-2018 November 14 on November 14th, 2018
In 15:55 period day, the user for operating the webpage moment and exiting the webpage moment within the period simultaneously will be started within the period
Cluster is a classification.
Then count the number of users in each classification, when number of users be greater than amount threshold when, then using the category as
Target category.Wherein, amount threshold can be preset, and can also be arranged with real-time perfoming, without being limited thereto.
After obtaining target category, when corresponding second initial data of all users in target category, that is, user exist
When operating the duration of webpage in data area, using the user in target category as inactive users.
It is understood that in some cases, the IP of electronic equipment can be constantly changed by some electronic equipments
(Internet Protocol, Internet protocol) address, frequent progress webpage clicking links in a short time, to improve net
The click volume and amount of reading of page link.However, being not to use by improving click volume and amount of reading in the change IP address short time
Family operates the real processes of webpage, i.e. the data are invalid data.If by invalid data together with the data of target user into
The analysis of row data, can reduce the accuracy of data analysis.
In another embodiment, according to the first attribute tags " participation " and " having neither part nor lot in " by original user sample set
Middle user is divided into two classifications.In classification " participation ", the maximum value in the second initial data of the second attribute tags is obtained, and
Using the maximum value as longest duration.In classification " having neither part nor lot in ", when user the second initial data be greater than longest duration, then will
The user is as inactive users.
It is understood that during user's operation webpage net can be rested on always because of other things
Webpage is not exited in page and, but user does not carry out the process of operation webpage actually, i.e. user does not participate in, and stops
The time is stayed to be greater than the longest duration participated in classification, then the user is inactive users.
In the present embodiment, by obtaining inactive users from original user sample set, and by the data of inactive users
The accuracy of data analysis can be improved in removal.
In one embodiment, according to the first attribute tags and the second attribute tags to the user in user's sample set into
The first classification processing of row includes:
Step 402, classification processing is carried out to the user in target user's sample set according to the first attribute tags, obtains the
One classification processing result.
Step 404, classification processing is carried out to the first classification processing result according to the second attribute tags, obtained at the second classification
Manage result.
For example, the first attribute tags can be " participation ", " having neither part nor lot in ", then by the user in target user's sample set into
Row classification processing obtains the user and " user having neither part nor lot in " that the first classification processing result is two classifications " participation ".Second belongs to
Property label can be " residence time is long ", " residence time is short ".Again by the first classification processing result participate in user and have neither part nor lot in
User carry out classification processing respectively, obtain that the second classification processing result is four classifications " it is long to participate in the residence time ", " participation stops
Stay the time short ", " it is long to have neither part nor lot in the residence time " and " it is short to have neither part nor lot in the residence time ".
Count the first quantity for the user for including in each classification that the first classification processing obtains, comprising:
Step 406, the first quantity for the user for including in each classification that the second classification processing result of statistics obtains.
For example, the second classification processing result can for four classifications " it is long to participate in the residence time ", " it is short to participate in the residence time ",
" it is long to have neither part nor lot in the residence time " and " it is short to have neither part nor lot in the residence time " counts the number of the user of each classification in the second classification results
Amount.
In the present embodiment, first the user in target user's sample set is carried out at classification according to the first attribute tags
Reason obtains the first classification processing as a result, carrying out classification processing to the first classification processing result further according to the second attribute tags, obtains
Second classification processing can more accurately carry out data analysis as a result, obtain the first quantity by the second classification processing result.
In one embodiment, the first analysis data are obtained according to the first quantity that statistics obtains, comprising: according to described every
The first analysis data are calculated according to the first calculation formula in the first quantity for the user for including in a classification.
As shown in figure 5, the second classification processing result can be four classifications: classification 501 is " it is long to participate in the residence time ", class
Other 502 be " it is short to participate in the residence time ", classification 503 is " it is long to have neither part nor lot in the residence time " and classification 504 is " to have neither part nor lot in the residence time
It is short ".Correspondingly, the first analysis data can be clarity.First calculation formula may include the calculation formula and suction of clarity
The calculation formula for degree of drawing.
Specifically, it participates in indicating that user clicks the button in webpage and takes part in activity in webpage, illustrates user by net
Content in page attracts.Having neither part nor lot in indicates that user does not click on any button in webpage, illustrates user not by in webpage
Hold and attracts.Residence time, short expression user residence time in webpage was short, illustrated the content in webpage than more visible.When stop
Between long expression user residence time in webpage it is long, illustrate that the content in webpage is unintelligible.
Further, as shown in figure 5, can be divided into " not in the user that classification 504 is " it is short to have neither part nor lot in the residence time "
Participate in residence time short understanding " and " have neither part nor lot in residence time short cannot understand "." having neither part nor lot in residence time short understanding " refers to
User has neither part nor lot in, is understood that backed off after random webpage to the content in webpage in a short time." have neither part nor lot in the residence time it is short cannot
Understand " refer to that user has neither part nor lot in, cannot understand backed off after random webpage to the content in webpage in a short time.
It is understood that classification be " having neither part nor lot in residence time short understanding " number of users classification be " have neither part nor lot in and stop
Stay the time short " in the ratio that accounts for, in classification be the ratio accounted in " participation " with number of users that classification is " it is short to participate in the residence time "
Example is identical.That is, to user's ratio of the content understanding in webpage in the classification participated in, and do not participated in
And it is identical to user's ratio of the content understanding in webpage in residence time short classification.
Specifically, for user during operating webpage, the residence time is short and is participated in, i.e., classification is " to participate in stopping
Time is short " user be to the clearly user of the content in webpage.In addition, to the content in webpage, clearly user further includes not
It is participated in, the residence time is short and the user of understanding, i.e. classification are the user of " having neither part nor lot in residence time short understanding ".
Therefore, clarity can be obtained by following calculation formula: clarity=(j+i*j/ (J+j))/(i+I+j+J).
Wherein, i indicates that classification is the number of users of " it is short to have neither part nor lot in the residence time ", and I indicates that classification is " it is long to have neither part nor lot in the residence time "
Number of users, j indicate that classification is the number of users of " it is short to participate in the residence time ", and J indicates that classification is " it is long to participate in the residence time "
Number of users, i*j/ (J+j) indicate that classification is the number of users of " having neither part nor lot in residence time short understanding ".
As shown in fig. 6, classification is the number of users i=9188 of " it is short to have neither part nor lot in the residence time ", classification is " to have neither part nor lot in stop
Time is long " number of users I=2162, classification is the number of users j=1173 of " participate in residence time short ", and classification is " to participate in
Residence time is long " number of users J=667.Then classification is the number of users i*j/ (J+j) of " having neither part nor lot in residence time short understanding "
=9188*1173 (667+1173)=5857.35, clarity=(j+i*j/ (J+j))/(i+I+j+J)=(1173+
5857.35)/(9188+2162+1173+667)=53.30%.
Further, as shown in figure 5, can be divided into " not in the user that classification 503 is " it is long to have neither part nor lot in the residence time "
Residence time length is participated in be attracted " and " having neither part nor lot in residence time length not to be attracted "." having neither part nor lot in residence time length to be attracted " refers to
Be that user has neither part nor lot in, the residence time is long during web page operation and is attracted.It " has neither part nor lot in residence time length not inhaled
Draw " refer to that user has neither part nor lot in, the residence time is long during web page operation and is not attracted.
It is understood that the number of users that classification is " have neither part nor lot in residence time length be attracted " in classification is " to have neither part nor lot in
Residence time is long " in the ratio that accounts for, in classification be " residence time is short " with number of users that classification is " it is short to participate in the residence time "
In the ratio that accounts for it is identical.
Specifically, for user during operating webpage, classification is that the user of " participation " is to be attracted by the content in webpage
User.In addition, further including the use for not participating in, the residence time length and being attracted by the user that the content in webpage attracts
Family, i.e. classification are the user of " having neither part nor lot in residence time length to be attracted ".
Therefore, Attraction Degree can be obtained by following calculation formula: Attraction Degree=(j+J+I*j/ (i+j))/(i+I+j+
J).Wherein, I*j/ (i+j) indicates that classification is the number of users of " having neither part nor lot in residence time length to be attracted ".
As shown in fig. 6, classification is the number of users i=9188 of " it is short to have neither part nor lot in the residence time ", classification is " to have neither part nor lot in stop
Time is long " number of users I=2162, classification is the number of users j=1173 of " participate in residence time short ", and classification is " to participate in
Residence time is long " number of users J=667.Then classification is the number of users I*j/ (i+ of " having neither part nor lot in residence time length to be attracted "
J)=2162*1173/ (9188+1173)=244.76, Attraction Degree=(j+J+I*j/ (i+j))/(i+I+j+J)=(1173+
667+244.76)/(9188+2162+1173+667)=15.81%.
In the present embodiment, it according to the first quantity for the user for including in each classification, is calculated according to the first calculation formula
Obtaining the first analysis data is clarity and Attraction Degree, can more accurately obtain the first analysis data.
In one embodiment, according to the quantity for the user for including in obtained the second quantity and original user sample set
Obtain the second analysis data, comprising: according to the quantity of user in obtained the second quantity and original user sample set, according to the
The second analysis data are calculated in two calculation formula.
For example, the second quantity can be " opening ", " completion ", the user for including in " recommendation " three classifications quantity.Phase
Ying Di, the second analysis data can be visibility, completeness and recommendation.Classification is the number of users i.e. target of " opening "
The quantity of number of users in user's sample set, as web page interlinkage arrival user.
Visibility can be calculated by following calculation formula: visibility=opening number of users/original user sample
The quantity of user in this set.Completeness can be calculated by following calculation formula: completeness=completion number of users/
The number of users of opening.Recommendation can be calculated by following calculation formula: recommendation=sharing number of users/opening
Number of users.
As shown in fig. 6, the quantity of user is 150000 in original user sample set, the number of users of opening, i.e. target
The quantity of user is 13190 in user's sample set, and the number of users of completion is 1000, and the number of users of sharing is 65.Then may be used
Quantity=13190/150000=8.79% of user in degree of opinion=opening number of users/original user sample set is completed
Degree=completion number of users/opening number of users=1000/13190=7.58%, recommendation=sharing number of users/
Number of users=65/13190=0.49% of opening.
In the present embodiment, it according to the quantity of user in the second quantity and original user sample set, is calculated according to second
The second analysis data visibility, completeness and recommendation is calculated in formula, can more accurately obtain the second analysis data.
In one embodiment, data analysis result includes the first data analysis result and the second data analysis result, root
Data analysis result is generated according to the first analysis data and the second analysis data, comprising: obtains the first analysis data corresponding first
Data threshold, and obtain corresponding second data threshold of the second analysis data;When the first analysis data are less than the first data threshold
When, generate the first data analysis result;When the second analysis data are less than the second data threshold, the second data analysis knot is generated
Fruit.
Wherein, the first data threshold and the second data threshold can be preset, and can also be arranged with real-time perfoming, are not limited to
This.First data threshold and the second data threshold can be obtained by big data, or be carried out according to the professional standard of this field
It obtains.
As shown in fig. 6, the first analysis data may include clarity and Attraction Degree, clarity 53.30%, Attraction Degree is
15.81%.First data threshold includes clarity threshold and Attraction Degree threshold value, and clarity threshold can be 50%, Attraction Degree threshold
Value can be 20%.Second analysis data may include visibility, completeness and recommendation, it is seen that spending is 8.79%, completeness
It is 7.58%, recommendation 0.49%.Second data threshold includes threshold of visibility, completeness threshold value and recommendation threshold value, can
Degree of opinion threshold value can be 30%, completeness threshold value 5%, recommendation threshold value 0.4%.
Then clarity 53.30% is greater than clarity threshold 50%, and Attraction Degree 15.81% is less than Attraction Degree threshold value 20%, can
Degree of opinion 8.79% is less than threshold of visibility 30%, and completeness 7.58% is greater than completeness threshold value 5%, and recommendation 0.49%, which is greater than, to be pushed away
Degree of recommending threshold value 0.4%.
Therefore, the first data analysis result can be " Attraction Degree is too low, it may be possible to since movable bonus is very little, to
Family lacks attraction ".Second data analysis result can be " visibility is too low, it may be possible to due to link entrance it is unobvious, user
It hardly notices ".
In the present embodiment, by the way that the first analysis data are analyzed number for second compared with the first data threshold, and simultaneously
According to compared with the second data threshold, the first data analysis result and the second data analysis result are generated respectively, it can be more accurate
Ground obtains the result of data analysis.
It should be understood that although each step in the flow chart of Fig. 2-4 is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 2-4
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively
It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately
It executes.
In one embodiment, as shown in fig. 7, providing a kind of data analysis set-up, comprising: attribute tags obtain module
702, the first classification processing module 704, the second classification processing module 706, analysis data acquisition module 708 and data analysis result
Generation module 710, in which:
Attribute tags obtain module 702, for obtaining the corresponding attribute tags of each user in original user sample set,
It wherein, include target user's sample set in the original user sample set, the attribute tags are users in operation webpage
It generates in the process, the attribute tags include the first attribute tags, the second attribute tags and third attribute tags.
First classification processing module 704 is used for according to first attribute tags and the second attribute tags to the target
User in user's sample set carries out the first classification processing, and counts and wrap in each classification that first classification processing obtains
The first quantity of the user contained.
Second classification processing module 706 is used for according to the third attribute tags in target user's sample set
User carry out the second classification processing, and count the second of the user for including in each classification that second classification processing obtains
Quantity.
Data acquisition module 708 is analyzed, first quantity for being obtained according to statistics obtains the first analysis data, and
The second analysis number is obtained according to the quantity for the user for including in obtained second quantity and the original user sample set
According to.
Data analysis result generation module 710, for generating data point according to the first analysis data and the second analysis data
Analyse result.
Above-mentioned data analysis set-up obtains the corresponding attribute tags of each user in original user sample set, wherein former
It include target user's sample set in beginning user's sample set, attribute tags are that user generates during operating webpage, are belonged to
Property label include the first attribute tags, the second attribute tags and third attribute tags;According to the first attribute tags and the second attribute
Label in target user's sample set user carry out the first classification processing, and count the first classification processing obtain each divide
The first quantity for the user for including in class;Second point is carried out to the user in target user's sample set according to third attribute tags
Class processing, and count the second quantity for the user for including in each classification that the second classification processing obtains;It is obtained according to statistics
First quantity obtains the first analysis data, and according to the user for including in obtained the second quantity and original user sample set
Quantity obtains the second analysis data;Data analysis result is generated according to the first analysis data and the second analysis data.By to original
The corresponding attribute tags of each user carry out classification processing in beginning user's sample set, and count the quantity in each classification, from
And the first analysis data and the second analysis data are obtained, data analysis knot is obtained according to the first analysis data and the second analysis data
The accuracy of data analysis can be improved in fruit.
In one embodiment, as shown in figure 8, providing a kind of data analysis set-up, comprising: attribute tags obtain module
802, target user's sample set generation module 804, the first classification processing module 806, the second classification processing module 808, analysis
Data acquisition module 810 and data analyze result-generation module 812, in which:
Attribute tags obtain module 802, for obtaining the corresponding attribute tags of each user in original user sample set,
It wherein, include target user's sample set in the original user sample set, the attribute tags are users in operation webpage
It generates in the process, the attribute tags include the first attribute tags, the second attribute tags and third attribute tags.
Target user's sample set generation module 804, it is corresponding for obtaining each user in original user sample set
The initial data of second attribute tags obtains inactive users from original user sample set according to initial data;It obtains and removes nothing
Target user except effectiveness family generates target user's sample set.
First classification processing module 806 is used for according to first attribute tags and the second attribute tags to the target
User in user's sample set carries out the first classification processing, and counts and wrap in each classification that first classification processing obtains
The first quantity of the user contained.
Second classification processing module 808 is used for according to the third attribute tags in target user's sample set
User carry out the second classification processing, and count the second of the user for including in each classification that second classification processing obtains
Quantity.
Data acquisition module 810 is analyzed, first quantity for being obtained according to statistics obtains the first analysis data, and
The second analysis number is obtained according to the quantity for the user for including in obtained second quantity and the original user sample set
According to.
Data analysis result generation module 812, for generating data point according to the first analysis data and the second analysis data
Analyse result.
In the present embodiment, by obtaining target user's sample set from original user sample set, then target is used
The corresponding attribute tags of each user in the sample set of family carry out classification processing, and count the quantity in each classification, thus
The first analysis data and the second analysis data are obtained, obtain data analysis knot according to the first analysis data and the second analysis data
Fruit can more improve the accuracy of data analysis.
In one embodiment, above-mentioned target user's sample set generation module 804 is also used to obtain original user sample
The initial data of corresponding second attribute tags of each user, is obtained from original user sample set according to initial data in set
It takes at least one of inactive users, including following manner: obtaining each user corresponding second in original user sample set and belong to
Property label the second initial data, using the second initial data be less than initial data threshold value user as inactive users;It obtains former
The first initial data and the second initial data of corresponding second attribute tags of each user in beginning user's sample set, according to
One initial data clusters the user in original user sample set, the number of users in each classification that Statistical Clustering Analysis obtains
Amount, the classification using number of users greater than amount threshold is as target category, as all users corresponding second in target category
When initial data is in data area, using the user in target category as inactive users.
In one embodiment, above-mentioned first classification processing module 806 is also used to use target according to the first attribute tags
User in the sample set of family carries out classification processing, obtains the first classification processing result;According to the second attribute tags to first point
Class processing result carries out classification processing, obtains the second classification processing result.It counts in each classification that the first classification processing obtains
The first quantity for the user for including, comprising: the of the user for including in the obtained each classification of the second classification processing result of statistics
One quantity.
In one embodiment, above-mentioned analysis data acquisition module 810 is also used to according to the user for including in each classification
The first quantity, the first analysis data are calculated according to the first calculation formula.
In one embodiment, above-mentioned analysis data acquisition module 810 is also used to according to obtained the second quantity and original
The second analysis data are calculated according to the second calculation formula in the quantity of user in user's sample set.
In one embodiment, it is corresponding to be also used to obtain the first analysis data for above-mentioned data analysis result generation module 812
The first data threshold, and obtain the second corresponding second data threshold of analysis data;When the first analysis data are less than the first number
When according to threshold value, the first data analysis result is generated;When the second analysis data are less than the second data threshold, the second data point are generated
Analyse result.
Specific about data analysis set-up limits the restriction that may refer to above for data analysing method, herein not
It repeats again.Modules in above-mentioned data analysis set-up can be realized fully or partially through software, hardware and combinations thereof.On
Stating each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also store in a software form
In memory in computer equipment, the corresponding operation of the above modules is executed in order to which processor calls.
In one embodiment, a kind of computer equipment is provided, which can be terminal, internal structure
Figure can be as shown in Figure 9.The computer equipment includes processor, the memory, network interface, display connected by system bus
Screen and input unit.Wherein, the processor of the computer equipment is for providing calculating and control ability.The computer equipment is deposited
Reservoir includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system and computer journey
Sequence.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating
The network interface of machine equipment is used to communicate with external terminal by network connection.When the computer program is executed by processor with
Realize a kind of data analysing method.The display screen of the computer equipment can be liquid crystal display or electric ink display screen,
The input unit of the computer equipment can be the touch layer covered on display screen, be also possible to be arranged on computer equipment shell
Key, trace ball or Trackpad, can also be external keyboard, Trackpad or mouse etc..
It will be understood by those skilled in the art that structure shown in Fig. 9, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment
It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, is stored in memory
Computer program, which performs the steps of when executing computer program obtains each use in original user sample set
The corresponding attribute tags in family, wherein include target user's sample set in original user sample set, attribute tags are that user exists
Operate what webpage generated in the process, attribute tags include the first attribute tags, the second attribute tags and third attribute tags;According to
First attribute tags and the second attribute tags carry out the first classification processing to the user in target user's sample set, and count the
The first quantity for the user for including in each classification that one classification processing obtains;According to third attribute tags to target user's sample
User in set carries out the second classification processing, and counts the of the user for including in each classification that the second classification processing obtains
Two quantity;The first analysis data are obtained according to the first quantity that statistics obtains, and according to obtained the second quantity and original user
The quantity for the user for including in sample set obtains the second analysis data;It is generated according to the first analysis data and the second analysis data
Data analysis result.
In one embodiment, it is also performed the steps of when processor executes computer program and obtains original user sample
The initial data of corresponding second attribute tags of each user, is obtained from original user sample set according to initial data in set
Take inactive users;It obtains the target user in addition to inactive users and generates target user's sample set.
In one embodiment, it is also performed the steps of when processor executes computer program and obtains original user sample
Second initial data of corresponding second attribute tags of each user in set, is less than initial data threshold value for the second initial data
User as inactive users;Obtain corresponding second attribute tags of each user in original user sample set first is original
Data and the second initial data cluster the user in original user sample set according to the first initial data, and statistics is poly-
The number of users in each classification that class obtains, the classification that number of users is greater than amount threshold work as target as target category
When corresponding second initial data of all users in classification is in data area, using the user in target category as no effectiveness
Family.
In one embodiment, it also performs the steps of when processor executes computer program according to the first attribute tags
Classification processing is carried out to the user in target user's sample set, obtains the first classification processing result;According to the second attribute tags
Classification processing is carried out to the first classification processing result, obtains the second classification processing result.Count the first classification processing obtain it is every
The first quantity for the user for including in a classification, comprising: include in each classification that the second classification processing result of statistics obtains
The first quantity of user.
In one embodiment, it also performs the steps of when processor executes computer program and is wrapped according in each classification
The first analysis data are calculated according to the first calculation formula in the first quantity of the user contained.
In one embodiment, it is also performed the steps of when processor executes computer program according to the second obtained number
The quantity of user, is calculated the second analysis data according to the second calculation formula in amount and original user sample set.
In one embodiment, acquisition the first analysis data are also performed the steps of when processor executes computer program
Corresponding first data threshold, and obtain corresponding second data threshold of the second analysis data;When the first analysis data are less than the
When one data threshold, the first data analysis result is generated;When the second analysis data are less than the second data threshold, the second number is generated
According to analysis result.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated
Machine program performs the steps of when being executed by processor obtains the corresponding attribute mark of each user in original user sample set
Label, wherein include target user's sample set in original user sample set, attribute tags are users during operating webpage
It generates, attribute tags include the first attribute tags, the second attribute tags and third attribute tags;According to the first attribute tags and
Second attribute tags carry out the first classification processing to the user in target user's sample set, and count the first classification processing and obtain
Each classification in include user the first quantity;According to third attribute tags to the user in target user's sample set into
The second classification processing of row, and count the second quantity for the user for including in each classification that the second classification processing obtains;According to system
It counts the first obtained quantity and obtains the first analysis data, and include according in obtained the second quantity and original user sample set
User quantity obtain second analysis data;Data analysis result is generated according to the first analysis data and the second analysis data.
In one embodiment, it is also performed the steps of when computer program is executed by processor and obtains original user sample
The initial data of corresponding second attribute tags of each user in this set, according to initial data from original user sample set
Obtain inactive users;It obtains the target user in addition to inactive users and generates target user's sample set.
In one embodiment, it is also performed the steps of when computer program is executed by processor: obtaining original user sample
Second initial data of corresponding second attribute tags of each user in this set, is less than initial data threshold for the second initial data
The user of value is as inactive users;Obtain corresponding second attribute tags of each user in original user sample set first is former
Beginning data and the second initial data cluster the user in original user sample set according to the first initial data, statistics
The number of users in obtained each classification is clustered, the classification that number of users is greater than amount threshold works as mesh as target category
When marking corresponding second initial data of all users in classification in data area, using the user in target category as invalid
User.
In one embodiment, it is also performed the steps of when computer program is executed by processor according to the first attribute mark
It signs and classification processing is carried out to the user in target user's sample set, obtain the first classification processing result;According to the second attribute mark
Label carry out classification processing to the first classification processing result, obtain the second classification processing result.Count what the first classification processing obtained
The first quantity for the user for including in each classification, comprising: include in each classification that the second classification processing result of statistics obtains
User the first quantity.
In one embodiment, it also performs the steps of when computer program is executed by processor according in each classification
The first analysis data are calculated according to the first calculation formula in the first quantity for the user for including.
In one embodiment, it is also performed the steps of when computer program is executed by processor according to second obtained
The second analysis data are calculated according to the second calculation formula in the quantity of user in quantity and original user sample set.
In one embodiment, acquisition the first analysis number is also performed the steps of when computer program is executed by processor
According to corresponding first data threshold, and obtain corresponding second data threshold of the second analysis data;When the first analysis data are less than
When the first data threshold, the first data analysis result is generated;When the second analysis data are less than the second data threshold, second is generated
Data analysis result.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
To any reference of memory, storage, database or other media used in each embodiment provided herein,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms,
Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing
Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application
Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.