CN109522495A

CN109522495A - Data analysing method, device, computer equipment and storage medium

Info

Publication number: CN109522495A
Application number: CN201811389637.9A
Authority: CN
Inventors: 宋延锋; 林东宇; 魏睿; 蔡剑洪; 舒栋; 蔡立勋; 黄慧
Original assignee: CENTURY DRAGON INFORMATION NETWORK Co Ltd
Current assignee: Tianyi Shilian Technology Co ltd
Priority date: 2018-11-21
Filing date: 2018-11-21
Publication date: 2019-03-26
Anticipated expiration: 2038-11-21
Also published as: CN109522495B

Abstract

This application involves a kind of data analysing method, device, computer equipment and storage mediums.The described method includes: obtaining the corresponding attribute tags of each user in original user sample set；The first classification processing is carried out to the user in target user's sample set according to the first attribute tags and the second attribute tags, and counts the first quantity for the user for including in each classification that the first classification processing obtains；The second classification processing is carried out to the user in target user's sample set according to third attribute tags, and counts the second quantity for the user for including in each classification that the second classification processing obtains；The first analysis data are obtained according to the first quantity that statistics obtains, and the second analysis data are obtained according to the quantity for the user for including in obtained the second quantity and original user sample set；Data analysis result is generated according to the first analysis data and the second analysis data.It can be improved the accuracy of data analysis using this method.

Description

Data analysing method, device, computer equipment and storage medium

Technical field

This application involves field of computer technology, more particularly to a kind of data analysing method, device, computer equipment and Storage medium.

Background technique

With the development of computer technology and internet, more and more activities are held on the internet.Some activities are lifted The effect done is relatively good, and the effect that some activities are held is bad.It, can only be from movable content for holding the bad activity of effect It carries out analysis and obtains reason.However, can not accurately obtain reason from movable content analysis, there are data accuracy of analysis Lower problem.

Summary of the invention

Based on this, it is necessary in view of the above technical problems, provide a kind of data analysis side for capableing of data accuracy of analysis Method, device, computer equipment and storage medium.

A kind of data analysing method, which comprises

Obtain the corresponding attribute tags of each user in original user sample set, wherein the original user sample set It include target user's sample set in conjunction, the attribute tags are that user generates during operating webpage, the attribute mark Label include the first attribute tags, the second attribute tags and third attribute tags；

The user in target user's sample set is carried out according to first attribute tags and the second attribute tags First classification processing, and count the first quantity for the user for including in each classification that first classification processing obtains；

The second classification processing is carried out to the user in target user's sample set according to the third attribute tags, and Count the second quantity for the user for including in each classification that second classification processing obtains；

Obtain the first analysis data according to obtained first quantity of statistics, and according to obtained second quantity and The quantity for the user for including in the original user sample set obtains the second analysis data；

Data analysis result is generated according to the first analysis data and the second analysis data.

A kind of data analysis set-up, described device include:

Attribute tags obtain module, for obtaining the corresponding attribute tags of each user in original user sample set, In, it include target user's sample set in the original user sample set, the attribute tags are users in operation webpage mistake It is generated in journey, the attribute tags include the first attribute tags, the second attribute tags and third attribute tags；

First classification processing module is used for according to first attribute tags and the second attribute tags to the target user User in sample set carries out the first classification processing, and counts in each classification that first classification processing obtains and include The first quantity of user；

Second classification processing module, for according to the third attribute tags to the use in target user's sample set Family carries out the second classification processing, and counts the second number for the user for including in each classification that second classification processing obtains Amount；

Data acquisition module is analyzed, first quantity for obtaining according to statistics obtains the first analysis data, and root The second analysis data are obtained according to the quantity for the user for including in obtained second quantity and the original user sample set；

Data analysis result generation module, for generating data point according to the first analysis data and the second analysis data Analyse result.

A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing The step of device realizes above-mentioned data analysing method when executing the computer program.

A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor The step of above-mentioned data analysing method is realized when row.

Above-mentioned data analysing method, device, computer equipment and storage medium obtain each in original user sample set The corresponding attribute tags of user, wherein include target user's sample set in original user sample set, attribute tags are users It is generated during operating webpage, attribute tags include the first attribute tags, the second attribute tags and third attribute tags；Root The first classification processing is carried out to the user in target user's sample set according to the first attribute tags and the second attribute tags, and is counted The first quantity for the user for including in each classification that first classification processing obtains；According to third attribute tags to target user's sample User in this set carries out the second classification processing, and counts the user for including in each classification that the second classification processing obtains Second quantity；The first analysis data are obtained according to the first quantity that statistics obtains, and according to obtained the second quantity and original use The quantity for the user for including in the sample set of family obtains the second analysis data；It is raw according to the first analysis data and the second analysis data At data analysis result.By carrying out classification processing to the corresponding attribute tags of user each in original user sample set, and The quantity in each classification is counted, so that the first analysis data and the second analysis data are obtained, according to the first analysis data and the Two analysis data obtain data analysis result, and the accuracy of data analysis can be improved.

Detailed description of the invention

Fig. 1 is the applied environment figure of data analysing method in one embodiment；

Fig. 2 is the flow diagram of data analysing method in one embodiment；

Fig. 3 is the flow diagram that inactive users step is obtained in one embodiment；

Fig. 4 is the flow diagram of data analysing method in another embodiment；

Fig. 5 is the schematic diagram of each classification of the second classification processing result in one embodiment；

Fig. 6 is the schematic diagram of input data and analysis data in one embodiment；

Fig. 7 is the structural block diagram of data analysis set-up in one embodiment；

Fig. 8 is the structural block diagram of data analysis set-up in one embodiment；

Fig. 9 is the internal structure chart of computer equipment in one embodiment.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.

Data analysing method provided by the embodiments of the present application can be applied in application environment as shown in Figure 1.Wherein, The terminal 102 of each user is communicated with server 104 by network by network.The terminal 102 of each user is operating During webpage, the corresponding attribute tags of each user in the available original user sample set of server 104, wherein It include target user's sample set in original user sample set, attribute tags are that user generates during operating webpage, Attribute tags include the first attribute tags, the second attribute tags and third attribute tags；Belonged to according to the first attribute tags and second Property label the first classification processing is carried out to the user in target user's sample set, and count the first classification processing and each of obtain The first quantity for the user for including in classification；Second is carried out to the user in target user's sample set according to third attribute tags Classification processing, and count the second quantity for the user for including in each classification that the second classification processing obtains；It is obtained according to statistics The first quantity obtain the first analysis data, and according to the user for including in obtained the second quantity and original user sample set Quantity obtain second analysis data；Data analysis result is generated according to the first analysis data and the second analysis data.Wherein, eventually End 102 can be, but not limited to be various personal computers, laptop, smart phone, tablet computer and portable wearable Equipment, server 104 can be realized with the server cluster of the either multiple server compositions of independent server.

In one embodiment, as shown in Fig. 2, providing a kind of data analysing method, it is applied in Fig. 1 in this way It is illustrated for terminal, comprising the following steps:

Step 202, the corresponding attribute tags of each user in original user sample set are obtained, wherein original user sample It include target user's sample set in this set, attribute tags are that user generates during operating webpage, attribute tags packet Include the first attribute tags, the second attribute tags and third attribute tags.

Wherein, original user sample set refers to carrying out the set of all user's samples of data analysis.Attribute tags Refer to the label for indicating attribute of user during operating webpage.Such as " participation ", " having neither part nor lot in ", " residence time Length ", " residence time is short ", " sharing ", " opening " etc..

Step 204, the user in target user's sample set is carried out according to the first attribute tags and the second attribute tags First classification processing, and count the first quantity for the user for including in each classification that the first classification processing obtains.

Wherein, target user's sample set refers to the set of user's sample for classification processing processing.Specifically, it obtains The initial data for taking corresponding second attribute tags of each user in original user sample set, according to initial data from original use Target user is obtained in the sample set of family；Target user's sample set is generated according to the target user of acquisition.

Specifically, the user in target user's sample set can have the first attribute tags and the second attribute mark simultaneously Label, classify to the user in target user's sample set according to the first attribute tags and the second attribute tags, and count every The quantity of user in a classification.

For example, the first attribute tags can be " participation ", " having neither part nor lot in ", the second attribute tags can be the " residence time Length ", " residence time is short ".The user in target user's sample set is carried out according to the first attribute tags and the second attribute tags First classification processing, available four classification " participate in residence time long ", " it is short to participate in the residence time ", " when having neither part nor lot in stop Between it is long " and " it is short to have neither part nor lot in the residence time ".

When user is when operating webpage, such as " supplementing with money ", " payment " of the button in webpage clicking, then the first attribute of the user Label is " participation "；When not clicking on any button in webpage, then the first attribute tags of the user are " having neither part nor lot in ".User exists Start to operate webpage to exiting webpage during the entire process of, when residence time on webpage is more than duration threshold value, then the user The second attribute tags be " residence time is long "；When residence time on webpage is less than or equal to duration threshold value, then the user The second attribute tags be " residence time is short ".

" it is long to participate in the residence time " indicates that user clicks the button in webpage while residence time on webpage is long. " it is short to participate in the residence time " indicates that user clicks the button in webpage while residence time on webpage is short.It " has neither part nor lot in Residence time is long " indicate that user does not click on any button in webpage while residence time on webpage is long.It " has neither part nor lot in and stops Stay the time short " indicate that user does not click on any button in webpage while residence time on webpage is short.

Step 206, the second classification processing is carried out to the user in target user's sample set according to third attribute tags, and Count the second quantity for the user for including in each classification that the second classification processing obtains.

For example, third attribute tags can be " opening ", " completion ", " recommendation ".According to third attribute tags " opening ", User in target user's sample set can be divided into three classes " opening ", " completion " and " recommendation " by " completion ", " recommendation ", and unite Count the quantity for the user for including in each classification.

" opening " indicates that user opens webpage." completion " indicates that user clicks the button in webpage and completes net The step of activity in page." recommendation " indicates that user shares webpage.

Step 208, the first analysis data are obtained according to obtained the first quantity of statistics, and according to the second obtained quantity and The quantity for the user for including in original user sample set obtains the second analysis data.

Wherein, the first analysis data refer to the data analyzed the first quantity.What the second analysis data referred to It is the data analyzed the quantity for the user for including in the second quantity and original user sample set.

For example, obtaining four classification, " participation stops after carrying out the first classification processing to the user in target user's sample set Stay the time long ", the first quantity of " participate in residence time short ", " it is long to have neither part nor lot in the residence time " and " it is short to have neither part nor lot in the residence time ", It is clarity and participation according to the available first analysis data of four the first quantity.

After carrying out the second classification processing to the user in target user's sample set, three classification " opening ", " complete are obtained At " and " recommendation " the second quantity, can according to the quantity for the user for including in three the second quantity and original user sample set To obtain the second analysis data as visibility, completeness and recommendation.

Step 210, data analysis result is generated according to the first analysis data and the second analysis data.

For example, data analysis result can be, " visibility is too low, it may be possible to which, since link entrance is unobvious, user is difficult Notice ", either " Attraction Degree is too low, it may be possible to since movable bonus is very little, lack attraction to user ".

Further, improved method can also be obtained by data analysis result, and such as " since visibility is too low, it can be with The appropriate entrance launched advertisement and obtain user " " since Attraction Degree is too low, can properly increase amount of bonus, or improve prize Quantity attracts more users ".

Above-mentioned data analysing method obtains the corresponding attribute tags of each user in original user sample set, wherein former It include target user's sample set in beginning user's sample set, attribute tags are that user generates during operating webpage, are belonged to Property label include the first attribute tags, the second attribute tags and third attribute tags；According to the first attribute tags and the second attribute Label in target user's sample set user carry out the first classification processing, and count the first classification processing obtain each divide The first quantity for the user for including in class；Second point is carried out to the user in target user's sample set according to third attribute tags Class processing, and count the second quantity for the user for including in each classification that the second classification processing obtains；It is obtained according to statistics First quantity obtains the first analysis data, and according to the user for including in obtained the second quantity and original user sample set Quantity obtains the second analysis data；Data analysis result is generated according to the first analysis data and the second analysis data.By to original The corresponding attribute tags of each user carry out classification processing in beginning user's sample set, and count the quantity in each classification, from And the first analysis data and the second analysis data are obtained, data analysis knot is obtained according to the first analysis data and the second analysis data The accuracy of data analysis can be improved in fruit.

In one embodiment, it obtains in original user sample set after the corresponding attribute tags of each user, also wraps Include: obtaining the initial data of corresponding second attribute tags of each user in original user sample set, according to initial data from Inactive users are obtained in original user sample set；It obtains the target user in addition to inactive users and generates target user's sample set It closes.

Wherein, the second attribute tags can be " residence time is long ", " residence time is short ".Inactive users refer to that data are Invalid data, that is, nugatory user is analyzed to data, such as use of the residence time less than 1s during operating webpage Family, it is believed that caused by the user may be maloperation or malicious operation, the data for illustrating that the user generates do not have break-up value, Then the user is inactive users.

In original user sample set, there are some pairs of data to analyze nugatory data, if by original user Each user in sample set analyzes, and can reduce the accuracy of data analysis.If some users are operating webpage In the process, since network cause rests on the overlong time on webpage, fail to exit always.Or due to server, Occur the same data simultaneously in a short time.Therefore, it is necessary to which some pairs of data are analyzed nugatory data to remove, from original Inactive users are obtained in beginning user's sample set, obtain target user further according to inactive users, target is generated according to target user User's set.

In the present embodiment, according to the original number of corresponding second attribute tags of user each in original user sample set According to obtaining inactive users from original user sample set, then obtain the data of target user in addition to inactive users and carry out The accuracy of data analysis can be improved in analysis.

In one embodiment, initial data includes the first initial data and the second initial data, wherein the first original number At the time of according to for indicating that user starts to operate webpage, the second initial data is used to indicate the duration of user's operation webpage；

The initial data for obtaining corresponding second attribute tags of each user in original user sample set, according to original number At least one of inactive users, including following manner are obtained according to from original user sample set:

Step 302, the second original number of corresponding second attribute tags of each user in original user sample set is obtained According to the user using the second initial data less than the first data threshold is as inactive users.

Wherein, the second initial data can be " 10s ", " 2s ".First data threshold can be it is pre-set, can also be with It is arranged in real time, it is without being limited thereto.

It is understood that when user is during operating webpage, it can be due to network cause or maloperation, short Start to operate webpage in time and exits webpage.Therefore, the data of the user are invalid data, which is inactive users.

For example, it is 1s that the first data threshold, which can be set, each user corresponding the in original user sample set is obtained Second initial data of two attribute tags, when the second initial data is less than 1s, then second initial data is that do not have to data analysis Valuable data remove second initial data less than 1s.When the second initial data is greater than 1s, then by second data Corresponding user is as inactive users.

Step 304, the first original number of corresponding second attribute tags of each user in original user sample set is obtained According to the second initial data, the user in original user sample set is clustered according to the first initial data, Statistical Clustering Analysis The obtained number of users in each classification, the classification that number of users is greater than amount threshold work as target class as target category When corresponding second initial data of all users in not is in data area, using the user in target category as no effectiveness Family.

Wherein, the first initial data can be " 15:54 on November 14th, 2018 ".

Specifically, the user in original user sample set is clustered according to the first initial data, can will be started The user's cluster at the time of operating webpage being the same second is the same classification, can also will start to operate the webpage moment to exiting net User cluster of the page moment in objective time interval is the same classification, such as in 15:54-2018 November 14 on November 14th, 2018 In 15:55 period day, the user for operating the webpage moment and exiting the webpage moment within the period simultaneously will be started within the period Cluster is a classification.

Then count the number of users in each classification, when number of users be greater than amount threshold when, then using the category as Target category.Wherein, amount threshold can be preset, and can also be arranged with real-time perfoming, without being limited thereto.

After obtaining target category, when corresponding second initial data of all users in target category, that is, user exist When operating the duration of webpage in data area, using the user in target category as inactive users.

It is understood that in some cases, the IP of electronic equipment can be constantly changed by some electronic equipments (Internet Protocol, Internet protocol) address, frequent progress webpage clicking links in a short time, to improve net The click volume and amount of reading of page link.However, being not to use by improving click volume and amount of reading in the change IP address short time Family operates the real processes of webpage, i.e. the data are invalid data.If by invalid data together with the data of target user into The analysis of row data, can reduce the accuracy of data analysis.

In another embodiment, according to the first attribute tags " participation " and " having neither part nor lot in " by original user sample set Middle user is divided into two classifications.In classification " participation ", the maximum value in the second initial data of the second attribute tags is obtained, and Using the maximum value as longest duration.In classification " having neither part nor lot in ", when user the second initial data be greater than longest duration, then will The user is as inactive users.

It is understood that during user's operation webpage net can be rested on always because of other things Webpage is not exited in page and, but user does not carry out the process of operation webpage actually, i.e. user does not participate in, and stops The time is stayed to be greater than the longest duration participated in classification, then the user is inactive users.

In the present embodiment, by obtaining inactive users from original user sample set, and by the data of inactive users The accuracy of data analysis can be improved in removal.

In one embodiment, according to the first attribute tags and the second attribute tags to the user in user's sample set into The first classification processing of row includes:

Step 402, classification processing is carried out to the user in target user's sample set according to the first attribute tags, obtains the One classification processing result.

Step 404, classification processing is carried out to the first classification processing result according to the second attribute tags, obtained at the second classification Manage result.

For example, the first attribute tags can be " participation ", " having neither part nor lot in ", then by the user in target user's sample set into Row classification processing obtains the user and " user having neither part nor lot in " that the first classification processing result is two classifications " participation ".Second belongs to Property label can be " residence time is long ", " residence time is short ".Again by the first classification processing result participate in user and have neither part nor lot in User carry out classification processing respectively, obtain that the second classification processing result is four classifications " it is long to participate in the residence time ", " participation stops Stay the time short ", " it is long to have neither part nor lot in the residence time " and " it is short to have neither part nor lot in the residence time ".

Count the first quantity for the user for including in each classification that the first classification processing obtains, comprising:

Step 406, the first quantity for the user for including in each classification that the second classification processing result of statistics obtains.

For example, the second classification processing result can for four classifications " it is long to participate in the residence time ", " it is short to participate in the residence time ", " it is long to have neither part nor lot in the residence time " and " it is short to have neither part nor lot in the residence time " counts the number of the user of each classification in the second classification results Amount.

In the present embodiment, first the user in target user's sample set is carried out at classification according to the first attribute tags Reason obtains the first classification processing as a result, carrying out classification processing to the first classification processing result further according to the second attribute tags, obtains Second classification processing can more accurately carry out data analysis as a result, obtain the first quantity by the second classification processing result.

In one embodiment, the first analysis data are obtained according to the first quantity that statistics obtains, comprising: according to described every The first analysis data are calculated according to the first calculation formula in the first quantity for the user for including in a classification.

As shown in figure 5, the second classification processing result can be four classifications: classification 501 is " it is long to participate in the residence time ", class Other 502 be " it is short to participate in the residence time ", classification 503 is " it is long to have neither part nor lot in the residence time " and classification 504 is " to have neither part nor lot in the residence time It is short ".Correspondingly, the first analysis data can be clarity.First calculation formula may include the calculation formula and suction of clarity The calculation formula for degree of drawing.

Specifically, it participates in indicating that user clicks the button in webpage and takes part in activity in webpage, illustrates user by net Content in page attracts.Having neither part nor lot in indicates that user does not click on any button in webpage, illustrates user not by in webpage Hold and attracts.Residence time, short expression user residence time in webpage was short, illustrated the content in webpage than more visible.When stop Between long expression user residence time in webpage it is long, illustrate that the content in webpage is unintelligible.

Further, as shown in figure 5, can be divided into " not in the user that classification 504 is " it is short to have neither part nor lot in the residence time " Participate in residence time short understanding " and " have neither part nor lot in residence time short cannot understand "." having neither part nor lot in residence time short understanding " refers to User has neither part nor lot in, is understood that backed off after random webpage to the content in webpage in a short time." have neither part nor lot in the residence time it is short cannot Understand " refer to that user has neither part nor lot in, cannot understand backed off after random webpage to the content in webpage in a short time.

It is understood that classification be " having neither part nor lot in residence time short understanding " number of users classification be " have neither part nor lot in and stop Stay the time short " in the ratio that accounts for, in classification be the ratio accounted in " participation " with number of users that classification is " it is short to participate in the residence time " Example is identical.That is, to user's ratio of the content understanding in webpage in the classification participated in, and do not participated in And it is identical to user's ratio of the content understanding in webpage in residence time short classification.

Specifically, for user during operating webpage, the residence time is short and is participated in, i.e., classification is " to participate in stopping Time is short " user be to the clearly user of the content in webpage.In addition, to the content in webpage, clearly user further includes not It is participated in, the residence time is short and the user of understanding, i.e. classification are the user of " having neither part nor lot in residence time short understanding ".

Therefore, clarity can be obtained by following calculation formula: clarity=(j+i*j/ (J+j))/(i+I+j+J). Wherein, i indicates that classification is the number of users of " it is short to have neither part nor lot in the residence time ", and I indicates that classification is " it is long to have neither part nor lot in the residence time " Number of users, j indicate that classification is the number of users of " it is short to participate in the residence time ", and J indicates that classification is " it is long to participate in the residence time " Number of users, i*j/ (J+j) indicate that classification is the number of users of " having neither part nor lot in residence time short understanding ".

As shown in fig. 6, classification is the number of users i=9188 of " it is short to have neither part nor lot in the residence time ", classification is " to have neither part nor lot in stop Time is long " number of users I=2162, classification is the number of users j=1173 of " participate in residence time short ", and classification is " to participate in Residence time is long " number of users J=667.Then classification is the number of users i*j/ (J+j) of " having neither part nor lot in residence time short understanding " =9188*1173 (667+1173)=5857.35, clarity=(j+i*j/ (J+j))/(i+I+j+J)=(1173+ 5857.35)/(9188+2162+1173+667)=53.30%.

Further, as shown in figure 5, can be divided into " not in the user that classification 503 is " it is long to have neither part nor lot in the residence time " Residence time length is participated in be attracted " and " having neither part nor lot in residence time length not to be attracted "." having neither part nor lot in residence time length to be attracted " refers to Be that user has neither part nor lot in, the residence time is long during web page operation and is attracted.It " has neither part nor lot in residence time length not inhaled Draw " refer to that user has neither part nor lot in, the residence time is long during web page operation and is not attracted.

It is understood that the number of users that classification is " have neither part nor lot in residence time length be attracted " in classification is " to have neither part nor lot in Residence time is long " in the ratio that accounts for, in classification be " residence time is short " with number of users that classification is " it is short to participate in the residence time " In the ratio that accounts for it is identical.

Specifically, for user during operating webpage, classification is that the user of " participation " is to be attracted by the content in webpage User.In addition, further including the use for not participating in, the residence time length and being attracted by the user that the content in webpage attracts Family, i.e. classification are the user of " having neither part nor lot in residence time length to be attracted ".

Therefore, Attraction Degree can be obtained by following calculation formula: Attraction Degree=(j+J+I*j/ (i+j))/(i+I+j+ J).Wherein, I*j/ (i+j) indicates that classification is the number of users of " having neither part nor lot in residence time length to be attracted ".

As shown in fig. 6, classification is the number of users i=9188 of " it is short to have neither part nor lot in the residence time ", classification is " to have neither part nor lot in stop Time is long " number of users I=2162, classification is the number of users j=1173 of " participate in residence time short ", and classification is " to participate in Residence time is long " number of users J=667.Then classification is the number of users I*j/ (i+ of " having neither part nor lot in residence time length to be attracted " J)=2162*1173/ (9188+1173)=244.76, Attraction Degree=(j+J+I*j/ (i+j))/(i+I+j+J)=(1173+ 667+244.76)/(9188+2162+1173+667)=15.81%.

In the present embodiment, it according to the first quantity for the user for including in each classification, is calculated according to the first calculation formula Obtaining the first analysis data is clarity and Attraction Degree, can more accurately obtain the first analysis data.

In one embodiment, according to the quantity for the user for including in obtained the second quantity and original user sample set Obtain the second analysis data, comprising: according to the quantity of user in obtained the second quantity and original user sample set, according to the The second analysis data are calculated in two calculation formula.

For example, the second quantity can be " opening ", " completion ", the user for including in " recommendation " three classifications quantity.Phase Ying Di, the second analysis data can be visibility, completeness and recommendation.Classification is the number of users i.e. target of " opening " The quantity of number of users in user's sample set, as web page interlinkage arrival user.

Visibility can be calculated by following calculation formula: visibility=opening number of users/original user sample The quantity of user in this set.Completeness can be calculated by following calculation formula: completeness=completion number of users/ The number of users of opening.Recommendation can be calculated by following calculation formula: recommendation=sharing number of users/opening Number of users.

As shown in fig. 6, the quantity of user is 150000 in original user sample set, the number of users of opening, i.e. target The quantity of user is 13190 in user's sample set, and the number of users of completion is 1000, and the number of users of sharing is 65.Then may be used Quantity=13190/150000=8.79% of user in degree of opinion=opening number of users/original user sample set is completed Degree=completion number of users/opening number of users=1000/13190=7.58%, recommendation=sharing number of users/ Number of users=65/13190=0.49% of opening.

In the present embodiment, it according to the quantity of user in the second quantity and original user sample set, is calculated according to second The second analysis data visibility, completeness and recommendation is calculated in formula, can more accurately obtain the second analysis data.

In one embodiment, data analysis result includes the first data analysis result and the second data analysis result, root Data analysis result is generated according to the first analysis data and the second analysis data, comprising: obtains the first analysis data corresponding first Data threshold, and obtain corresponding second data threshold of the second analysis data；When the first analysis data are less than the first data threshold When, generate the first data analysis result；When the second analysis data are less than the second data threshold, the second data analysis knot is generated Fruit.

Wherein, the first data threshold and the second data threshold can be preset, and can also be arranged with real-time perfoming, are not limited to This.First data threshold and the second data threshold can be obtained by big data, or be carried out according to the professional standard of this field It obtains.

As shown in fig. 6, the first analysis data may include clarity and Attraction Degree, clarity 53.30%, Attraction Degree is 15.81%.First data threshold includes clarity threshold and Attraction Degree threshold value, and clarity threshold can be 50%, Attraction Degree threshold Value can be 20%.Second analysis data may include visibility, completeness and recommendation, it is seen that spending is 8.79%, completeness It is 7.58%, recommendation 0.49%.Second data threshold includes threshold of visibility, completeness threshold value and recommendation threshold value, can Degree of opinion threshold value can be 30%, completeness threshold value 5%, recommendation threshold value 0.4%.

Then clarity 53.30% is greater than clarity threshold 50%, and Attraction Degree 15.81% is less than Attraction Degree threshold value 20%, can Degree of opinion 8.79% is less than threshold of visibility 30%, and completeness 7.58% is greater than completeness threshold value 5%, and recommendation 0.49%, which is greater than, to be pushed away Degree of recommending threshold value 0.4%.

Therefore, the first data analysis result can be " Attraction Degree is too low, it may be possible to since movable bonus is very little, to Family lacks attraction ".Second data analysis result can be " visibility is too low, it may be possible to due to link entrance it is unobvious, user It hardly notices ".

In the present embodiment, by the way that the first analysis data are analyzed number for second compared with the first data threshold, and simultaneously According to compared with the second data threshold, the first data analysis result and the second data analysis result are generated respectively, it can be more accurate Ground obtains the result of data analysis.

It should be understood that although each step in the flow chart of Fig. 2-4 is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 2-4 Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately It executes.

In one embodiment, as shown in fig. 7, providing a kind of data analysis set-up, comprising: attribute tags obtain module 702, the first classification processing module 704, the second classification processing module 706, analysis data acquisition module 708 and data analysis result Generation module 710, in which:

Attribute tags obtain module 702, for obtaining the corresponding attribute tags of each user in original user sample set, It wherein, include target user's sample set in the original user sample set, the attribute tags are users in operation webpage It generates in the process, the attribute tags include the first attribute tags, the second attribute tags and third attribute tags.

First classification processing module 704 is used for according to first attribute tags and the second attribute tags to the target User in user's sample set carries out the first classification processing, and counts and wrap in each classification that first classification processing obtains The first quantity of the user contained.

Second classification processing module 706 is used for according to the third attribute tags in target user's sample set User carry out the second classification processing, and count the second of the user for including in each classification that second classification processing obtains Quantity.

Data acquisition module 708 is analyzed, first quantity for being obtained according to statistics obtains the first analysis data, and The second analysis number is obtained according to the quantity for the user for including in obtained second quantity and the original user sample set According to.

Data analysis result generation module 710, for generating data point according to the first analysis data and the second analysis data Analyse result.

Above-mentioned data analysis set-up obtains the corresponding attribute tags of each user in original user sample set, wherein former It include target user's sample set in beginning user's sample set, attribute tags are that user generates during operating webpage, are belonged to Property label include the first attribute tags, the second attribute tags and third attribute tags；According to the first attribute tags and the second attribute Label in target user's sample set user carry out the first classification processing, and count the first classification processing obtain each divide The first quantity for the user for including in class；Second point is carried out to the user in target user's sample set according to third attribute tags Class processing, and count the second quantity for the user for including in each classification that the second classification processing obtains；It is obtained according to statistics First quantity obtains the first analysis data, and according to the user for including in obtained the second quantity and original user sample set Quantity obtains the second analysis data；Data analysis result is generated according to the first analysis data and the second analysis data.By to original The corresponding attribute tags of each user carry out classification processing in beginning user's sample set, and count the quantity in each classification, from And the first analysis data and the second analysis data are obtained, data analysis knot is obtained according to the first analysis data and the second analysis data The accuracy of data analysis can be improved in fruit.

In one embodiment, as shown in figure 8, providing a kind of data analysis set-up, comprising: attribute tags obtain module 802, target user's sample set generation module 804, the first classification processing module 806, the second classification processing module 808, analysis Data acquisition module 810 and data analyze result-generation module 812, in which:

Attribute tags obtain module 802, for obtaining the corresponding attribute tags of each user in original user sample set, It wherein, include target user's sample set in the original user sample set, the attribute tags are users in operation webpage It generates in the process, the attribute tags include the first attribute tags, the second attribute tags and third attribute tags.

Target user's sample set generation module 804, it is corresponding for obtaining each user in original user sample set The initial data of second attribute tags obtains inactive users from original user sample set according to initial data；It obtains and removes nothing Target user except effectiveness family generates target user's sample set.

First classification processing module 806 is used for according to first attribute tags and the second attribute tags to the target User in user's sample set carries out the first classification processing, and counts and wrap in each classification that first classification processing obtains The first quantity of the user contained.

Second classification processing module 808 is used for according to the third attribute tags in target user's sample set User carry out the second classification processing, and count the second of the user for including in each classification that second classification processing obtains Quantity.

Data acquisition module 810 is analyzed, first quantity for being obtained according to statistics obtains the first analysis data, and The second analysis number is obtained according to the quantity for the user for including in obtained second quantity and the original user sample set According to.

Data analysis result generation module 812, for generating data point according to the first analysis data and the second analysis data Analyse result.

In the present embodiment, by obtaining target user's sample set from original user sample set, then target is used The corresponding attribute tags of each user in the sample set of family carry out classification processing, and count the quantity in each classification, thus The first analysis data and the second analysis data are obtained, obtain data analysis knot according to the first analysis data and the second analysis data Fruit can more improve the accuracy of data analysis.

In one embodiment, above-mentioned target user's sample set generation module 804 is also used to obtain original user sample The initial data of corresponding second attribute tags of each user, is obtained from original user sample set according to initial data in set It takes at least one of inactive users, including following manner: obtaining each user corresponding second in original user sample set and belong to Property label the second initial data, using the second initial data be less than initial data threshold value user as inactive users；It obtains former The first initial data and the second initial data of corresponding second attribute tags of each user in beginning user's sample set, according to One initial data clusters the user in original user sample set, the number of users in each classification that Statistical Clustering Analysis obtains Amount, the classification using number of users greater than amount threshold is as target category, as all users corresponding second in target category When initial data is in data area, using the user in target category as inactive users.

In one embodiment, above-mentioned first classification processing module 806 is also used to use target according to the first attribute tags User in the sample set of family carries out classification processing, obtains the first classification processing result；According to the second attribute tags to first point Class processing result carries out classification processing, obtains the second classification processing result.It counts in each classification that the first classification processing obtains The first quantity for the user for including, comprising: the of the user for including in the obtained each classification of the second classification processing result of statistics One quantity.

In one embodiment, above-mentioned analysis data acquisition module 810 is also used to according to the user for including in each classification The first quantity, the first analysis data are calculated according to the first calculation formula.

In one embodiment, above-mentioned analysis data acquisition module 810 is also used to according to obtained the second quantity and original The second analysis data are calculated according to the second calculation formula in the quantity of user in user's sample set.

In one embodiment, it is corresponding to be also used to obtain the first analysis data for above-mentioned data analysis result generation module 812 The first data threshold, and obtain the second corresponding second data threshold of analysis data；When the first analysis data are less than the first number When according to threshold value, the first data analysis result is generated；When the second analysis data are less than the second data threshold, the second data point are generated Analyse result.

Specific about data analysis set-up limits the restriction that may refer to above for data analysing method, herein not It repeats again.Modules in above-mentioned data analysis set-up can be realized fully or partially through software, hardware and combinations thereof.On Stating each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also store in a software form In memory in computer equipment, the corresponding operation of the above modules is executed in order to which processor calls.

In one embodiment, a kind of computer equipment is provided, which can be terminal, internal structure Figure can be as shown in Figure 9.The computer equipment includes processor, the memory, network interface, display connected by system bus Screen and input unit.Wherein, the processor of the computer equipment is for providing calculating and control ability.The computer equipment is deposited Reservoir includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system and computer journey Sequence.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The network interface of machine equipment is used to communicate with external terminal by network connection.When the computer program is executed by processor with Realize a kind of data analysing method.The display screen of the computer equipment can be liquid crystal display or electric ink display screen, The input unit of the computer equipment can be the touch layer covered on display screen, be also possible to be arranged on computer equipment shell Key, trace ball or Trackpad, can also be external keyboard, Trackpad or mouse etc..

It will be understood by those skilled in the art that structure shown in Fig. 9, only part relevant to application scheme is tied The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.

In one embodiment, a kind of computer equipment, including memory and processor are provided, is stored in memory Computer program, which performs the steps of when executing computer program obtains each use in original user sample set The corresponding attribute tags in family, wherein include target user's sample set in original user sample set, attribute tags are that user exists Operate what webpage generated in the process, attribute tags include the first attribute tags, the second attribute tags and third attribute tags；According to First attribute tags and the second attribute tags carry out the first classification processing to the user in target user's sample set, and count the The first quantity for the user for including in each classification that one classification processing obtains；According to third attribute tags to target user's sample User in set carries out the second classification processing, and counts the of the user for including in each classification that the second classification processing obtains Two quantity；The first analysis data are obtained according to the first quantity that statistics obtains, and according to obtained the second quantity and original user The quantity for the user for including in sample set obtains the second analysis data；It is generated according to the first analysis data and the second analysis data Data analysis result.

In one embodiment, it is also performed the steps of when processor executes computer program and obtains original user sample The initial data of corresponding second attribute tags of each user, is obtained from original user sample set according to initial data in set Take inactive users；It obtains the target user in addition to inactive users and generates target user's sample set.

In one embodiment, it is also performed the steps of when processor executes computer program and obtains original user sample Second initial data of corresponding second attribute tags of each user in set, is less than initial data threshold value for the second initial data User as inactive users；Obtain corresponding second attribute tags of each user in original user sample set first is original Data and the second initial data cluster the user in original user sample set according to the first initial data, and statistics is poly- The number of users in each classification that class obtains, the classification that number of users is greater than amount threshold work as target as target category When corresponding second initial data of all users in classification is in data area, using the user in target category as no effectiveness Family.

In one embodiment, it also performs the steps of when processor executes computer program according to the first attribute tags Classification processing is carried out to the user in target user's sample set, obtains the first classification processing result；According to the second attribute tags Classification processing is carried out to the first classification processing result, obtains the second classification processing result.Count the first classification processing obtain it is every The first quantity for the user for including in a classification, comprising: include in each classification that the second classification processing result of statistics obtains The first quantity of user.

In one embodiment, it also performs the steps of when processor executes computer program and is wrapped according in each classification The first analysis data are calculated according to the first calculation formula in the first quantity of the user contained.

In one embodiment, it is also performed the steps of when processor executes computer program according to the second obtained number The quantity of user, is calculated the second analysis data according to the second calculation formula in amount and original user sample set.

In one embodiment, acquisition the first analysis data are also performed the steps of when processor executes computer program Corresponding first data threshold, and obtain corresponding second data threshold of the second analysis data；When the first analysis data are less than the When one data threshold, the first data analysis result is generated；When the second analysis data are less than the second data threshold, the second number is generated According to analysis result.

In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program performs the steps of when being executed by processor obtains the corresponding attribute mark of each user in original user sample set Label, wherein include target user's sample set in original user sample set, attribute tags are users during operating webpage It generates, attribute tags include the first attribute tags, the second attribute tags and third attribute tags；According to the first attribute tags and Second attribute tags carry out the first classification processing to the user in target user's sample set, and count the first classification processing and obtain Each classification in include user the first quantity；According to third attribute tags to the user in target user's sample set into The second classification processing of row, and count the second quantity for the user for including in each classification that the second classification processing obtains；According to system It counts the first obtained quantity and obtains the first analysis data, and include according in obtained the second quantity and original user sample set User quantity obtain second analysis data；Data analysis result is generated according to the first analysis data and the second analysis data.

In one embodiment, it is also performed the steps of when computer program is executed by processor and obtains original user sample The initial data of corresponding second attribute tags of each user in this set, according to initial data from original user sample set Obtain inactive users；It obtains the target user in addition to inactive users and generates target user's sample set.

In one embodiment, it is also performed the steps of when computer program is executed by processor: obtaining original user sample Second initial data of corresponding second attribute tags of each user in this set, is less than initial data threshold for the second initial data The user of value is as inactive users；Obtain corresponding second attribute tags of each user in original user sample set first is former Beginning data and the second initial data cluster the user in original user sample set according to the first initial data, statistics The number of users in obtained each classification is clustered, the classification that number of users is greater than amount threshold works as mesh as target category When marking corresponding second initial data of all users in classification in data area, using the user in target category as invalid User.

In one embodiment, it is also performed the steps of when computer program is executed by processor according to the first attribute mark It signs and classification processing is carried out to the user in target user's sample set, obtain the first classification processing result；According to the second attribute mark Label carry out classification processing to the first classification processing result, obtain the second classification processing result.Count what the first classification processing obtained The first quantity for the user for including in each classification, comprising: include in each classification that the second classification processing result of statistics obtains User the first quantity.

In one embodiment, it also performs the steps of when computer program is executed by processor according in each classification The first analysis data are calculated according to the first calculation formula in the first quantity for the user for including.

In one embodiment, it is also performed the steps of when computer program is executed by processor according to second obtained The second analysis data are calculated according to the second calculation formula in the quantity of user in quantity and original user sample set.

In one embodiment, acquisition the first analysis number is also performed the steps of when computer program is executed by processor According to corresponding first data threshold, and obtain corresponding second data threshold of the second analysis data；When the first analysis data are less than When the first data threshold, the first data analysis result is generated；When the second analysis data are less than the second data threshold, second is generated Data analysis result.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.

The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims

1. a kind of data analysing method, which comprises

Obtain the corresponding attribute tags of each user in original user sample set, wherein in the original user sample set Comprising target user's sample set, the attribute tags are that user generates during operating webpage, the attribute tags packet Include the first attribute tags, the second attribute tags and third attribute tags；

First is carried out to the user in target user's sample set according to first attribute tags and the second attribute tags Classification processing, and count the first quantity for the user for including in each classification that first classification processing obtains；

The second classification processing is carried out to the user in target user's sample set according to the third attribute tags, and is counted The second quantity for the user for including in each classification that second classification processing obtains；

The first analysis data are obtained according to obtained first quantity of statistics, and according to obtained second quantity and described The quantity for the user for including in original user sample set obtains the second analysis data；

2. the method according to claim 1, wherein each user couple in the acquisition original user sample set After the attribute tags answered, further include:

The initial data for obtaining corresponding second attribute tags of each user in the original user sample set, according to the original Beginning data obtain inactive users from the original user sample set；

It obtains the target user in addition to inactive users and generates target user's sample set.

3. according to the method described in claim 2, it is characterized in that, the initial data includes that the first initial data and second are former Beginning data, wherein at the time of first initial data is for indicating that user starts to operate webpage, the second initial data is used for table Show the duration of user's operation webpage；

The initial data for obtaining corresponding second attribute tags of each user in the original user sample set, according to institute It states initial data and obtains at least one of inactive users, including following manner from the original user sample set:

The second initial data of corresponding second attribute tags of each user in the original user sample set is obtained, it will be described Second initial data is less than the user of initial data threshold value as inactive users；

Obtain the first initial data and second of corresponding second attribute tags of each user in the original user sample set Initial data clusters the user in the original user sample set according to first initial data, Statistical Clustering Analysis The obtained number of users in each classification, the classification that the number of users is greater than amount threshold work as institute as target category When stating corresponding second initial data of all users in target category in data area, by the user in the target category As inactive users.

4. the method according to claim 1, wherein described according to first attribute tags and the second attribute mark It signs and the first classification processing is carried out to the user in user's sample set, include:

Classification processing is carried out to the user in target user's sample set according to first attribute tags, obtains first point Class processing result；

Classification processing is carried out to the first classification processing result according to second attribute tags, obtains the second classification processing knot Fruit；

The first quantity for the user for including in each classification that statistics first classification processing obtains, comprising:

Count the first quantity for the user for including in each classification that the second classification processing result obtains.

5. the method according to claim 1, wherein described obtain the according to obtained first quantity of statistics One analysis data, comprising:

According to the first quantity for the user for including in each classification, the first analysis number is calculated according to the first calculation formula According to.

6. the method according to claim 1, wherein second quantity that the basis obtains and described original The quantity for the user for including in user's sample set obtains the second analysis data, comprising:

According to the quantity of user in obtained second quantity and the original user sample set, according to the second calculation formula The second analysis data are calculated.

7. method according to any one of claim 1 to 6, which is characterized in that the data analysis result includes first Data analysis result and the second data analysis result, it is described to generate data analysis according to the first analysis data and the second analysis data As a result, comprising:

Corresponding first data threshold of the first analysis data is obtained, and obtains the corresponding second data threshold of the second analysis data Value；

When the first analysis data are less than first data threshold, the first data analysis result is generated；

When the second analysis data are less than second data threshold, the second data analysis result is generated.

8. a kind of data analysis set-up, which is characterized in that described device includes:

Attribute tags obtain module, for obtaining the corresponding attribute tags of each user in original user sample set, wherein institute It states comprising target user's sample set in original user sample set, the attribute tags are that user is raw during operating webpage At, the attribute tags include the first attribute tags, the second attribute tags and third attribute tags；

First classification processing module is used for according to first attribute tags and the second attribute tags to target user's sample User in set carries out the first classification processing, and counts the user for including in each classification that first classification processing obtains The first quantity；

Second classification processing module, for according to the third attribute tags to the user in target user's sample set into The second classification processing of row, and count the second quantity for the user for including in each classification that second classification processing obtains；

Data acquisition module is analyzed, first quantity acquisition the first analysis data for being obtained according to statistics, and according to To second quantity and the original user sample set in include user quantity obtain second analysis data；

Data analysis result generation module, for generating data analysis result according to the first analysis data and the second analysis data.

9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 7 the method when executing the computer program.

10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 7 is realized when being executed by processor.