CN109190035A - ID data network data analysis method, device and calculating equipment - Google Patents
ID data network data analysis method, device and calculating equipment Download PDFInfo
- Publication number
- CN109190035A CN109190035A CN201810973801.4A CN201810973801A CN109190035A CN 109190035 A CN109190035 A CN 109190035A CN 201810973801 A CN201810973801 A CN 201810973801A CN 109190035 A CN109190035 A CN 109190035A
- Authority
- CN
- China
- Prior art keywords
- data
- relation
- subnet
- relationship
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses a kind of ID data network data analysis method, device, calculate equipment and computer storage medium, wherein ID data network data analysis method includes: to obtain the ID data network comprising the incidence relation between ID data and ID data;ID data include: User ID data and/or device id data;The incidence relation between ID data and ID data for being included according to ID data network constructs ID relation data;ID relation data includes several ID relationships pair;Combination is compared to ID relation data, obtains several ID data subnets.The technical solution effectively improves ID data network data analysis efficiency, several ID data subnets can accurately and rapidly be obtained, realize effective division to ID data network, compared with ID data network, the ID data that ID data subnet is included have stronger, reliable incidence relation, it can recognize the ID data for same user, help to construct complete, effective user's portrait.
Description
Technical field
The present invention relates to Internet technical fields, and in particular to a kind of ID data network data analysis method, device, calculating are set
Standby and computer storage medium.
Background technique
In order to meet the different use demand of user, be developed online, do shopping, make a reservation, ordering train ticket, payment etc.
Multiple business are selected and are used for user.Business can be according to account of the user in business or equipment used by a user
Deng, be user setting ID data, for being identified to user.ID number can be constructed according to the ID data from multiple business
It, can be to user's gender, age of user, browsing hobby, click hobby, liveness, article purchase happiness based on ID data network according to net
The user characteristics such as good, article purchase potentiality, game hobby are analyzed, and complete, effective user's portrait are constructed, to realize to new
The accurate recommendation of news, game, advertisement etc..However the ID data of multiple business are various, the incidence relation between ID data is complicated, number
It is larger according to treating capacity, and different business is different for the setting rule of ID data, can not accurately and rapidly be wrapped from ID data network
The ID data corresponding to same user are identified in a large amount of ID data contained.
Summary of the invention
In view of the above problems, it proposes on the present invention overcomes the above problem or at least be partially solved in order to provide one kind
It states the ID data network data analysis method of problem, device, calculate equipment and computer storage medium.
According to an aspect of the invention, there is provided a kind of ID data network data analysis method, this method comprises: obtaining packet
The ID data network of incidence relation between data containing ID and ID data;ID data include: User ID data and/or device id number
According to;The incidence relation between ID data and ID data for being included according to ID data network constructs ID relation data;ID relationship number
According to including several ID relationships pair;Combination is compared to ID relation data, obtains several ID data subnets.
Further, combination is compared to ID relation data, obtaining several ID data subnets further comprises: full dose is multiple
ID relation data processed is into memory;ID relation data is compared with the ID relation data that full dose copies in memory and is combined,
Data Integration is carried out according to combined result is compared, obtains several ID data subnets.
Further, ID relation data is compared with the ID relation data that full dose copies in memory and is combined, according to
It compares combined result and carries out Data Integration, obtaining several ID data subnets further comprises: ID relation data being divided into multiple
Fragment;The ID relation data that multiple fragments concurrently copy in memory with full dose is compared and is combined, all fragments are obtained
Comparison combined result;The comparison combined result of all fragments is subjected to Data Integration, obtains several ID data subnets.
Further, the ID relation data that multiple fragments concurrently copy in memory with full dose is compared and is combined,
The comparison combined result for obtaining all fragments further comprises: being directed to any fragment, the fragment and full dose are copied in memory
ID relation data combination is compared, obtain the fragment comparison combination intermediate result;Iteration executes this step, until meeting
Default iterated conditional: the comparison combination intermediate result of all fragments is divided into the sub- fragment in multiple centres, and will be multiple intermediate sub
The ID relation data that fragment concurrently copies in memory with full dose, which is compared, to be combined, and all of next iteration operation are obtained
Intermediate result is combined in the comparison of fragment;After iterative process, the comparison combined result of all fragments is obtained.
Further, default iterated conditional includes: that the number of iterations reaches default the number of iterations.
According to another aspect of the present invention, a kind of ID data network data analytical equipment is provided, which includes: acquisition mould
Block, suitable for obtaining the ID data network comprising the incidence relation between ID data and ID data;ID data include: User ID data
And/or device id data;First building module, suitable for included according to ID data network ID data and ID data between pass
Connection relationship constructs ID relation data;ID relation data includes several ID relationships pair;Composite module is compared, is suitable for ID relationship number
According to combination is compared, several ID data subnets are obtained.
Further, compare composite module to be further adapted for: full dose replicates ID relation data into memory;By ID relationship number
It is compared and combines according to the ID relation data copied in memory with full dose, carry out Data Integration according to combined result is compared, obtain
To several ID data subnets.
Further, it compares composite module to be further adapted for: ID relation data is divided into multiple fragments;By multiple fragments
The ID relation data concurrently copied in memory with full dose, which is compared, to be combined, and the comparison combined result of all fragments is obtained;
The comparison combined result of all fragments is subjected to Data Integration, obtains several ID data subnets.
Further, it compares composite module to be further adapted for: for any fragment, the fragment and full dose being copied into memory
In ID relation data combination is compared, obtain the fragment comparison combination intermediate result;Iteration executes this step, until symbol
It closes default iterated conditional: the comparison combination intermediate result of all fragments is divided into the sub- fragment in multiple centres, and by multiple centres
The ID relation data that sub- fragment concurrently copies in memory with full dose, which is compared, to be combined, and the institute of next iteration operation is obtained
There is the comparison combination intermediate result of fragment;After iterative process, the comparison combined result of all fragments is obtained.
Further, default iterated conditional includes: that the number of iterations reaches default the number of iterations.
According to another aspect of the invention, provide a kind of calculating equipment, comprising: processor, memory, communication interface and
Communication bus, processor, memory and communication interface complete mutual communication by communication bus;
Memory makes processor execute above-mentioned ID data network data for storing an at least executable instruction, executable instruction
The corresponding operation of analysis method.
In accordance with a further aspect of the present invention, a kind of computer storage medium is provided, at least one is stored in storage medium
Executable instruction, executable instruction make processor execute such as the corresponding operation of above-mentioned ID data network data analysis method.
The technical solution provided according to the present invention, can be based between the ID data and ID data that ID data network is included
Incidence relation, construct ID relation data, and combination be compared to ID relation data, accurately and rapidly obtain several ID numbers
According to subnet, compared with ID data network, the ID data that ID data subnet is included have stronger, reliable incidence relation, can know
Not Wei same user ID data;And the data volume of ID data subnet is far smaller than the data volume of ID data network, is based on ID number
Accurately and rapidly user characteristics can be analyzed according to subnet, construct complete, effective user portrait, with realize to news,
The accurate recommendation of game, advertisement etc..
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows the flow diagram of ID data network processing method according to an embodiment of the invention;
Fig. 2 a shows the flow diagram of ID data network processing method in accordance with another embodiment of the present invention;
Fig. 2 b shows ID data network schematic diagram;
Fig. 3 shows the flow diagram of ID data network beta pruning preprocess method according to an embodiment of the invention;
Fig. 4 shows the flow diagram of ID data network data analysis method according to an embodiment of the invention;
Fig. 5 a shows the flow diagram of ID data network data analysis method in accordance with another embodiment of the present invention;
Figure 5b shows that ID relationship to the processing schematic for carrying out oriented positive sequence and oriented backward;
Fig. 6 shows the flow diagram of ID data subnet processing method according to an embodiment of the invention;
Fig. 7 shows the structural block diagram of ID data network processing unit according to an embodiment of the invention;
Fig. 8 shows the structural block diagram of ID data network beta pruning pretreatment unit according to an embodiment of the invention;
Fig. 9 shows the structural block diagram of ID data network data analytical equipment according to an embodiment of the invention;
Figure 10 shows the structural block diagram of ID data network data analytical equipment in accordance with another embodiment of the present invention;
Figure 11 shows the structural block diagram of ID data subnet processing unit according to an embodiment of the invention;
Figure 12 shows a kind of structural schematic diagram for calculating equipment according to an embodiment of the present invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
It is fully disclosed to those skilled in the art.
Fig. 1 shows the flow diagram of ID data network processing method according to an embodiment of the invention, such as Fig. 1 institute
Show, this method comprises the following steps:
Step S100 obtains the ID data network comprising the incidence relation between ID data and ID data.
Wherein, the ID data network constructed in advance can be obtained from data analysis system etc., ID data network can be for according to multiple
What the daily record data of business was constructed, ID data network includes the incidence relation between ID data and ID data, ID
Data refer to the data for identity user identity, ID data can include: User ID data and/or device id data.Each ID
There is incidence relation, incidence relation includes direct correlation relationship and indirect association relationship between data.
Specifically, User ID data refer to account data of the user in business, such as cell-phone number, WeChat ID, QQ number, clear
Look at device ID etc..For example, a certain user has been logged in using cell-phone number " 189****2677 ", wechat is applied and QQ is applied, and the user
WeChat ID in wechat application is " wxid_1 ", and the QQ number in QQ application is " 12345 ", then cell-phone number " 189****
2677 " have direct correlation relationship with WeChat ID " wxid_1 ", and cell-phone number " 189****2677 " also has with QQ number " 12345 "
Direct correlation relationship.
The mark data of device id data used equipment when referring to user using business, such as the equipment of mobile device
Number MD5 value, device number+system program version number+handset serial MD5 value of mobile device, the MAC Address of mobile device
MD5 value in 32,44 etc. in the MD5 value of the MAC Address of mobile device.Different business is set for device id data
Set regular difference.If using same business by the multiple User ID data of same equipment utilization, then the business is the equipment
The device id data and above-mentioned multiple User ID data marked all have incidence relation.For example, using wechat by certain mobile phone
Number " wxid_1 " and the logged wechat application of WeChat ID " wxid_2 ", and the device id data markers of the mobile phone are by wechat application
" m1 ", then device id data " m1 " and WeChat ID " wxid_1 " have direct correlation relationship, device id data " m1 " and wechat
Number " wxid_2 " also has direct correlation relationship.
Step S101 carries out data analysis to ID data network, obtains several ID data subnets.
Wherein, data point are carried out by the incidence relation between the ID data and ID data that are included to ID data network
Analysis, is divided into several ID data subnets for ID data network.It can will be several according to the quantity for the ID data that ID data subnet is included
ID data sub-network division is concentrated to n ID data subnet, and n is the natural number greater than 0.The ID data that different ID data subnets are concentrated
The quantity for the ID data that subnet is included is different.For example, several ID data subnets include the quantity of 200 ID data for being included
The ID data for including by 3 ID data subnet and 100 by 2 ID data subnet, the quantity of 300 ID data for including
Quantity be 4 ID data subnet, then can according to the ID data that ID data subnet is included quantity by this several ID data
Sub-network division is concentrated to 3 ID data subnets, specifically, the ID data subnet that the quantity for the ID data for being included by 200 is 2
It is divided into first ID data subnet to concentrate, the ID data sub-network division that the quantity for the ID data for being included by 300 is 3 to the
Two ID data subnets are concentrated, the ID data sub-network division that the quantity for the ID data for being included by 100 is 4 to third ID number
It is concentrated according to subnet.
Compared with ID data network, the ID data that ID data subnet is included have stronger, reliable incidence relation, can incite somebody to action
The ID data that ID data subnet is included are identified as the ID data of same user.And the ID data that ID data subnet is included
Quantity is far smaller than the quantity for the ID data that ID data network is included, and the data volume of ID data subnet is far smaller than ID data network
Data volume is based on ID data subnet, can accurately and rapidly like user's gender, age of user, browsing hobby, click, is living
The user characteristics such as jerk, article purchase hobby, article purchase potentiality, game hobby are analyzed, and complete, effective user is constructed
Portrait.
According to ID data network processing method provided in this embodiment, the ID data and ID data for being included to ID data network
Between incidence relation carry out data analysis, ID data network rapidly can be divided into several ID data subnets, with ID data
Net is compared, and the ID data that ID data subnet is included have stronger, reliable incidence relation, can recognize as the ID of same user
Data;And the data volume of ID data subnet is far smaller than the data volume of ID data network, can be accurate, fast based on ID data subnet
User characteristics are analyzed on fast ground, complete, effective user's portrait are constructed, to realize to the accurate of news, game, advertisement etc.
Recommend.
Fig. 2 a shows the flow diagram of ID data network processing method in accordance with another embodiment of the present invention, such as Fig. 2 a
Shown, this method comprises the following steps:
Step S200 carries out data analysis to the daily record data of multiple business, determines between ID data and ID data
Incidence relation.
Wherein, the daily record data of multiple business is obtained, daily record data can be to be obtained by multiple business active uploads, can also
To be to make requests to obtain to multiple business.For the daily record data of a business, can record in daily record data using the industry
The ID data of business and other ID data illustrate there is incidence relation between ID data and other ID data using the business,
Data analysis is carried out by the daily record data to multiple business, is capable of determining that the association between ID data and ID data is closed
System.Specifically, ID data can include: User ID data and/or device id data.
Step S201, according to the incidence relation between ID data, determines the connection between node using ID data as node
Relationship, construction obtain ID data network.
After the incidence relation between ID data and ID data has been determined, can according to identified ID data and
Incidence relation between ID data constructs ID data network, specifically, using ID data as node, according to the pass between ID data
Connection relationship determines the connection relationship between node, thus construction obtain ID data network, the ID data network include ID data and
Incidence relation between ID data can clearly illustrate the incidence relation between each ID data and ID data.
Assuming that identified ID data include " a1 ", " b1 ", " a2 ", " b2 ", " c2 ", " a3 ", " b3 ", " c3 ", " d3 ",
" a4 ", " b4 ", " c4 ", " d4 ", " e4 ", " f4 ", " g4 ", " h4 ", wherein between ID data " a1 " and ID data " b1 ", ID number
According between " a2 " and ID data " b2 ", between ID data " a2 " and ID data " c2 ", between ID data " a3 " and ID data " b3 ",
Between ID data " a3 " and ID data " c3 ", between ID data " c3 " and ID data " d3 ", ID data " a4 " and ID data " b4 "
Between, between ID data " a4 " and ID data " c4 ", between ID data " a4 " and ID data " f4 ", ID data " b4 " and ID data
Between " d4 ", between ID data " b4 " and ID data " e4 ", between ID data " b4 " and ID data " h4 " and ID data " e4 "
Between ID data " g4 " have direct correlation relationship, then between ID data " b2 " and ID data " c2 ", ID data " b3 " and
There is indirect association relationship between ID data " c3 ", between ID data " a3 " and ID data " d3 " etc., then extremely by ID data " a1 "
ID data " h4 " are respectively as the node a1 to node h4 in ID data network, and according to the incidence relation between each ID data,
Node a1 in ID data network is connected with node b1, node a2 is connected with node b2 and node c2 respectively, by node a3 points
Be not connected with node b3 and node c3, node c3 be connected with node d3, by node a4 respectively with node b4, node c4 and node
F4 is connected, and node b4 is connected with node d4, node e4 and node h4 respectively, node e4 is connected with node g4, is configured to
The ID data network 210 arrived can be as shown in Figure 2 b.
Step S202 obtains the ID data network comprising the incidence relation between ID data and ID data.
After the construction for completing ID data network, the ID data network is obtained, to carry out beta pruning pretreatment to the ID data network
And the processing such as data analysis.
Step S203 carries out beta pruning pretreatment to ID data network, obtains the pretreated ID data network of beta pruning.
It wherein, can be according to the quantity of the association frequency and other ID data being directly linked with ID data between ID data
Deng to the progress beta pruning pretreatment of ID data network, to obtain the pretreated ID data network of beta pruning.Specifically, part that can be removed
The ID data directly incidence relation between other associated ID data, realizes the pretreatment of the beta pruning to ID data network, has
Effect ground eliminates in ID data network insecure incidence relation between ID data, can not only help to improve ID data network and handle
Accuracy, but also the data volume of subsequent data analysis can be reduced.
Step S204, ID data network pretreated to beta pruning carry out data analysis, obtain several ID data subnets.
It, can be by being included to the pretreated ID data network of beta pruning after obtaining the pretreated ID data network of beta pruning
ID data and ID data between incidence relation carry out data analysis, which is divided into several ID data
Net.Can according to the ID data that ID data subnet is included quantity by several ID data sub-network divisions to n ID data subnet collection
In, n is the natural number greater than 0.The quantity for the ID data that the ID data subnet that different ID data subnets are concentrated is included is different.With
ID data network is compared, and the ID data that ID data subnet is included have stronger, reliable incidence relation.
Step S205 is greater than ID data of the first preset quantity threshold value for the quantity of any included ID data
Net is clustered and is divided to the ID data in the ID data subnet, obtains several 3rd ID corresponding to the ID data subnet
Data subnet.
It may be still including in obtained several ID data subnets after being analyzed by the data of step S204
The a fairly large number of ID data subnet for the ID data for including, although the ID data in these ID data subnets have stronger association
Relationship, but the ID data of same user may and be not belonging to, if these ID data to be identified as to the ID data of same user, will lead
Cause the user characteristics obtained based on these ID data subnets analysis can not situation that is effective, being truly reflected user's reality.In order to
The reliability for further increasing these ID data subnets also needs that these ID data subnets are further processed, such as to this
A little ID data subnets are clustered and are divided.
Specifically, the first preset quantity threshold value and the second preset quantity threshold value can be preset, for several ID data
The quantity of any included ID data in net is greater than the ID data subnet of the first preset quantity threshold value, to the ID data subnet
In ID data clustered and divided, several 3rd ID data subnets corresponding to the ID data subnet are obtained, thus should
ID data in ID data subnet with stronger, more structurally sound incidence relation are gathered for one kind, and are divided to same 3rd ID
In data subnet.Wherein, any to refer to any one;The quantity for the ID data that 3rd ID data subnet is included is less than or equal to
Second preset quantity threshold value.Compared with the quantity for the ID data for being included is greater than the ID data subnet of the first preset quantity threshold value,
ID data in 3rd ID data subnet have stronger, more structurally sound incidence relation, can recognize the ID number for same user
According to can accurately and efficiently be analyzed user characteristics based on the 3rd ID data subnet, to construct complete, effective user
Portrait.And the 3rd the data volume of ID data subnet be far smaller than the quantity of included ID data and be greater than the first preset quantity threshold
The data volume of the ID data subnet of value, is more convenient for user feature analysis, helps to improve analysis efficiency.
Those skilled in the art can according to actual needs carry out the first preset quantity threshold value and the second preset quantity threshold value
Setting, herein without limitation.For example, 50 can be set by the first preset quantity threshold value, set the second preset quantity threshold value to
10, then it is greater than 50 ID data subnet for the quantity for the ID data that any one of several ID data subnets are included,
It requires that the ID data in the ID data subnet are clustered and divided, which is divided into several included
The quantity of ID data is less than or equal to 10 the 3rd ID data subnet.
According to ID data network processing method provided in this embodiment, data point are carried out by the daily record data to multiple business
Analysis, can rapidly construct to obtain ID data network;And beta pruning pretreatment is carried out to ID data network, is effectively and quickly eliminated
Insecure incidence relation between ID data in ID data network can not only help to improve the accuracy of ID data network processing,
But also the data volume of data analysis can be reduced;In addition, between the ID data and ID data that are included to ID data network
Incidence relation carries out data analysis, ID data network rapidly can be divided into several ID data subnets, ID data subnet is wrapped
The ID data contained have stronger, reliable incidence relation, can recognize the ID data for same user, are based on ID data subnet energy
It is enough that accurately and rapidly user characteristics are analyzed, to construct complete, effective user's portrait.
The present invention also provides a kind of ID data network beta pruning preprocess method, this method comprises: obtain comprising ID data with
And the ID data network of the incidence relation between ID data;Beta pruning pretreatment is carried out to ID data network, it is pretreated to obtain beta pruning
ID data network.Wherein, ID data include: User ID data and/or device id data.Below by specific implementation shown in Fig. 3
The ID data network beta pruning preprocess method is described in example.
Fig. 3 shows the flow diagram of ID data network beta pruning preprocess method according to an embodiment of the invention, such as
Shown in Fig. 3, this method comprises the following steps:
Step S300 obtains the ID data network comprising the incidence relation between ID data and ID data.
Description in embodiment illustrated in fig. 1 to step S100 can refer to the description of the step, details are not described herein again.
Step S301 carries out data analysis to the daily record data of multiple business, obtains the association frequency between ID data.
The ID data and other ID using the business can be recorded for the daily record data of a business, in daily record data
Data illustrate there is incidence relation between ID data and other ID data using the business, pass through the log to multiple business
Data carry out data analysis, can not only determine the incidence relation between ID data and ID data, additionally it is possible to determine ID
The association frequency between data.
Specifically, data analysis is carried out to the daily record data of multiple business, calculates the actual association frequency between ID data.
In practical applications, the actual association frequency between ID data can be calculated according to the default unit time.To preset the unit time
For day, if analyzing to obtain by carrying out data to daily record data, some ID data and another ID data have 50 days to have pass
The actual association frequency between the two ID data is then denoted as 50 by connection relationship.According to the method described above, each ID is calculated
The actual association frequency between data and other ID data.
In practical applications, there is also the feelings that multiple users successively use same business in different times by same equipment
Condition, the User ID data of this multiple user have incidence relation, but its actual association all between the device id data of the equipment
The frequency can not be truly reflected user corresponding to the equipment current period reality.For example, two users are on the same mobile phone
It is applied using 360 security guards, then 360 accounts of the two users have association all between the device id data of the mobile phone
Relationship, it is assumed that obtained according to the daily record data that 360 security guards apply, wherein first 360 account is before 1 year frequently by this
Mobile phone logs in 360 security guards application, the actual association frequency between first 360 account and the device id data of the mobile phone
It is 100, but first 360 account no longer pass through the mobile phone and log in 360 security guards application before half a year, but second
360 accounts before half a year frequently by the mobile phone log in 360 security guards application, second 360 account and the mobile phone
Device id data between the actual association frequency be 50.Although between first 360 account and the device id data of the mobile phone
The actual association frequency be higher than the actual association frequency between second 360 account and the device id data of the mobile phone, but the
The corresponding daily record data of one 360 account is daily record data the year before, and the temporal information of the daily record data is apart from current time
Farther out, it is clear that the corresponding user of second 360 account is only user corresponding to the mobile phone current period reality, if according only to reality
The border association frequency can not be truly reflected user corresponding to the mobile phone current period reality.
To solve the above-mentioned problems, the present invention is that the corresponding daily record data of ID data introduces corresponding time weighting, according to
According to the temporal information and time weighting of the actual association frequency, the corresponding daily record data of ID data between ID data, calculate
To the association frequency between ID data.Wherein, the weight size of time weighting corresponding to the corresponding daily record data of ID data with
How far of the corresponding daily record data of ID data apart from current time is related.If the time of the corresponding daily record data of ID data believes
Breath is closer apart from current time, then the weight of time weighting corresponding to the corresponding daily record data of ID data is bigger;If ID data
The temporal information of corresponding daily record data is remoter apart from current time, then the time corresponding to the corresponding daily record data of ID data weighs
The weight of weight is smaller.Attenuation processing is carried out to the actual association frequency between ID data by time weighting, after attenuation processing
Obtained numerical value is as the association frequency between ID data.The association frequency between obtained ID data in this way
It can accurately reflect true correlation degree between current period ID data, reference value with higher facilitates precisely
Ground carries out beta pruning pretreatment to ID data network.
Step S302, for any ID data in ID data network, according to other ID numbers being directly linked with the ID data
According to quantity and/or the ID data and other ID data between be associated with the frequency, between the ID data and other ID data
Incidence relation carries out beta pruning pretreatment.
The present invention is provided with each threshold value of defined in prune rule and prune rule by data analysis repeatedly,
Wherein, prune rule includes: for any ID data in ID data network, if other ID data being directly linked with the ID data
Quantity be greater than between first threshold and the ID data and other any ID data and be associated with the frequency less than or equal to second threshold,
Then remove the incidence relation between the ID data and other any ID data;If other ID numbers being directly linked with the ID data
According to quantity be greater than the sum of the frequency that is associated between third threshold value and the ID data and other each ID data and be greater than or equal to the
Four threshold values then remove the incidence relation between the ID data and other each ID data;If the ID data and other each ID numbers
The sum of association frequency between is greater than or equal to the 5th threshold value;Then remove the pass between the ID data and other each ID data
Connection relationship;For other situations in addition to above-mentioned three kinds of situations, then retain between the ID data and other each ID data
Incidence relation, without being removed.As long as invention provides for meet above-mentioned three kinds need to remove incidence relation in the case where
It is any, just remove corresponding incidence relation.
Whether any ID data for the ease of judging in ID data network meet above-mentioned prune rule, can first be directed to ID data
Any ID data in net are constructed with the intermediate subnet of the ID data grid technology, specifically, the ID for being included according to ID data network
Incidence relation between data and ID data constructs ID relation data, wherein and ID relation data includes several ID relationships pair,
Each ID relationship to comprising relationship between two ID and two ID, for example, ID data " a1 " and ID data " b1 " have it is direct
Incidence relation, then constructed corresponding ID relationship is to being (a1, b1), a1 and b1 for two included in the ID relationship pair
ID, and indicate that there is relationship between the two ID with ().Then according to major key ID group technology, to all ID relationships to dividing
Group obtains intermediate subnet according to group result, wherein refers to according to major key ID group technology and is divided according to set major key ID
The method of group.For example, being major key ID according to the ID in the left side of all ID relationship centerings, by groupByKey method to all ID
Relationship is to being grouped, and the intermediate subnet centered on obtaining all ID by left side according to group result is to get having arrived with ID number
According to the intermediate subnet of any ID data grid technology in net.After having obtained intermediate subnet, so that it may easily carry out ID data
Whether the judgement of above-mentioned prune rule is met.
In practical applications, after whether meeting the judgement of prune rule, setting beta pruning can be marked for ID relationship
Whether position is the incidence relation for needing to remove for the relationship between two ID of Tag ID relationship centering.If some ID relationship
Relationship between two ID of centering is the incidence relation for needing to remove, then sets 1 for the beta pruning marker bit of the ID relationship pair;
If the relationship between two ID of some ID relationship centering is not the incidence relation for needing to remove, by the beta pruning of the ID relationship pair
Marker bit is set as 0.By beta pruning marker bit can clearly know the relationship between two ID of ID relationship centering whether be
The incidence relation for needing to remove.
It specifically, can be according to the intermediate son of the ID data grid technology for any one of ID data network ID data
Net, the quantity for other ID data that judgement and the ID data are directly linked whether be greater than first threshold and the ID data and it is any its
The association frequency between his ID data is less than or equal to second threshold;If so, removing the ID data and other any ID numbers
Incidence relation between.Wherein, first threshold can be 2, and second threshold can be 5, then judgement is directly linked with the ID data
The frequencys that is associated with that whether are greater than between 2 and the ID data and other any ID data of quantity of other ID data be less than or equal to
5;If so, illustrating that the incidence relation between the ID data and other any ID data is insecure incidence relation, then remove
Incidence relation between the ID data and other any ID data.Assuming that for the ID data " a4 " in ID data network, according to
Intermediate subnet centered on ID data " a4 " it is found that with ID data " a4 " ID data being directly linked include ID data " b4 ",
ID data " c4 " and ID data " f4 ", wherein the frequency that is associated between ID data " a4 " and ID data " b4 " is 20, ID data
The frequency that is associated between " a4 " and ID data " c4 " be the frequency that is associated between 30, ID data " a4 " and ID data " f4 " is 3, that
Quantity with ID data " a4 " other ID data being directly linked is 3, is greater than 2, and ID data " a4 " and ID data " f4 " it
Between the association frequency less than 5, then remove the incidence relation between ID data " a4 " and ID data " f4 ".
For any one of ID data network ID data, also according to the intermediate subnet of the ID data grid technology, judgement
Whether the quantity for other ID data being directly linked with the ID data is greater than third threshold value and the ID data and other each ID numbers
The sum of association frequency between is greater than or equal to the 4th threshold value;If so, remove the ID data and other each ID data it
Between incidence relation.Wherein, third threshold value can be 299, and the 4th threshold value can be 100, then judgement is directly linked with the ID data
The sum of frequencys that is associated with for whether being greater than between 299 and the ID data and other each ID data of quantity of other ID data be greater than
Or it is equal to 100;If so, illustrating that the incidence relation between the ID data and other each ID data is that insecure association is closed
System, then remove the incidence relation between the ID data and other each ID data.In addition, also can determine whether the ID data and it is each its
Whether the sum of association frequency between his ID data is greater than or equal to the 5th threshold value;If so, remove the ID data and it is each its
Incidence relation between his ID data.Wherein, the 5th threshold value can be 1000, then judging the ID data and other each ID data
Between the sum of the association frequency whether be greater than or equal to 1000;If so, illustrating between the ID data and other each ID data
Incidence relation is insecure incidence relation, then removes the incidence relation between the ID data and other each ID data.
Step S303 obtains the pretreated ID data network of beta pruning.
The judgement for whether meeting prune rule is being completed for any ID data in ID data network, and according to judgement
As a result after carrying out beta pruning pretreatment to the incidence relation between the ID data and other ID data, it is pretreated to obtain beta pruning
ID data network, so that insecure incidence relation between ID data is effectively removed in ID data network, so that beta pruning pre-processes
The incidence relation between the ID data in ID data network afterwards is stronger, reliable incidence relation, can not only facilitate to mention
The accuracy of high ID data network processing, but also the data volume of subsequent data analysis can be reduced.
According to ID data network beta pruning preprocess method provided in this embodiment, data are carried out to the daily record data of multiple business
Analysis, the association frequency being quickly obtained between ID data, for any ID data in ID data network, according to the ID data
It is associated with the frequency between the quantity and/or the ID data and other ID data of other ID data being directly linked, to the ID data
Incidence relation between other ID data carries out beta pruning pretreatment, effectively and quickly eliminate in ID data network ID data it
Between insecure incidence relation so that the incidence relation between ID data in the pretreated ID data network of beta pruning be compared with
By force, reliable incidence relation, can not only help to improve the accuracy of ID data network processing, but also can reduce data point
The data volume of analysis.Optionally, corresponding time weighting also is introduced for daily record data, by time weighting between ID data
The actual association frequency carries out attenuation processing, using numerical value obtained after attenuation processing as the association frequency between ID data, with
Just accurately reflect true correlation degree between current period ID data, reference value with higher facilitates accurately
Beta pruning pretreatment is carried out to ID data network.
The present invention also provides a kind of ID data network data analysis method, this method comprises: obtain comprising ID data and
The ID data network of incidence relation between ID data;The association between ID data and ID data for being included according to ID data network
Relationship constructs ID relation data;ID relation data includes several ID relationships pair;Combination is compared to ID relation data, is obtained
Several ID data subnets.Wherein, ID data include: User ID data and/or device id data.Below by tool shown in Fig. 4
The ID data network data analysis method is described in body embodiment.
Fig. 4 shows the flow diagram of ID data network data analysis method according to an embodiment of the invention, such as Fig. 4
Shown, this method comprises the following steps:
Step S400 obtains the ID data network comprising the incidence relation between ID data and ID data.
Description in embodiment illustrated in fig. 1 to step S100 can refer to the description of the step, details are not described herein again.
Step S401, the incidence relation between ID data and ID data for being included according to ID data network, building ID are closed
Coefficient evidence.
After obtaining ID data network, so that it may the pass between ID data and ID data for being included according to ID data network
Connection relationship constructs ID relation data, and constructed ID relation data includes several ID relationships pair, each ID relationship to comprising: two
Relationship between a ID and two ID, for example, ID data " a1 " and ID data " b1 " have direct correlation relationship, ID data " a2 "
There is direct correlation relationship with ID data " b2 ", ID data " a2 " and ID data " c2 " have direct correlation relationship, then institute's structure
The corresponding ID relationship built is to for (a1, b1), (a2, b2) and (a2, c2), and there are two ID to separately including for above-mentioned ID relationship, and
Indicate that there is relationship between the two ID with ().With ID relationship to for for (a1, b1), two ID for being included are respectively a1
And b1, the two ID are included together with (), indicate that there is relationship between the two ID.The institute for being included for ID data network
Incidence relation between some ID data and all ID data constructs several ID relationships pair using above-mentioned construction method, from
And complete the building of ID relation data.
Step S402, full dose replicate ID relation data into memory.
Before combination is compared, need full dose duplication ID relation data into memory, so that including complete in memory
The ID relation data of amount, so as to which combination quickly and easily is compared to ID relation data.
ID relation data is compared with the ID relation data that full dose copies in memory and combines by step S403, according to
It compares combined result and carries out Data Integration, obtain several ID data subnets.
It, can be by each of ID relation data ID relationship pair after ID relation data full dose is copied in memory
The ID relation data copied in memory with full dose respectively, which is compared, to be combined, then whole according to combined result progress data are compared
It closes, obtains several ID data subnets.Wherein, for each of ID relation data ID relationship pair, by comparing from memory
ID relation data in find with the ID relationship at least exist an identical ID ID relationship pair, wrapped according to ID relationship centering
The relationship between two ID contained, the ID of the ID relationship centering and the ID for the ID relationship centering found are combined, obtained
Intermediate result is combined in the comparison of the ID relationship pair.For example, being closed to (a2, b2) by the ID compared from memory for ID relationship
Coefficient according in find with the ID relationship to (a2, b2) there are the ID relationship of at least one identical ID to include ID relationship to (a2,
B2) and ID relationship is to (a2, c2), then the ID of the ID relationship centering and the ID for the ID relationship centering found are combined,
The obtained ID relationship is " c2-a2-b2 " to the comparison combination intermediate result of (a2, b2), wherein the "-" between two ID
Indicate that there is relationship between two ID.
The case where may possibly still be present non-complete combination in view of obtained comparison combination intermediate result, will then own
The comparison combination intermediate result continuation of ID relationship pair is compared with the ID relation data that full dose copies in memory combines, and obtains
Intermediate result is combined in the comparison of next iteration operation, and iteration executes this step, until meeting default iterated conditional.When iteration mistake
After journey, obtain comparing combined result.Wherein, it compares and is had recorded in multiple groups ID and every group of ID between ID in combined result
Relationship includes one or more ID in every group of ID.According in the multiple groups ID and every group of ID compared in combined result between ID
Relationship carries out Data Integration, obtains several ID data subnets, specifically, for any group compared in combined result in multiple groups ID
ID carries out Data Integration according to the relationship between ID in this group of ID, is integrated into an ID data subnet.
Optionally, ID relation data can be divided into multiple fragments, combination is concurrently compared by fragment, with into one
Step improves ID data network data analysis efficiency.The ID relation data multiple fragments concurrently copied to full dose in memory carries out
Combination is compared, the comparison combined result of all fragments is obtained, the comparison combined result of all fragments is then subjected to Data Integration,
Obtain several ID data subnets.The comparison combined result of all fragments has recorded the pass in multiple groups ID and every group of ID between ID
System carries out Data Integration according to the relationship in the multiple groups ID and every group of ID in the comparison combined result of all fragments between ID,
Obtain several ID data subnets.Wherein, for any fragment, by the fragment and full dose copy to the ID relation data in memory into
Row compares combination, obtains the comparison combination intermediate result of the fragment.Specifically, for each of fragment ID relationship pair,
It is found from the ID relation data in memory with the ID relationship by comparing to the ID relationship pair that at least there is an identical ID, is pressed
According to the relationship between two ID included in ID relationship pair, by the ID of the ID relationship centering and the ID relationship centering found
ID is combined, and the comparison combination intermediate result of the ID relationship pair is obtained, until all ID relationships are to being completed in the fragment
It is combined with the comparison of the ID relation data in memory, obtains the comparison combination intermediate result of the fragment, the comparison combination of the fragment
Intermediate result includes: the comparison combination intermediate result of all ID relationships pair in the fragment.
The case where may possibly still be present non-complete combination in view of the comparison combination intermediate result of obtained all fragments,
The present invention is after intermediate result is combined in the comparison for obtaining all fragments, and iteration executes following intermediate comparison step, until symbol
Close default iterated conditional, wherein centre compares step are as follows: the comparison combination intermediate result of all fragments is divided into multiple centres
Sub- fragment, and the ID relation data that the sub- fragment in multiple centres concurrently copies in memory with full dose is compared and is combined, it obtains
Intermediate result is combined in the comparison of all fragments run to next iteration.After iterative process, all fragments are obtained
Compare combined result.In such a way that above-mentioned iteration executes, it the comparison of fragment can combine intermediate result and carry out fully group
It closes, to carry out Data Integration.Those skilled in the art can according to actual needs be configured default iterated conditional, herein not
It limits.For example, default iterated conditional can include: the number of iterations reaches default the number of iterations, wherein those skilled in the art can
Default the number of iterations is set according to actual needs, such as sets 3 for default the number of iterations.
It, can be based on the ID data that ID data network is included according to ID data network data analysis method provided in this embodiment
And the incidence relation between ID data, ID relation data is constructed, then copies to ID relation data and full dose in memory
Combination is compared in ID relation data, carries out Data Integration according to combined result is compared, accurately and rapidly obtains several ID data
Subnet, to realize effective division to ID data network.Optionally, ID relation data can be also divided into multiple fragments, led to
It crosses the ID relation data that fragment concurrently copies in memory with full dose and is compared and combine, further improve ID data netting index
According to analysis efficiency.Compared with ID data network, the ID data that ID data subnet is included have stronger, reliable incidence relation,
It can recognize the ID data for same user, accurately and rapidly user characteristics can be analyzed based on ID data subnet, with structure
Build complete, effective user's portrait.
The present invention also provides another ID data network data analysis method, this method comprises: obtain comprising ID data with
And the ID data network of the incidence relation between ID data;The pass between ID data and ID data for being included according to ID data network
Connection relationship constructs ID relation data;ID relation data includes several ID relationships pair, and each ID relationship is to including two ID and two
Relationship between a ID;ID relation data is grouped, several ID data subnets are obtained.Wherein, ID data include: User ID
Data and/or device id data.The ID data network data analysis method is retouched below by specific embodiment shown in fig. 5
It states.
Fig. 5 a shows the flow diagram of ID data network data analysis method in accordance with another embodiment of the present invention, such as
Shown in Fig. 5 a, this method comprises the following steps:
Step S500 obtains the ID data network comprising the incidence relation between ID data and ID data.
Description in embodiment illustrated in fig. 1 to step S100 can refer to the description of the step, details are not described herein again.
Step S501, the incidence relation between ID data and ID data for being included according to ID data network, building ID are closed
Coefficient evidence.
Wherein, ID relation data includes several ID relationships pair, each ID relationship to comprising: between two ID and two ID
Relationship.Description in embodiment illustrated in fig. 4 to step S401 can refer to the description of the step, details are not described herein again.
Step S502 obtains each ID relationship to institute by each ID relationship to oriented positive sequence and the processing of oriented backward is carried out
The oriented relationship pair of corresponding two ID.
For the ease of being grouped processing, the present invention is provided with oriented positive sequence processing method and oriented backward processing method,
Specifically, positive sequence is set by the sequence of left side ID to right side ID by the centering of ID relationship, by the centering of ID relationship by right side ID a to left side
Side ID is set as backward, and two ID of ID relationship centering are ranked up referred to as oriented positive sequence according to positive sequence and are handled, by ID relationship
Two ID of centering are ranked up referred to as oriented backward according to backward and handle.By each ID relationship to carrying out oriented positive sequence and oriented
After backward processing, each ID relationship can be obtained to two corresponding oriented relationships pair of ID.In order to easily know ID
Oriented relationship can be the oriented relationship of each ID to setting relationship position, wherein same ID is closed to whether same ID relationship pair is corresponded to
It is, different ID relationships pass to corresponding ID oriented relationship pair identical to the relationship position of two corresponding oriented relationships pair of ID
It is position difference.
It wherein, can be as shown in Figure 5 b to the processing schematic for carrying out oriented positive sequence and oriented backward to ID relationship.Fig. 5 b's
Left part show ID relationship included by ID relation data to for (a1, b1), (a2, b2), (a2, c2), (a3, b3),
(a3, c3) and (c3, d3).For ID relationship to (a1, b1), (a1, b1) is subjected to oriented positive sequence processing, obtains the oriented relationship of ID
To (a1-b1-01), (a1, b1) is subjected to oriented backward processing, obtains the oriented relationship of ID to (b1-a1-01), then ID is oriented
Relationship is ID relationship to two oriented passes ID corresponding to (a1, b1) to (b1-a1-01) to (a1-b1-01) and the oriented relationship of ID
System pair, wherein the relationship position of the oriented relationship centering of the two ID is identical, and is all 01.In the manner described above, respectively to (a2,
B2), (a2, c2), (a3, b3), (a3, c3) and (c3, d3) carries out oriented positive sequence and the processing of oriented backward, to obtain Fig. 5 b's
The oriented relationship pair of ID shown in right part.Any oriented relationship centering of ID determines major key ID according to preset rules.Ability
Preset rules can be arranged in field technique personnel according to actual needs, herein without limitation.For example, preset rules include: that ID is oriented
The ID in the left side of relationship centering is as major key ID.
Step S503, using according to major key ID group technology, to the oriented relationship of all ID to being grouped, according to group result
Obtain several ID data subnets.
Wherein, using according to major key ID group technology, to the oriented relationship of all ID to being grouped, several first points are obtained
Group;For any first grouping, which is determined according to the quantity of the included oriented relationship pair of ID of first grouping
Meter digital;Extract meter digital be the first count value at least one first grouping, according to relationship position to it is extracted at least one
The included oriented relationship of ID of first grouping obtains at least one the first ID data subnet to processing is combined;First ID number
The quantity for the ID data for being included according to subnet is 2.Wherein, the first count value is 1.
By taking the oriented relationship of all ID is to the oriented relationship pair of ID shown in the right part for Fig. 5 b as an example, according to the oriented pass ID
The ID for being the left side of centering is major key ID, will be led to the oriented relationship of all ID to being grouped by groupByKey method
The oriented relationship of the identical ID of key ID is to one first grouping is divided into, to obtain several first groupings, this several first grouping divides
Wei not include the oriented relationship of ID to first grouping 1 of (a1-b1-01), include the oriented relationship of ID to (a2-b2-02) and
(a2-c2-03) first is grouped 2, includes the oriented relationship of ID to first grouping 3 of (a3-b3-04) and (a3-c3-05), packet
4 are grouped to the first of (b1-a1-01) containing the oriented relationship of ID, include first grouping of the oriented relationship of ID to (b2-a2-02)
5, include the oriented relationship of ID to first grouping 6 of (b3-a3-04), include the oriented relationship of ID to the first of (c2-a2-03)
Grouping 7 includes the oriented relationship of ID to first grouping 8 of (c3-a3-05) and (c3-d3-06) and includes the oriented relationship of ID
9 are grouped to the first of (d3-c3-06).Then for any one the first grouping, according to first grouping, included ID is oriented
The quantity of relationship pair determines the meter digital of first grouping, wherein first grouping the 4, first 5, first points of grouping of the 1, first grouping
The meter digital of group the 6, first grouping 7 and the first grouping 9 is 1, the counting of first grouping the 2, first grouping 3 and the first grouping 8
Position is 2.
The first grouping that meter digital is 1 is extracted from 1 to the first grouping 9 of the first grouping, extracted first grouping includes
First the 1, first grouping of grouping the 5, first grouping of the 4, first grouping the 6, first grouping 7 and the first grouping 9, it is right then according to relationship position
These extracted first included oriented relationships of ID of grouping to being combined processing, that is, by it is extracted these first
The oriented relationship of the identical ID in relationship position is combined into a first ID data subnet to group in grouping, and the first ID data subnet is included
The quantity of ID data is 2.In the included oriented relationship pair of ID of these extracted the first groupings, the oriented relationship pair of only ID
(a1-b1-01) identical with the relationship position of (b1-a1-01), then the oriented relationship of the two ID is combined into a first ID data to group
Subnet specifically determines between two nodes using a1 and b1 as node according to the incidence relation between a1 and b1
Connection relationship, to obtain the first ID data subnet.
By above-mentioned packet transaction mode, can quickly and easily obtain included ID data quantity be 2 first
ID data subnet.In addition, the present invention can also quickly and easily obtain the 2nd ID data that the quantity of included ID data is 3
Subnet, specific processing mode are as follows:
During above-mentioned packet transaction, after the meter digital that all first groupings have been determined, extracting meter digital is the
At least one first grouping of two count values;For extracted any first grouping, according to the included ID of first grouping
Oriented relationship pair obtains the corresponding oriented relationship group of ID of first grouping;Each oriented relationship group of ID includes: three ID and three
Relationship between a ID;Wherein, major key ID is determined according to preset rules in the oriented relationship group of any ID;It and is the oriented pass each ID
System's group setting relationship position;Wherein, the relationship position of the corresponding oriented relationship group of ID of same first grouping is identical, and difference first is grouped
The relationship position of the oriented relationship group of corresponding ID is different.Followed by according to major key ID group technology, to the oriented relationship group of all ID
It is grouped, obtains several second packets, for any second packet, the oriented relationship group of ID for being included according to the second packet
Quantity determine the meter digital of the second packet, then extract at least one second packet that meter digital is third count value, press
The oriented relationship group of ID for being included at least one extracted second packet according to relationship position is combined processing, obtains at least one
A 2nd ID data subnet;The quantity for the ID data that 2nd ID data subnet is included is 3.Wherein, the second count value is 2, the
Three count values are 1.
According to above-mentioned example it is found that the first 7 and of grouping the 1, first grouping the 4, first grouping the 6, first grouping of the 5, first grouping
The meter digital of first grouping 9 is 1, and the meter digital of first grouping the 2, first grouping 3 and the first grouping 8 is 2, from first point
The first grouping that meter digital is 2 is extracted in 1 to the first grouping 9 of group, extracted first grouping includes the first 2, first points of grouping
Group 3 and the first grouping 8.For the first grouping 2, the oriented relationship of ID that the first grouping 2 is included to for (a2-b2-02) and
(a2-c2-03), it is oriented to be obtained to (a2-b2-02) and (a2-c2-03) according to the oriented relationship of ID by ID corresponding to the first grouping 2
Relationship group, specifically, the oriented relationship group of ID corresponding to the first grouping 2 includes the oriented relationship group of 3 ID, for example, obtained
The oriented relationship group of ID corresponding to first grouping 2 includes the oriented relationship group (a2-b2-c2-001) of ID, the oriented relationship group (b2- of ID
) and the oriented relationship group (c2-a2-b2-001) of ID a2-c2-001, wherein the relationship position in the oriented relationship group of these three ID is identical,
It and is all 001.In the manner described above, it respectively obtains corresponding to the oriented relationship group of ID corresponding to the first grouping 3 and the first grouping 8
The oriented relationship group of ID, wherein first grouping 3 corresponding to the oriented relationship group of ID include (a3-b3-c3-002), (b3-a3-
C3-002) and (c3-a3-b3-002), the oriented relationship group of ID corresponding to the first grouping 8 includes (c3-a3-d3-003), (a3-
) and (d3-c3-a3-003) c3-d3-003.It is major key ID according to the ID in the left side in the oriented relationship group of ID, passes through
GroupByKey method is grouped the oriented relationship group of all ID, i.e., the oriented relationship group of the identical ID of major key ID is divided into one
Second packet, to obtain several second packets, this several second packet includes respectively the oriented relationship group (a2-b2- of ID
C2-001 second packet 1), include ID oriented relationship group (a3-b3-c3-002) and (a3-c3-d3-003) second packet
2, include the second packet 3 of the oriented relationship group (b2-a2-c2-001) of ID, include the oriented relationship group (b3-a3-c3- of ID
002) second packet 4, includes the oriented relationship of ID at the second packet 5 for including the oriented relationship group (c2-a2-b2-001) of ID
The second packet 6 of group (c3-a3-b3-002) and (c3-a3-d3-003) and include the oriented relationship group (d3-c3-a3- of ID
003) second packet 7.Then for any second packet, according to the quantity for the oriented relationship group of ID that the second packet is included
Determine the meter digital of the second packet, wherein second packet 1, second packet 3, second packet 4, second packet 5 and second packet
7 meter digital is 1, and the meter digital of second packet 2 and second packet 6 is 2.
The second packet that meter digital is 1 is extracted into second packet 7 from second packet 1, extracted second packet includes
Second packet 1, second packet 3, second packet 4, second packet 5 and second packet 7, then according to relationship position to it is extracted this
The oriented relationship group of ID that a little second packets are included is combined processing, that is, by relationship in these extracted second packets
The identical oriented relationship group group of ID in position is combined into a 2nd ID data subnet, the number for the ID data that the 2nd ID data subnet is included
Amount is 3.In the oriented relationship group of ID that these extracted second packets are included, the oriented relationship group (a2-b2-c2- of only ID
001), (b2-a2-c2-001) is identical with the relationship position of (c2-a2-b2-001), then is combined into the oriented relationship group group of these three ID
One the 2nd ID data subnet specifically using a2, b2 and c2 as node, is closed according to the association between a2, b2 and c2
System, determines the connection relationship between three nodes, obtains the 2nd ID data subnet, specifically, can be according to the oriented relationship group of ID
(a2-b2-c2-001), the oriented relationship of ID corresponding to (b2-a2-c2-001) and (c2-a2-b2-001) to (a2-b2-02) and
(a2-c2-03), it determines the connection relationship between tri- nodes of a2, b2 and c2, node a2 is connected with node b2, by node a2
It is connected with node c2, to obtain the 2nd ID data subnet.
By above-mentioned packet transaction mode, can quickly and easily obtain included ID data quantity be 2 first
The 2nd ID data subnet that the quantity of ID data subnet and the ID data for being included is 3, certain those skilled in the art can also join
According to above-mentioned packet transaction mode and so on, other ID data subnets that the quantity of included ID data is 4,5,6 etc. are obtained,
Details are not described herein again.
It, can be based on the ID data that ID data network is included according to ID data network data analysis method provided in this embodiment
And the incidence relation between ID data, ID relation data is constructed, is then handled by oriented positive sequence and oriented backward, obtains ID
In relation data then each ID relationship utilizes according to major key ID group technology, to institute two corresponding oriented relationships pair of ID
There is the oriented relationship of ID to being grouped, effectively improves ID data network data analysis efficiency, can accurately and rapidly be counted
A ID data subnet, to realize effective division to ID data network.Optionally, using the meter digital of obtained grouping with
And for the oriented relationship of ID to and the oriented relationship group of ID set by relationship position, can quickly and easily obtain the first ID data
Net and the 2nd ID data subnet.
Those skilled in the art can also be by ID data network data analysis method shown in Fig. 5 a and ID data network shown in Fig. 4
Data analysing method combines, and further increases ID data network data analysis efficiency.For example, first with ID data shown in Fig. 5 a
Network data analysis method is grouped ID relation data, and the quantity for obtaining included ID data is 2 the first ID data
The 2nd ID data subnet that the quantity of net and the ID data for being included is 3 will then remove the first ID data in ID relation data
Other ID relationships except ID relationship pair corresponding to net and the 2nd ID data subnet are to multiple fragments are divided into, by multiple fragments
The ID relation data concurrently copied in memory with full dose, which is compared, to be combined, and the comparison combined result of all fragments is obtained,
Then the comparison combined result of all fragments is subjected to Data Integration, the quantity for obtaining included ID data is 4,5,6 etc.
Other ID data subnets.The quantity that included ID data can not only be quickly and easily obtained by this processing mode is 2
The first ID data subnet and the quantity of the ID data that are included be 3 the 2nd ID data subnet, but also effectively reduce
Combined data processing amount is compared, ID data network data analysis efficiency is improved.
The present invention also provides a kind of ID data subnet processing methods, this method comprises: calculating in several ID data subnets
The quantity for the ID data that each ID data subnet is included;The quantity for extracting included ID data is more than the first preset quantity
The ID data subnet of threshold value;It is greater than ID data of the first preset quantity threshold value for the quantity of any included ID data
Net is clustered and is divided to the ID data in the ID data subnet, obtains several 3rd ID corresponding to the ID data subnet
Data subnet;The quantity for the ID data that 3rd ID data subnet is included is less than or equal to the second preset quantity threshold value.Lead to below
Specific embodiment shown in fig. 6 is crossed the ID data subnet processing method is described.
Fig. 6 shows the flow diagram of ID data subnet processing method according to an embodiment of the invention, such as Fig. 6 institute
Show, this method comprises the following steps:
Step S600 calculates the quantity for the ID data that each ID data subnet is included in several ID data subnets.
Wherein, several ID data subnets are analyzed by carrying out data to ID data network, and ID data subnet includes
There is the incidence relation between ID data and ID data, the quantity for the ID data that ID data subnet is included is far smaller than ID data
Net the quantity of included ID data.The quantity for the ID data that may still included in several ID data subnets is more
ID data subnet may and be not belonging to same although ID data in these ID data subnets have stronger incidence relation
The ID data of one user will lead to if these ID data to be identified as to the ID data of same user based on these ID data subnets
Analyzing obtained user characteristics can not situation that is effective, being truly reflected user's reality.In order to further increase these ID data
The reliability of subnet also needs that these ID data subnets are further processed.In order to easily from several ID data subnets
The ID data subnet handled is found, can first calculate each ID data subnet in several ID data subnets is included
The quantity of ID data.
Step S601, the quantity for extracting included ID data is more than the ID data subnet of the first preset quantity threshold value.
After the quantity for calculating the ID data that each ID data subnet is included, mentioned from several ID data subnets
Take included ID data quantity be more than the first preset quantity threshold value ID data subnet, wherein those skilled in the art can
The first preset quantity threshold value is configured according to actual needs, herein without limitation.For example, can be by the first preset quantity threshold value
50 are set as, then the quantity for extracting included ID data from several ID data subnets is more than 50 ID data subnet.
Step S602 is more than ID data of the first preset quantity threshold value in the quantity of extracted included ID data
The ID data subnet that selection one be not selected in net.
After the quantity for being extracted included ID data is more than the ID data subnet of the first preset quantity threshold value, in order to
The 3rd ID data subnet can be effectively obtained, is greater than the first preset quantity threshold value for the quantity of any included ID data
ID data subnet, the ID data in the ID data subnet are clustered and are divided, are obtained corresponding to the ID data subnet
Several 3rd ID data subnets.It specifically, is more than first in the quantity of extracted included ID data in step S602
The ID data subnet that selection one be not selected in the ID data subnet of preset quantity threshold value.
Step S603 carries out data analysis to the daily record data of multiple business corresponding with the ID data subnet, and determining should
The association frequency in ID data subnet between ID data.
Wherein, daily record data corresponding with the ID data subnet can be searched from the daily record data of multiple business, specifically,
The ID data and other ID data using the business can be recorded for the daily record data of a business, in daily record data, said
There is incidence relation, then can be from the daily record data of multiple business between bright ID data and other ID data using the business
Daily record data corresponding with ID data in the ID data subnet is searched, by multiple business corresponding with the ID data subnet
Daily record data carries out data analysis, is capable of determining that the association frequency in the ID data subnet between ID data.
Specifically, data analysis is carried out to the daily record data of multiple business corresponding with the ID data subnet, calculates the ID
The actual association frequency in data subnet between ID data.In practical applications, ID number can be calculated according to the default unit time
The actual association frequency between.By taking the default unit time is day as an example, if analyzing to obtain by carrying out data to daily record data,
Another ID data in some ID data and the ID data subnet in the ID data subnet have 50 days to have incidence relation, then will
The actual association frequency between the two ID data is denoted as 50.According to the method described above, it is calculated each in the ID data subnet
The actual association frequency in a ID data and the ID data subnet between other ID data.
In view of in practical applications, there is also multiple users successively to use same industry by same equipment in different times
The case where business, the User ID data of this multiple user have incidence relation all between the device id data of the equipment, but in fact
The border association frequency can not be truly reflected user corresponding to the equipment current period reality.Therefore, the present invention is ID data pair
The daily record data answered introduces corresponding time weighting, according to the actual association frequency between ID data, ID data corresponding day
The association frequency between ID data is calculated in the temporal information and time weighting of will data.Wherein, ID data corresponding day
How far of the weight size of time weighting corresponding to the will data daily record data corresponding with ID data apart from current time
It is related.If the temporal information of the corresponding daily record data of ID data is closer apart from current time, the corresponding daily record data of ID data
The weight of corresponding time weighting is bigger;If the temporal information of the corresponding daily record data of ID data is remoter apart from current time,
Then the weight of time weighting corresponding to the corresponding daily record data of ID data is smaller.By time weighting to the reality between ID data
Border is associated with the frequency and carries out attenuation processing, using numerical value obtained after attenuation processing as the association frequency between ID data.Pass through
The association frequency between this obtained ID data of mode, which can accurately reflect between current period ID data, really closes
Connection degree, reference value with higher help accurately to cluster the ID data in the ID data subnet.
Step S604, for any ID data in the ID data subnet, according between the ID data and other ID data
The association frequency, calculate the distance between the ID data and other ID data.
Wherein, be associated with the frequency bigger, the obtained ID data and other ID between the ID data and other ID data
The distance between data are smaller.Specific calculation can be arranged in those skilled in the art according to actual needs, herein without limitation.
For example, divided by between the ID data and other ID data the frequency can be associated with preset value, then using obtained numerical value as
The distance between the ID data and other ID data.Assuming that preset value is 1, obtained in the ID data subnet through step S603 determination
The frequencys that is associated with of ID data " d5 " and the ID data " e5 " in the ID data subnet be 50, then with 1 divided by the association frequency,
Numerical value 0.02 is obtained, then regard numerical value 0.02 as the distance between ID data " d5 " and ID data " e5 ".When for the ID number
According to any ID data in subnet, be completed the ID data with after the calculating of the distance between other ID data to get arriving
The distance between ID data in the ID data subnet.
Step S605, according to the distance between ID data in the ID data subnet and default clustering rule, to the ID
ID data in data subnet are clustered, and several cluster set are obtained.
Those skilled in the art can according to actual needs be configured default clustering rule, herein without limitation.For example,
Default clustering rule defines default neighborhood radius, predetermined minimum and the second preset quantity threshold value, specifically, according to the ID number
According to the distance between ID data in subnet and default neighborhood radius, determine to count from the ID data in the ID data subnet
Then a core I D data are directed to any core I D data, search the default neighbour in the ID data subnet in core I D data
Other ID data in the radius of domain, and according to the second preset quantity threshold value, by core I D data and other ID numbers found
According to being clustered, cluster set is obtained, thus the ID that will there is stronger, more structurally sound incidence relation in the ID data subnet
Data clusters are cluster set.
Wherein, for any ID data in the ID data subnet, according between the ID data and other ID data away from
From the quantity is more than predetermined minimum by the quantity of other ID data of the calculating in the default neighborhood radius of the ID data
ID data are determined as core I D data.For example, default neighborhood radius is 1, predetermined minimum 3 is wrapped in the ID data subnet
The ID data contained include " d5 ", " e5 ", " f5 ", " g5 ", " h5 " etc., for ID data " d5 ", according to ID data " d5 " and other
The distance between ID data are it is found that the distance between ID data " d5 " and ID data " e5 ", ID data " d5 " and ID data " f5 "
The distance between, between the distance between ID data " d5 " and ID data " g5 " and ID data " d5 " and ID data " h5 " away from
It is equal at a distance from the ID data in addition to ID data " e5 ", " f5 ", " g5 " and " h5 " from being respectively less than or being equal to 1, ID data " d5 "
Greater than 1, then in the default neighborhood radius of ID data " d5 " other existing ID data include ID data " e5 ", " f5 ",
" g5 " and " h5 ", the i.e. quantity of other ID data in the corresponding default neighborhood radius of ID data " d5 " are 4, which is more than
ID data " d5 " are then determined as core I D data by predetermined minimum.In the manner described above, from the ID number in the ID data subnet
All core I D data are determined in.
In determining the ID data subnet after all core I D data, for any core in all core I D data
Heart ID data, search other ID data in the ID data subnet in the default neighborhood radius of core I D data, and according to
Second preset quantity threshold value clusters core I D data and other ID data found, obtains cluster set.Specifically
Ground can be selected from other ID data found according to core I D data and the distance between other ID data found
Then access amount gathers core I D data and selected ID data less than the ID data of the second preset quantity threshold value
Class obtains a cluster set.For example, the second preset quantity threshold value is 10, the default neighbour in core I D data found
The quantity of other ID data in the radius of domain has 15, be greater than the second preset quantity threshold value, then can from 15 found its
9 nearest ID data of selected distance core I D data in his ID data, by core I D data and 9 selected ID numbers
According to being clustered, a cluster set is obtained.For another example, other in the default neighborhood radius of core I D data found
The quantity of ID data has 8, less than the second preset quantity threshold value, then without from this 8 ID data decimation ID data, it can be direct
Core I D data and this 8 ID data are clustered, a cluster set is obtained.
Step S606 gathers according to several clusters, is split to the ID data subnet, and it is right to obtain the ID data subnet institute
The several 3rd ID data subnets answered.
After having obtained several cluster set, needs to gather according to several clusters, which is split.
In the ID data subnet, gather for any cluster, removes except the ID data and the cluster set in the cluster set
Incidence relation between ID data realizes effective segmentation to the ID data subnet, obtains number corresponding to the ID data subnet
A 3rd ID data subnet.Specifically, it removes between the ID data in the ID data and other cluster set in the cluster set
Incidence relation and removing be not clustered in ID data in the cluster set and the ID data subnet to several clusters set
In ID data between incidence relation.For example, ID data " d5 " and another ID number clustered in set in the cluster set
According to having incidence relation between " a5 ", the ID data " d5 " in the cluster set are not clustered with the ID data subnet to number also
There is incidence relation between ID data " b5 " in a cluster set, then can remove between ID data " d5 " and ID data " a5 "
Incidence relation, and remove the incidence relation between ID data " d5 " and ID data " b5 ".
Compared with the quantity for the ID data for being included is greater than the ID data subnet of the first preset quantity threshold value, the 3rd ID data
ID data in subnet have stronger, more structurally sound incidence relation, can recognize the ID data for same user, according to third
ID data subnet can accurately and efficiently analyze user characteristics, to construct complete, effective user's portrait.And the
The quantity that the data volume of three ID data subnets is far smaller than included ID data is greater than the ID data of the first preset quantity threshold value
The data volume of subnet, is more convenient for user feature analysis, helps to improve analysis efficiency.It in practical applications, can be ID
It needs to remove ID relationship corresponding to the ID data of incidence relation in data subnet to setting dividing mark position, is closed for Tag ID
It is relationship between two ID of centering whether is the incidence relation for needing to remove in cutting procedure.If some ID relationship centering
Two ID between relationship be the incidence relation for needing to remove in cutting procedure, then by the dividing mark position of the ID relationship pair
It is set as 1;If the relationship between two ID of some ID relationship centering is not the incidence relation for needing to remove in cutting procedure,
Then 0 is set by the dividing mark position of the ID relationship pair.The two of ID relationship centering can be clearly known by dividing mark position
Whether the relationship between a ID is the incidence relation for needing to remove in cutting procedure.
Step S607, judges whether the ID data subnet in extracted ID data subnet is all selected;If so, should
Method terminates;If it is not, thening follow the steps S602.
If it is determined that the quantity for obtaining extracted included ID data is more than the ID data of the first preset quantity threshold value
ID data subnet in subnet is all selected, and is illustrated for each of extracted ID data subnet ID data subnet all
It completes and ID data therein is clustered and divided, then this method terminates;If it is determined that obtaining all not being selected, then hold
Row step S602.
According to ID data subnet processing method provided in this embodiment, the quantity for any included ID data is more than
The ID data subnet of first preset quantity threshold value, can be according to the association frequency and default clustering rule between ID data, will
ID data in the ID data subnet with stronger, more structurally sound incidence relation are gathered for one kind, and are divided to same third
In ID data subnet, to obtain corresponding several 3rd ID data subnets, realizes and ID data subnet is effectively treated.With
ID data subnet before processing is compared, and the ID data in the 3rd ID data subnet have stronger, more structurally sound incidence relation,
It can recognize the ID data for same user, accurately and efficiently user characteristics can be analyzed based on the 3rd ID data subnet,
To construct complete, effective user's portrait.And the 3rd the data volume of ID data subnet be far smaller than ID data before handling
The data volume of net, is more convenient for user feature analysis, helps to improve analysis efficiency.
Fig. 7 shows the structural block diagram of ID data network processing unit according to an embodiment of the invention, as shown in fig. 7,
The device includes: to obtain module 710 and ID data network analysis module 720.
It obtains module 710 to be suitable for: obtaining the ID data network comprising the incidence relation between ID data and ID data;ID number
According to including: User ID data and/or device id data.
ID data network analysis module 720 is suitable for: carrying out data analysis to ID data network, obtains several ID data subnets;Its
The quantity of the middle ID data for being included according to ID data subnet concentrates several ID data sub-network divisions to n ID data subnet, n
For the natural number greater than 0;The quantity for the ID data that the ID data subnet that different ID data subnets are concentrated is included is different.
Optionally, device further include: daily record data analysis module 730 is carried out suitable for the daily record data to multiple business
Data analysis, determines the incidence relation between ID data and ID data;Constructing module 740 is suitable for using ID data as node,
According to the incidence relation between ID data, the connection relationship between node is determined, construction obtains ID data network.
Optionally, device further include: beta pruning preprocessing module 750 is suitable for carrying out beta pruning pretreatment to ID data network, obtain
To the pretreated ID data network of beta pruning;ID data network analysis module 720 is further adapted for: ID data pretreated to beta pruning
Net carries out data analysis, obtains several ID data subnets.
Optionally, beta pruning preprocessing module 750 is further adapted for: data analysis is carried out to the daily record data of multiple business,
Obtain the association frequency between ID data;For any ID data in ID data network, according to what is be directly linked with the ID data
It is associated with the frequency between the quantity of other ID data and/or the ID data and other ID data, to the ID data and other ID numbers
Incidence relation between carries out beta pruning pretreatment;Obtain the pretreated ID data network of beta pruning.
Optionally, beta pruning preprocessing module 750 is further adapted for: data analysis is carried out to the daily record data of multiple business,
Calculate the actual association frequency between ID data;According to the actual association frequency between ID data, the corresponding log number of ID data
According to temporal information and time weighting, the association frequency between ID data is calculated.
Optionally, beta pruning preprocessing module 750 is further adapted for: other ID data that judgement is directly linked with the ID data
Quantity whether be greater than first threshold and be associated with the frequency less than or equal to second between the ID data and other any ID data
Threshold value;If so, removing the incidence relation between the ID data and other any ID data.Beta pruning preprocessing module 750 into
One step is suitable for: the quantity for other ID data that judgement is directly linked with the ID data whether be greater than third threshold value and the ID data with
The sum of association frequency between other each ID data is greater than or equal to the 4th threshold value;If so, remove the ID data with it is each
Incidence relation between other ID data.Beta pruning preprocessing module 750 is further adapted for: judging the ID data and other each ID
Whether the sum of association frequency between data is greater than or equal to the 5th threshold value;If so, removing the ID data and other each ID
Incidence relation between data.
Optionally, ID data network analysis module 720 is further adapted for: the ID data and ID for being included according to ID data network
Incidence relation between data constructs ID relation data;ID relation data includes several ID relationships pair;Full dose replicates ID relationship number
According into memory;ID relation data is compared with the ID relation data that full dose copies in memory and is combined, according to comparison group
It closes result and carries out Data Integration, obtain several ID data subnets.
Optionally, ID data network analysis module 720 is further adapted for: ID relation data is divided into multiple fragments;It will be more
The ID relation data that a fragment concurrently copies in memory with full dose, which is compared, to be combined, and the comparison combination of all fragments is obtained
As a result;The comparison combined result of all fragments is subjected to Data Integration, obtains several ID data subnets.ID data network analysis module
720 are further adapted for: being directed to any fragment, which is compared group with the ID relation data that full dose copies in memory
It closes, obtains the comparison combination intermediate result of the fragment;Iteration executes this step, until meeting default iterated conditional: by all points
The comparison combination intermediate result of piece is divided into the sub- fragment in multiple centres, and the sub- fragment in multiple centres is concurrently copied to full dose
Combination is compared in ID relation data in memory, obtains the intermediate knot of comparison combination of all fragments of next iteration operation
Fruit;After iterative process, the comparison combined result of all fragments is obtained.Wherein, default iterated conditional includes: that the number of iterations reaches
To default the number of iterations.
Optionally, ID data network analysis module 720 is further adapted for: the ID data and ID for being included according to ID data network
Incidence relation between data constructs ID relation data;ID relation data includes several ID relationships pair, and each ID relationship is to packet
Contain: the relationship between two ID and two ID;By each ID relationship to oriented positive sequence and the processing of oriented backward is carried out, obtain each
ID relationship is to two corresponding oriented relationships pair of ID;Any oriented relationship centering of ID determines major key according to preset rules
ID;Using according to major key ID group technology, to the oriented relationship of all ID to being grouped, several ID data are obtained according to group result
Subnet.ID data network analysis module 720 is further adapted for: being the oriented relationship of each ID to setting relationship position;Wherein, same ID is closed
It is, different ID relationships pass to corresponding ID oriented relationship pair identical to the relationship position of two corresponding oriented relationships pair of ID
It is position difference;Using according to major key ID group technology, to the oriented relationship of all ID to being grouped, several first groupings are obtained;Needle
To any first grouping, the counting of first grouping is determined according to the quantity of the included oriented relationship pair of ID of first grouping
Position;Extract meter digital be the first count value at least one first grouping, according to relationship position to it is extracted at least one first
The included oriented relationship of ID is grouped to processing is combined, obtains at least one the first ID data subnet;First ID data
The quantity for netting included ID data is 2.
Optionally, ID data network analysis module 720 is further adapted for: extracting at least one that meter digital is the second count value
First grouping;For extracted any first grouping, according to the included oriented relationship pair of ID of the first grouping, obtain this
The corresponding oriented relationship group of ID of one grouping;Each oriented relationship group of ID includes: the relationship between three ID and three ID;Wherein
Major key ID is determined according to preset rules in any oriented relationship group of ID;For each ID oriented relationship group, relationship position is set;Wherein, together
The relationship position of the corresponding oriented relationship group of ID of one first grouping is identical, the corresponding oriented relationship group of ID of the first grouping of difference
Relationship position is different;Using according to major key ID group technology, the oriented relationship group of all ID is grouped, several second packets are obtained;
For any second packet, the quantity of the oriented relationship group of the ID for being included according to the second packet determines the counting of the second packet
Position;Extract meter digital be third count value at least one second packet, according to relationship position to it is extracted at least one second
It is grouped the included oriented relationship group of ID and is combined processing, obtain at least one the 2nd ID data subnet;2nd ID data
The quantity for netting included ID data is 3.
Optionally, the device further include: cluster segmentation module 760, suitable for being directed to the quantity of any included ID data
Greater than the ID data subnet of the first preset quantity threshold value, the ID data in the ID data subnet are clustered and divided, are obtained
Several 3rd ID data subnets corresponding to the ID data subnet;The quantity for the ID data that 3rd ID data subnet is included is less than
Or it is equal to the second preset quantity threshold value.
Optionally, cluster segmentation module 760 is further adapted for: for any ID data in the ID data subnet, according to
It is associated with the frequency between the ID data and other ID data, calculates the distance between the ID data and other ID data;According to this
The distance between ID data in ID data subnet and default clustering rule gather the ID data in the ID data subnet
Class obtains several cluster set;Gather according to several clusters, which is split, the ID data subnet is obtained
Corresponding several 3rd ID data subnets.Cluster segmentation module 760 is further adapted for: according to the ID number in the ID data subnet
According to the distance between and default neighborhood radius, determine several core I D data from the ID data in the ID data subnet;
For any core I D data, other ID numbers in the ID data subnet in the default neighborhood radius of core I D data are searched
According to, and according to the second preset quantity threshold value, core I D data and other ID data found are clustered, are clustered
Set.Cluster segmentation module 760 is further adapted for: in the ID data subnet, being gathered for any cluster, is removed the cluster set
The incidence relation between the ID data in ID data and other cluster set in conjunction;Obtain number corresponding to the ID data subnet
A 3rd ID data subnet.
According to ID data network processing unit provided in this embodiment, data point are carried out by the daily record data to multiple business
Analysis, can rapidly construct to obtain ID data network;And beta pruning pretreatment is carried out to ID data network, is effectively and quickly eliminated
Insecure incidence relation between ID data in ID data network can not only help to improve the accuracy of ID data network processing,
But also the data volume of data analysis can be reduced;In addition, between the ID data and ID data that are included to ID data network
Incidence relation carries out data analysis, ID data network rapidly can be divided into several ID data subnets, ID data subnet is wrapped
The ID data contained have stronger, reliable incidence relation, can recognize the ID data for same user, are based on ID data subnet energy
It is enough that accurately and rapidly user characteristics are analyzed, to construct complete, effective user's portrait.
Fig. 8 shows the structural block diagram of ID data network beta pruning pretreatment unit according to an embodiment of the invention, such as Fig. 8
Shown, which includes: to obtain module 810 and beta pruning preprocessing module 820.
It obtains module 810 to be suitable for: obtaining the ID data network comprising the incidence relation between ID data and ID data;ID number
According to including: User ID data and/or device id data.
Beta pruning preprocessing module 820 is suitable for: carrying out beta pruning pretreatment to ID data network, obtains the pretreated ID number of beta pruning
According to net.
Optionally, beta pruning preprocessing module 820 is further adapted for: data analysis is carried out to the daily record data of multiple business,
Obtain the association frequency between ID data;For any ID data in ID data network, according to what is be directly linked with the ID data
It is associated with the frequency between the quantity of other ID data and/or the ID data and other ID data, to the ID data and other ID numbers
Incidence relation between carries out beta pruning pretreatment;Obtain the pretreated ID data network of beta pruning.
Optionally, beta pruning preprocessing module 820 is further adapted for: data analysis is carried out to the daily record data of multiple business,
Calculate the actual association frequency between ID data;According to the actual association frequency between ID data, the corresponding log number of ID data
According to temporal information and time weighting, the association frequency between ID data is calculated.Beta pruning preprocessing module 820 is further
Be suitable for: judge the quantity for other ID data being directly linked with the ID data whether be greater than first threshold and the ID data with it is any
The association frequency between other ID data is less than or equal to second threshold;If so, removing the ID data and other any ID
Incidence relation between data.Beta pruning preprocessing module 820 is further adapted for: other ID that judgement is directly linked with the ID data
The quantity of data whether be greater than third threshold value and between the ID data and other each ID data be associated with the sum of frequency be greater than or
Equal to the 4th threshold value;If so, removing the incidence relation between the ID data and other each ID data.Beta pruning preprocessing module
820 are further adapted for: judging to be associated with whether the sum of frequency is greater than or equal to the between the ID data and other each ID data
Five threshold values;If so, removing the incidence relation between the ID data and other each ID data.
According to ID data network beta pruning pretreatment unit provided in this embodiment, data are carried out to the daily record data of multiple business
Analysis, the association frequency being quickly obtained between ID data, for any ID data in ID data network, according to the ID data
It is associated with the frequency between the quantity and/or the ID data and other ID data of other ID data being directly linked, to the ID data
Incidence relation between other ID data carries out beta pruning pretreatment, effectively and quickly eliminate in ID data network ID data it
Between insecure incidence relation so that the incidence relation between ID data in the pretreated ID data network of beta pruning be compared with
By force, reliable incidence relation, can not only help to improve the accuracy of ID data network processing, but also can reduce data point
The data volume of analysis.Optionally, corresponding time weighting also is introduced for daily record data, by time weighting between ID data
The actual association frequency carries out attenuation processing, using numerical value obtained after attenuation processing as the association frequency between ID data, with
Just accurately reflect true correlation degree between current period ID data, reference value with higher facilitates accurately
Beta pruning pretreatment is carried out to ID data network.
Fig. 9 shows the structural block diagram of ID data network data analytical equipment according to an embodiment of the invention, such as Fig. 9 institute
Show, which includes: to obtain module 910, first to construct module 920 and compare composite module 930.
It obtains module 910 to be suitable for: obtaining the ID data network comprising the incidence relation between ID data and ID data;ID number
According to including: User ID data and/or device id data.
First building module 920 is suitable for: the association between the ID data and ID data for being included according to ID data network is closed
System constructs ID relation data;ID relation data includes several ID relationships pair.
It compares composite module 930 to be suitable for: combination being compared to ID relation data, obtains several ID data subnets.
Optionally, compare composite module 930 to be further adapted for: full dose replicates ID relation data into memory;By ID relationship
Data are compared with the ID relation data that full dose copies in memory and combine, and carry out Data Integration according to combined result is compared,
Obtain several ID data subnets.It compares composite module 930 to be further adapted for: ID relation data is divided into multiple fragments;It will be more
The ID relation data that a fragment concurrently copies in memory with full dose, which is compared, to be combined, and the comparison combination of all fragments is obtained
As a result;The comparison combined result of all fragments is subjected to Data Integration, obtains several ID data subnets.Compare composite module 930
It is further adapted for: for any fragment, which being compared with the ID relation data that full dose copies in memory and is combined, is obtained
Intermediate result is combined in comparison to the fragment;Iteration executes this step, until meeting default iterated conditional: by the ratio of all fragments
The sub- fragment in multiple centres is divided into combination intermediate result, and the sub- fragment in multiple centres is concurrently copied in memory with full dose
ID relation data combination is compared, obtain next iteration operation all fragments comparison combination intermediate result;Iteration
After process, the comparison combined result of all fragments is obtained.Wherein, default iterated conditional includes: that the number of iterations reaches default
The number of iterations.
It, can be based on the ID data that ID data network is included according to ID data network data analytical equipment provided in this embodiment
And the incidence relation between ID data, ID relation data is constructed, then copies to ID relation data and full dose in memory
Combination is compared in ID relation data, carries out Data Integration according to combined result is compared, accurately and rapidly obtains several ID data
Subnet, to realize effective division to ID data network.Optionally, ID relation data can be also divided into multiple fragments, led to
It crosses the ID relation data that fragment concurrently copies in memory with full dose and is compared and combine, further improve ID data netting index
According to analysis efficiency.Compared with ID data network, the ID data that ID data subnet is included have stronger, reliable incidence relation,
It can recognize the ID data for same user, accurately and rapidly user characteristics can be analyzed based on ID data subnet, with structure
Build complete, effective user's portrait.
Figure 10 shows the structural block diagram of ID data network data analytical equipment in accordance with another embodiment of the present invention, such as schemes
Shown in 10, which includes: to obtain module 1010, second to construct module 1020 and grouping module 1030.
It obtains module 1010 to be suitable for: obtaining the ID data network comprising the incidence relation between ID data and ID data;ID
Data include: User ID data and/or device id data.
Second building module 1020 is suitable for: the association between the ID data and ID data for being included according to ID data network is closed
System constructs ID relation data;ID relation data includes several ID relationships pair, and each ID relationship is to including two ID and two ID
Between relationship.
Grouping module 1030 is suitable for: being grouped to ID relation data, obtains several ID data subnets.
Optionally, grouping module 1030 is further adapted for: by each ID relationship to carrying out at oriented positive sequence and oriented backward
Reason, obtains each ID relationship to two corresponding oriented relationships pair of ID;Any oriented relationship centering of ID is according to default rule
Then determine major key ID;It is obtained to the oriented relationship of all ID to being grouped according to group result using according to major key ID group technology
Several ID data subnets.Grouping module 1030 is further adapted for: being the oriented relationship of each ID to setting relationship position;Wherein, same
ID relationship is identical to the relationship position of two corresponding oriented relationships pair of ID, and different ID relationships are to the corresponding oriented relationship pair of ID
Relationship position it is different;Using according to major key ID group technology, to the oriented relationship of all ID to being grouped, several first points are obtained
Group;For any first grouping, which is determined according to the quantity of the included oriented relationship pair of ID of first grouping
Meter digital;Extract meter digital be the first count value at least one first grouping, according to relationship position to it is extracted at least one
The included oriented relationship of ID of first grouping obtains at least one the first ID data subnet to processing is combined;First ID number
The quantity for the ID data for being included according to subnet is 2.
Optionally, grouping module 1030 is further adapted for: extracting at least one first point that meter digital is the second count value
Group;For extracted any first grouping, according to the included oriented relationship pair of ID of first grouping, first grouping is obtained
The oriented relationship group of corresponding ID;Each oriented relationship group of ID includes: the relationship between three ID and three ID;Any ID
Major key ID is determined according to preset rules in oriented relationship group;For each ID oriented relationship group, relationship position is set;Wherein, same first
The relationship position of the corresponding oriented relationship group of ID of grouping is identical, the relationship position of the corresponding oriented relationship group of ID of the first grouping of difference
It is different;Using according to major key ID group technology, the oriented relationship group of all ID is grouped, several second packets are obtained;For appoint
The quantity of one second packet, the oriented relationship group of the ID for being included according to the second packet determines the meter digital of the second packet;It mentions
Taking meter digital is at least one second packet of third count value, according to relationship position at least one extracted second packet institute
The oriented relationship group of the ID for including is combined processing, obtains at least one the 2nd ID data subnet;2nd ID data subnet is wrapped
The quantity of the ID data contained is 3.
It, can be based on the ID data that ID data network is included according to ID data network data analytical equipment provided in this embodiment
And the incidence relation between ID data, ID relation data is constructed, is then handled by oriented positive sequence and oriented backward, obtains ID
In relation data then each ID relationship utilizes according to major key ID group technology, to institute two corresponding oriented relationships pair of ID
There is the oriented relationship of ID to being grouped, effectively improves ID data network data analysis efficiency, can accurately and rapidly be counted
A ID data subnet, to realize effective division to ID data network.Optionally, using the meter digital of obtained grouping with
And for the oriented relationship of ID to and the oriented relationship group of ID set by relationship position, can quickly and easily obtain the first ID data
Net and the 2nd ID data subnet.
Figure 11 shows the structural block diagram of ID data subnet processing unit according to an embodiment of the invention, such as Figure 11 institute
Show, which includes: computing module 1110, extraction module 1120 and cluster segmentation module 1130.
Computing module 1110 is suitable for: calculating the ID data that each ID data subnet is included in several ID data subnets
Quantity.
Extraction module 1120 is suitable for: the quantity for extracting included ID data is more than the ID data of the first preset quantity threshold value
Subnet.
Cluster segmentation module 1130 is suitable for: being greater than the first preset quantity threshold value for the quantity of any included ID data
ID data subnet, the ID data in the ID data subnet are clustered and are divided, are obtained corresponding to the ID data subnet
Several 3rd ID data subnets;The quantity for the ID data that 3rd ID data subnet is included is less than or equal to the second preset quantity threshold
Value.
Optionally, cluster segmentation module 1130 is further adapted for: for any ID data in the ID data subnet, according to
It is associated with the frequency between the ID data and other ID data, calculates the distance between the ID data and other ID data;According to this
The distance between ID data in ID data subnet and default clustering rule gather the ID data in the ID data subnet
Class obtains several cluster set;Gather according to several clusters, which is split, the ID data subnet is obtained
Corresponding several 3rd ID data subnets.
Optionally, the device further include: association frequency determining module 1140, suitable for for any included ID data
Quantity is greater than the ID data subnet of the first preset quantity threshold value, to the daily record data of multiple business corresponding with the ID data subnet
Data analysis is carried out, determines the association frequency in the ID data subnet between ID data.It is associated with frequency determining module 1140 into one
Step is suitable for: carrying out data analysis to the daily record data of multiple business corresponding with the ID data subnet, calculates the ID data subnet
The actual association frequency between middle ID data;According to the actual association frequency between ID data, the corresponding daily record data of ID data
Temporal information and time weighting, the association frequency between ID data is calculated.
Optionally, cluster segmentation module 1130 is further adapted for: according between the ID data in the ID data subnet away from
From and default neighborhood radius, determine several core I D data from the ID data in the ID data subnet;For any core
Heart ID data, search other ID data in the ID data subnet in the default neighborhood radius of core I D data, and according to
Second preset quantity threshold value clusters core I D data and other ID data found, obtains cluster set.Cluster
Segmentation module 1130 is further adapted for: in the ID data subnet, being gathered for any cluster, is removed the ID in the cluster set
The incidence relation between ID data except data and the cluster set;Obtain several 3rd ID corresponding to the ID data subnet
Data subnet.
According to ID data subnet processing unit provided in this embodiment, the quantity for any included ID data is more than
The ID data subnet of first preset quantity threshold value, can be according to the association frequency and default clustering rule between ID data, will
ID data in the ID data subnet with stronger, more structurally sound incidence relation are gathered for one kind, and are divided to same third
In ID data subnet, to obtain corresponding several 3rd ID data subnets, realizes and ID data subnet is effectively treated.With
ID data subnet before processing is compared, and the ID data in the 3rd ID data subnet have stronger, more structurally sound incidence relation,
It can recognize the ID data for same user, accurately and efficiently user characteristics can be analyzed based on the 3rd ID data subnet,
To construct complete, effective user's portrait.And the 3rd the data volume of ID data subnet be far smaller than ID data before handling
The data volume of net, is more convenient for user feature analysis, helps to improve analysis efficiency.
The present invention also provides a kind of nonvolatile computer storage media, computer storage medium is stored at least one can
It executes instruction, the ID data network data analysis method in above-mentioned any means embodiment can be performed in executable instruction.
Figure 12 shows a kind of structural schematic diagram for calculating equipment according to an embodiment of the present invention, the specific embodiment of the invention
The specific implementation for calculating equipment is not limited.As shown in figure 12, which may include: processor
(processor) 1202, communication interface (Communications Interface) 1204, memory (memory) 1206, with
And communication bus 1208.Wherein: processor 1202, communication interface 1204 and memory 1206 are complete by communication bus 1208
At mutual communication.Communication interface 1204, for being communicated with the network element of other equipment such as client or other servers etc..
Processor 1202 can specifically execute the phase in above-mentioned ID data network data analysis method embodiment for executing program 1210
Close step.Specifically, program 1210 may include program code, which includes computer operation instruction.
Processor 1202 may be central processor CPU or specific integrated circuit ASIC (Application
Specific Integrated Circuit), or be arranged to implement the integrated electricity of one or more of the embodiment of the present invention
Road.The one or more processors that equipment includes are calculated, can be same type of processor, such as one or more CPU;It can also
To be different types of processor, such as one or more CPU and one or more ASIC.Memory 1206, for storing journey
Sequence 1210.Memory 1206 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-
Volatile memory), a for example, at least magnetic disk storage.Program 1210 specifically can be used for so that processor 1202 is held
ID data network data analysis method in the above-mentioned any means embodiment of row.The specific implementation of each step can be joined in program 1210
See corresponding description in the corresponding steps in above-mentioned ID data network data analysis embodiment and unit, this will not be repeated here.Affiliated neck
The technical staff in domain can be understood that, for convenience and simplicity of description, the equipment of foregoing description and the specific work of module
Make process, can refer to corresponding processes in the foregoing method embodiment description, details are not described herein.
Algorithm and display are not inherently related to any particular computer, virtual system, or other device provided herein.
Various general-purpose systems can also be used together with teachings based herein.As described above, it constructs required by this kind of system
Structure be obvious.In addition, the present invention is also not directed to any particular programming language.It should be understood that can use various
Programming language realizes summary of the invention described herein, and the description done above to language-specific is to disclose this hair
Bright preferred forms.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention
Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this specification.Similarly, it should be understood that in order to simplify the disclosure and help to understand each
One or more of a inventive aspect, in the above description of the exemplary embodiment of the present invention, each spy of the invention
Sign is grouped together into a single embodiment, figure, or description thereof sometimes.However, should not be by the method solution of the disclosure
It is interpreted into and reflects an intention that i.e. the claimed invention requires more than feature expressly recited in each claim
More features.More precisely, as the following claims reflect, inventive aspect is less than single reality disclosed above
Apply all features of example.Therefore, it then follows thus claims of specific embodiment are expressly incorporated in the specific embodiment,
It is wherein each that the claims themselves are regarded as separate embodiments of the invention.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment
Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or
Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any
Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed
All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power
Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose
It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention
Within the scope of and form different embodiments.For example, in detail in the claims, embodiment claimed it is one of any
Can in any combination mode come using.
Various component embodiments of the invention can be implemented in hardware, or to run on one or more processors
Software module realize, or be implemented in a combination thereof.It will be understood by those of skill in the art that can be used in practice
Microprocessor or digital signal processor (DSP) realize one of some or all components according to embodiments of the present invention
A little or repertoire.The present invention is also implemented as setting for executing some or all of method as described herein
Standby or program of device (for example, computer program and computer program product).It is such to realize that program of the invention deposit
Storage on a computer-readable medium, or may be in the form of one or more signals.Such signal can be from because of spy
It downloads and obtains on net website, be perhaps provided on the carrier signal or be provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability
Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between parentheses should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such
Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real
It is existing.In the unit claims listing several devices, several in these devices can be through the same hardware branch
To embody.The use of word first, second, and third does not indicate any sequence.These words can be explained and be run after fame
Claim.
Claims (10)
1. a kind of ID data network data analysis method, which comprises
Obtain the ID data network comprising the incidence relation between ID data and ID data;The ID data include: User ID number
According to and/or device id data;
The incidence relation between ID data and ID data for being included according to the ID data network constructs ID relation data;Institute
Stating ID relation data includes several ID relationships pair;
Combination is compared to the ID relation data, obtains several ID data subnets.
2. it is described that combination is compared to the ID relation data according to the method described in claim 1, wherein, it obtains several
ID data subnet further comprises:
Full dose replicates the ID relation data into memory;
The ID relation data is compared with the ID relation data that full dose copies in memory and is combined, is tied according to combination is compared
Fruit carries out Data Integration, obtains several ID data subnets.
3. described that the ID relation data and full dose are copied to the ID in memory according to the method described in claim 2, wherein
Combination is compared in relation data, carries out Data Integration according to combined result is compared, obtains several ID data subnets and further wrap
It includes:
The ID relation data is divided into multiple fragments;
The ID relation data that multiple fragments concurrently copy in memory with full dose is compared and is combined, all fragments are obtained
Compare combined result;
The comparison combined result of all fragments is subjected to Data Integration, obtains several ID data subnets.
4. according to the method described in claim 3, wherein, the ID multiple fragments concurrently copied to full dose in memory
Combination is compared in relation data, and the comparison combined result for obtaining all fragments further comprises:
For any fragment, which is compared with the ID relation data that full dose copies in memory and is combined, this point is obtained
Intermediate result is combined in the comparison of piece;
Iteration executes this step, until meeting default iterated conditional: the comparison combination intermediate result of all fragments being divided into more
A sub- fragment in centre, and group is compared in the ID relation data that the sub- fragment in multiple centres concurrently copies in memory with full dose
It closes, obtains the comparison combination intermediate result of all fragments of next iteration operation;
After iterative process, the comparison combined result of all fragments is obtained.
5. according to the method described in claim 4, wherein, the default iterated conditional includes: that the number of iterations reaches default iteration
Number.
6. a kind of ID data network data analytical equipment, described device include:
Module is obtained, suitable for obtaining the ID data network comprising the incidence relation between ID data and ID data;The ID data
It include: User ID data and/or device id data;
First building module, suitable for included according to the ID data network ID data and ID data between incidence relation,
Construct ID relation data;The ID relation data includes several ID relationships pair;
It compares composite module and obtains several ID data subnets suitable for combination is compared to the ID relation data.
7. device according to claim 6, wherein the comparison composite module is further adapted for:
Full dose replicates the ID relation data into memory;
The ID relation data is compared with the ID relation data that full dose copies in memory and is combined, is tied according to combination is compared
Fruit carries out Data Integration, obtains several ID data subnets.
8. device according to claim 7, wherein the comparison composite module is further adapted for:
The ID relation data is divided into multiple fragments;
The ID relation data that multiple fragments concurrently copy in memory with full dose is compared and is combined, all fragments are obtained
Compare combined result;
The comparison combined result of all fragments is subjected to Data Integration, obtains several ID data subnets.
9. a kind of calculating equipment, comprising: processor, memory, communication interface and communication bus, the processor, the storage
Device and the communication interface complete mutual communication by the communication bus;
The memory executes the processor as right is wanted for storing an at least executable instruction, the executable instruction
Ask the corresponding operation of ID data network data analysis method described in any one of 1-5.
10. a kind of computer storage medium, an at least executable instruction, the executable instruction are stored in the storage medium
Processor is set to execute the corresponding operation of ID data network data analysis method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810973801.4A CN109190035A (en) | 2018-08-24 | 2018-08-24 | ID data network data analysis method, device and calculating equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810973801.4A CN109190035A (en) | 2018-08-24 | 2018-08-24 | ID data network data analysis method, device and calculating equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109190035A true CN109190035A (en) | 2019-01-11 |
Family
ID=64919746
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810973801.4A Pending CN109190035A (en) | 2018-08-24 | 2018-08-24 | ID data network data analysis method, device and calculating equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109190035A (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105224606A (en) * | 2015-09-02 | 2016-01-06 | 新浪网技术(中国)有限公司 | A kind of disposal route of user ID and device |
CN105227352A (en) * | 2015-09-02 | 2016-01-06 | 新浪网技术(中国)有限公司 | A kind of update method of user ID collection and device |
CN105391594A (en) * | 2014-09-03 | 2016-03-09 | 阿里巴巴集团控股有限公司 | Method and device for recognizing characteristic account number |
CN106850346A (en) * | 2017-01-23 | 2017-06-13 | 北京京东金融科技控股有限公司 | Change and assist in identifying method, device and the electronic equipment of blacklist for monitor node |
CN106897273A (en) * | 2017-04-12 | 2017-06-27 | 福州大学 | A kind of network security dynamic early-warning method of knowledge based collection of illustrative plates |
CN107193894A (en) * | 2017-05-05 | 2017-09-22 | 北京小度信息科技有限公司 | Data processing method, individual discrimination method and relevant apparatus |
CN107248929A (en) * | 2017-05-27 | 2017-10-13 | 北京知道未来信息技术有限公司 | A kind of strong associated data generation method of multidimensional associated data |
US20170323005A1 (en) * | 2007-07-25 | 2017-11-09 | Schmidt J. Raymond | Relevant Relationships Based Networking Environment |
CN108197129A (en) * | 2016-12-08 | 2018-06-22 | 中国电信股份有限公司 | Data digging method and device |
-
2018
- 2018-08-24 CN CN201810973801.4A patent/CN109190035A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170323005A1 (en) * | 2007-07-25 | 2017-11-09 | Schmidt J. Raymond | Relevant Relationships Based Networking Environment |
CN105391594A (en) * | 2014-09-03 | 2016-03-09 | 阿里巴巴集团控股有限公司 | Method and device for recognizing characteristic account number |
CN105224606A (en) * | 2015-09-02 | 2016-01-06 | 新浪网技术(中国)有限公司 | A kind of disposal route of user ID and device |
CN105227352A (en) * | 2015-09-02 | 2016-01-06 | 新浪网技术(中国)有限公司 | A kind of update method of user ID collection and device |
CN108197129A (en) * | 2016-12-08 | 2018-06-22 | 中国电信股份有限公司 | Data digging method and device |
CN106850346A (en) * | 2017-01-23 | 2017-06-13 | 北京京东金融科技控股有限公司 | Change and assist in identifying method, device and the electronic equipment of blacklist for monitor node |
CN106897273A (en) * | 2017-04-12 | 2017-06-27 | 福州大学 | A kind of network security dynamic early-warning method of knowledge based collection of illustrative plates |
CN107193894A (en) * | 2017-05-05 | 2017-09-22 | 北京小度信息科技有限公司 | Data processing method, individual discrimination method and relevant apparatus |
CN107248929A (en) * | 2017-05-27 | 2017-10-13 | 北京知道未来信息技术有限公司 | A kind of strong associated data generation method of multidimensional associated data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110210227B (en) | Risk detection method, device, equipment and storage medium | |
JP6371870B2 (en) | Machine learning service | |
CN106663224B (en) | Interactive interface for machine learning model assessment | |
US10452992B2 (en) | Interactive interfaces for machine learning model evaluations | |
CN107704485A (en) | A kind of position recommends method and computing device | |
Arrigo et al. | Non-backtracking walk centrality for directed networks | |
CN109033408A (en) | Information-pushing method and device, computer readable storage medium, electronic equipment | |
CN111325619A (en) | Credit card fraud detection model updating method and device based on joint learning | |
CN108052670A (en) | A kind of recommendation method and device of camera special effect | |
CN109214692B (en) | E-book methods of marking and electronic equipment based on user's timing behavior | |
CN111444438A (en) | Method, device, equipment and storage medium for determining recall permission rate of recall strategy | |
CN107871055A (en) | A kind of data analysing method and device | |
CN105426392A (en) | Collaborative filtering recommendation method and system | |
CN109241421A (en) | ID data network processing method, calculates equipment and computer storage medium at device | |
CN109522275A (en) | Label method for digging, electronic equipment and the storage medium of content are produced based on user | |
CN108876644A (en) | A kind of similar account calculation method and device based on social networks | |
CN107608965A (en) | Extracting method, electronic equipment and the storage medium of books the names of protagonists | |
CN109190035A (en) | ID data network data analysis method, device and calculating equipment | |
CN109241419A (en) | ID data network data analysis method, device and calculating equipment | |
CN109086452A (en) | ID data network beta pruning preprocess method, device and calculating equipment | |
CN109829099A (en) | ID data subnet processing method, calculates equipment and computer storage medium at device | |
CN110215703A (en) | The selection method of game application, apparatus and system | |
Eden et al. | Max-min greedy matching | |
CN114581177A (en) | Product recommendation method, device, equipment and storage medium | |
CN108694171A (en) | The method and device of information push |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190111 |