CN106570014A - Method and device for determining home attribute information of user - Google Patents

Method and device for determining home attribute information of user Download PDF

Info

Publication number
CN106570014A
CN106570014A CN201510649771.8A CN201510649771A CN106570014A CN 106570014 A CN106570014 A CN 106570014A CN 201510649771 A CN201510649771 A CN 201510649771A CN 106570014 A CN106570014 A CN 106570014A
Authority
CN
China
Prior art keywords
information
user
equipment
family
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510649771.8A
Other languages
Chinese (zh)
Other versions
CN106570014B (en
Inventor
吴保华
付登坡
甘云锋
黄耐寒
吕秀泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510649771.8A priority Critical patent/CN106570014B/en
Publication of CN106570014A publication Critical patent/CN106570014A/en
Application granted granted Critical
Publication of CN106570014B publication Critical patent/CN106570014B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The objective of the invention is to provide a method and device for determining home attribute information of a user. Compared with the prior art, the method includes the following steps: acquiring sample data, wherein the sample data includes correlation information between a sample user and a sample network device, such as communication time, communication frequency, and communication date; performing machining learning on the sample data so as to determine corresponding correlation decision model information; and applying use record information of a user about the network device to the correlation decision model information so as to acquire home correlation information of a home corresponding to the user and the network information. The corresponding correlation decision model information is determined by machining learning, which can effectively improve the recognition rate of a user home correlation relation.

Description

For determining the method and apparatus of family's attribute information of user
Technical field
The present invention relates to computer realm, more particularly to a kind of family's attribute information for determining user Technology.
Background technology
With flourishing for family's Internet technology, increasing business is carried out in units of family Carry out, so which user is identified from the same family, for solution family the Internet fine data Change operation most important.
It is mainly logical with cell-phone number by Telephone set for subscriber household recognition methodss in prior art Letter data relation is inferred that this method has several defects, for example, based on Small Sample Database Model easy over-fitting, data acquisition cost more and more higher, it is impossible to which user communication device is unified Identification, is not easy to be extended using the Internet behavior characteristicss, the coverage rate and discrimination of domestic consumer It is not high.With the development of family's Internet technology, the problems referred to above can be projected increasingly.
The content of the invention
The purpose of the application is to provide a kind of method and apparatus for determining family's attribute information of user, To solve the problems, such as whether the family that user is located with map network equipment has family association relation.
According to the one side of the application, there is provided a kind of side for determining family's attribute information of user Method, wherein, the method includes:
Sample data is obtained, wherein, the sample data includes the pass of sample of users and network of samples equipment Connection information;
Determine corresponding interrelated decision model information by carrying out machine learning to the sample data;
By user with regard to the network equipment usage record Information application in the interrelated decision model information, with Obtain family's related information of user family corresponding with the network equipment.
According to the another aspect of the application, additionally provide a kind of for determining family's attribute information of user Equipment, wherein, the equipment includes:
Sample acquiring device, for obtaining sample data, wherein, the sample data is used including sample Family and the related information of network of samples equipment;
Model determining device, for determining corresponding pass by carrying out machine learning to the sample data Connection decision model information;
Model application apparatus, for by user with regard to the network equipment usage record Information application in described Interrelated decision model information, is closed with the family for obtaining user family corresponding with the network equipment Connection information.
Compared with prior art, the application passes through to obtain sample data, wherein, sample data includes that sample is used Family and the related information of network of samples equipment, such as call duration time, communication frequency, communication date etc., and Machine learning is carried out to the sample data to determine corresponding interrelated decision model information, and user is closed In the network equipment usage record Information application in the interrelated decision model information, to obtain the user Family's related information of family corresponding with the network equipment.Wherein, it is right to be determined by machine learning The interrelated decision model information answered can effectively improve the discrimination of subscriber household incidence relation.
And, the application can also by according to the mapping relations between user totem information by same user Different user identification information corresponding to communications records information merger be with the communications records letter of user Breath, and according to merger after multiple users set up multiple use using the communications records information of heterogeneous networks equipment Family and communication equipment associated group.For example, the communication equipment by the way that domestic consumer is used carries out unifying to reflect Penetrate, i.e., the communication equipment is normalized to same user, the behavior characteristicss for facilitating views with the Internet enter Row extension.
Additionally, the application can also pass through to judge when two users family corresponding with consolidated network equipment When family's related information is association, determine that described two users belong to the same family, can be with according to mesh The corresponding destination network device of mark family determines the multiple targeted customers included by the target household, and root Family's portrait information of the target household is determined according to the portrait information of targeted customer, such that it is able to according to institute State family's portrait information and provide recommendation information, such as sales promotion information, advertising message etc. for the target household, Be conducive to the development of many business in units of family.
Description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, this Bright other features, objects and advantages will become more apparent upon:
Fig. 1 illustrates a kind of side for determining family's attribute information of user according to the application one side Method flow chart;
Fig. 2 illustrates a kind of family's attribute letter for determining user according to one preferred embodiment of the application The method flow diagram of breath;
Fig. 3 is illustrated according to a kind of for determining family's attribute information of user of the application other side Equipment schematic diagram;
Fig. 4 is illustrated according to the application one preferred embodiment for determining family's attribute information of user Equipment schematic diagram.
Same or analogous reference represents same or analogous part in accompanying drawing.
Specific embodiment
The present invention is described in further detail below in conjunction with the accompanying drawings.
In one typical configuration of the application, terminal, the equipment of service network and trusted party include One or more processors (CPU), input/output interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory And/or the form, such as read only memory (ROM) or flash memory (flash such as Nonvolatile memory (RAM) RAM).Internal memory is the example of computer-readable medium.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be with Information Store is realized by any method or technique.Information can be computer-readable instruction, data knot Structure, the module of program or other data.The example of the storage medium of computer includes, but are not limited to phase Become internal memory (PRAM), static RAM (SRAM), dynamic random access memory (DRAM), other kinds of random access memory (RAM), read only memory (ROM), electricity It is Erasable Programmable Read Only Memory EPROM (EEPROM), fast flash memory bank or other memory techniques, read-only Compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storages, Magnetic cassette tape, magnetic disk storage or other magnetic storage apparatus or any other non-transmission medium, Can be used to store the information that can be accessed by a computing device.Define according to herein, computer-readable Medium does not include non-temporary computer readable media (transitory media), such as the data signal of modulation and Carrier wave.
Further to illustrate the effect of technological means that the application taken and acquirement, with reference to attached Figure and preferred embodiment, the technical scheme to the application, carry out clear and complete description.
Shown in ginseng Fig. 1, illustrate according to a kind of for determining user's of the one side of the application offer The method of family's attribute information, wherein, the method includes:
S1 obtains sample data, wherein, the sample data includes sample of users with network of samples equipment Related information;
S2 determines corresponding interrelated decision model information by carrying out machine learning to the sample data;
S3 by user with regard to the network equipment usage record Information application in the interrelated decision model information, To obtain family's related information of user family corresponding with the network equipment.
In this embodiment, in step S1, sample data is obtained, wherein, the sample data Including sample of users and the related information of network of samples equipment;Specifically, the network equipment therein can be The equipment for making user access the Internet, for example, it may include router, set up equipment of WAP etc., So network of samples equipment is just the network equipment for being wherein used as sample, is determined with the association for obtaining following Plan model;Sample of users therein includes sample of users and sample net with the related information of network of samples equipment The associated all information of network equipment, namely the relevant information of sample of users access network of samples equipment, example Annual distribution (such as one day), the long-time in the short time of network of samples equipment is accessed such as sample of users The information such as interior Annual distribution (such as one month), the frequency.
Specifically, obtaining the mode of sample data may include directly to obtain already present sample from local device Data, may also comprise the communication data with the network equipment by the user for having determined that incidence relation from collection Middle extraction sample data etc..
Those skilled in the art is it should be understood that obtain the mode of sample data only in above-mentioned steps S1 For citing, other modes of acquisitions sample data that are existing or being likely to occur from now on are such as applicable to Application, also should be included within the application protection domain, and here is incorporated herein by reference.
Continue in this embodiment, in step S2, by carrying out engineering to the sample data Practise and determine corresponding interrelated decision model information;Specifically, interrelated decision model therein is used to determine and uses Whether family has incidence relation with the network equipment, and further, the interrelated decision model can be by setting up Artificial intelligence model is realized, it is for instance possible to use GBDT algorithms (gradient boosting decision tree), the algorithm is made up of many decision trees, final classification As a result added up based on all of result, for example, by using GBDT algorithms to sample data Machine learning training is constantly carried out, makes the user of output reach certain standard with the incidence relation of the network equipment True rate, so that it is determined that corresponding interrelated decision model information.
Continue in this embodiment, in step S3, by user with regard to the network equipment usage record Information application is corresponding with the network equipment to obtain the user in the interrelated decision model information Family's related information of family.Wherein, the usage record information includes user and multiple network equipments Communication information etc..Family's related information refers to that user family corresponding with the network equipment is It is no with incidence relation.Specifically, by extracting to the usage record information, and after extracting The interrelated decision model information that determines to step S2 of usage record Information application, the user can be obtained Whether family corresponding with the network equipment has incidence relation, so that it is determined that the user and the net Family's related information of the corresponding family of network equipment.
Preferably, wherein, step S3 includes:
S31 (not shown) by user with regard to the network equipment usage record Information application in the interrelated decision Model information, to obtain the equipment related information of the user and the network equipment.
S32 (not shown) exceedes predetermined correlation threshold information when the equipment related information, it is determined that described Family's related information of user family corresponding with the network equipment is association.
Specifically, in step S31, the equipment related information of the user and the network equipment Can be represented with the association probability of the network equipment with the user.Specifically, by user with regard to net The communication information of network equipment is applied to the interrelated decision model information, obtain the user with the net The association probability of network equipment, so as to determine the user with the network equipment according to the size of association probability Equipment related information, for example, the user be U, the network equipment be router R, by user The communication information of U and router R is input into the interrelated decision model information, obtains user U and router The association probability of R, whether associating for user U and router R determined by the size of association probability.
Specifically, in step S32, the correlation threshold information is described user and the net The threshold value of the association probability of network equipment, the threshold value sets in advance.Specifically, by by obtain The user compares with the association probability of the network equipment with predetermined association probability threshold value, when the pass When connection probability is more than predetermined association probability threshold value, user family corresponding with the network equipment is determined Family's related information in front yard is association, for example, can set the pass of described user and the network equipment Connection probability threshold value be 80%, when the network equipment be router R, the user be U, if route When the association probability of device R and user U is more than 80%, the family corresponding to user U and router R is determined Family's related information in front yard is association.
Preferably, the method also includes:
S4 when family's related information of two users family corresponding with consolidated network equipment is association, Determine that described two users belong to the same family.
Specifically, in step S4, by step S3 obtain respectively described two users with it is described The association probability of consolidated network equipment, when described two users and the association probability of the consolidated network equipment The threshold value of the association probability of both greater than described two users and consolidated network equipment, i.e., described two users with When family's related information of the corresponding family of consolidated network equipment is association, described two user's category are determined In the same family, for example, when the network equipment be router R, association probability threshold value be 80%, When user U1 and user U2 are both greater than 80% with the association probability of router R respectively, user U1 is determined The family being located with router R with user U2 associates, so that it is determined that user U1 and user U2 belong to same One family.
Preferably, the method also includes:
S5 (not shown) determines that the target household is wrapped according to the corresponding destination network device of target household The multiple targeted customers for including, wherein, targeted customer family corresponding with the destination network device Family's related information is association.
Specifically, in step S5, each target household has multiple targeted customers, according to described many Family's related information of individual targeted customer family corresponding with the destination network device is association, so as to true Multiple targeted customers included by the fixed target household, for example, the corresponding target network of the target household Network equipment is both greater than walked for the association probability of router R, user U1, U2, U3, U4 and router R Correlation threshold information described in rapid S32, determines what user U1, U2, U3, U4 and router R were located Family associates, so that it is determined that multiple targets that user U1, U2, U3, U4 are the target household to be included User.
It is highly preferred that the method also includes:
S6 (not shown) determines the family of the target household according to the user of targeted customer portrait information Front yard portrait information;
S7 (not shown) provides recommendation information according to family portrait information for the target household.
Specifically, in step s 6, the target is determined according to the user of targeted customer portrait information Family's portrait information of family;User's portrait information therein represents the various information aggregates of user characteristicses, Including but not limited to the sex of user, the age, occupation, education background, technical ability, hobby etc..Wherein Family's portrait information represent the various information aggregates of Family characteristics, including but not limited to family background, family Front yard hobby, family income, family life attitude etc..According to the user of targeted customer portrait information Determine the mode of family's portrait information of the target household, the use of the analysis targeted customer can be passed through Family characteristic information determines the Family characteristics information of the targeted customer place family.For example by analyzing target Domestic consumer likes motion, it may be determined that family's hobby of target household includes motion.
Continue in this embodiment, in the step s 7, information is drawn a portrait for the target man according to the family Front yard provides recommendation information;Recommendation information therein includes but is not limited to sales promotion information, advertising message, financing Information etc..The mode of recommendation information is provided for the target household according to family portrait information, can With some family's characteristic informations drawn a portrait according to family, for example, family's hobby, family income etc., to institute State target household and the recommendation information for matching is provided.Specifically, for example family's portrait of described target household Family's hobby in information includes cuisines, then to the target household cuisines for matching can be recommended to believe Breath.
Preferably, the sample data includes positive sample data, wherein, the positive sample data include sample The related information that this user is associated with network of samples equipment;
Wherein, join shown in Fig. 2, step S1 includes:
S11 is set up multiple users and is led to according to multiple users using the communications records information of heterogeneous networks equipment Letter equipment associated group;
It is same that S12 screens determination based on pre-defined rule from the plurality of user with communication equipment associated group The corresponding preferred network equipment of user, and charge to institute as associated sample of users and network of samples equipment State positive sample data.
Specifically, in step s 11, believed using the communications records of heterogeneous networks equipment according to multiple users Breath sets up multiple users and communication equipment associated group;Wherein described communications records information includes but is not limited to many Individual user and the communication information of heterogeneous networks equipment.Specifically, set using heterogeneous networks according to multiple users Standby communications records information sets up the mode of multiple users and communication equipment associated group, and can pass through will be multiple User is combined with multiple network equipments, for example, when the network equipment be router, with user U1 and The router of U2 communications has R1, R2, then can set up following associated group (U1_R1, x1, x2, x3...), (U1_R2, x1, x2, x3...), (U2_R1, x1, x2, x3...), wherein (U2_R2, x1, x2, x3...), x1, x2, x3... The communication information of user and different routers is represented, its number can be arranged according to specific requirement.
Specifically, in step s 12, associated with communication equipment from the plurality of user based on pre-defined rule Screening in group determines the corresponding preferred network equipment of same user, and as associated sample of users and Network of samples equipment charges to the positive sample data;Wherein, the preferred network equipment is and the sample The maximally related network equipment of user place family.Specifically, when the network equipment is router, based on predetermined Associated group of the rule from multiple users for having set up with router in determine same user place family most Related router.For example, the router for communicating with user U1 and U2 has R1, R2, R3, then Can set up following associated group (U1_R1, x1, x2, x3...), (U1_R2, x1, x2, x3...), (U1_R3, x1, x2, x3...) (U2_R1, x1, x2, x3...), (U2_R2, x1, x2, x3...), (U2_R3, x1, x2, x3...), it is R1 to go out the maximally related routers of user U1 based on predetermined Rules Filtering, The maximally related routers of user U2 are R3, then (U1_R1, x1, x2, x3...), (U2_R3, x1, x2, x3...) Charge to the positive sample data.
It is highly preferred that the pre-defined rule includes following at least any one:
Between the home location information of the device location information of the preferred network equipment and the same user Range information be less than or equal to predetermined correlation distance threshold information;
The device location information of other network equipments that the same user is used and the home location Range information between information is equal to or more than predetermined unrelated distance threshold information;
Between the home location information of the device location information of the preferred network equipment and the same user The device location information of other network equipments that used less than the same user of range information and institute State the range information between home location information.
Wherein, screening the rule of preferred network equipment may include following at least any one:
(1) home location of the device location information of the preferred network equipment and the same user Range information between information is less than or equal to predetermined correlation distance threshold information, wherein, the family position The mode that confidence breath determines is included but is not limited to:Determine according to relation data is paid, for example, closed according to payment The conventional ship-to of user of the coefficient according in determines;Determined according to position relationship data, such as according to position Wireless activity hotspot location and wireless activity time in relation data etc. determine.Wherein, correlation distance threshold Value information is the device location information with regard to preferred network equipment set in advance and the same user The threshold value of the related information of home location information.Specifically, when the network equipment is router, road is worked as It is less than or equal to by the range information between the positional information of device and the home location information of the same user During the pre-set threshold value, determine that the router is the maximally related road of same user place family By device, as the corresponding preferred network equipment of same user.For example, correlation distance threshold information is arranged For 0.2 kilometer, the distance between the home location information of router R and user U is public less than or equal to 0.2 In when, determine router R be user U preferred network equipment.
(2) device location information of other network equipments that the same user is used and the family Range information between positional information be equal to or more than predetermined unrelated distance threshold information, wherein, it is unrelated away from It is the family of the device location information with regard to the network equipment set in advance and the same user from information The threshold value of the incoherent information of positional information.Specifically, when the network equipment is router, route is worked as Range information between the home location information of the positional information of device and the same user is equal to or more than institute When stating pre-set threshold value, the incoherent road that the router is same user place family is determined By device.For example, unrelated distance threshold information is set as 3 kilometers, router R1, R2 and same use Range information between the home location information of family U is more than or equal to 3 kilometers, determines router R1, R2 For the not preferred network equipment of user U places family.
(3) device location information of the preferred network equipment is believed with the home location of the same user Device location information of the range information between breath less than other network equipments that the same user is used With the range information between the home location information.Specifically, when the network equipment is router, institute State the home location information with the same user with the maximally related router of same user place family Between other router device positional informationes for being used less than the same user of range information with it is described Range information between home location information.The preferred network equipment of such as user U be router R1, non-optimum The network equipment of choosing is router R2, R3, then between the home location information of router R1 and user U Distance between home location information of the distance less than router R2 and R3 and user U.
It is highly preferred that the sample data also includes negative sample data, wherein, the negative sample packet Include sample of users and the uncorrelated related information of network of samples equipment;
Shown in ginseng Fig. 2, wherein, step S1 also includes:
S13 is according to the accumulative traffic information between the same user and other communication equipments for being used It is preferred that the corresponding unrelated network equipment of the same user, and as uncorrelated sample of users and sample The network equipment charges to the negative sample data.
It will be understood by those skilled in the art that it is determined that the user positive sample corresponding with the preferred network equipment After notebook data, the user is unconnected to other communication equipments in addition to the preferred network equipment, therefore, Can according to the user and the accumulative traffic of other each communication equipments from this (s) it is excellent in other communication equipments Select some using as the corresponding unrelated network equipment of the user, and then build negative sample data for machine Study is used.Specifically, in step S13, it is determined that the user and the preferred network equipment pair After the positive sample data answered, in other communication equipments in addition to the preferred network equipment according to the user with The accumulative traffic information of each communication equipment preferably goes out several other communication equipments, such as with the user's Accumulative traffic information exceedes some other communication equipments of predetermined communication natural law or predetermined communication time, or The most top n communication equipment of the accumulative traffic information of person and the user, using corresponding as the user The unrelated network equipment, i.e. user network equipment onrelevant unrelated with each;Then, by the user with It is preferred that each the unrelated network equipment for going out charges to negative sample data as uncorrelated sample data.For example, When other network equipments be router R, accumulative traffic information be communication natural law, and default communication natural law Threshold value is 10, when the communication natural law of user U and router R is more than or equal to 10, determines router R is the unrelated network equipment of user U, and is charged to network of samples equipment as uncorrelated sample of users The negative sample data.
It is highly preferred that in the user and communication equipment associated group the communication user number of communication equipment be less than or Equal to home-use amount threshold value.
It will be understood by those skilled in the art that the communication equipment in practical application scene, used in family Generally only use for less user, and in the communication equipment beyond family, such as cafe or library Communication equipment generally have a large number of users using, therefore, in this embodiment, family can also be passed through Number of users threshold value filtering out the communication equipment for being substantially not belonging to use in the family in advance, i.e., described user Home-use amount threshold value is less than or equal to the communication user number of communication equipment in communication equipment associated group. This, the home-use amount threshold value includes the meansigma methodss of the home-use amount or certain times of the meansigma methodss Number.Specifically, for example, it is assumed that home-use amount threshold value is 5, when router R1, R2, R3 difference Associated group is established with 5,2,10 different users, then delete the corresponding associated groups of router R3, The only corresponding associated group of reserved route device R1, R2.
It is highly preferred that the method also includes:
Family of the S15 (not shown) according to communication equipment place region in the user and communication equipment associated group Front yard Census information determines the home-use amount threshold value.
Specifically, in step S15, communication equipment institute in the user and communication equipment associated group Average household number of users in region is different, and according to family population census information the region is can determine Average household number of users, and by the Average household number of users of the region or the Average household number of users certain Individual multiple is used as the home-use amount threshold value.For example, according to the Census information of District of Shanghai, put down Home-use amount is 3, then District of Shanghai can arrange home-use amount threshold value for 3,6,9 or 12.
It is highly preferred that step S11 includes:
S111 (not shown) uses the difference of same user according to the mapping relations between user totem information Communications records information merger corresponding to the identification information of family is the communications records information of same user.
S112 (not shown) according to merger after multiple users using heterogeneous networks equipment communications records believe Breath sets up multiple users and communication equipment associated group.
Specifically, in step S111, the user totem information is included but is not limited to used by same user The mac addresses of communication equipment, No. imei, No. imsi, user registers application end id etc..The mapping Relation is realized by UUIC services.Specifically, the mapping relations for providing are serviced by UUIC, By the mac addresses of the different communication equipment used by same user, No. imei, No. imsi, user is registered Application end id etc. is mapped as the same user, so as to by the communications records corresponding to different communication equipment Information merger is the communications records information of same user.For example, the mobile device that user U is used has No. imei of mobile phone P1, P2, wherein P1, P2 is respectively imei1 and imei2, by mobile phone P1 and road By communications records (imei1, R) and mobile phone P2 and the router R of device R communications records (imei2, R the communications records (U, R) of user and router) are mapped as by UUIC services.
Specifically, in step S112, according to merger after multiple users use the logical of heterogeneous networks equipment Letter record information sets up multiple users and communication equipment associated group, and such as user U1 has mobile device imei1 And imei2, wherein, the mobile device imei1 and router R1 and router R2 has communications records (imei1, R1) and (imei1, R2), the mobile device imei2 and router R1 and router R2 has communications records (imei2, R1) and (imei2, R2), by UUIC service will (imei1, R1) and (imei2, R1) is mapped as (U, R1), (imei1, R2) and (imei2, R2) is reflected Penetrate as (U, R2).
It is highly preferred that shown in ginseng Fig. 2, step S1 also includes:
S14 extracts the sample characteristics information in the sample data;
Wherein, step S2 includes:
Determine corresponding pass by carrying out machine learning to the sample data and sample characteristics information therein Connection decision model information.
Specifically, in step S14, the sample characteristics information in the sample data is extracted;Its Described in sample characteristics information include network equipment unique characteristics information, user's unique characteristics information, user With network device communications characteristic information.For example when network equipment be router, the sample characteristics packet Include router unique characteristics information, user's unique characteristics information, user and router communication characteristic information. The router unique characteristics information is included but is not limited to:The average daily communication user number of router and There is total number of users, router working day and weekend communication user number ratio, the road of the user of communication in router By device different time sections communication user number ratio etc..User's unique characteristics information is included but is not limited to: User from the communication natural law in weekend or working day and all-router, user oneself in different on the same day Number of communications of period and all-router etc..The user and router communication characteristic information include but It is not limited to:Nearest date, the Yong Huyu of communication natural law, user and router communication of the user with router Router on weekdays or weekend communication different periods in one day of natural law, user and router it is logical Letter natural law, user and router each week communication natural law etc.
Specifically, in step S2, by the sample data and sample characteristics information therein Carry out machine learning and determine corresponding interrelated decision model information.Specifically, by sample data and therein Sample characteristics information composition training set (R_U, x1, x2, x3, x4....., label), wherein, R_U represents user U With the associated group of network equipment R;X1, x2, x3, x4..... represent sample characteristics information, and its number is according to concrete Require setting;Label desirable 1 or 0, when for positive sample when take 1,0 is taken during negative sample.By to training Collection (R_U, x1, x2, x3, x4....., label) carries out machine learning and determines corresponding interrelated decision model information.
More it is highly preferred that step S3 includes:
S31 (not shown) is believed from user according to the sample characteristics information with regard to the usage record of the network equipment Predicted characteristics information is extracted in breath;
S32 (not shown) by the predicted characteristics Information application in the interrelated decision model information, to obtain Obtain family's related information of user family corresponding with the network equipment.
Specifically, in step S31, according to the sample characteristics information from user with regard to the network equipment Predicted characteristics information is extracted in usage record information;Wherein described predicted characteristics information be by user with communicate The associated group of equipment and characteristic information are constituted, and predicted characteristics information is represented by (R_U, x1, x2, x3, x4....., -1), wherein R_U represent user and network equipment associated group, X1, x2, x3, x4..... represent the content that predicted characteristics packet contains, and specifically, predicted characteristics packet contains Particular content is identical with sample characteristics information, and its particular content is listed in the aforementioned embodiment, herein not Repeat again.
Specifically, in step s 32, by the predicted characteristics Information application in the interrelated decision model Information, to obtain family's related information of user family corresponding with the network equipment.Specifically, By the way that multiple predicted characteristics information (R_U, x1, x2, x3, x4....., -1) are input into into the interrelated decision model letter Breath, obtains the association probability of network equipment R and user U, sets with the network so as to obtain the user Family's related information of standby corresponding family.
Compared with prior art, the application passes through to obtain sample data, wherein, sample data includes that sample is used Family and the related information of network of samples equipment, such as call duration time, communication frequency, communication date etc., and Machine learning is carried out to the sample data to determine corresponding interrelated decision model information, and user is closed In the network equipment usage record Information application in the interrelated decision model information, to obtain the user Family's related information of family corresponding with the network equipment.Wherein, it is right to be determined by machine learning The interrelated decision model information answered can effectively improve the discrimination of subscriber household incidence relation.
And, the application can also by according to the mapping relations between user totem information by same user Different user identification information corresponding to communications records information merger be with the communications records letter of user Breath, and according to merger after multiple users set up multiple use using the communications records information of heterogeneous networks equipment Family and communication equipment associated group.For example, the communication equipment by the way that domestic consumer is used carries out unifying to reflect Penetrate, i.e., the communication equipment is normalized to same user, the behavior characteristicss for facilitating views with the Internet enter Row extension.
Additionally, the application can also pass through to judge when two users family corresponding with consolidated network equipment When family's related information is association, determine that described two users belong to the same family, can be with according to mesh The corresponding destination network device of mark family determines the multiple targeted customers included by the target household, and root Family's portrait information of the target household is determined according to the portrait information of targeted customer, such that it is able to according to institute State family's portrait information and provide recommendation information, such as sales promotion information, advertising message etc. for the target household, Be conducive to the development of many business in units of family.
Shown in ginseng Fig. 3, illustrating the one kind provided according to further aspect of the application is used to determine user Family's attribute information equipment 1, wherein, the equipment includes:
Sample acquiring device 11, obtain sample data, wherein, the sample data include sample of users with The related information of network of samples equipment;
Model determining device 12, determines that corresponding association is determined by carrying out machine learning to the sample data Plan model information;
Model application apparatus 13, by user with regard to the network equipment usage record Information application in the association Decision model information, to obtain family's related information of user family corresponding with the network equipment.
In this embodiment, sample acquiring device 11 obtains sample data, wherein, the sample data bag Include the related information of sample of users and network of samples equipment;Specifically, the network equipment therein can be to make The equipment that user accesses the Internet, for example, it may include router, set up equipment of WAP etc., So network of samples equipment is just the network equipment for being wherein used as sample, is determined with the association for obtaining following Plan model;Sample of users therein includes sample of users and sample net with the related information of network of samples equipment The associated all information of network equipment, namely the relevant information of sample of users access network of samples equipment, example Annual distribution (such as one day), the long-time in the short time of network of samples equipment is accessed such as sample of users The information such as interior Annual distribution (such as one month), the frequency.
Specifically, obtaining the mode of sample data may include directly to obtain already present sample from local device Data, may also comprise the communication data with the network equipment by the user for having determined that incidence relation from collection Middle extraction sample data etc..
Those skilled in the art is it should be understood that above-mentioned sample acquiring device 11 obtains sample data Mode is only for example, and other modes of acquisition sample data that are existing or being likely to occur from now on such as can be fitted For the application, also should be included within the application protection domain, and here is contained in by reference This.
Continue in this embodiment, model determining device 12 to the sample data by carrying out machine learning Determine corresponding interrelated decision model information;Specifically, interrelated decision model therein is used to determine user Whether there is incidence relation with the network equipment, further, the interrelated decision model can be by setting up people Work model of mind is realized, it is for instance possible to use GBDT algorithms, the algorithm is made up of many decision trees, Final classification results are added up based on all of result, for example, by using sample data GBDT algorithms constantly carry out machine learning training, the user of output is reached with the incidence relation of the network equipment To certain accuracy rate, so that it is determined that corresponding interrelated decision model information.
Continue in this embodiment, model application apparatus 13 believes user with regard to the usage record of the network equipment Breath is applied to the interrelated decision model information, to obtain user family corresponding with the network equipment Family's related information in front yard.Wherein, the usage record information includes that user is logical with multiple network equipments Letter information etc..Whether family's related information refers to user family corresponding with the network equipment With incidence relation.Specifically, by extracting to the usage record information, and by after extraction The interrelated decision model information that usage record Information application determines to model determining device 12, can obtain institute State user family corresponding with the network equipment and whether there is incidence relation, so that it is determined that the user with Family's related information of the corresponding family of the network equipment.
Preferably, the model application apparatus 13 includes:
Equipment related information acquiring unit (not shown), user is believed with regard to the usage record of the network equipment Breath is applied to the interrelated decision model information, is closed with the equipment of the network equipment with obtaining the user Connection information;
Family's related information determining unit (not shown), when the equipment related information exceedes predetermined pass Connection threshold information, determines family's related information of user family corresponding with the network equipment to close Connection.
Specifically, equipment related information acquiring unit should with regard to the usage record information of the network equipment by user For the interrelated decision model information, letter is associated with the equipment of the network equipment to obtain the user Breath, wherein, the user and the equipment related information of the network equipment can with the user with it is described The association probability of the network equipment is representing.Specifically, by user with regard to the network equipment communication information application In the interrelated decision model information, the user and the association probability with the network equipment are obtained, from And the equipment related information of the user and the network equipment is determined according to the size of association probability.For example The user is U, the network equipment is router R, by the communication information of user U and router R The interrelated decision model information is input into, the association probability of user U and router R is obtained, by association The size of probability determines whether associating for user U and router R.
Specifically, when the equipment related information exceedes predetermined correlation threshold information, family's related information Determining unit determines that family's related information of user family corresponding with the network equipment is association, Wherein, the correlation threshold information is the threshold value of described user and the association probability of the network equipment, The threshold value sets in advance.Specifically, by will obtain the user and the network equipment Association probability compares with predetermined association probability threshold value, when the association probability is more than predetermined association probability During threshold value, the family's related information for determining user family corresponding with the network equipment is association. For example, it can be set to described user is 80% with the threshold value of the association probability of the network equipment, when described The network equipment is router R, the user is U, if router R is more than with the association probability of user U When 80%, determine user U with family's related information of the family corresponding to router R to associate.
Preferably, the equipment also includes:
Same home determining device (not shown), when two users family corresponding with consolidated network equipment Family's related information when being association, determine that described two users belong to the same family.
In this embodiment, when family's related information of two users family corresponding with consolidated network equipment When being association, same home determining device determines that described two users belong to the same family, specifically, Described two users are obtained respectively by model application apparatus 13 general with associating for the consolidated network equipment Rate, when described two users and the association probability of the consolidated network equipment be both greater than described two users with The threshold value of the association probability of consolidated network equipment, i.e., described two users family corresponding with consolidated network equipment When family's related information in front yard is association, determine that described two users belong to the same family.For example, when The network equipment is router R, the threshold value of association probability is 80%, and user U1 and user U2 distinguishes When being both greater than 80% with the association probability of router R, user U1 and user U2 and router R is determined Family's association at place, so that it is determined that user U1 and user U2 belong to same family.
Preferably, the equipment also includes:
Domestic consumer's determining device (not shown), determines according to the corresponding destination network device of target household Multiple targeted customers included by the target household, wherein, the targeted customer and the objective network Family's related information of the corresponding family of equipment is association.
In this embodiment, domestic consumer's determining device is true according to the corresponding destination network device of target household Multiple targeted customers included by the fixed target household, wherein, the targeted customer and the target network Family's related information of the corresponding family of network equipment is association, wherein, each target household has multiple targets User, believes according to the association of the family of the plurality of targeted customer family corresponding with the destination network device Cease to associate, so that it is determined that the multiple targeted customers included by the target household.Such as described target man The corresponding destination network device in front yard is the pass of router R, user U1, U2, U3, U4 and router R Connection probability is both greater than aforesaid correlation threshold information, determines user U1, U2, U3, U4 and router R Place family association, so that it is determined that user U1, U2, U3, U4 are the target household include it is many Individual targeted customer.
It is highly preferred that the equipment also includes:
Family's portrait determining device (not shown), determines according to the user of targeted customer portrait information Family's portrait information of the target household;
Recommendation information offer device (not shown), information is drawn a portrait for the target household according to the family Recommendation information is provided.
Specifically, family's portrait determining device is drawn a portrait described in information determination according to the user of the targeted customer Family's portrait information of target household;Wherein, user's portrait information represents the various information collection of user characteristicses Close, including but not limited to the sex of user, age, occupation, education background, technical ability, hobby etc.. Family therein portrait information represents the various information aggregates of Family characteristics, including but not limited to family background, Family's hobby, family income, family life attitude etc..Believed according to the user of targeted customer portrait Breath determines the mode of family's portrait information of the target household, can pass through the analysis targeted customer's User's characteristic information determines the Family characteristics information of the targeted customer place family.For example by analyzing mesh Mark domestic consumer likes motion, it may be determined that family's hobby of target household includes motion.
Continue in this embodiment, recommendation information offer device draws a portrait information for the mesh according to the family Mark family provides recommendation information;Wherein, recommendation information include but is not limited to sales promotion information, advertising message, Financing information etc..The mode of recommendation information is provided for the target household according to family portrait information, Can be according to some family's characteristic informations of family's portrait, such as family's hobby, family income etc., to institute State target household and the recommendation information for matching is provided.Specifically, for example family's portrait of described target household Family's hobby in information includes cuisines, then to the target household cuisines for matching can be recommended to believe Breath.
Shown in ginseng Fig. 4, it is preferable that the sample data includes positive sample data, wherein, the positive sample Notebook data includes the related information that sample of users is associated with network of samples equipment;
Wherein, the sample acquiring device 11 includes:
Associated group sets up unit 111, and according to multiple users the communications records information of heterogeneous networks equipment is used Set up multiple users and communication equipment associated group;
Positive sample acquiring unit 112, based on pre-defined rule from the plurality of user and communication equipment associated group Middle screening determines the corresponding preferred network equipment of same user, and as associated sample of users and sample Present networks equipment charges to the positive sample data.
Specifically, associated group sets up the communication note that unit 111 uses heterogeneous networks equipment according to multiple users Record information sets up multiple users and communication equipment associated group;Wherein, the communications records information include but not It is limited to the communication information of multiple users and heterogeneous networks equipment.Specifically, used according to multiple users different The communications records information of the network equipment sets up the mode of multiple users and communication equipment associated group, can pass through Multiple users are combined with multiple network equipments, such as when the network equipment is router, with user U1 There are R1, R2 with the router of U2 communications, then following associated group (U1_R1, x1, x2, x3...) can be set up, (U1_R2, x1, x2, x3...), (U2_R1, x1, x2, x3...), wherein (U2_R2, x1, x2, x3...), x1, x2, x3... The communication information of user and different routers is represented, its number can be arranged according to specific requirement.
Specifically, positive sample acquiring unit 112 is based on pre-defined rule from the plurality of user and communication equipment Screening determines the corresponding preferred network equipment of same user in associated group, and uses as associated sample The positive sample data are charged to network of samples equipment in family;Wherein, the preferred network equipment be with it is described The maximally related network equipment of sample of users place family.Specifically, when the network equipment is router, it is based on Predetermined rule determines that same user is in from the multiple users for having set up with the associated group of router The maximally related router in front yard.The router for for example communicating with user U1 and U2 has R1, R2, R3, that Can set up following associated group (U1_R1, x1, x2, x3...), (U1_R2, x1, x2, x3...), (U1_R3, x1, x2, x3...) (U2_R1, x1, x2, x3...), (U2_R2, x1, x2, x3...), (U2_R3, x1, x2, x3...), it is R1 to go out the maximally related routers of user U1 based on predetermined Rules Filtering, The maximally related routers of user U2 are R3, then (U1_R1, x1, x2, x3...), (U2_R3, x1, x2, x3...) Charge to the positive sample data.
It is highly preferred that the pre-defined rule includes following at least any one:
Between the home location information of the device location information of the preferred network equipment and the same user Range information be less than or equal to predetermined correlation distance threshold information;
The device location information of other network equipments that the same user is used and the home location Range information between information is equal to or more than predetermined unrelated distance threshold information;
Between the home location information of the device location information of the preferred network equipment and the same user The device location information of other network equipments that used less than the same user of range information and institute State the range information between home location information.
Wherein, screening the rule of preferred network equipment may include following at least any one:
(1) device location information of the preferred network equipment is believed with the home location of the same user Range information between breath is less than or equal to predetermined correlation distance threshold information, wherein, the home location The mode that information determines is included but is not limited to:Determine according to relation data is paid, such as according to the relation of payment The conventional ship-to of user in data determines;Determined according to position relationship data, for example, closed according to position The wireless activity hotspot location and wireless activity time of coefficient according in etc. determines.
Wherein, correlation distance threshold information is that the device location with regard to preferred network equipment set in advance is believed The threshold value of the breath information related to the home location information of the same user.Specifically, the network When equipment is router, when between the positional information of router and the home location information of the same user Range information less than or equal to the pre-set threshold value when, determine that the router is same use The maximally related router of family place family, as the corresponding preferred network equipment of same user.For example, Correlation distance threshold information is set to 0.2 kilometer, between the home location information of router R and user U When distance is less than or equal to 0.2 kilometer, the preferred network equipment that router R is user U is determined.
(2) device location information of other network equipments that the same user is used and the family Range information between positional information be equal to or more than predetermined unrelated distance threshold information, wherein, it is unrelated away from It is the family of the device location information with regard to the network equipment set in advance and the same user from information The threshold value of the incoherent information of positional information.Specifically, when the network equipment is router, route is worked as Range information between the home location information of the positional information of device and the same user is equal to or more than institute When stating pre-set threshold value, the incoherent road that the router is same user place family is determined By device.For example, unrelated distance threshold information is set as 3 kilometers, router R1, R2 and same use Range information between the home location information of family U is more than or equal to 3 kilometers, determines router R1, R2 For the not preferred network equipment of user U places family.
(3) device location information of the preferred network equipment is believed with the home location of the same user Device location information of the range information between breath less than other network equipments that the same user is used With the range information between the home location information.Specifically, when the network equipment is router, institute State the home location information with the same user with the maximally related router of same user place family Between other router device positional informationes for being used less than the same user of range information with it is described Range information between home location information.The preferred network equipment of such as user U be router R1, non-optimum The network equipment of choosing is router R2, R3, then between the home location information of router R1 and user U Distance between home location information of the distance less than router R2 and R3 and user U.
Shown in ginseng Fig. 4, it is highly preferred that the sample data also includes negative sample data, wherein, it is described Negative sample data include sample of users and the uncorrelated related information of network of samples equipment;
Wherein, the sample acquiring device 11 also includes:
Negative sample acquiring unit 113, according between the same user and other communication equipments for being used The corresponding unrelated network equipment of the preferably described same user of accumulative traffic information, and as onrelevant Sample of users and network of samples equipment charge to the negative sample data.
It will be understood by those skilled in the art that it is determined that the user positive sample corresponding with the preferred network equipment After notebook data, the user is unconnected to other communication equipments in addition to the preferred network equipment, therefore, Can according to the user and the accumulative traffic of other each communication equipments from this (s) it is excellent in other communication equipments Select some using as the corresponding unrelated network equipment of the user, and then build negative sample data for machine Study is used.Specifically, it is determined that after user positive sample data corresponding with the preferred network equipment, It is accumulative logical with each communication equipment according to the user in other communication equipments in addition to the preferred network equipment Traffic information preferably goes out several other communication equipments, for example, exceed with the accumulative traffic information of the user Some other communication equipments of predetermined communication natural law or predetermined communication time, or it is accumulative logical with the user The most top n communication equipment of traffic information, as the corresponding unrelated network equipment of the user, i.e., should User's network equipment onrelevant unrelated with each;Then, it is the user is unrelated with each for preferably going out The network equipment charges to negative sample data as uncorrelated sample data.For example, when other network equipments are Router R, accumulative traffic information are communication natural law, and default communication natural law threshold value is 10, works as user When the communication natural law of U and router R is more than or equal to 10, router R is determined for the unrelated of user U The network equipment, and charge to the negative sample data as uncorrelated sample of users and network of samples equipment.
It is highly preferred that in the user and communication equipment associated group the communication user number of communication equipment be less than or Equal to home-use amount threshold value.
It will be understood by those skilled in the art that the communication equipment in practical application scene, used in family Generally only use for less user, and in the communication equipment beyond family, such as cafe or library Communication equipment generally have a large number of users using, therefore, in this embodiment, family can also be passed through Number of users threshold value filtering out the communication equipment for being substantially not belonging to use in the family in advance, i.e., described user Home-use amount threshold value is less than or equal to the communication user number of communication equipment in communication equipment associated group. This, the home-use amount threshold value includes the meansigma methodss of the home-use amount or certain times of the meansigma methodss Number.Specifically, for example, it is assumed that home-use amount threshold value is 5, when router R1, R2, R3 difference Associated group is established with 5,2,10 different users, then delete the corresponding associated groups of router R3, The only corresponding associated group of reserved route device R1, R2.
It is highly preferred that the equipment also includes:
Number of users threshold determining apparatus (not shown), leads to according in the user and communication equipment associated group The family population census information of letter equipment place region determines the home-use amount threshold value.
Specifically, number of users threshold determining apparatus set according to the user with communicating in communication equipment associated group The family population census information of standby place region determines the home-use amount threshold value, wherein, the user Average household number of users from communication equipment place region in communication equipment associated group is different, according to family Front yard Census information can determine the Average household number of users of the region, and by the Average household of the region Certain multiple of number of users or the Average household number of users is used as the home-use amount threshold value.For example, root According to the Census information of District of Shanghai, Average household number of users is 3, then District of Shanghai can be arranged Home-use amount threshold value is 3,6,9 or 12.
It is highly preferred that the associated group sets up unit 111 including:
Communications records information merger subelement (not shown), according to the mapping relations between user totem information It is with a user by the communications records information merger corresponding to the different user identification information of same user Communications records information;
Associated group sets up subelement (not shown), according to merger after multiple users set using heterogeneous networks Standby communications records information sets up multiple users and communication equipment associated group.
Specifically, communications records information merger subelement will be same according to the mapping relations between user totem information Communications records information merger corresponding to the different user identification information of one user is logical with user Letter record information, wherein, the user totem information includes but is not limited to communication equipment used by same user Mac addresses, No. imei, No. imsi, user registers application end id etc..The mapping relations are logical Cross what UUIC services were realized.Specifically, the mapping relations for providing are serviced by UUIC, by same use The mac addresses of the different communication equipment used by family, No. imei, No. imsi, user registers application end id Etc. being mapped as the same user, so as to by the communications records information merger corresponding to different communication equipment For the communications records information of same user.For example, the mobile device that user U is used have mobile phone P1, No. imei of P2, wherein P1, P2 is respectively imei1 and imei2, by mobile phone P1's and router R The communications records (imei2, R) of communications records (imei1, R) and mobile phone P2 and router R pass through UUIC services are mapped as the communications records (U, R) of user and router.
Specifically, associated group sets up subelement (not shown), according to merger after multiple users using not Multiple users and communication equipment associated group, such as user U1 are set up with the communications records information of the network equipment There are mobile device imei1 and imei2, wherein, the mobile device imei1 and router R1 and route Device R2 has communications records (imei1, R1) and (imei1, R2), the mobile device imei2 and road There are communications records (imei2, R1) and (imei2, R2) by device R1 and router R2, by UUIC (imei1, R1) and (imei2, R1) is mapped as (U, R1) by service, by (imei1, R2) and (imei2, R2) is mapped as (U, R2).
Shown in ginseng Fig. 4, it is highly preferred that the sample acquiring device 11 also includes:
Feature information extraction unit 114, extracts the sample characteristics information in the sample data;
Wherein, the model determining device 12:
Determine corresponding pass by carrying out machine learning to the sample data and sample characteristics information therein Connection decision model information.
Specifically, feature information extraction unit 114 extracts the sample characteristics information in the sample data, Wherein described sample characteristics information includes network equipment unique characteristics information, user's unique characteristics information, uses Family and network device communications characteristic information.For example when network equipment be router, the sample characteristics information Including router unique characteristics information, user's unique characteristics information, user and router communication characteristic information. The router unique characteristics information is included but is not limited to:The average daily communication user number of router and There is total number of users, router working day and weekend communication user number ratio, the road of the user of communication in router By device different time sections communication user number ratio etc..User's unique characteristics information is included but is not limited to: User from the communication natural law in weekend or working day and all-router, user oneself in different on the same day Number of communications of period and all-router etc..The user and router communication characteristic information include but It is not limited to:Nearest date, the Yong Huyu of communication natural law, user and router communication of the user with router Router on weekdays or weekend communication different periods in one day of natural law, user and router it is logical Letter natural law, user and router each week communication natural law etc.
Continue in this embodiment, the model determining device 12 is by the sample data and therein Sample characteristics information carries out machine learning and determines corresponding interrelated decision model information, specifically, by sample Data and sample characteristics information therein composition training set (R_U, x1, x2, x3, x4....., label), wherein, R_U represents the associated group of user U and network equipment R;X1, x2, x3, x4..... represent sample characteristics information, Its number sets according to specific requirement;Label desirable 1 or 0, when for positive sample when take 1, take during negative sample 0.Determine corresponding association by carrying out machine learning to training set (R_U, x1, x2, x3, x4....., label) Decision model information.
More it is highly preferred that the model application apparatus 13:
Prediction is extracted in usage record information according to the sample characteristics information from user with regard to the network equipment Characteristic information;
By the predicted characteristics Information application in the interrelated decision model information, with obtain the user with Family's related information of the corresponding family of the network equipment.
In this embodiment, the model application apparatus 13 according to the sample characteristics information from user with regard to Predicted characteristics information is extracted in the usage record information of the network equipment;Wherein described predicted characteristics information be by User constitutes with the associated group and characteristic information of communication equipment, and predicted characteristics information is represented by (R_U, x1, x2, x3, x4....., -1), wherein R_U represent user and network equipment associated group, X1, x2, x3, x4..... represent the content that predicted characteristics packet contains, and specifically, predicted characteristics packet contains Particular content is identical with sample characteristics information, and its particular content is listed in the aforementioned embodiment, herein not Repeat again.
Continue in this embodiment, the model application apparatus 13 is by the predicted characteristics Information application in institute Interrelated decision model information is stated, is closed with the family for obtaining user family corresponding with the network equipment Connection information.Specifically, by the way that multiple predicted characteristics information (R_U, x1, x2, x3, x4....., -1) are input into into institute Interrelated decision model information is stated, the association probability of network equipment R and user U is obtained, it is described so as to obtain Family's related information of user family corresponding with the network equipment.
Compared with prior art, the application passes through to obtain sample data, wherein, sample data includes that sample is used Family and the related information of network of samples equipment, such as call duration time, communication frequency, communication date etc., and Machine learning is carried out to the sample data to determine corresponding interrelated decision model information, and user is closed In the network equipment usage record Information application in the interrelated decision model information, to obtain the user Family's related information of family corresponding with the network equipment.Wherein, it is right to be determined by machine learning The interrelated decision model information answered can effectively improve the discrimination of subscriber household incidence relation.
And, the application can also by according to the mapping relations between user totem information by same user Different user identification information corresponding to communications records information merger be with the communications records letter of user Breath, and according to merger after multiple users set up multiple use using the communications records information of heterogeneous networks equipment Family and communication equipment associated group.For example, the communication equipment by the way that domestic consumer is used carries out unifying to reflect Penetrate, i.e., the communication equipment is normalized to same user, the behavior characteristicss for facilitating views with the Internet enter Row extension.
Additionally, the application can also pass through to judge when two users family corresponding with consolidated network equipment When family's related information is association, determine that described two users belong to the same family, can be with according to mesh The corresponding destination network device of mark family determines the multiple targeted customers included by the target household, and root Family's portrait information of the target household is determined according to the portrait information of targeted customer, such that it is able to according to institute State family's portrait information and provide recommendation information, such as sales promotion information, advertising message etc. for the target household, Be conducive to the development of many business in units of family.
It is obvious to a person skilled in the art that the invention is not restricted to the thin of above-mentioned one exemplary embodiment Section, and without departing from the spirit or essential characteristics of the present invention, can be with other concrete Form realizes the present invention.Therefore, no matter from the point of view of which point, embodiment all should be regarded as exemplary , and be nonrestrictive, the scope of the present invention is by claims rather than described above is limited It is fixed, it is intended that all changes in the implication and scope of the equivalency of claim that will fall are included In the present invention.Any reference in claim should not be considered as into the right involved by limiting will Ask.Furthermore, it is to be understood that " an including " word is not excluded for other units or step, odd number is not excluded for plural number. The multiple units stated in device claim or device can also be by a units or device by soft Part or hardware are realizing.The first, the second grade word is used for representing title, and is not offered as any spy Fixed order.

Claims (26)

1. a kind of method for determining family's attribute information of user, wherein, the method includes:
Sample data is obtained, wherein, the sample data includes the pass of sample of users and network of samples equipment Connection information;
Determine corresponding interrelated decision model information by carrying out machine learning to the sample data;
By user with regard to the network equipment usage record Information application in the interrelated decision model information, with Obtain family's related information of user family corresponding with the network equipment.
2. method according to claim 1, wherein, it is described by user with regard to the network equipment use Record information is applied to the interrelated decision model information, to obtain the user with the network equipment pair Family's related information of the family answered includes:
By user with regard to the network equipment usage record Information application in the interrelated decision model information, with Obtain the equipment related information of the user and the network equipment;
When the equipment related information exceedes predetermined correlation threshold information, determine the user with the net Family's related information of the corresponding family of network equipment is association.
3. method according to claim 1 and 2, wherein, the method also includes:
When family's related information of two users family corresponding with consolidated network equipment is association, really Fixed described two users belong to the same family.
4. according to the method in any one of claims 1 to 3, wherein, the method also includes:
Multiple targets according to included by the corresponding destination network device of target household determines the target household User, wherein, family's related information of targeted customer family corresponding with the destination network device For association.
5. method according to claim 4, wherein, the method also includes:
Family's portrait information of the target household is determined according to the user of targeted customer portrait information;
Recommendation information is provided according to family portrait information for the target household.
6. method according to any one of claim 1 to 5, wherein, the sample data includes Positive sample data, wherein, the positive sample data include what sample of users was associated with network of samples equipment Related information;
Wherein, the acquisition sample data, wherein, the sample data includes sample of users and sample net The related information of network equipment includes:
Multiple users are set up according to multiple users using the communications records information of heterogeneous networks equipment to set with communication Standby associated group;
Based on pre-defined rule, screening determines same user from the plurality of user and communication equipment associated group Corresponding preferred network equipment, and as associated sample of users and network of samples equipment charge to it is described just Sample data.
7. method according to claim 6, wherein, the pre-defined rule includes following at least arbitrary :
Between the home location information of the device location information of the preferred network equipment and the same user Range information be less than or equal to predetermined correlation distance threshold information;
The device location information of other network equipments that the same user is used and the home location Range information between information is equal to or more than predetermined unrelated distance threshold information;
Between the home location information of the device location information of the preferred network equipment and the same user The device location information of other network equipments that used less than the same user of range information and institute State the range information between home location information.
8. the method according to claim 6 or 7, wherein, the sample data also includes negative sample Data, wherein, the negative sample data include sample of users association letter uncorrelated with network of samples equipment Breath;
Wherein, the acquisition sample data, wherein, the sample data includes sample of users and sample net The related information of network equipment also includes:
It is preferred according to the accumulative traffic information between the same user and other communication equipments for being used The corresponding unrelated network equipment of the same user, and as uncorrelated sample of users and network of samples Equipment charges to the negative sample data.
9. the method according to any one of claim 6 to 8, wherein, the user sets with communication The communication user number of communication equipment is less than or equal to home-use amount threshold value in standby associated group.
10. method according to claim 9, wherein, the method also includes:
According to the family population generaI investigation letter of communication equipment place region in the user and communication equipment associated group Breath determines the home-use amount threshold value.
11. methods according to any one of claim 6 to 10, wherein, it is described according to multiple use Multiple users are set up in family using the communications records information of heterogeneous networks equipment to be included with communication equipment associated group:
It is according to the mapping relations between user totem information that the different user identification information institute of same user is right The communications records information merger answered is the communications records information of same user;
Multiple users after according to merger set up multiple users using the communications records information of heterogeneous networks equipment With communication equipment associated group.
12. methods according to any one of claim 6 to 11, wherein, the acquisition sample number According to, wherein, the sample data also includes including sample of users with the related information of network of samples equipment:
Extract the sample characteristics information in the sample data;
Wherein, it is described to determine corresponding interrelated decision model by carrying out machine learning to the sample data Information includes:
Determine corresponding pass by carrying out machine learning to the sample data and sample characteristics information therein Connection decision model information.
13. methods according to claim 12, wherein, the making with regard to the network equipment by user The interrelated decision model information is applied to record information, to obtain the user with the network equipment Family's related information of corresponding family includes:
Prediction is extracted in usage record information according to the sample characteristics information from user with regard to the network equipment Characteristic information;
By the predicted characteristics Information application in the interrelated decision model information, with obtain the user with Family's related information of the corresponding family of the network equipment.
A kind of 14. equipment for determining family's attribute information of user, wherein, the equipment includes:
Sample acquiring device, for obtaining sample data, wherein, the sample data includes sample of users With the related information of network of samples equipment;
Model determining device, for determining corresponding association by carrying out machine learning to the sample data Decision model information;
Model application apparatus, for by user with regard to the network equipment usage record Information application in the pass Connection decision model information, with the family for obtaining user family corresponding with the network equipment letter is associated Breath.
15. equipment according to claim 14, wherein, the model application apparatus is used for:
By user with regard to the network equipment usage record Information application in the interrelated decision model information, with Obtain the equipment related information of the user and the network equipment;
When the equipment related information exceedes predetermined correlation threshold information, determine the user with the net Family's related information of the corresponding family of network equipment is association.
16. equipment according to claims 14 or 15, wherein, the equipment also includes:
Same home determining device, for when the family of two users family corresponding with consolidated network equipment When related information is association, determine that described two users belong to the same family.
17. equipment according to any one of claim 14 to 16, wherein, the equipment also includes:
Domestic consumer's determining device, for determining the mesh according to the corresponding destination network device of target household Multiple targeted customers included by mark family, wherein, the targeted customer and the destination network device pair Family's related information of the family answered is association.
18. equipment according to claim 17, wherein, the equipment also includes:
Family's portrait determining device, for determining the mesh according to the user of targeted customer portrait information Family's portrait information of mark family;
Recommendation information offer device, pushes away for being provided for the target household according to family portrait information Recommend information.
19. equipment according to any one of claim 14 to 18, wherein, the sample data Including positive sample data, wherein, the positive sample data include that sample of users is related to network of samples equipment The related information of connection;
Wherein, the sample acquiring device includes:
Associated group sets up unit, for using the communications records information of heterogeneous networks equipment according to multiple users Set up multiple users and communication equipment associated group;
Positive sample acquiring unit, for being based on pre-defined rule from the plurality of user and communication equipment associated group Middle screening determines the corresponding preferred network equipment of same user, and as associated sample of users and sample Present networks equipment charges to the positive sample data.
20. equipment according to claim 19, wherein, the pre-defined rule is at least appointed including following One:
Between the home location information of the device location information of the preferred network equipment and the same user Range information be less than or equal to predetermined correlation distance threshold information;
The device location information of other network equipments that the same user is used and the home location Range information between information is equal to or more than predetermined unrelated distance threshold information;
Between the home location information of the device location information of the preferred network equipment and the same user The device location information of other network equipments that used less than the same user of range information and institute State the range information between home location information.
21. equipment according to claim 19 or 20, wherein, the sample data also includes negative Sample data, wherein, the negative sample data include sample of users and the uncorrelated pass of network of samples equipment Connection information;
Wherein, the sample acquiring device also includes:
Negative sample acquiring unit, for according between the same user and other communication equipments for being used The corresponding unrelated network equipment of the preferably described same user of accumulative traffic information, and as onrelevant Sample of users and network of samples equipment charge to the negative sample data.
22. equipment according to any one of claim 19 to 21, wherein, the user with it is logical The communication user number of communication equipment is less than or equal to home-use amount threshold value in letter equipment associated group.
23. equipment according to claim 22, wherein, the equipment also includes:
Number of users threshold determining apparatus, for according to communication equipment in the user and communication equipment associated group The family population census information of place region determines the home-use amount threshold value.
24. equipment according to any one of claim 19 to 23, wherein, the association is set up Vertical unit is used for:
It is according to the mapping relations between user totem information that the different user identification information institute of same user is right The communications records information merger answered is the communications records information with a user;
Multiple users after according to merger set up multiple users using the communications records information of heterogeneous networks equipment With communication equipment associated group.
25. equipment according to any one of claim 19 to 24, wherein, the sample acquisition Device also includes:
Feature information extraction unit, for extracting the sample data in sample characteristics information;
Wherein, the model determining device is used for:
Determine corresponding pass by carrying out machine learning to the sample data and sample characteristics information therein Connection decision model information.
26. equipment according to claim 25, wherein, the model application apparatus is used for:
Prediction is extracted in usage record information according to the sample characteristics information from user with regard to the network equipment Characteristic information;
By the predicted characteristics Information application in the interrelated decision model information, with obtain the user with Family's related information of the corresponding family of the network equipment.
CN201510649771.8A 2015-10-09 2015-10-09 Method and apparatus for determining home attribute information of user Active CN106570014B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510649771.8A CN106570014B (en) 2015-10-09 2015-10-09 Method and apparatus for determining home attribute information of user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510649771.8A CN106570014B (en) 2015-10-09 2015-10-09 Method and apparatus for determining home attribute information of user

Publications (2)

Publication Number Publication Date
CN106570014A true CN106570014A (en) 2017-04-19
CN106570014B CN106570014B (en) 2020-09-25

Family

ID=58507703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510649771.8A Active CN106570014B (en) 2015-10-09 2015-10-09 Method and apparatus for determining home attribute information of user

Country Status (1)

Country Link
CN (1) CN106570014B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108769809A (en) * 2018-05-28 2018-11-06 成都市极米科技有限公司 Domestic consumer's behavioral data acquisition method, device and computer readable storage medium based on smart television
CN109717879A (en) * 2017-10-31 2019-05-07 丰田自动车株式会社 Condition estimating system
CN110019996A (en) * 2017-12-11 2019-07-16 中国移动通信集团广东有限公司 A kind of family relationship recognition methods and system
CN110163686A (en) * 2019-05-27 2019-08-23 成都魔方城科技有限公司 Desired consumption portrait method and system based on consumer behaviour
CN110324418A (en) * 2019-07-01 2019-10-11 阿里巴巴集团控股有限公司 Method and apparatus based on customer relationship transmission service
CN110769457A (en) * 2019-10-09 2020-02-07 深圳市酷开网络科技有限公司 Family relation discovery method, server and computer readable storage medium
CN111510368A (en) * 2019-01-31 2020-08-07 中国移动通信有限公司研究院 Family group identification method, device, equipment and computer readable storage medium
CN113098741A (en) * 2021-04-16 2021-07-09 深圳市炆石数据有限公司 Family portrait construction method, system, storage medium and advertisement cross-screen delivery method
CN113780605A (en) * 2020-06-28 2021-12-10 京东城市(北京)数字科技有限公司 Method and apparatus for predicting information
CN113836361A (en) * 2021-09-29 2021-12-24 平安科技(深圳)有限公司 Family relation network generation method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101841607A (en) * 2010-04-28 2010-09-22 深圳天源迪科信息技术股份有限公司 Method for obtaining family association relation between fixed-line phone and mobile phone
CN102541886A (en) * 2010-12-20 2012-07-04 郝敬涛 System and method for identifying relationship among user group and users
CN103365893A (en) * 2012-03-31 2013-10-23 百度在线网络技术(北京)有限公司 Method and device for searching individual information of user
CN104200657A (en) * 2014-07-22 2014-12-10 杭州智诚惠通科技有限公司 Traffic flow parameter acquisition method based on video and sensor
CN104331502A (en) * 2014-11-19 2015-02-04 亚信科技(南京)有限公司 Identifying method for courier data for courier surrounding crowd marketing
CN104883278A (en) * 2014-09-28 2015-09-02 北京匡恩网络科技有限责任公司 Method for classifying network equipment by utilizing machine learning
CN104954873A (en) * 2014-03-26 2015-09-30 Tcl集团股份有限公司 Intelligent television video customizing method and intelligent television video customizing system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101841607A (en) * 2010-04-28 2010-09-22 深圳天源迪科信息技术股份有限公司 Method for obtaining family association relation between fixed-line phone and mobile phone
CN102541886A (en) * 2010-12-20 2012-07-04 郝敬涛 System and method for identifying relationship among user group and users
CN103365893A (en) * 2012-03-31 2013-10-23 百度在线网络技术(北京)有限公司 Method and device for searching individual information of user
CN104954873A (en) * 2014-03-26 2015-09-30 Tcl集团股份有限公司 Intelligent television video customizing method and intelligent television video customizing system
CN104200657A (en) * 2014-07-22 2014-12-10 杭州智诚惠通科技有限公司 Traffic flow parameter acquisition method based on video and sensor
CN104883278A (en) * 2014-09-28 2015-09-02 北京匡恩网络科技有限责任公司 Method for classifying network equipment by utilizing machine learning
CN104331502A (en) * 2014-11-19 2015-02-04 亚信科技(南京)有限公司 Identifying method for courier data for courier surrounding crowd marketing

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109717879A (en) * 2017-10-31 2019-05-07 丰田自动车株式会社 Condition estimating system
CN109717879B (en) * 2017-10-31 2021-09-24 丰田自动车株式会社 State estimation system
CN110019996A (en) * 2017-12-11 2019-07-16 中国移动通信集团广东有限公司 A kind of family relationship recognition methods and system
CN108769809B (en) * 2018-05-28 2021-06-29 成都极米科技股份有限公司 Smart television-based home user behavior data acquisition method and device and computer-readable storage medium
CN108769809A (en) * 2018-05-28 2018-11-06 成都市极米科技有限公司 Domestic consumer's behavioral data acquisition method, device and computer readable storage medium based on smart television
CN111510368A (en) * 2019-01-31 2020-08-07 中国移动通信有限公司研究院 Family group identification method, device, equipment and computer readable storage medium
CN111510368B (en) * 2019-01-31 2023-01-03 中国移动通信有限公司研究院 Family group identification method, device, equipment and computer readable storage medium
CN110163686A (en) * 2019-05-27 2019-08-23 成都魔方城科技有限公司 Desired consumption portrait method and system based on consumer behaviour
CN110324418B (en) * 2019-07-01 2022-09-20 创新先进技术有限公司 Method and device for pushing service based on user relationship
CN110324418A (en) * 2019-07-01 2019-10-11 阿里巴巴集团控股有限公司 Method and apparatus based on customer relationship transmission service
CN110769457A (en) * 2019-10-09 2020-02-07 深圳市酷开网络科技有限公司 Family relation discovery method, server and computer readable storage medium
CN110769457B (en) * 2019-10-09 2022-10-28 深圳市酷开网络科技股份有限公司 Family relation discovery method, server and computer readable storage medium
CN113780605A (en) * 2020-06-28 2021-12-10 京东城市(北京)数字科技有限公司 Method and apparatus for predicting information
CN113098741A (en) * 2021-04-16 2021-07-09 深圳市炆石数据有限公司 Family portrait construction method, system, storage medium and advertisement cross-screen delivery method
CN113098741B (en) * 2021-04-16 2022-07-12 深圳市炆石数据有限公司 Family portrait construction method, system, storage medium and advertisement cross-screen delivery method
CN113836361A (en) * 2021-09-29 2021-12-24 平安科技(深圳)有限公司 Family relation network generation method, device, equipment and storage medium
CN113836361B (en) * 2021-09-29 2024-02-23 平安科技(深圳)有限公司 Home relationship network generation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN106570014B (en) 2020-09-25

Similar Documents

Publication Publication Date Title
CN106570014A (en) Method and device for determining home attribute information of user
Vanhoof et al. Assessing the quality of home detection from mobile phone data for official statistics
Wang et al. Understanding travellers’ preferences for different types of trip destination based on mobile internet usage data
US10013494B2 (en) Interest profile of a user of a mobile application
CN105824813B (en) A kind of method and device for excavating core customer
CN109640312B (en) 'Black card' identification method, electronic equipment and computer readable storage medium
CN109063966A (en) The recognition methods of adventure account and device
US8255392B2 (en) Real time data collection system and method
CN105160173B (en) Safety evaluation method and device
CN105306495B (en) user identification method and device
CN105976216A (en) Advertising effect evaluation method, advertisement injecting method and device
CN103189885B (en) Server and approaches to IM
CN111148018B (en) Method and device for identifying and positioning regional value based on communication data
CN107592296A (en) The recognition methods of rubbish account and device
CN109408522A (en) A kind of update method and device of user characteristic data
CN105045911B (en) Label generating method and equipment for user to mark
CN109145050B (en) Computing device
CN107527240A (en) A kind of operator's industry product Praise effect identification system and method
CN110796269A (en) Method and device for generating model, and method and device for processing information
Manley et al. New forms of data for understanding urban activity in developing countries
CN112925899B (en) Ordering model establishment method, case clue recommendation method, device and medium
CN109451334A (en) User, which draws a portrait, generates processing method, device and electronic equipment
CN107025246A (en) A kind of recognition methods of target geographical area and device
Sumathi et al. Crowd estimation at a social event using call data records
CN110569418A (en) Method and device for verifying academic calendar information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant