CN106570014B - Method and apparatus for determining home attribute information of user - Google Patents

Method and apparatus for determining home attribute information of user Download PDF

Info

Publication number
CN106570014B
CN106570014B CN201510649771.8A CN201510649771A CN106570014B CN 106570014 B CN106570014 B CN 106570014B CN 201510649771 A CN201510649771 A CN 201510649771A CN 106570014 B CN106570014 B CN 106570014B
Authority
CN
China
Prior art keywords
information
user
family
association
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510649771.8A
Other languages
Chinese (zh)
Other versions
CN106570014A (en
Inventor
吴保华
付登坡
甘云锋
黄耐寒
吕秀泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510649771.8A priority Critical patent/CN106570014B/en
Publication of CN106570014A publication Critical patent/CN106570014A/en
Application granted granted Critical
Publication of CN106570014B publication Critical patent/CN106570014B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application aims to provide a method and equipment for determining family attribute information of a user. Compared with the prior art, the method and the device have the advantages that the sample data is obtained, wherein the sample data comprises the associated information of the sample user and the sample network equipment, such as communication time, communication frequency, communication date and the like, the corresponding associated decision model information is determined by performing machine learning on the sample data, and the use record information of the user about the network equipment is applied to the associated decision model information, so that the family associated information of the family corresponding to the user and the network equipment is obtained. The corresponding association decision model information is determined through machine learning, so that the identification rate of the family association relationship of the user can be effectively improved.

Description

Method and apparatus for determining home attribute information of user
Technical Field
The present invention relates to the field of computers, and in particular, to a technique for determining family attribute information of a user.
Background
With the rapid development of the home internet technology, more and more services are developed by taking a family as a unit, so that the identification of which users come from the same family is important for solving the fine data operation of the home internet.
In the prior art, the user family identification method is mainly inferred through the communication data relation between a telephone fixed-line telephone and a mobile phone number, and has several defects, for example, the method is easy to overfit based on small sample data modeling, the data acquisition cost is higher and higher, the user communication equipment cannot be identified uniformly, the expansion by adopting internet behavior characteristics is not convenient, the coverage rate and the identification rate of family users are not high, and the like. With the development of home internet technology, the above problems are more and more prominent.
Disclosure of Invention
The application aims to provide a method and equipment for determining family attribute information of a user, so as to solve the problem whether the user has a family association relationship with a family where a corresponding network equipment is located.
According to an aspect of the present application, there is provided a method for determining family attribute information of a user, wherein the method includes:
acquiring sample data, wherein the sample data comprises associated information of a sample user and sample network equipment;
determining corresponding associated decision model information by performing machine learning on the sample data;
and applying the usage record information of the user about the network equipment to the association decision model information to obtain family association information of the family corresponding to the network equipment.
According to another aspect of the present application, there is also provided an apparatus for determining family attribute information of a user, wherein the apparatus includes:
the sample acquisition device is used for acquiring sample data, wherein the sample data comprises the associated information of a sample user and sample network equipment;
the model determining device is used for determining corresponding associated decision model information by performing machine learning on the sample data;
and the model application device is used for applying the usage record information of the user about the network equipment to the association decision model information so as to obtain family association information of the family corresponding to the user and the network equipment.
Compared with the prior art, the method and the device have the advantages that the sample data is obtained, wherein the sample data comprises the associated information of the sample user and the sample network equipment, such as communication time, communication frequency, communication date and the like, the corresponding associated decision model information is determined by performing machine learning on the sample data, and the use record information of the user about the network equipment is applied to the associated decision model information, so that the family associated information of the family corresponding to the user and the network equipment is obtained. The corresponding association decision model information is determined through machine learning, so that the identification rate of the family association relationship of the user can be effectively improved.
Moreover, the communication record information corresponding to different user identification information of the same user can be merged into the communication record information of one user according to the mapping relation among the user identification information, and a plurality of user and communication equipment association groups are established according to the merged communication record information of a plurality of users using different network equipment. For example, by uniformly mapping the communication devices used by the home users, i.e., normalizing the communication devices to the same user, it is beneficial to expand the application of the internet behavior characteristics.
In addition, the method and the device can determine that two users belong to the same family by judging whether the family associated information of the two users and the family corresponding to the same network device is associated, determine a plurality of target users included in the target family according to the target network device corresponding to the target family, and determine the family portrait information of the target family according to the portrait information of the target users, so that recommendation information, such as promotion information, advertisement information and the like, can be provided for the target family according to the family portrait information, and are beneficial to the development of services taking the family as a unit.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:
FIG. 1 illustrates a flow chart of a method for determining family attribute information of a user in accordance with an aspect of the subject application;
FIG. 2 illustrates a flow chart of a method for determining family attribute information of a user in accordance with a preferred embodiment of the present application;
FIG. 3 illustrates a schematic diagram of an apparatus for determining family attribute information of a user, according to another aspect of the subject application;
fig. 4 shows a schematic diagram of an apparatus for determining family property information of a user according to a preferred embodiment of the present application.
The same or similar reference numbers in the drawings identify the same or similar elements.
Detailed Description
The present invention is described in further detail below with reference to the attached drawing figures.
In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
To further illustrate the technical means and effects adopted by the present application, the following description clearly and completely describes the technical solution of the present application with reference to the accompanying drawings and preferred embodiments.
Referring to fig. 1, a method for determining family attribute information of a user according to an aspect of the present application is illustrated, wherein the method includes:
s1, sample data is obtained, wherein the sample data comprises the associated information of the sample user and the sample network equipment;
s2, determining corresponding associated decision model information by performing machine learning on the sample data;
s3 applies usage record information of the user about the network device to the association decision model information to obtain family association information of a family to which the user corresponds to the network device.
In this embodiment, in the step S1, sample data is obtained, where the sample data includes association information of a sample user and a sample network device; specifically, the network device may be a device for enabling a user to access the internet, for example, a router, a device for establishing a wireless access point, or the like, and the sample network device is a network device used as a sample to obtain an association decision model described below; the information related to the sample user and the sample network device includes all information related to the sample user and the sample network device, that is, information related to the sample user accessing the sample network device, such as a time distribution (e.g., one day) in a short time, a time distribution (e.g., one month) in a long time, a frequency, and the like of the sample user accessing the sample network device.
Specifically, the manner of obtaining the sample data may include directly obtaining existing sample data from the local device, or may also include extracting the sample data from the collected communication data between the user and the network device for which the association relationship is determined.
It should be understood by those skilled in the art that the manner of obtaining the sample data in step S1 is only an example, and other manners of obtaining the sample data that may occur now or hereafter are also included in the scope of the present application, and are also included herein by reference.
Continuing in this embodiment, in said step S2, determining corresponding associated decision model information by performing machine learning on said sample data; specifically, the association decision model is used to determine whether the user and the network device have an association relationship, and further, the association decision model may be implemented by establishing an artificial intelligence model, for example, a GBDT (binary decision tree) algorithm may be adopted, the algorithm is composed of a plurality of decision trees, and the final classification result is accumulated based on all the results, for example, machine learning training is continuously performed on sample data by applying the GBDT algorithm, so that the output association relationship between the user and the network device reaches a certain accuracy, and thus the corresponding association decision model information is determined.
Continuing in this embodiment, in step S3, usage record information of the user about the network device is applied to the association decision model information to obtain family association information of the family of the user corresponding to the network device. Wherein the usage record information includes communication information of the user with the plurality of network devices, and the like. The family association information is whether the user has an association relationship with a family corresponding to the network device. Specifically, by extracting the usage record information and applying the extracted usage record information to the association decision model information determined in step S2, it is possible to obtain whether the user has an association relationship with the family corresponding to the network device, so as to determine family association information of the family corresponding to the user and the network device.
Preferably, wherein the step S3 includes:
s31 (not shown) applies usage record information of a user about a network device to the association decision model information to obtain device association information of the user with the network device.
S32 (not shown), when the device association information exceeds the predetermined association threshold information, determining that the family association information of the family corresponding to the network device is associated.
Specifically, in the step S31, the device association information of the user and the network device may be represented by an association probability of the user and the network device. Specifically, communication information of a user about a network device is applied to the association decision model information to obtain an association probability between the user and the network device, so that device association information between the user and the network device is determined according to the magnitude of the association probability, for example, the user is U, the network device is a router R, communication information between the user U and the router R is input into the association decision model information to obtain an association probability between the user U and the router R, and whether the user U is associated with the router R is determined according to the magnitude of the association probability.
Specifically, in step S32, the association threshold information is a threshold of the association probability between the user and the network device, and the threshold is set in advance. Specifically, by comparing the obtained association probability between the user and the network device with a predetermined association probability threshold, when the association probability is greater than the predetermined association probability threshold, it is determined that the family association information of the family corresponding to the user and the network device is associated, for example, the threshold of the association probability between the user and the network device may be set to 80%, when the network device is a router R and the user is U, and when the association probability between the router R and the user U is greater than 80%, it is determined that the family association information of the family corresponding to the user U and the router R is associated.
Preferably, the method further comprises:
s4 determines that two users belong to the same family when the family related information of the two users and the family corresponding to the same network device are both related.
Specifically, in the step S4, the association probabilities of the two users and the same network device are respectively obtained through the step S3, and when the association probabilities of the two users and the same network device are both greater than the threshold of the association probabilities of the two users and the same network device, that is, when the home association information of the two users and the home corresponding to the same network device are both associated, it is determined that the two users belong to the same home, for example, when the network device is the router R, the threshold of the association probability is 80%, and the association probabilities of the user U1 and the user U2 and the router R are both greater than 80%, it is determined that the user U1 and the user U2 are associated with the home where the router R is located, so as to determine that the user U1 and the user U2 belong to the same home.
Preferably, the method further comprises:
s5 (not shown) determines a plurality of target users included in a target family according to a target network device corresponding to the target family, where the family association information of the target users and the family corresponding to the target network device is association.
Specifically, in the step S5, each target household has a plurality of target users, and according to the household association information of the plurality of target users and the household corresponding to the target network device being associated, the plurality of target users included in the target household are determined, for example, the target network device corresponding to the target household is the router R, the association probabilities of the users U1, U2, U3, U4 and the router R are all greater than the association threshold information in the step S32, the users U1, U2, U3, U4 and the household where the router R is located are determined to be associated, and thus the users U1, U2, U3, U4 are determined to be the plurality of target users included in the target household.
More preferably, the method further comprises:
s6 (not shown) determining family representation information of the target family from the user representation information of the target user;
s7 (not shown) provides recommendation information for the target family based on the family profile information.
Specifically, in step S6, family representation information of the target family is determined based on the user representation information of the target user; the user profile information represents various sets of information characteristic of the user, including but not limited to the user's gender, age, occupation, educational background, skills, hobbies, and the like. The family portrait information represents various information sets of family features including, but not limited to, family background, family hobbies, family income, family life attitudes, and the like. And determining the family portrait information of the target family according to the user portrait information of the target user, and determining the family characteristic information of the family where the target user is located by analyzing the user characteristic information of the target user. The family preferences of the target family may be determined to include sports, for example, by analyzing the target family user's favorite sports.
Continuing in this embodiment, in step S7, recommendation information is provided for the target family based on the family representation information; the recommendation information includes but is not limited to promotion information, advertisement information, financial information, and the like. The method for providing the recommendation information for the target family according to the family portrait information can provide the matched recommendation information for the target family according to a plurality of family characteristic information of the family portrait, such as family hobbies, family income and the like. Specifically, for example, the family preference in the family portrait information of the target family includes food, matching food information may be recommended to the target family.
Preferably, the sample data comprises positive sample data, wherein the positive sample data comprises association information of a sample user associated with the sample network device;
as shown in fig. 2, the step S1 includes:
s11, establishing a plurality of user and communication device association groups according to the communication record information of the plurality of users using different network devices;
s12, screening and determining a preferred network device corresponding to the same user from the multiple user and communication device association groups based on a predetermined rule, and logging in the positive sample data as the associated sample user and sample network device.
Specifically, in step S11, a plurality of user-communication device association groups are established according to the communication record information of the plurality of users using different network devices; wherein the communication record information includes, but is not limited to, communication information of a plurality of users with different network devices. Specifically, according to the way that a plurality of users establish a plurality of user-communication device association groups using communication record information of different network devices, the plurality of users may be combined with the plurality of network devices, for example, when the network devices are routers and the routers communicating with users U1 and U2 have R1 and R2, then the following association groups (U1_ R1, x1, x2, x3..), (U1_ R2, x1, x2, x3..), (U2_ R1, x1, x2, x3..), (U2_ R2, x1, x2, x3..), wherein x1, x2, x3.. represent communication information of users with different routers, and the number thereof may be set according to specific requirements.
Specifically, in step S12, a preferred network device corresponding to the same user is determined by screening from the multiple user and communication device association groups based on a predetermined rule, and the positive sample data is logged as an associated sample user and sample network device; wherein the preferred network device is the network device most relevant to the home in which the sample user is located. Specifically, when the network device is a router, the most relevant router of the family where the same user is located is determined from the established association group of the plurality of users and the router based on a predetermined rule. For example, if there are R2, and R2 routers communicating with users U1 and U2, then the following association group (U2_ R2, x2, x2, 2.), (U2_ R2, x2, x2, 2.) (U2_ R2, x2, x2, x2, 2.), (U2_ R2, x2, x 2.), (U2, x2, x 2.)) may be established, and based on a predetermined rule, the most relevant router of user U2 is selected as R2, and the most relevant router of user U2 is R2, then (U2, x2, x2, x 2.) (n.) (U2, 2.) (said.
More preferably, the predetermined rule includes at least any one of:
the distance information between the equipment position information of the preferred network equipment and the home position information of the same user is less than or equal to the preset associated distance threshold information;
the distance information between the device position information of other network devices used by the same user and the home position information is equal to or larger than the preset irrelevant distance threshold information;
the distance information between the device location information of the preferred network device and the home location information of the same user is smaller than the distance information between the device location information of other network devices used by the same user and the home location information.
Wherein the rule for screening the preferred network device may include at least any one of the following:
(1) the distance information between the device location information of the preferred network device and the home location information of the same user is less than or equal to the predetermined associated distance threshold information, wherein the home location information is determined in a manner including, but not limited to: determining according to the payment relation data, for example, according to the user common receiving address in the payment relation data; and determining according to the position relation data, for example, determining according to the position of the wireless activity hotspot in the position relation data, the wireless activity time and the like. Wherein the associated distance threshold information is a preset threshold of information related to the device location information of the preferred network device and the home location information of the same user. Specifically, when the network device is a router, and when distance information between the location information of the router and the home location information of the same user is less than or equal to the preset threshold, it is determined that the router is the most relevant router for the home where the same user is located, and the router is used as the preferred network device corresponding to the same user. For example, when the associated distance threshold information is set to 0.2 km and the distance between the router R and the home location information of the user U is less than or equal to 0.2 km, it is determined that the router R is the preferred network device of the user U.
(2) And the distance information between the device position information of other network devices used by the same user and the home position information is equal to or greater than the preset irrelevant distance threshold information, wherein the irrelevant distance information is a preset threshold of information that the device position information of the network devices is irrelevant to the home position information of the same user. Specifically, when the network device is a router, and when distance information between the location information of the router and the home location information of the same user is equal to or greater than the preset threshold, it is determined that the router is an unrelated router of the home where the same user is located. For example, the irrelevant distance threshold information is set to be 3 kilometers, the distance information between the routers R1 and R2 and the home location information of the same user U is greater than or equal to 3 kilometers, and the routers R1 and R2 are determined to be non-preferred network devices of the home where the user U is located.
(3) The distance information between the device location information of the preferred network device and the home location information of the same user is smaller than the distance information between the device location information of other network devices used by the same user and the home location information. Specifically, when the network device is a router, the distance information between the router most related to the home where the same user is located and the home location information of the same user is smaller than the distance information between the location information of other router devices used by the same user and the home location information. For example, if the preferred network device of the user U is the router R1 and the non-preferred network devices are the routers R2 and R3, the distance between the router R1 and the home location information of the user U is smaller than the distances between the routers R2 and R3 and the home location information of the user U.
More preferably, the sample data further comprises negative sample data, wherein the negative sample data comprises association information that the sample user is not associated with the sample network device;
referring to fig. 2, the step S1 further includes:
s13 prefers an unrelated network device corresponding to the same user according to the accumulated traffic information between the same user and the other communication devices used by the same user, and enters the negative sample data as an unrelated sample user and a sample network device.
Those skilled in the art will understand that after determining the positive sample data corresponding to the preferred network device, the user is not associated with other communication devices except the preferred network device, so that several of the other communication device(s) can be selected as the irrelevant network device corresponding to the user according to the accumulated traffic of the user and other communication devices, and further negative sample data can be constructed for machine learning. Specifically, in the step S13, after determining the positive sample data corresponding to the preferred network device, in other communication devices except the preferred network device, according to the accumulated traffic information of the user and each communication device, a number of other communication devices are preferred, for example, a number of other communication devices whose accumulated traffic information exceeds a predetermined number of communication days or a predetermined communication time duration, or the first N communication devices whose accumulated traffic information is the most for the user, as unrelated network devices corresponding to the user, that is, the user is not associated with each unrelated network device; then, the user and each independent network device which is selected as the user and the user are used as the non-associated sample data to be logged into the negative sample data. For example, when the other network device is the router R, the accumulated traffic information is the number of communication days, and the preset threshold of the number of communication days is 10, and when the number of communication days between the user U and the router R is greater than or equal to 10, the router R is determined to be an irrelevant network device of the user U, and the negative sample data is logged as an unassociated sample user and sample network device.
More preferably, the number of communication users of the communication device in the user and communication device association set is less than or equal to the home user number threshold.
It will be appreciated by those skilled in the art that in a practical scenario, communication devices used in a household are usually only available to a relatively small number of users, while communication devices outside the household, such as communication devices in a coffee shop or a library, are usually available to a large number of users, and therefore, in this embodiment, communication devices that are obviously not used in the household may also be filtered out in advance by a household user number threshold, that is, the number of communication users of the communication devices in the user and communication device association is less than or equal to the household user number threshold. Here, the home user number threshold value includes an average value of the number of home users or a certain multiple of the average value. Specifically, for example, assuming that the threshold of the number of home users is 5, when the routers R1, R2, and R3 establish association groups with 5, 2, and 10 different users, respectively, the association group corresponding to the router R3 is deleted, and only the association group corresponding to the routers R1 and R2 is reserved.
More preferably, the method further comprises:
s15 (not shown), determining the threshold of the number of home users according to the general census information of the home in the area where the communication device is located in the user and communication device association group.
Specifically, in step S15, the average number of home users in the area where the communication device is located in the user and communication device association group is different, the average number of home users in the area may be determined according to the family census information, and the average number of home users in the area or a multiple of the average number of home users is used as the home user number threshold. For example, if the average number of home users is 3 based on census information in the shanghai region, the shanghai region may set the threshold number of home users to be 3, 6, 9, or 12.
More preferably, the step S11 includes:
s111 (not shown) merges the communication record information corresponding to different user identification information of the same user into the communication record information of the same user according to the mapping relationship between the user identification information.
S112 (not shown) establishes a plurality of user-to-communication device association groups according to the merged communication record information of the plurality of users using different network devices.
Specifically, in step S111, the user identification information includes, but is not limited to, a mac address, an imei number, an imsi number, an application id registered by the user, and the like of the communication device used by the same user. The mapping is implemented through UUIC services. Specifically, the mac addresses, imei numbers, imsi numbers, application id registered by the user, and the like of different communication devices used by the same user are mapped to the same user through the mapping relationship provided by the UUIC service, so that the communication record information corresponding to different communication devices is merged into the communication record information of the same user. For example, the mobile devices used by the user U are handsets P1 and P2, wherein imei numbers of P1 and P2 are imei1 and imei2, respectively, and a communication record (imei1, R) of the handset P1 and the router R and a communication record (imei2, R) of the handset P2 and the router R are mapped into a communication record (U, R) of the user and the router through the UUIC service.
Specifically, in step S112, a plurality of user and communication device association groups are established according to the merged communication record information of a plurality of users using different network devices, for example, the user U1 has mobile devices imei1 and imei2, wherein the mobile device imei1 has communication records (imei1, R1) and (imei1, R2) with the router R1 and the router R2, the mobile device imei2 has communication records (imei2, R1) and (imei2, R2) with the router R1 and the router R2, the (imei1, R1) and (imei2, R1) are mapped to (U, R1) through UUIC service, and the (imei1, R2) and (imei2, R2) are mapped to (U, R39 2).
More preferably, referring to fig. 2, the step S1 further includes:
s14, extracting sample characteristic information in the sample data;
wherein the step S2 includes:
and determining corresponding associated decision model information by performing machine learning on the sample data and sample characteristic information in the sample data.
Specifically, in the step S14, sample feature information in the sample data is extracted; the sample characteristic information comprises the self characteristic information of the network equipment, the self characteristic information of the user and the communication characteristic information of the user and the network equipment. For example, when the network device is a router, the sample feature information includes router own feature information, user own feature information, and user and router communication feature information. The router characteristic information includes but is not limited to: the router averages the number of communication subscribers per day, the total number of subscribers communicating with the router, the ratio of the number of communication subscribers on the router between working days and weekends, the ratio of communication subscribers on the router at different time periods, and the like. The user characteristic information includes but is not limited to: the number of communication days of the user himself with all routers on weekends or weekdays, the number of communication times of the user himself with all routers in different time periods on the same day, and the like. The user and router communication characteristic information includes but is not limited to: the number of days a user communicates with the router, the last date the user communicated with the router, the number of days the user communicated with the router on weekdays or weekends, the number of days the user communicated with the router at different times of day, the number of days the user communicated with the router on each week, and so forth
Specifically, in the step S2, the corresponding associated decision model information is determined by performing machine learning on the sample data and the sample feature information therein. Specifically, the sample data and sample characteristic information therein are combined into a training set (R _ U, x1, x2, x3, x4..., label), wherein R _ U represents an association group of a user U and a network device R; x1, x2, x3 and x4.. label can take 1 or 0, 1 for positive samples and 0 for negative samples. The corresponding associated decision model information is determined by machine learning the training set (R _ U, x1, x2, x3, x4..
More preferably, the step S3 includes:
s31 (not shown) extracting predicted feature information from the usage record information of the user about the network device based on the sample feature information;
s32 (not shown) applies the predicted feature information to the association decision model information to obtain family association information of the family of the user corresponding to the network device.
Specifically, in step S31, predicted feature information is extracted from the usage record information of the user about the network device based on the sample feature information; the predicted feature information is composed of association groups of users and communication devices and feature information, and may be represented as (R _ U, x1, x2, x3, x4.., -1), where R _ U represents an association group of users and network devices, and x1, x2, x3, x4... represent contents included in the predicted feature information, specifically, specific contents included in the predicted feature information are the same as those included in the sample feature information, and are listed in the foregoing embodiments, which is not described herein again.
Specifically, in step S32, the predicted feature information is applied to the association decision model information to obtain family association information of the family corresponding to the network device. Specifically, by inputting a plurality of pieces of predicted characteristic information (R _ U, x1, x2, x3, x4.., -1) into the association decision model information, the association probability of the network device R and the user U is obtained, and thus the family association information of the family corresponding to the user and the network device is obtained.
Compared with the prior art, the method and the device have the advantages that the sample data is obtained, wherein the sample data comprises the associated information of the sample user and the sample network equipment, such as communication time, communication frequency, communication date and the like, the corresponding associated decision model information is determined by performing machine learning on the sample data, and the use record information of the user about the network equipment is applied to the associated decision model information, so that the family associated information of the family corresponding to the user and the network equipment is obtained. The corresponding association decision model information is determined through machine learning, so that the identification rate of the family association relationship of the user can be effectively improved.
Moreover, the communication record information corresponding to different user identification information of the same user can be merged into the communication record information of one user according to the mapping relation among the user identification information, and a plurality of user and communication equipment association groups are established according to the merged communication record information of a plurality of users using different network equipment. For example, by uniformly mapping the communication devices used by the home users, i.e., normalizing the communication devices to the same user, it is beneficial to expand the application of the internet behavior characteristics.
In addition, the method and the device can determine that two users belong to the same family by judging whether the family associated information of the two users and the family corresponding to the same network device is associated, determine a plurality of target users included in the target family according to the target network device corresponding to the target family, and determine the family portrait information of the target family according to the portrait information of the target users, so that recommendation information, such as promotion information, advertisement information and the like, can be provided for the target family according to the family portrait information, and are beneficial to the development of services taking the family as a unit.
Referring to fig. 3, there is illustrated an apparatus 1 for determining family property information of a user according to another aspect of the present application, wherein the apparatus comprises:
the sample acquisition device 11 is used for acquiring sample data, wherein the sample data comprises the associated information of a sample user and sample network equipment;
the model determining device 12 is used for determining corresponding associated decision model information by performing machine learning on the sample data;
and the model application device 13 is used for applying the usage record information of the user about the network equipment to the correlation decision model information so as to obtain family correlation information of the family corresponding to the user and the network equipment.
In this embodiment, the sample acquiring apparatus 11 acquires sample data, where the sample data includes association information of a sample user and a sample network device; specifically, the network device may be a device for enabling a user to access the internet, for example, a router, a device for establishing a wireless access point, or the like, and the sample network device is a network device used as a sample to obtain an association decision model described below; the information related to the sample user and the sample network device includes all information related to the sample user and the sample network device, that is, information related to the sample user accessing the sample network device, such as a time distribution (e.g., one day) in a short time, a time distribution (e.g., one month) in a long time, a frequency, and the like of the sample user accessing the sample network device.
Specifically, the manner of obtaining the sample data may include directly obtaining existing sample data from the local device, or may also include extracting the sample data from the collected communication data between the user and the network device for which the association relationship is determined.
It should be understood by those skilled in the art that the above-mentioned manner of acquiring sample data by the sample acquiring device 11 is only an example, and other manners of acquiring sample data that may occur now or hereafter may be applicable to the present application, and should be included within the scope of the present application, and is herein incorporated by reference.
Continuing in this embodiment, the model determining means 12 determines the corresponding associated decision model information by performing machine learning on the sample data; specifically, the association decision model is used to determine whether the user and the network device have an association relationship, and further, the association decision model may be implemented by establishing an artificial intelligence model, for example, a GBDT algorithm may be adopted, the algorithm is composed of a plurality of decision trees, and the final classification result is accumulated based on all the results, for example, machine learning training is continuously performed on sample data by applying the GBDT algorithm, so that the output association relationship between the user and the network device reaches a certain accuracy, and the corresponding association decision model information is determined.
Continuing in this embodiment, the model application means 13 applies the usage record information of the user about the network device to the association decision model information to obtain the family association information of the family to which the user corresponds to the network device. Wherein the usage record information includes communication information of the user with the plurality of network devices, and the like. The family association information is whether the user has an association relationship with a family corresponding to the network device. Specifically, by extracting the usage record information and applying the extracted usage record information to the association decision model information determined by the model determining device 12, it is possible to obtain whether the user has an association relationship with the family corresponding to the network device, so as to determine family association information of the user with the family corresponding to the network device.
Preferably, the model application means 13 comprise:
a device-associated information obtaining unit (not shown) that applies usage record information of a user about a network device to the association decision model information to obtain device-associated information of the user and the network device;
a family related information determining unit (not shown) that determines the family related information of the family corresponding to the network device of the user as being related when the device related information exceeds predetermined related threshold information.
Specifically, the device association information obtaining unit applies usage record information of a user about a network device to the association decision model information to obtain device association information of the user and the network device, wherein the device association information of the user and the network device can be represented by an association probability of the user and the network device. Specifically, the communication information of the user about the network equipment is applied to the association decision model information, the association probability of the user and the network equipment is obtained, and therefore the equipment association information of the user and the network equipment is determined according to the magnitude of the association probability. For example, the user is U, the network device is router R, the communication information of the user U and the router R is input into the association decision model information, the association probability of the user U and the router R is obtained, and whether the user U is associated with the router R is determined according to the magnitude of the association probability.
Specifically, when the device association information exceeds predetermined association threshold information, the family association information determining unit determines that the family association information of the family corresponding to the user and the network device is associated, where the association threshold information is a threshold of an association probability between the user and the network device, and the threshold is set in advance. Specifically, by comparing the obtained association probability of the user and the network device with a predetermined association probability threshold, when the association probability is greater than the predetermined association probability threshold, it is determined that the family association information of the family corresponding to the user and the network device is associated. For example, the threshold of the association probability between the user and the network device may be set to 80%, and when the network device is a router R and the user is a U, if the association probability between the router R and the user U is greater than 80%, it is determined that the family association information of the family corresponding to the router R is associated with the user U.
Preferably, the apparatus further comprises:
and the same family determining means (not shown) determines that two users belong to the same family when the family associated information of the two users and the family corresponding to the same network device are both associated.
In this embodiment, when the family related information of two users and the family corresponding to the same network device are both related, the same family determining device determines that the two users belong to the same family, specifically, the model application device 13 obtains the association probabilities of the two users and the same network device, respectively, and when the association probabilities of the two users and the same network device are both greater than the threshold of the association probabilities of the two users and the same network device, that is, the family related information of the two users and the family corresponding to the same network device are both related, it determines that the two users belong to the same family. For example, when the network device is a router R, the threshold of the association probability is 80%, and the association probabilities of the user U1 and the user U2 with the router R are both greater than 80%, it is determined that the user U1 and the user U2 are associated with the family where the router R is located, so that it is determined that the user U1 and the user U2 belong to the same family.
Preferably, the apparatus further comprises:
a home user determining device (not shown) that determines a plurality of target users included in a target home according to target network devices corresponding to the target home, where the target users are associated with home associated information of a home corresponding to the target network devices.
In this embodiment, the home user determining apparatus determines, according to a target network device corresponding to a target home, a plurality of target users included in the target home, where the target users are associated with home association information of a home corresponding to the target network device, where each target home has a plurality of target users, and determines, according to the home association information of the plurality of target users and the home corresponding to the target network device, the plurality of target users included in the target home. For example, the target network device corresponding to the target home is the router R, the association probabilities of the users U1, U2, U3, U4 and the router R are all greater than the association threshold information, and it is determined that the users U1, U2, U3, U4 are associated with the home where the router R is located, so that it is determined that the users U1, U2, U3, U4 are multiple target users included in the target home.
More preferably, the apparatus further comprises:
a family representation determining means (not shown) for determining family representation information of the target family from user representation information of the target user;
recommendation information providing means (not shown) for providing recommendation information to the target family based on the family representation information.
Specifically, the family portrait determination device determines the family portrait information of the target family according to the user portrait information of the target user; where the user profile information represents various sets of information characteristic of the user, including but not limited to the user's gender, age, occupation, educational background, skills, hobbies, and the like. The family portrait information represents various information sets of family features including, but not limited to, family background, family hobbies, family income, family life attitudes, and the like. And determining the family portrait information of the target family according to the user portrait information of the target user, and determining the family characteristic information of the family where the target user is located by analyzing the user characteristic information of the target user. The family preferences of the target family may be determined to include sports, for example, by analyzing the target family user's favorite sports.
Continuing in this embodiment, a recommendation information providing device provides recommendation information for the target family based on the family representation information; the recommendation information includes, but is not limited to, promotion information, advertisement information, financial information, and the like. The method for providing the recommendation information for the target family according to the family portrait information can provide the matched recommendation information for the target family according to a plurality of family characteristic information of the family portrait, such as family hobbies, family income and the like. Specifically, for example, the family preference in the family portrait information of the target family includes food, matching food information may be recommended to the target family.
Referring to fig. 4, preferably, the sample data includes positive sample data, wherein the positive sample data includes association information of a sample user associated with a sample network device;
wherein the sample acquiring device 11 comprises:
an association group establishing unit 111 that establishes association groups of a plurality of users and communication devices according to communication record information of the plurality of users using different network devices;
the positive sample obtaining unit 112 filters and determines a preferred network device corresponding to the same user from the multiple user and communication device association groups based on a predetermined rule, and enters the positive sample data as the associated sample user and sample network device.
Specifically, the association group establishing unit 111 establishes a plurality of user and communication device association groups according to communication record information of a plurality of users using different network devices; wherein the communication record information includes, but is not limited to, communication information of a plurality of users and different network devices. Specifically, according to the way that a plurality of users establish a plurality of user-communication device association groups using communication record information of different network devices, the plurality of users may be combined with the plurality of network devices, for example, when the network devices are routers and the routers communicating with users U1 and U2 have R1 and R2, then the following association groups (U1_ R1, x1, x2, x3..), (U1_ R2, x1, x2, x3..), (U2_ R1, x1, x2, x3..), (U2_ R2, x1, x2, x3..), wherein x1, x2, x3.. represent communication information of users with different routers, and the number thereof may be set according to specific requirements.
Specifically, the positive sample obtaining unit 112 filters and determines a preferred network device corresponding to the same user from the multiple user and communication device association groups based on a predetermined rule, and logs the positive sample data as an associated sample user and sample network device; wherein the preferred network device is the network device most relevant to the home in which the sample user is located. Specifically, when the network device is a router, the most relevant router of the family where the same user is located is determined from the established association group of the plurality of users and the router based on a predetermined rule. For example, if there are R2, and R2 for routers communicating with users U1 and U2, then the following association group (U2_ R2, x2, x2, 2.), (U2_ R2, x2, x2, 2.) (U2_ R2, x2, x2, x2, 2.), (U2_ R2, x2, x 2.) (U2, x2, x 2.) may be established, and based on a predetermined rule, the most relevant router of user U2 is R2, and the most relevant router of user U2 is selected as R2, then U2, x2, x2, x 2.) (n.) (U2, 2.) (said sample data is included in said.
More preferably, the predetermined rule includes at least any one of:
the distance information between the equipment position information of the preferred network equipment and the home position information of the same user is less than or equal to the preset associated distance threshold information;
the distance information between the device position information of other network devices used by the same user and the home position information is equal to or larger than the preset irrelevant distance threshold information;
the distance information between the device location information of the preferred network device and the home location information of the same user is smaller than the distance information between the device location information of other network devices used by the same user and the home location information.
Wherein the rule for screening the preferred network device may include at least any one of the following:
(1) the distance information between the device location information of the preferred network device and the home location information of the same user is less than or equal to the predetermined associated distance threshold information, wherein the home location information is determined in a manner including, but not limited to: determining according to the payment relation data, for example, according to the user common receiving address in the payment relation data; and determining according to the position relation data, for example, determining according to the position of the wireless activity hotspot in the position relation data, the wireless activity time and the like.
Wherein the associated distance threshold information is a preset threshold of information related to the device location information of the preferred network device and the home location information of the same user. Specifically, when the network device is a router, and when distance information between the location information of the router and the home location information of the same user is less than or equal to the preset threshold, it is determined that the router is the most relevant router for the home where the same user is located, and the router is used as the preferred network device corresponding to the same user. For example, when the associated distance threshold information is set to 0.2 km and the distance between the router R and the home location information of the user U is less than or equal to 0.2 km, it is determined that the router R is the preferred network device of the user U.
(2) And the distance information between the device position information of other network devices used by the same user and the home position information is equal to or greater than the preset irrelevant distance threshold information, wherein the irrelevant distance information is a preset threshold of information that the device position information of the network devices is irrelevant to the home position information of the same user. Specifically, when the network device is a router, and when distance information between the location information of the router and the home location information of the same user is equal to or greater than the preset threshold, it is determined that the router is an unrelated router of the home where the same user is located. For example, the irrelevant distance threshold information is set to be 3 kilometers, the distance information between the routers R1 and R2 and the home location information of the same user U is greater than or equal to 3 kilometers, and the routers R1 and R2 are determined to be non-preferred network devices of the home where the user U is located.
(3) The distance information between the device location information of the preferred network device and the home location information of the same user is smaller than the distance information between the device location information of other network devices used by the same user and the home location information. Specifically, when the network device is a router, the distance information between the router most related to the home where the same user is located and the home location information of the same user is smaller than the distance information between the location information of other router devices used by the same user and the home location information. For example, if the preferred network device of the user U is the router R1 and the non-preferred network devices are the routers R2 and R3, the distance between the router R1 and the home location information of the user U is smaller than the distances between the routers R2 and R3 and the home location information of the user U.
More preferably, as shown in fig. 4, the sample data further includes negative sample data, where the negative sample data includes association information that the sample user is not associated with the sample network device;
wherein the sample acquiring device 11 further comprises:
the negative sample acquiring unit 113 preferentially selects an unrelated network device corresponding to the same user according to the accumulated traffic information between the same user and the other used communication devices, and records the negative sample data as an unrelated sample user and sample network device.
Those skilled in the art will understand that after determining the positive sample data corresponding to the preferred network device, the user is not associated with other communication devices except the preferred network device, so that several of the other communication device(s) can be selected as the irrelevant network device corresponding to the user according to the accumulated traffic of the user and other communication devices, and further negative sample data can be constructed for machine learning. Specifically, after positive sample data corresponding to the preferred network device is determined, several other communication devices are preferred in other communication devices except the preferred network device according to the accumulated traffic information of the user and each communication device, for example, several other communication devices whose accumulated traffic information exceeds a predetermined communication day or a predetermined communication duration with the user, or the first N communication devices whose accumulated traffic information with the user is the most are used as the unrelated network devices corresponding to the user, that is, the user is not associated with each unrelated network device; then, the user and each independent network device which is selected as the user and the user are used as the non-associated sample data to be logged into the negative sample data. For example, when the other network device is the router R, the accumulated traffic information is the number of communication days, and the preset threshold of the number of communication days is 10, and when the number of communication days between the user U and the router R is greater than or equal to 10, the router R is determined to be an irrelevant network device of the user U, and the negative sample data is logged as an unassociated sample user and sample network device.
More preferably, the number of communication users of the communication device in the user and communication device association set is less than or equal to the home user number threshold.
It will be appreciated by those skilled in the art that in a practical scenario, communication devices used in a household are usually only available to a relatively small number of users, while communication devices outside the household, such as communication devices in a coffee shop or a library, are usually available to a large number of users, and therefore, in this embodiment, communication devices that are obviously not used in the household may also be filtered out in advance by a household user number threshold, that is, the number of communication users of the communication devices in the user and communication device association is less than or equal to the household user number threshold. Here, the home user number threshold value includes an average value of the number of home users or a certain multiple of the average value. Specifically, for example, assuming that the threshold of the number of home users is 5, when the routers R1, R2, and R3 establish association groups with 5, 2, and 10 different users, respectively, the association group corresponding to the router R3 is deleted, and only the association group corresponding to the routers R1 and R2 is reserved.
More preferably, the apparatus further comprises:
and a user number threshold determining device (not shown) configured to determine the home user number threshold according to the general census information of the home in the area where the communication device is located in the user and communication device association group.
Specifically, the user number threshold determining device determines the home user number threshold according to the general household census information of the region where the communication device is located in the user and communication device association group, where the average home user numbers of the region where the communication device is located in the user and communication device association group are different, the average home user number of the region may be determined according to the general household census information, and the average home user number of the region or a multiple of the average home user number is used as the home user number threshold. For example, if the average number of home users is 3 based on census information in the shanghai region, the shanghai region may set the threshold number of home users to be 3, 6, 9, or 12.
More preferably, the association group establishing unit 111 includes:
a communication record information merging subunit (not shown) that merges communication record information corresponding to different user identification information of the same user into communication record information of one user according to a mapping relationship between the user identification information;
and an association group establishing subunit (not shown) for establishing a plurality of user and communication device association groups according to the merged communication record information of the plurality of users using different network devices.
Specifically, the communication record information merging subunit merges the communication record information corresponding to different user identification information of the same user into the communication record information of the user according to the mapping relationship between the user identification information, where the user identification information includes, but is not limited to, a mac address, an imei number, an imsi number, an application id registered by the user, and the like of the communication device used by the same user. The mapping is implemented through UUIC services. Specifically, the mac addresses, imei numbers, imsi numbers, application id registered by the user, and the like of different communication devices used by the same user are mapped to the same user through the mapping relationship provided by the UUIC service, so that the communication record information corresponding to different communication devices is merged into the communication record information of the same user. For example, the mobile devices used by the user U are handsets P1 and P2, wherein imei numbers of P1 and P2 are imei1 and imei2, respectively, and a communication record (imei1, R) of the handset P1 and the router R and a communication record (imei2, R) of the handset P2 and the router R are mapped into a communication record (U, R) of the user and the router through the UUIC service.
Specifically, an association group establishing subunit (not shown) establishes a plurality of user and communication device association groups according to the merged communication record information of the plurality of users using different network devices, for example, the user U1 has mobile devices imei1 and imei2, wherein the mobile device imei1 has communication records (imei1, R1) and (imei1, R2) with the router R1 and the router R2, the mobile device imei2 has communication records (imei2, R1) and (imei2, R2) with the router R1 and the router R2, and (imei1, R1) and (imei2, R1) are mapped to (U, R1) and (imei1, R2) and (ei 2, R2) are mapped to (U, R39 2) by the UUIC service.
As shown in fig. 4, more preferably, the sample acquiring device 11 further comprises:
a feature information extraction unit 114 that extracts sample feature information in the sample data;
wherein the model determining means 12:
and determining corresponding associated decision model information by performing machine learning on the sample data and sample characteristic information in the sample data.
Specifically, the feature information extraction unit 114 extracts sample feature information in the sample data, where the sample feature information includes network device own feature information, user own feature information, and user and network device communication feature information. For example, when the network device is a router, the sample feature information includes router own feature information, user own feature information, and user and router communication feature information. The router characteristic information includes but is not limited to: the router averages the number of communication subscribers per day, the total number of subscribers communicating with the router, the ratio of the number of communication subscribers on the router between working days and weekends, the ratio of communication subscribers on the router at different time periods, and the like. The user characteristic information includes but is not limited to: the number of communication days of the user himself with all routers on weekends or weekdays, the number of communication times of the user himself with all routers in different time periods on the same day, and the like. The user and router communication characteristic information includes but is not limited to: the number of days a user communicates with the router, the last date the user communicated with the router, the number of days the user communicated with the router on weekdays or weekends, the number of days the user communicated with the router at different times of day, the number of days the user communicated with the router on each week, and so forth
Continuing in this embodiment, the model determining device 12 determines corresponding associated decision model information by performing machine learning on the sample data and sample feature information therein, specifically, the sample data and sample feature information therein are grouped into a training set (R _ U, x1, x2, x3, x4..., label), where R _ U represents an associated group of the user U and the network device R; x1, x2, x3 and x4.. label can take 1 or 0, 1 for positive samples and 0 for negative samples. The corresponding associated decision model information is determined by machine learning the training set (R _ U, x1, x2, x3, x4..
More preferably, the model application means 13:
extracting predicted characteristic information from the use record information of the user about the network equipment according to the sample characteristic information;
and applying the prediction characteristic information to the correlation decision model information to obtain family correlation information of the family corresponding to the network equipment.
In this embodiment, the model application means 13 extracts predicted feature information from usage record information of a user about a network device based on the sample feature information; the predicted feature information is composed of association groups of users and communication devices and feature information, and may be represented as (R _ U, x1, x2, x3, x4.., -1), where R _ U represents an association group of users and network devices, and x1, x2, x3, x4... represent contents included in the predicted feature information, specifically, specific contents included in the predicted feature information are the same as those included in the sample feature information, and are listed in the foregoing embodiments, which is not described herein again.
Continuing in this embodiment, the model application means 13 applies the prediction feature information to the association decision model information to obtain family association information of the family corresponding to the network device. Specifically, by inputting a plurality of pieces of predicted characteristic information (R _ U, x1, x2, x3, x4.., -1) into the association decision model information, the association probability of the network device R and the user U is obtained, and thus the family association information of the family corresponding to the user and the network device is obtained.
Compared with the prior art, the method and the device have the advantages that the sample data is obtained, wherein the sample data comprises the associated information of the sample user and the sample network equipment, such as communication time, communication frequency, communication date and the like, the corresponding associated decision model information is determined by performing machine learning on the sample data, and the use record information of the user about the network equipment is applied to the associated decision model information, so that the family associated information of the family corresponding to the user and the network equipment is obtained. The corresponding association decision model information is determined through machine learning, so that the identification rate of the family association relationship of the user can be effectively improved.
Moreover, the communication record information corresponding to different user identification information of the same user can be merged into the communication record information of one user according to the mapping relation among the user identification information, and a plurality of user and communication equipment association groups are established according to the merged communication record information of a plurality of users using different network equipment. For example, by uniformly mapping the communication devices used by the home users, i.e., normalizing the communication devices to the same user, it is beneficial to expand the application of the internet behavior characteristics.
In addition, the method and the device can determine that two users belong to the same family by judging whether the family associated information of the two users and the family corresponding to the same network device is associated, determine a plurality of target users included in the target family according to the target network device corresponding to the target family, and determine the family portrait information of the target family according to the portrait information of the target users, so that recommendation information, such as promotion information, advertisement information and the like, can be provided for the target family according to the family portrait information, and are beneficial to the development of services taking the family as a unit.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims (20)

1. A method for determining family attribute information of a user, wherein the method comprises:
acquiring sample data, wherein the sample data comprises associated information of a sample user and sample network equipment, the sample data comprises positive sample data and negative sample data, and the associated information is related information of the sample user accessing the sample network equipment;
determining corresponding associated decision model information by performing machine learning on the sample data;
applying usage record information of a user about network equipment to the association decision model information to obtain family association information of a family corresponding to the network equipment by the user;
acquiring sample data, comprising:
establishing a plurality of user and communication equipment association groups according to the communication record information of a plurality of users using different network equipment;
screening and determining preferred network equipment corresponding to the same user from the plurality of user and communication equipment association groups based on a predetermined rule, and recording the preferred network equipment as associated sample user and sample network equipment into the positive sample data;
optimizing irrelevant network equipment corresponding to the same user according to accumulated traffic information between the same user and other used communication equipment, and recording the negative sample data as unrelated sample user and sample network equipment;
the predetermined rule includes at least any one of:
the distance information between the equipment position information of the preferred network equipment and the home position information of the same user is less than or equal to the preset associated distance threshold information;
the distance information between the device position information of other network devices used by the same user and the home position information is equal to or larger than the preset irrelevant distance threshold information;
the distance information between the device location information of the preferred network device and the home location information of the same user is smaller than the distance information between the device location information of other network devices used by the same user and the home location information.
2. The method of claim 1, wherein the applying usage record information of a user about a network device to the association decision model information to obtain family association information of a family of the user corresponding to the network device comprises:
applying usage record information of a user about a network device to the association decision model information to obtain device association information of the user and the network device;
and when the equipment association information exceeds the preset association threshold information, determining that the family association information of the family corresponding to the user and the network equipment is associated.
3. The method of claim 1, wherein the method further comprises:
and when the family associated information of the two users and the family corresponding to the same network equipment are both associated, determining that the two users belong to the same family.
4. The method of claim 1, wherein the method further comprises:
determining a plurality of target users included in a target family according to target network equipment corresponding to the target family, wherein the target users are associated with family associated information of the family corresponding to the target network equipment.
5. The method of claim 4, wherein the method further comprises:
determining family portrait information of the target family according to the user portrait information of the target user;
and providing recommendation information for the target family according to the family portrait information.
6. The method of claim 1, wherein the number of communications subscribers of the communications device in the subscriber and communications device association is less than or equal to a home subscriber number threshold.
7. The method of claim 6, wherein the method further comprises:
and determining the threshold value of the number of the family users according to the family census information of the region where the communication equipment is located in the user and communication equipment association group.
8. The method of claim 1, wherein the establishing a plurality of user-to-communication device associations according to the communication record information of the plurality of users using different network devices comprises:
merging the communication record information corresponding to different user identification information of the same user into the communication record information of the same user according to the mapping relation among the user identification information;
and establishing a plurality of user and communication equipment association groups according to the merged communication record information of the plurality of users using different network equipment.
9. The method of claim 1, wherein said obtaining sample data, wherein said sample data including sample user association information with a sample network device further comprises:
extracting sample characteristic information in the sample data;
wherein the determining of the corresponding associated decision model information by machine learning the sample data comprises:
and determining corresponding associated decision model information by performing machine learning on the sample data and sample characteristic information in the sample data.
10. The method of claim 9, wherein the applying usage record information of a user about a network device to the association decision model information to obtain family association information of a family of the user corresponding to the network device comprises:
extracting predicted characteristic information from the use record information of the user about the network equipment according to the sample characteristic information;
and applying the prediction characteristic information to the correlation decision model information to obtain family correlation information of the family corresponding to the network equipment.
11. An apparatus for determining family property information of a user, wherein the apparatus comprises:
the sample acquisition device is used for acquiring sample data, wherein the sample data comprises associated information of a sample user and sample network equipment, the sample data comprises positive sample data and negative sample data, and the associated information is related information of the sample user accessing the sample network equipment;
the model determining device is used for determining corresponding associated decision model information by performing machine learning on the sample data;
the model application device is used for applying the usage record information of the user about the network equipment to the correlation decision model information so as to obtain family correlation information of a family corresponding to the user and the network equipment;
the sample acquiring device comprises:
the association group establishing unit is used for establishing a plurality of user and communication equipment association groups according to the communication record information of the plurality of users using different network equipment;
a positive sample obtaining unit, configured to filter and determine, based on a predetermined rule, preferred network devices corresponding to the same user from the multiple user and communication device association groups, and to log in the positive sample data as associated sample users and sample network devices;
a negative sample obtaining unit, configured to prefer an unrelated network device corresponding to the same user according to accumulated traffic information between the same user and other used communication devices, and to enter the negative sample data as a sample user and a sample network device that are unrelated;
the predetermined rule includes at least any one of:
the distance information between the equipment position information of the preferred network equipment and the home position information of the same user is less than or equal to the preset associated distance threshold information;
the distance information between the device position information of other network devices used by the same user and the home position information is equal to or larger than the preset irrelevant distance threshold information;
the distance information between the device location information of the preferred network device and the home location information of the same user is smaller than the distance information between the device location information of other network devices used by the same user and the home location information.
12. The apparatus of claim 11, wherein the model application means is for:
applying usage record information of a user about a network device to the association decision model information to obtain device association information of the user and the network device;
and when the equipment association information exceeds the preset association threshold information, determining that the family association information of the family corresponding to the user and the network equipment is associated.
13. The apparatus of claim 11, wherein the apparatus further comprises:
and the same family determining device is used for determining that the two users belong to the same family when the family associated information of the two users and the family corresponding to the same network equipment are associated.
14. The apparatus of claim 11, wherein the apparatus further comprises:
the home user determining device is used for determining a plurality of target users included in a target home according to target network equipment corresponding to the target home, wherein the target users are associated with home associated information of the home corresponding to the target network equipment.
15. The apparatus of claim 14, wherein the apparatus further comprises:
a family portrait determination means for determining family portrait information of the target family based on user portrait information of the target user;
and the recommendation information providing device is used for providing recommendation information for the target family according to the family portrait information.
16. The device of claim 11, wherein the number of communications subscribers of the communications device in the subscriber and communications device association is less than or equal to a home subscriber number threshold.
17. The apparatus of claim 16, wherein the apparatus further comprises:
and the user number threshold determining device is used for determining the home user number threshold according to the home census information of the region where the communication equipment is located in the user and communication equipment association group.
18. The apparatus of claim 11, wherein the association set-up unit is configured to:
merging the communication record information corresponding to different user identification information of the same user into the communication record information of one user according to the mapping relation among the user identification information;
and establishing a plurality of user and communication equipment association groups according to the merged communication record information of the plurality of users using different network equipment.
19. The apparatus of claim 11, wherein the sample acquisition device further comprises:
the characteristic information extraction unit is used for extracting sample characteristic information in the sample data;
wherein the model determining means is for:
and determining corresponding associated decision model information by performing machine learning on the sample data and sample characteristic information in the sample data.
20. The apparatus of claim 19, wherein the model application means is for:
extracting predicted characteristic information from the use record information of the user about the network equipment according to the sample characteristic information;
and applying the prediction characteristic information to the correlation decision model information to obtain family correlation information of the family corresponding to the network equipment.
CN201510649771.8A 2015-10-09 2015-10-09 Method and apparatus for determining home attribute information of user Active CN106570014B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510649771.8A CN106570014B (en) 2015-10-09 2015-10-09 Method and apparatus for determining home attribute information of user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510649771.8A CN106570014B (en) 2015-10-09 2015-10-09 Method and apparatus for determining home attribute information of user

Publications (2)

Publication Number Publication Date
CN106570014A CN106570014A (en) 2017-04-19
CN106570014B true CN106570014B (en) 2020-09-25

Family

ID=58507703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510649771.8A Active CN106570014B (en) 2015-10-09 2015-10-09 Method and apparatus for determining home attribute information of user

Country Status (1)

Country Link
CN (1) CN106570014B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6853159B2 (en) * 2017-10-31 2021-03-31 トヨタ自動車株式会社 State estimator
CN110019996A (en) * 2017-12-11 2019-07-16 中国移动通信集团广东有限公司 A kind of family relationship recognition methods and system
CN108769809B (en) * 2018-05-28 2021-06-29 成都极米科技股份有限公司 Smart television-based home user behavior data acquisition method and device and computer-readable storage medium
CN111510368B (en) * 2019-01-31 2023-01-03 中国移动通信有限公司研究院 Family group identification method, device, equipment and computer readable storage medium
CN110163686A (en) * 2019-05-27 2019-08-23 成都魔方城科技有限公司 Desired consumption portrait method and system based on consumer behaviour
CN110324418B (en) * 2019-07-01 2022-09-20 创新先进技术有限公司 Method and device for pushing service based on user relationship
CN110769457B (en) * 2019-10-09 2022-10-28 深圳市酷开网络科技股份有限公司 Family relation discovery method, server and computer readable storage medium
CN113098741B (en) * 2021-04-16 2022-07-12 深圳市炆石数据有限公司 Family portrait construction method, system, storage medium and advertisement cross-screen delivery method
CN113836361B (en) * 2021-09-29 2024-02-23 平安科技(深圳)有限公司 Home relationship network generation method, device, equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200657A (en) * 2014-07-22 2014-12-10 杭州智诚惠通科技有限公司 Traffic flow parameter acquisition method based on video and sensor

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101841607A (en) * 2010-04-28 2010-09-22 深圳天源迪科信息技术股份有限公司 Method for obtaining family association relation between fixed-line phone and mobile phone
CN102541886B (en) * 2010-12-20 2015-04-01 郝敬涛 System and method for identifying relationship among user group and users
CN103365893B (en) * 2012-03-31 2019-10-11 百度在线网络技术(北京)有限公司 A kind of method and apparatus of the individual information for realizing search user
CN104954873B (en) * 2014-03-26 2018-10-26 Tcl集团股份有限公司 A kind of smart television video method for customizing and system
CN104883278A (en) * 2014-09-28 2015-09-02 北京匡恩网络科技有限责任公司 Method for classifying network equipment by utilizing machine learning
CN104331502B (en) * 2014-11-19 2018-04-03 杭州亚信软件有限公司 The recognition methods of courier's data in being marketed for courier periphery crowd

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200657A (en) * 2014-07-22 2014-12-10 杭州智诚惠通科技有限公司 Traffic flow parameter acquisition method based on video and sensor

Also Published As

Publication number Publication date
CN106570014A (en) 2017-04-19

Similar Documents

Publication Publication Date Title
CN106570014B (en) Method and apparatus for determining home attribute information of user
CN110337059B (en) Analysis algorithm, server and network system for family relationship of user
CN106792992B (en) Method and equipment for providing wireless access point information
CN106993048B (en) Determine method and device, information recommendation method and the device of recommendation information
CA2832722A1 (en) Data mining method for social network of terminal user and related methods, apparatuses and systems
CN111339436B (en) Data identification method, device, equipment and readable storage medium
US20130311283A1 (en) Data mining method for social network of terminal user and related methods, apparatuses and systems
CN112311612B (en) Information construction method and device and storage medium
CN106301980B (en) Brushing amount tool detection method and device
US8559926B1 (en) Telecom-fraud detection using device-location information
EP2652909B1 (en) Method and system for carrying out predictive analysis relating to nodes of a communication network
CN107862020B (en) Friend recommendation method and device
CN111131493B (en) Data acquisition method and device and user portrait generation method and device
CN111817868A (en) Method and device for positioning network quality abnormity
CN109583228B (en) Privacy information management method, device and system
US11792662B2 (en) Identification and prioritization of optimum capacity solutions in a telecommunications network
WO2018010693A1 (en) Method and apparatus for identifying information from rogue base station
CN109412832B (en) User service providing method and system
CN111148018B (en) Method and device for identifying and positioning regional value based on communication data
CN111163482A (en) Data processing method, device and storage medium
CN102272743A (en) Management method for information of universal integrated circuit card and device thereof
CN109121137B (en) Method and device for identifying user number use type of double-card terminal
CN102075386B (en) Identification method and device
CN108924840B (en) Blacklist management method and device and terminal
CN113094412B (en) Identity recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant