CN114143207A - Home user identification method and electronic equipment - Google Patents

Home user identification method and electronic equipment Download PDF

Info

Publication number
CN114143207A
CN114143207A CN202010816834.5A CN202010816834A CN114143207A CN 114143207 A CN114143207 A CN 114143207A CN 202010816834 A CN202010816834 A CN 202010816834A CN 114143207 A CN114143207 A CN 114143207A
Authority
CN
China
Prior art keywords
family
users
user
user data
family relationship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010816834.5A
Other languages
Chinese (zh)
Inventor
刘伟平
涂锋
刘忱
戚玉雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Guangdong Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202010816834.5A priority Critical patent/CN114143207A/en
Publication of CN114143207A publication Critical patent/CN114143207A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Algebra (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Operations Research (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a home user identification method and electronic equipment, which are used for solving the problem that home users are not accurately identified in the prior art. The scheme provided by the application comprises the following steps: acquiring user data of a plurality of characteristic dimensions in historical communication records of each user, wherein the user data of any two users can be used for indicating the family relationship between the two users; identifying whether a family relationship exists between the users based on the user data of the users; mapping the user identified to have the family relationship in the network through a complex network algorithm to obtain a family relationship model; and identifying the family groups of the users according to the family relation model and the user data to obtain the users in each family group. According to the scheme of the embodiment of the invention, the family relation among the users is effectively determined according to the historical communication records, so that the users belonging to the same family are accurately identified, and the identification efficiency is improved.

Description

Home user identification method and electronic equipment
Technical Field
The present invention relates to the field of data processing, and in particular, to a home user identification method and an electronic device.
Background
With the development of mobile communication, the home user market is an emerging market appearing in recent years, and the home user has become a key for the tactical layout of an operator. At present, home users have not satisfied with the single voice and internet access requirements, and for mobile communication enterprises, the demand points of users turn to the service requirements of entertainment and living applications, so that various large telecom operators continuously begin to develop home services.
The family service is a service provided for specificity among family members, and frequent contact, communication resource sharing and other requirements often exist among the family members. In order to provide better service to the family users, each family member in a family needs to be accurately identified, so that the family business can be performed in a targeted manner.
How to efficiently and accurately identify the family user is a technical problem to be solved by the application.
Disclosure of Invention
The embodiment of the application aims to provide a home user identification method and electronic equipment, which are used for solving the problem that home users are not accurately identified in the prior art.
In a first aspect, a home user identification method is provided, including:
acquiring user data of a plurality of characteristic dimensions in historical communication records of each user, wherein the user data of any two users can be used for indicating the family relationship between the two users;
identifying whether a family relationship exists between the users based on the user data of the users;
mapping the user identified to have the family relationship in the network through a complex network algorithm to obtain a family relationship model;
and identifying the family groups of the users according to the family relation model and the user data to obtain the users in each family group.
In a second aspect, an electronic device is provided, comprising:
an electronic device, comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring user data of a plurality of characteristic dimensions in historical communication records of users, and the user data of any two users can be used for indicating the family relationship between the two users;
the identification module is used for identifying whether a family relationship exists between the users or not based on the user data of the users;
the mapping module is used for mapping the user identified to have the family relationship in the network through a complex network algorithm to obtain a family relationship model;
and the identification module is used for identifying the family group of the user according to the family relation model to obtain the user in each family group.
In a third aspect, an electronic device is provided, the electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the method according to the first aspect.
In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the method according to the first aspect.
In the embodiment of the application, user data of a plurality of characteristic dimensions in a historical communication record of each user is obtained, wherein the user data of any two users can be used for indicating the family relationship between the two users; identifying whether a family relationship exists between the users based on the user data of the users; mapping the user identified to have the family relationship in the network through a complex network algorithm to obtain a family relationship model; and identifying the family groups of the users according to the family relation model and the user data to obtain the users in each family group. By the scheme, whether family relations exist among the users can be identified according to the user data, and the relations among the users can be judged from multiple aspects because the user data has multi-dimensional index features. By combining the multi-dimensional characteristic data, the accuracy of judging whether the family relationship exists among the users can be improved. The family relation model can show whether family relations exist among users, and whether the two users belong to the same family can be further judged by combining user data, so that family groups of the users can be identified according to requirements. The family to which each user belongs can be distinguished through the identification, and then each user belonging to the same family group is determined. Thereby efficiently and accurately identifying users belonging to the same family.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is one of the flow diagrams of a home subscriber identification method according to an embodiment of the present invention;
fig. 2a is a second flowchart of a home subscriber identification method according to an embodiment of the present invention;
FIG. 2b is a partial structural schematic diagram of a family relationship model according to an embodiment of the present invention;
fig. 3 is a third flowchart illustrating a home subscriber identification method according to an embodiment of the present invention;
FIG. 4 is a fourth flowchart illustrating a home subscriber identification method according to an embodiment of the invention;
fig. 5 is a fifth flowchart illustrating a home subscriber identification method according to an embodiment of the present invention;
fig. 6 is a sixth flowchart illustrating a home subscriber identification method according to an embodiment of the present invention;
fig. 7 is a seventh flowchart of a home subscriber identification method according to an embodiment of the present invention;
FIG. 8 is one of the schematic structural diagrams of an electronic device of the present application;
fig. 9 is a second schematic structural diagram of an electronic device of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. The reference numbers in the present application are only used for distinguishing the steps in the scheme and are not used for limiting the execution sequence of the steps, and the specific execution sequence is described in the specification.
With the development of mobile communication, the home user market has become an emerging market in recent years, and home users have become the key to the tactical layout of operators. At present, home users do not meet the single voice and internet requirements, but turn to the service requirements of entertainment and living application, and accurately and efficiently identify the home users, which can help to provide the services required by the users for the home users.
In order to solve the problems in the prior art, an embodiment of the present application provides a home user identification method, as shown in fig. 1, including the following steps:
s11: acquiring user data of a plurality of characteristic dimensions in historical communication records of each user, wherein the user data of any two users can be used for indicating the family relationship between the two users;
s12: identifying whether a family relationship exists between the users based on the user data of the users;
s13: mapping the user identified to have the family relationship in the network through a complex network algorithm to obtain a family relationship model;
s14: and identifying the family groups of the users according to the family relation model and the user data to obtain the users in each family group.
In step S11, the historical communication records of the user may be reported periodically by the user, or may be collected when the user communicates. The historical communication records include, but are not limited to, records of communication performed by a user through a mobile terminal such as a mobile phone, a tablet computer, an electronic watch, and the like, and may also include records of communication performed by a user through an electronic device such as a computer, a smart television, and the like.
After the user data for the plurality of feature dimensions is collected, the user data may be preprocessed. Such as filtering invalid data, padding incomplete data, etc. When a user communicates, a plurality of users are often involved. For example, when a user places a call to a family member, the communication involves at least the user's own person and the family member. If a user initiates a group chat, the communication may involve the user's own person and a number of users within the group. Thus, the user data in the historical communication record can indicate whether a family relationship exists between at least two users involved in the communication.
Specifically, the family attributes between two users may be scored according to the user data, two users having scores equal to or greater than a preset score value are determined as users having a family relationship, and two users having scores less than the preset score value are determined as users not having a family relationship.
Subsequently, in step S12, whether there is a family relationship between two users can be determined by an important feature dimension of the feature dimensions, or whether there is a family relationship between two users can be determined by combining a plurality of feature dimensions.
For example, whether a family relationship exists between two users can be determined through night positioning data and user call relationship data.
A plurality of users connected with the same WIFI can be selected according to the historical records of the connected WIFI, suspected public WIFI and users who are in a lease are removed, and further whether family relations exist between the two users can be judged according to night positioning data and conversation relation data.
Or, a plurality of users using the same IMEI within a preset time period are determined, and if the using time of the IMEI by the plurality of users is longer than a certain time, whether a family relationship exists between the two users can be further judged according to night positioning data and call relationship data. If the user is off-line, the co-production right on-line number of the off-line user can be determined according to the certificate information of the user, so that the communication data of the user can be tracked, and whether the off-line user has a family relationship with other users or not can be determined.
And then, or, screening users with the same address according to the certificate information of the users, and further judging whether a family relationship exists between the two users through night positioning data and call relationship data.
In addition, since the number of family members of a family may be greater than 2, a user having a family relationship with the user may be determined according to the communication circle of the user. For example, a user who is located in accordance with the user's nighttime location, a user who is in accordance with the user's domicile, a user who is in accordance with the user's roaming place during the spring festival, a user who is in accordance with the user's surname, a user who is in accordance with the user's age in accordance with the age of the family relationship, and so on.
By the scheme, whether family relations exist among the users can be identified according to the user data, and the relations among the users can be judged from multiple aspects because the user data has multi-dimensional index features. By combining the multi-dimensional characteristic data, the accuracy of judging whether the family relationship exists among the users can be improved.
Next, in step S13, the user identified that the family relationship exists is mapped in the network by the complex network algorithm, and a family relationship model is obtained. Based on the steps, users with family relations can be determined, the users with family relations are mapped to the network through a complex network algorithm, and the obtained family relation model can embody the relations among the users, so that the division of the families of the users is further facilitated.
Finally, in step S14, the family groups of the users are identified according to the family relationship model and the user data, and the users in each family group are obtained. The family relation model can show whether family relations exist among users, and whether the two users belong to the same family can be further judged by combining user data, so that family groups of the users can be identified according to requirements. The family to which each user belongs can be distinguished through the identification, and then each user belonging to the same family group is determined.
According to the scheme, on the basis of user data of a plurality of characteristic dimensions in historical communication records of users, the relationship between the two users is judged through analysis methods such as sampling exclusion and combined analysis, whether family relationships exist between the users is firstly identified, then the users with the family relationships identified and the family relationships between the users are mapped to a complex network, and different family groups are identified by using a complex network community division algorithm so as to identify the users belonging to the same family group. The scheme integrates the user data with multiple dimensions to identify the family user, and has the advantages of high identification efficiency, high accuracy and easiness in implementation.
Based on the solution provided by the foregoing embodiment, optionally, as shown in fig. 2a, in step S13, mapping the user identified that the family relationship exists in the network through a complex network algorithm to obtain the family relationship model, including the following steps:
s21: and mapping the family relationship between the users with the family relationship in the network through a complex network algorithm to obtain a family relationship model, wherein the family relationship model comprises a plurality of virtual nodes and virtual links among the virtual nodes, the virtual nodes are generated by mapping the users with the family relationship, and the virtual links are generated by mapping the family relationship among the users.
In the solution provided by this embodiment, the family relationship model includes a plurality of virtual nodes representing users and virtual links connecting the virtual nodes. Any one virtual node in the family relationship model is communicated with at least one other virtual node through a virtual link. For one virtual node, it can communicate with another plurality of different virtual nodes through virtual links. As shown in fig. 2b, a part of the family relationship model is shown, wherein the users connected to the user a include associated user 1 to associated user 5, indicating that the users having family relationship with the user a include at least the associated user 1 to associated user 5.
Further, a virtual link connected between virtual nodes of two users having a family relationship may also be used to characterize the family relationship weights of the two connected users, for example, the weights may be labeled at the endpoints of the edges of the virtual link.
By the method provided by the embodiment, the users with family relations are effectively judged, and then the family relations among the users are mapped to the complex network by combining the complex network community algorithm, so that the family relations among the users can be more accurately and reasonably divided, and the prediction accuracy is greatly improved.
Based on the solution provided by the foregoing embodiment, optionally, as shown in fig. 3, in step S21, mapping the family relationship between the user with family relationship and the user in the network through a complex network algorithm to obtain a family relationship model, including the following steps:
s31: determining a core user in the family relation model through a random walk method;
s32: and constructing the family relation model based on the core user and the user having family relation with the core user.
By the scheme, the family relation model can be obtained, the users in the family relation model are used as vertexes in the network, and the family relation between the users is used as an edge of the two vertexes. In the family relationship model, there are more contacts between nodes, and relatively fewer contacts between different communities. Therefore, the method is based on a random walk method to evaluate the importance of the nodes to the network community, and then the nodes with higher importance are used as the core to construct the network community, so that the constructed model can represent the family relation among users more accurately.
According to the scheme, the sub-attribute space with good characteristics is found from the attribute space of the network of the family relation model, then the sub-attribute space is mapped to the virtual node in the network, and the virtual link between the original node and the virtual node is constructed, so that the attribute information is converted into the topological structure information. Based on the topological structure of the network, the asymmetric transition probability among the nodes is quantized, and the core coefficients of the nodes are obtained through limited random transition, so that the importance of the nodes to the community is evaluated. See the following formula (1-1):
Figure BDA0002633028700000071
wherein, back represents the backtracking probability in the random walk process, P represents a node transition probability matrix, and N is the number of nodes in the original network.
And then determining the clustering direction of the nodes based on the transition probability and the core coefficient to realize spontaneous clustering of the nodes, and then adjusting the cluster shape according to the core coefficient of the nodes at the edge of the cluster to form a community sequence. Then, the information entropy of each attribute in the attribute space is calculated, and the information entropy larger than a threshold value t is removedhAnd arranging the residual attributes in ascending order according to the information entropy so as to evaluate the influence among the network nodes. The influence of the attribute relationship is shown in the following formula (1-2):
Figure BDA0002633028700000081
where I is the adjacency matrix after adding the virtual node, αattrIs an attribute factor. Then, each row of the influence matrix is normalized by the element f in the matrixijI.e. the probability of randomly transferring from node i to node j; for normalization see formulas (1-3) below:
Figure BDA0002633028700000082
the scheme provided by the embodiment of the application can solve the problem that the existing community discovery algorithm is not good in performance in a real network. According to the scheme, the event propagation law and the random walk method are combined to evaluate the importance of the nodes to the community, and the core nodes are determined on the basis to divide the community. The method not only constructs the attribute subspace and generates the attribute enhancement network, but also enables the evaluation node to evaluate the importance of the community and enables the node to spontaneously approach the community, and further can prune the edge of the community to realize community optimization. The scheme of the invention has better recognition performance on both the artificial simulation data set and the real data set.
Based on the solution provided by the foregoing embodiment, optionally, as shown in fig. 4, the step S12, which is executed above, of identifying whether a family relationship exists between users based on the user data of each user, includes the following steps:
s41: and identifying the user data of each user through a logistic regression model so as to determine whether the family relationship exists between the users.
In the embodiment of the present application, the logistic regression model is applied to identify the user data, and actually, other models may be selected according to actual requirements to perform the identification. For example, a machine learning algorithm such as a decision tree algorithm or a random forest algorithm may be selected to perform the above recognition. Alternatively, the above identification may also be performed in connection with a distributed system.
According to the scheme, a multi-dimensional data source is fused, the family attribute score between two users is identified based on combined analysis, and the data is trained by using a logistic regression model, so that user classification is realized.
In order to further improve the identification accuracy, based on the solution provided in the foregoing embodiment, optionally, as shown in fig. 5, in step S41, the identifying, by using a logistic regression model, the user data of the users to determine whether a family relationship exists between the users includes the following steps:
s51: performing binning on a plurality of features in user data of each user;
s52: and identifying the user data subjected to the box separation processing through a logistic regression model so as to determine whether family relations exist among the users.
The main purposes of binning include denoising, discretizing continuous data, addingParticle sizeAnd the like. The accuracy of the identification result of the logistic regression model can be improved by performing box separation processing on the features in the user data.
In order to further improve the identification accuracy, based on the solution provided by the foregoing embodiment, optionally, as shown in fig. 6, in step S52, the step of identifying the user data subjected to the binning processing by using a logistic regression model to determine whether a family relationship exists between users includes the following steps:
s61: establishing a penalty item according to the user data of each user;
s62: and identifying the user data subjected to box separation processing through a logistic regression model according to the punishment items so as to determine whether family relations exist among the users.
In the following, the present solution is illustrated by way of example:
in the scheme provided by the embodiment of the application, based on the multi-dimensional characteristic indexes of the user data, logistic regression model training is performed on sample data, the multi-dimensional characteristic indexes of the user data are subjected to box separation processing, the evidence weight of each characteristic index is calculated, and the information value of each characteristic index is calculated according to the evidence weight. Signaling location data indicators of
Figure BDA0002633028700000091
Corresponding index model coefficient is
Figure BDA0002633028700000092
The penalty item is established based on the index model coefficient as follows:
Figure BDA0002633028700000093
wherein, lambda is a punishment coefficient, and s is the total index number. In this embodiment, the probability of each relationship being a positive sample can be represented by P, and the logistic regression model can be represented by the following formula (2-1):
Figure BDA0002633028700000101
wherein xi(i ═ 1, 2.. times, s) is used as an index, s represents the index number, and as the value of P is between 0 and 1, the value range can be converted into any real value after logical conversion, and the solution is needed to solve that β ═ β (β ═ β ·01,...,βs) T, the model training solving formula is as follows (2-2):
Figure BDA0002633028700000102
then the logistic regression model β ═ β (β)01,...,βs) The estimated amount of T is defined by the following formula (2-3):
Figure BDA0002633028700000103
solving for beta (beta) using positive and negative sample data01,...,βs) And after T, obtaining a self-adaptive logistic regression family relation identification model for evaluating whether the two numbers form a stable family relation. The model expression obtained by the final solution is the following formula (2-4):
Figure BDA0002633028700000104
in the process of model training and solving, in order to ensure that the indexes contribute higher weight to the model, adding a penalty term is considered, and the penalty term is established based on the index model coefficient:
Figure BDA0002633028700000105
wherein, λ is a penalty coefficient and is a constant; s is the total index number; and constraining index model coefficients of each non-signaling position data index through a penalty term. And then, combining the edge weight model to uniformly increase the evidence weights of two users with family relations. The weight is sufficient for the connected user to determine a family membership.
By the scheme provided by the embodiment of the application, whether family relations exist among users is effectively determined, and then the complex network community algorithm is combined to map the relations among the users to the complex network, so that the family relations among the users are more accurately and reasonably divided, and the prediction accuracy is greatly improved.
Based on the solution provided by the foregoing embodiment, optionally, as shown in fig. 7, after identifying the family group of the user according to the family relationship model and the user data to obtain the users in each family group, the foregoing step S14 further includes the following steps:
s71: determining users having family relations with the target virtual node in the family relation model as adjacent nodes of the target virtual node;
s72: voting is carried out on the target virtual node according to the family relation between the adjacent node and the target virtual node indicated by the user data;
s73: and adjusting the identification of the target virtual node according to the voting result.
The scheme utilizes a voting mode to screen all adjacent user sets of the users, utilizes an algorithm set to vote, respectively gives out voting results related to data pairs to be judged, and aims at any set GiWhich treats the voting result S of the decision data pairiThe calculation method of (2) is as follows (3-1):
Figure BDA0002633028700000111
wherein, JiSet of representation algorithms GiTotal number of algorithms in, SijSet of representation adoption algorithms GiCalculated by the jth algorithm in (1).
For example, in a complex network, all neighboring user sets of user B are obtained first, where a neighboring user set may include a plurality of users having a family relationship with user B, and these users may belong to the same family or different families. Voting is carried out on the user B according to the family tags of the adjacent users and the family relation between the two users, and the family tag which votes most is selected to be the family voting tag of the user B. After the first round of voting, users whose identities marked in step S14 are consistent with the family voting labels are first screened, and it can be determined that the original identities of the screened users can represent the real family of the users. For users whose identities and family voting labels are inconsistent, the labels of the users can be optimized according to the number of family voting labels by combining the weights of the labels in step S14 and the weights of the family voting labels. For example, assuming that the user C is marked as a family M user in step S14, but the family voting tag indicates that the user C is likely to be a family N user, the original identifier "family M user" may be adjusted to "family N user", so as to optimize the identifier, and enable the optimized identifier to accurately represent the real family of the user.
Further, based on the user with the identifier obtained in the scheme provided in the above embodiment, an identity tag of the user may be further determined, where the tag may include: suspected householders, family user structures, member attribute structures, family broadband, user master sets, active members, number of friend users and the like.
The label can further indicate the identity of the user, and provides data support for developing business and promoting marketing. For example, in the field of home broadband services, by the scheme provided by this embodiment, a home to which a user belongs can be determined according to an identifier of the user, and then home location of the user is utilized to analyze the broadband resource coverage of the home user, broadband outbound marketing is performed for broadband-covered home users, the successful order placing rate of broadband outbound access is improved, and the number of members covering the home network is increased. The scheme that this embodiment provided can effectively promote the broadband and exhale the success rate outward, practices thrift and exhales the human cost outward.
In the field of the one-line one-network convergence service, according to the scheme provided by the embodiment of the application, the family to which the user belongs can be determined according to the identification of the user, the family V-network service is further expanded through the broadband contact, the broadband ordering rate of the family user contact 58+ the client and the one-line one-network convergence rate are improved, and the handling rate of the family V-network at the contact of the one-line channel is effectively improved.
In the field of home V-network expansion business, according to the scheme provided by the embodiment of the application, the home to which the user belongs can be determined according to the user identification, and further the home V-network batch opening can be performed for the co-production right users in the home network, so that the number of successfully batch opening home networks is increased, the batch opening efficiency is increased, and the network expansion cost is reduced.
According to the scheme, on the basis of user data of historical communication records, the family attributes among users are scored, whether family relations exist among the users or not is determined, the users with the family relations and the relations among the users are mapped to a complex network, division of the users in a family group is achieved through division of the complex network community, and the family user labels are optimized through a voting election mode. Family relations among users are divided more accurately and reasonably, and prediction accuracy is greatly improved.
The scheme provided by the embodiment of the application has the following advantages:
the method and the system train data by using the logistic regression model, classify the family attributes, identify whether family relations exist among the users, score the family attributes of the users, and effectively determine the family attribute relations among the users.
According to the scheme, a complex network community discovery algorithm is utilized, the relation between the users is mapped to the network better, and the identification results of all the users in the same family are optimized.
The scheme determines core users based on a random walk method, identifies different family groups, and enables mapping results to accurately represent the relationship between the users. Not only is an attribute subspace constructed and an attribute enhanced network generated, but also the importance of the evaluation node to the community is enabled, the node is made to spontaneously approach to the community, and then the edge of the community is trimmed. The invention has better performance on both the artificial simulation data set and the real data set, and the analyzed result has high reliability and strong effectiveness.
According to the scheme, all adjacent user sets of the users are screened in a voting mode, voting is carried out by utilizing an algorithm set, voting results about data pairs to be judged are respectively given, and the family user identification is optimized.
According to the technical scheme, in a complex network, a voting mode is utilized, all adjacent user sets of users are obtained firstly, the users are voted according to home labels of the adjacent users and home attribute scores between the two users, and the home label which votes most is selected to be registered as a home voting label of the user. After the first round of voting, users with the consistent user family tags and family voting tags are screened out firstly, and then optimization is carried out layer by layer, and finally the family tag optimization of all the users is completed.
In order to solve the problems in the prior art, an embodiment of the present application further provides an electronic device 80, as shown in fig. 8, including:
an obtaining module 81, configured to obtain user data of multiple feature dimensions in a historical communication record of each user, where the user data of any two users can be used to indicate a family relationship between the two users;
an identifying module 82, configured to identify whether a family relationship exists between users based on the user data of the users;
the mapping module 83 maps the user identified to have the family relationship in the network through a complex network algorithm to obtain a family relationship model;
and the identification module 84 identifies the family groups of the users according to the family relation model to obtain the users in each family group.
Optionally, the mapping module 83 is configured to:
and mapping the family relationship between the users with the family relationship in the network through a complex network algorithm to obtain a family relationship model, wherein the family relationship model comprises a plurality of virtual nodes and virtual links among the virtual nodes, the virtual nodes are generated by mapping the users with the family relationship, and the virtual links are generated by mapping the family relationship among the users.
Optionally, the mapping module 83 is configured to:
determining a core user in the family relation model through a random walk method;
and constructing the family relation model based on the core user and the user having family relation with the core user.
Optionally, the identifying module 82 is configured to:
and identifying the user data of each user through a logistic regression model so as to determine whether the family relationship exists between the users.
Optionally, the identifying module 82 is configured to:
performing binning on a plurality of features in user data of each user;
and identifying the user data subjected to the box separation processing through a logistic regression model so as to determine whether family relations exist among the users.
Optionally, the identifying module 82 is configured to:
establishing a penalty item according to the user data of each user;
and identifying the user data subjected to box separation processing through a logistic regression model according to the punishment items so as to determine whether family relations exist among the users.
Optionally, as shown in fig. 9, the electronic device further includes a voting module 85, configured to:
determining users having family relations with the target virtual node in the family relation model as adjacent nodes of the target virtual node;
voting is carried out on the target virtual node according to the family relation between the adjacent node and the target virtual node indicated by the user data;
and adjusting the identification of the target virtual node according to the voting result.
The electronic device provided by the above embodiment can identify whether a family relationship exists between users according to user data, and can determine the relationship between users from multiple aspects because the user data has a multi-dimensional index feature. By combining the multi-dimensional characteristic data, the accuracy of judging whether the family relationship exists among the users can be improved. The family relation model can show whether family relations exist among users, and whether the two users belong to the same family can be further judged by combining user data, so that family groups of the users can be identified according to requirements. The family to which each user belongs can be distinguished through the identification, and then each user belonging to the same family group is determined. Thereby efficiently and accurately identifying users belonging to the same family.
Preferably, an embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the foregoing home user identification method embodiment, and can achieve the same technical effect, and details are not repeated here to avoid repetition.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the foregoing home user identification method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A home subscriber identification method, comprising:
acquiring user data of a plurality of characteristic dimensions in historical communication records of each user, wherein the user data of any two users can be used for indicating the family relationship between the two users;
identifying whether a family relationship exists between the users based on the user data of the users;
mapping the user identified to have the family relationship in the network through a complex network algorithm to obtain a family relationship model;
and identifying the family groups of the users according to the family relation model and the user data to obtain the users in each family group.
2. The method of claim 1, wherein mapping the users identified as having a family relationship in the network through a complex network algorithm to obtain a family relationship model comprises:
and mapping the family relationship between the users with the family relationship in the network through a complex network algorithm to obtain a family relationship model, wherein the family relationship model comprises a plurality of virtual nodes and virtual links among the virtual nodes, the virtual nodes are generated by mapping the users with the family relationship, and the virtual links are generated by mapping the family relationship among the users.
3. The method of claim 2, wherein mapping the family relationship between the users with family relationship in the network through a complex network algorithm to obtain a family relationship model comprises:
determining a core user in the family relation model through a random walk method;
and constructing the family relation model based on the core user and the user having family relation with the core user.
4. The method of claim 1, wherein identifying whether a family relationship exists between users based on the user data for each user comprises:
and identifying the user data of each user through a logistic regression model so as to determine whether the family relationship exists between the users.
5. The method of claim 4, wherein identifying the user data for the users via a logistic regression model to determine whether a family relationship exists between the users comprises:
performing binning on a plurality of features in user data of each user;
and identifying the user data subjected to the box separation processing through a logistic regression model so as to determine whether family relations exist among the users.
6. The method of claim 5, wherein identifying the binned user data via a logistic regression model to determine whether a family relationship exists between users comprises:
establishing a penalty item according to the user data of each user;
and identifying the user data subjected to box separation processing through a logistic regression model according to the punishment items so as to determine whether family relations exist among the users.
7. The method according to any one of claims 1 to 6, wherein after identifying the family groups of the users according to the family relationship model and the user data to obtain the users under each family group, the method further comprises:
determining users having family relations with the target virtual node in the family relation model as adjacent nodes of the target virtual node;
voting is carried out on the target virtual node according to the family relation between the adjacent node and the target virtual node indicated by the user data;
and adjusting the identification of the target virtual node according to the voting result.
8. An electronic device, comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring user data of a plurality of characteristic dimensions in historical communication records of users, and the user data of any two users can be used for indicating the family relationship between the two users;
the identification module is used for identifying whether a family relationship exists between the users or not based on the user data of the users;
the mapping module is used for mapping the user identified to have the family relationship in the network through a complex network algorithm to obtain a family relationship model;
and the identification module is used for identifying the family group of the user according to the family relation model to obtain the user in each family group.
9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202010816834.5A 2020-08-14 2020-08-14 Home user identification method and electronic equipment Pending CN114143207A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010816834.5A CN114143207A (en) 2020-08-14 2020-08-14 Home user identification method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010816834.5A CN114143207A (en) 2020-08-14 2020-08-14 Home user identification method and electronic equipment

Publications (1)

Publication Number Publication Date
CN114143207A true CN114143207A (en) 2022-03-04

Family

ID=80438197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010816834.5A Pending CN114143207A (en) 2020-08-14 2020-08-14 Home user identification method and electronic equipment

Country Status (1)

Country Link
CN (1) CN114143207A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114501420A (en) * 2022-03-06 2022-05-13 北京工业大学 Method for identifying family relation by using mobile phone signaling data

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050243736A1 (en) * 2004-04-19 2005-11-03 International Business Machines Corporation System, method, and service for finding an optimal collection of paths among a plurality of paths between two nodes in a complex network
US20080052263A1 (en) * 2006-08-24 2008-02-28 Yahoo! Inc. System and method for identifying web communities from seed sets of web pages
CN102456064A (en) * 2011-04-25 2012-05-16 中国人民解放军国防科学技术大学 Method for realizing community discovery in social networking
CN105592405A (en) * 2015-10-30 2016-05-18 东北大学 Mobile communication user group construction method on the basis of fraction filtering and label propagation
CN105824813A (en) * 2015-01-05 2016-08-03 中国移动通信集团江苏有限公司 Core user excavate method and device
CN107368499A (en) * 2016-05-12 2017-11-21 中国移动通信集团广东有限公司 A kind of client's tag modeling and recommendation method and device
CN108009575A (en) * 2017-11-28 2018-05-08 武汉大学 A kind of community discovery method for complex network
US20180336488A1 (en) * 2017-05-17 2018-11-22 Microsoft Technology Licensing, Llc Machine Learning Based Family Relationship Inference
CN110019996A (en) * 2017-12-11 2019-07-16 中国移动通信集团广东有限公司 A kind of family relationship recognition methods and system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050243736A1 (en) * 2004-04-19 2005-11-03 International Business Machines Corporation System, method, and service for finding an optimal collection of paths among a plurality of paths between two nodes in a complex network
US20080052263A1 (en) * 2006-08-24 2008-02-28 Yahoo! Inc. System and method for identifying web communities from seed sets of web pages
CN102456064A (en) * 2011-04-25 2012-05-16 中国人民解放军国防科学技术大学 Method for realizing community discovery in social networking
CN105824813A (en) * 2015-01-05 2016-08-03 中国移动通信集团江苏有限公司 Core user excavate method and device
CN105592405A (en) * 2015-10-30 2016-05-18 东北大学 Mobile communication user group construction method on the basis of fraction filtering and label propagation
CN107368499A (en) * 2016-05-12 2017-11-21 中国移动通信集团广东有限公司 A kind of client's tag modeling and recommendation method and device
US20180336488A1 (en) * 2017-05-17 2018-11-22 Microsoft Technology Licensing, Llc Machine Learning Based Family Relationship Inference
CN108009575A (en) * 2017-11-28 2018-05-08 武汉大学 A kind of community discovery method for complex network
CN110019996A (en) * 2017-12-11 2019-07-16 中国移动通信集团广东有限公司 A kind of family relationship recognition methods and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘荣辉: "复杂电信社交网络中家庭群体的识别与应用", 《工业工程与管理》, vol. 21, no. 5, pages 105 - 110 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114501420A (en) * 2022-03-06 2022-05-13 北京工业大学 Method for identifying family relation by using mobile phone signaling data

Similar Documents

Publication Publication Date Title
CN110462604A (en) The data processing system and method for association internet device are used based on equipment
CN103198161B (en) Microblog water army recognition methods and equipment
CN102668457A (en) Systems and methods for social graph data analytics to determine connectivity within a community
CN105631749A (en) User portrait calculation method based on statistical data
CN117436724B (en) Multi-source data visual analysis method and system based on smart city
CN113205129B (en) Cheating group identification method and device, electronic equipment and storage medium
CN110457576A (en) Account-classification method, device, computer equipment and storage medium
CN110019996A (en) A kind of family relationship recognition methods and system
Celbiş A machine learning approach to rural entrepreneurship
CN112258250A (en) Target user identification method and device based on network hotspot and computer equipment
CN105354343B (en) User characteristics method for digging based on remote dialogue
CN107368499A (en) A kind of client's tag modeling and recommendation method and device
Besche-Truthe et al. Cultural Spheres–Creating a dyadic dataset of cultural proximity
CN114143207A (en) Home user identification method and electronic equipment
CN116958608A (en) Method, device, equipment, medium and program product for updating object recognition model
CN116975706A (en) Data processing method, apparatus, device, readable storage medium, and program product
CN116541166A (en) Super-computing power scheduling server and resource management method
CN113448876B (en) Service testing method, device, computer equipment and storage medium
CN113672816B (en) Account feature information generation method and device, storage medium and electronic equipment
CN116263906A (en) Method, device and storage medium for determining post address
CN113313505B (en) Abnormality positioning method and device and computing equipment
CN111523034A (en) Application processing method, device, equipment and medium
CN111026816B (en) High-net-value customer group identification method and device based on knowledge graph and storage medium
CN112685654B (en) Student identification method and device, computing equipment and readable computer storage medium
CN115034839A (en) Office area state detection method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination