CN110598122B - Social group mining method, device, equipment and storage medium - Google Patents

Social group mining method, device, equipment and storage medium Download PDF

Info

Publication number
CN110598122B
CN110598122B CN201810606527.7A CN201810606527A CN110598122B CN 110598122 B CN110598122 B CN 110598122B CN 201810606527 A CN201810606527 A CN 201810606527A CN 110598122 B CN110598122 B CN 110598122B
Authority
CN
China
Prior art keywords
target
user
target user
determining
target set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810606527.7A
Other languages
Chinese (zh)
Other versions
CN110598122A (en
Inventor
张阳
杨双全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Original Assignee
Baidu Online Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu Online Network Technology Beijing Co Ltd filed Critical Baidu Online Network Technology Beijing Co Ltd
Priority to CN201810606527.7A priority Critical patent/CN110598122B/en
Publication of CN110598122A publication Critical patent/CN110598122A/en
Application granted granted Critical
Publication of CN110598122B publication Critical patent/CN110598122B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The application provides a social group mining method, a social group mining device, social group mining equipment and a storage medium, wherein the method comprises the following steps: acquiring position information and network environment information of a target user, wherein the network environment information is used for representing a network address currently accessed by the target user; determining a target set to which a target user belongs according to the position information of the target user; determining the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set; and determining the attribution relationship between the target user and the target set according to the association degree between the target user and each other user in the target set. The method realizes the mining of the relationship between the user and other users based on the position information and the network environment information of the user, and not only reduces the cost of mining the user relationship, but also ensures that the finally obtained user relationship is more comprehensive and more practical due to low acquisition difficulty of mining data and wide coverage range.

Description

Social group mining method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a social group mining method, apparatus, device, and storage medium.
Background
The users in the social network can independently publish information and see shared information of other people, so that a social network in a virtual era is constructed, namely the essence of the social network is to provide an online platform for sharing information such as interests, hobbies, states and activities among people. The platform has the characteristics of timely sharing, real-time performance, interactivity and the like, has the propagation characteristic of the traditional social society, and becomes a part of work and life of people.
In practical applications, due to the popularity of the internet, a large amount of user behavior data is generated every day. By analyzing a large amount of user behavior data, high value information can be acquired, for example, by analyzing the user behavior data, mining of user relationships and the like is realized.
At present, when social data is mined, most of the social data is mined based on strong social behavior data, such as mining remark address lists containing families and colleagues; or mining is performed based on attention data, such as microblog attention data, so as to perform clustering mining by means of attention individual connection relations, however, in the mining manner, strong social behavior data of a user can be acquired only after authorization of the user is obtained, the data acquisition difficulty is high, the cost is high, and the acquirable data size is limited, so that mining of user relations is influenced.
Disclosure of Invention
The present application is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, in one aspect of the application, the embodiment provides a social group mining method, which realizes mining of relationships between users and other users based on location information and network environment information of the users, and not only reduces the cost of mining of user relationships, but also enables the finally obtained user relationships to be more comprehensive and practical due to low difficulty in mining data acquisition and wide coverage range.
A second object of the present application is to provide a social group mining device.
A third object of the present application is to propose a computer device.
A fourth object of the present application is to propose a computer readable storage medium.
To achieve the above object, an embodiment of a first aspect of the present application provides a social group mining method, including: acquiring position information and network environment information of a target user, wherein the network environment information is used for representing a network address currently accessed by the target user; determining a target set to which the target user belongs according to the position information of the target user; determining the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set; and determining the social relationship between the target user and each other user in the target set according to the association degree between the target user and each other user in the target set.
The social group mining method provided by the embodiment of the application comprises the steps of firstly obtaining position information and network environment information of a target user, determining a target set to which the target user belongs according to the position information of the target user, determining the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set, and then determining the affiliation relation between the target user and the target set according to the association degree between the target user and each other user in the target set. Therefore, the method and the device realize mining of the relationship between the user and other users based on the position information and the network environment information of the user, and not only reduce the cost of mining the user relationship, but also enable the finally obtained user relationship to be more comprehensive and practical due to low acquisition difficulty and wide coverage range of the mining data.
In addition, the social group mining method provided by the above embodiment of the present application may further have the following additional technical features:
optionally, in an embodiment of the present application, before determining the target set to which the target user belongs, the method further includes: analyzing the map data, and determining the mapping relation between each set and the position; the determining the target set to which the target user belongs includes: and determining a target set to which the target user belongs according to the distance between the position information of the target user and the positions of the sets.
Optionally, in another embodiment of the present application, the determining the association degree between the target user and each other user in the target set includes: determining the network environment information of the target user and the network environment information of other users in the target set based on a preset user association model, wherein the association degrees respectively correspond to the network environment information of the target user and the network environment information of other users in the target set; the preset user association model is obtained by training a sample by using the network environment information of the user with known association degree.
Optionally, in another embodiment of the present application, the target set corresponds to N entity names, where N is a positive integer greater than 1; after determining the association degree between the target user and each other user in the target set, the method further includes: and according to the relevance between all users in the target set, clustering the target set, and determining a target cluster to which the target user belongs, wherein each cluster corresponds to an entity name.
Optionally, in another embodiment of the present application, the determining a target cluster to which a target user belongs includes: and if the association degrees between the target user and each user in the first cluster are both greater than a threshold value and the association degrees between the target user and each user in the second cluster are both less than the threshold value, determining that the target cluster to which the target user belongs is the first cluster.
Optionally, in another embodiment of the present application, the determining a target cluster to which a target user belongs includes: and if the association degrees between the target user and each user in the first cluster and the second cluster are both larger than a threshold value, and the first cluster and the second cluster respectively correspond to the first entity name and the second entity name, determining the target cluster to which the target user belongs according to the number of users contained in the first cluster and the number of users contained in the second cluster.
Optionally, in another embodiment of the present application, the determining a target cluster to which a target user belongs includes: and if the association degrees between the target user and each user in the first cluster and the second cluster are greater than a threshold value, and the first cluster and the second cluster respectively correspond to a first entity name and a second entity name, determining the target cluster to which the target user belongs according to the type and/or scale of the industry to which the first entity name and the second entity name respectively belong.
Optionally, in another embodiment of the present application, the target set corresponds to an entity name; after determining the social relationship between the target user and each of the other users in the target set, the method further includes: if the target user has social relations with other users in the target set, analyzing entity names corresponding to the target set to determine the industry type of the target set; and determining a label set corresponding to the target user according to a preset mapping relation between the industry type and the label.
Optionally, in another embodiment of the present application, the analyzing the entity name corresponding to the target set to determine the industry type to which the target set belongs includes: analyzing the entity name corresponding to the target set, and determining the weight of each word cutting unit in the entity name corresponding to the target set; determining the probability values of various industries corresponding to the word segmentation units in the entity name corresponding to the target set according to a preset industry dictionary; and determining the industry type of the target set according to the weight value of each word cutting unit in the entity name and the industry probability value corresponding to each word cutting unit.
To achieve the above object, a social group mining device according to a second aspect of the present application is provided, where the social group mining device includes: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring the position information and the network environment information of a target user, and the network environment information is used for representing the network address currently accessed by the target user; the first determining module is used for determining a target set to which the target user belongs according to the position information of the target user; a second determining module, configured to determine, according to the network environment information of the target user and the network environment information of each other user in the target set, a degree of association between the target user and each other user in the target set; and the third determining module is used for determining the social relationship between the target user and each other user in the target set according to the association degree between the target user and each other user in the target set.
The social group mining device provided by the embodiment of the application firstly obtains the position information and the network environment information of the target user, determines the target set to which the target user belongs according to the position information of the target user, determines the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set, and then determines the affiliation relation between the target user and the target set according to the association degree between the target user and each other user in the target set. Therefore, the method and the device realize mining of the relationship between the user and other users based on the position information and the network environment information of the user, and not only reduce the cost of mining the user relationship, but also enable the finally obtained user relationship to be more comprehensive and practical due to low acquisition difficulty and wide coverage range of the mining data.
To achieve the above object, a third aspect of the present application provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the computer device implements the social group mining method described in the first aspect.
The computer device provided by the embodiment of the application first obtains the location information and the network environment information of the target user, determines a target set to which the target user belongs according to the location information of the target user, determines the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set, and then determines the affiliation relationship between the target user and the target set according to the association degree between the target user and each other user in the target set. Therefore, the method and the device realize mining of the relationship between the user and other users based on the position information and the network environment information of the user, and not only reduce the cost of mining the user relationship, but also enable the finally obtained user relationship to be more comprehensive and practical due to low acquisition difficulty and wide coverage range of the mining data.
To achieve the above object, a fourth aspect of the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the social group mining method according to the first aspect.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic flow chart diagram illustrating a social group mining method according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram illustrating a social group mining method according to another embodiment of the present disclosure;
FIG. 3 is a schematic flow chart diagram illustrating a social group mining method according to yet another embodiment of the present application;
FIG. 4 is a schematic flow chart illustrating a process for determining the industry type to which a target set belongs according to one embodiment of the present application;
FIG. 5 is a schematic structural diagram of a social group mining device according to an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a social group mining device according to another embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a social group mining device according to another embodiment of the present application;
FIG. 8 is a schematic block diagram of a computer device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a computer device according to another embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
In the application, the embodiments mainly aim at the problems that in the related technology, when social data are mined, mining is mostly carried out based on strong social data, or mining is carried out based on attention data, however, the mining mode can obtain the strong social behavior data of a user only after the user is authorized, the data obtaining difficulty is high, and the cost is high, and the social group mining method is provided.
According to the embodiment of the application, the position information and the network environment information of the target user are obtained, wherein the network environment information is used for representing the network address of the target user currently accessed, so that the target set to which the target user belongs is determined according to the position information of the target user, the association degree between the target user and each other user in the target set is determined according to the network environment information of the target user and the network environment information of each other user in the target set, and then the social relationship between the target user and each other user in the target set is determined according to the association degree between the target user and each other user in the target set. Therefore, the method and the device realize mining of the relationship between the user and other users based on the position information and the network environment information of the user, and not only reduce the cost of mining the user relationship, but also enable the finally obtained user relationship to be more comprehensive and practical due to low acquisition difficulty and wide coverage range of the mining data.
Social group mining methods, apparatuses, devices, and storage media according to embodiments of the present application are described below with reference to the drawings.
First, referring to fig. 1, a social group mining method provided in an embodiment of the present application is specifically described.
Fig. 1 is a flowchart illustrating a social group mining method according to an embodiment of the present disclosure.
As shown in fig. 1, the social group mining method of the present application may include the following steps:
step 101, obtaining location information and network environment information of a target user, wherein the network environment information is used for representing a network address currently accessed by the target user.
The target user may be any user in the social network, which is not specifically limited in this embodiment.
Optionally, in this embodiment, the network address of the access may be, but is not limited to: IP addresses, wireless Access Points (APs), camping points, Global System for Mobile communication (GSM) cells, and so on.
The social group mining method provided by the embodiment of the application can be executed by the computer device provided by the embodiment of the application. The computer equipment is provided with a social group mining device so as to mine the relationship between the target user and other users. The computer device of the embodiment may be any hardware device with a data processing function, such as a smart phone, a tablet computer, a personal digital assistant, and the like.
In an optional implementation form of the present application, the location information of the target user may be obtained through a Global Positioning System (GPS for short) or network Positioning.
Network positioning may include two different implementation manners, a first manner: wifi cell location, the second way: and positioning the base station.
It should be noted that the Wifi cell location is located according to the location of the Wifi router; the base station location depends mainly on the base station distribution density.
Further, after the location information of the target user is obtained, the present embodiment may obtain the network environment of the target user.
Generally, an operating system of a computer device provides a comprehensive network interface, which includes an interface for acquiring a network management class, for example, the interface for acquiring the network management class is: ConnectivityManager. That is to say, in this embodiment, the network information class can be obtained by calling the interface for obtaining the network management class, for example, the obtaining the network information class is: and the NetworkInfo, so that the network environment information of the target user is determined according to the network state information contained in the network information class.
That is, when sending a data request to the server by using an application program in the computer device, the server may send a network environment information obtaining instruction to the application program, so that the application program invokes an interface for obtaining network management classes in an operating system of the computer device according to the obtaining instruction to obtain the network environment information of the target user.
Or, when sending an access request to the server by using an application program in the computer device, the network environment information where the computer device is currently located may be carried in the request, so that the server may obtain the network environment information of the target user by analyzing the access request, which is not limited herein.
It should be noted that in the actual application process, there may be some situations that the network address currently accessed by the user may be accidental, for example, the user a may be an express person or a takeaway person, and then the user a may detect the access point AP1 of the WIFI network in the area where the user B is located and access the AP1 while the user a is dispatching an item to the user B. Therefore, in order to improve the accuracy of determining the relationship between users, in this embodiment, when the location information and the network environment information of the target user are obtained, the time information of the network address currently accessed by the user may also be obtained, so as to determine whether the user accidentally accesses the current network address according to the matching result of the time information and the threshold.
The time information may refer to a time length for which the user accesses the current network address.
The threshold may be adaptively set according to the actual usage of the user, for example, to 2 hours (h), 3h, 5h, etc., which is not specifically limited herein.
It can be understood that, in the present application, the time length of the current access network address of the user is obtained, so as to compare the time length with the threshold, if the time length exceeds the threshold, it is indicated that the user is the normal access network address, and if the time length does not exceed the threshold, it is indicated that the user is the accidental access network address, and the user is filtered, so that the relationship between the users can be determined more accurately in the following.
And 102, determining a target set to which the target user belongs according to the position information of the target user.
The target set refers to a set matched with the position information of the target user.
In this embodiment, the set may be multiple, and each set corresponds to different location information. For example, the position information corresponding to the set a is "guangdong building", and the position information corresponding to the set B is "all-flowers industrial park", and the like, and is not particularly limited herein.
Optionally, after the location information and the network environment information of the target user are obtained, the social group mining device may determine the target set to which the target user belongs according to the location information of the target user.
Since the position information of the user has certain stability during working hours or leisure hours, as an optional implementation form, before determining the target set to which the target user belongs, the embodiment may first analyze the map data to determine the mapping relationship between each set and the position, and then determine the target set to which the target user belongs according to the distance between the position information of the target user and the position of each set.
That is, a plurality of difference values are obtained by subtracting the position information of the target user from the position corresponding to each set. And then selecting the smallest difference value from the plurality of difference values to determine a corresponding set by inquiring the mapping relation between each set and the position according to the position corresponding to the smallest difference value, and taking the determined set as a target set to which the target user belongs.
For example, if the acquired location information of the target user XX is "bouquet industrial park", the social group mining device may match the location information "bouquet industrial park" of the target user XX with a plurality of locations corresponding to the sets, respectively, and if the distance between the location information "bouquet industrial park" of the target user XX and the location D is the minimum, the mapping relationship between each set and the location is queried according to the location D, so that a corresponding set D 'may be determined, and at this time, the set D' may be determined as a target set to which the target user XX belongs.
Further, when determining a target set to which a target user belongs, there may be a case where the position information of the target user is the same as the distance between any two or more set positions, and at this time, the target set to which the target user belongs cannot be determined. In this regard, in this embodiment, the position range corresponding to each set may be expanded in an equal proportion, and then the target set to which the target user belongs may be determined according to the distance between the position information of the target user and the processed position of each set.
For example, if the sets include the set E, the set F, the set G, and the set H, and the range of the position corresponding to each initial set is 100 meters (m) × 100m, when the distance between the position information of the target user XX and the positions corresponding to the set E and the set F is 0m, it indicates that the position information of the current target user XX may be located at the intersection between the set E and the set F. At this time, in order to determine the target set to which the target user XX belongs, the social group mining device may expand the position range corresponding to each of the set E, the set F, the set G, and the set H from 100 meters (m) 100m to 200m, and then perform subtraction on the position information of the target user XX and the positions corresponding to each of the set E, the set F, the set G, and the set H. If the distance between the position information of the target user XX and the positions corresponding to the set E, the set F, the set G and the set H is: 150m, 250m, and 450m, it may be determined that the distance between the location corresponding to the set E and the location information of the target user XX is the minimum, and then the set E may be determined as a target set to which the target user XX belongs.
And 103, determining the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set.
Optionally, in actual application, each set may further include network environment information of each user, so that after the target set to which the target user belongs is determined in this embodiment, the social group mining device may further determine, according to the network environment information of each user included in the determined target set, the association degree between the target user and each user in the target set.
In an optional implementation form of the present application, the embodiment may determine, based on a preset user model, each association degree corresponding to the network environment information of the target user and the network environment information of each other user in the target set respectively.
The preset user association model is obtained by training a sample by using the network environment information of the user with known association degree.
Optionally, in this embodiment, when the network environment information of the user with a known degree of association is used as a sample and a preset user association model is obtained through training, a Gradient Boosting Decision Tree (GBDT) may be used, and a weak classifier is generated through multiple iterations and each iteration, and each classifier is obtained through training on the basis of a residual error of a previous classifier.
The trained user association model can be expressed by the following formula (1):
Figure BDA0001694463830000081
wherein, Fm() A preset user association model is obtained for training, M is the mth round of training, M is the total number of rounds of training, T (x; theta)m) The weak classifiers generated for each round, i.e., each T (x; thetam) Representing a regression tree, x being the input variable, θmThe dividing variables, dividing positions, mean values of leaf nodes in each tree and the like of the trees.
It should be noted that the loss function of the generated weak classifier is as the following formula (2):
Figure BDA0001694463830000082
wherein the content of the first and second substances,
Figure BDA0001694463830000083
for the loss function, L () is a cost function representing the difference between the true and predicted values, yiIs a true value, Fm-1() Is the predicted value of the previous round, T () is the predicted value of the current round, Fm-1(xi)+T(x;θm) Is the overall predicted value of the current round, i is the ith round of training, N is the total number of rounds of training, T (x; thetam) One weak classifier is generated for each round, i.e. each T (x; thetam) Represents aA regression tree, x being an input variable, θmThe dividing variables, dividing positions, mean values of leaf nodes in each tree and the like of the trees.
Further, after the preset user association model is obtained through training, the social group mining device may input the obtained network environment information of the target user and the network environment information of other users in the target set into the preset user association model as input data, so as to analyze and process the network environment information of the target user and the network environment information of other users in the target set through the preset user association model, and obtain the association degree between the target user and each other user in the target set.
For example, if the user a belongs to the set of the flower industrial garden and the set includes 3 users, namely, the user B, the user C, and the user D, the network environment information of the user a and the network environment information of the 3 users may be input into a preset user association model to determine the association degrees between the user a and the 3 users, respectively. If the hot spot access point currently accessed by the user a is AP1, the hot spot access point of the user B is AP2, the hot spot access points of the users C are all AP1, and the hot spot access point of the user D is AP3, it indicates that the association degree between the user a and the users B and D is small, and the association degree between the user a and the users C is large.
And step 104, determining the social relationship between the target user and each other user in the target set according to the association degree between the target user and each other user in the target set.
In this embodiment, the social relationship may be a colleague relationship, a friend relationship, or the like, and is not specifically limited herein.
Optionally, after determining the association degrees between the target user and each of the other users in the target set, the social group mining device may determine the social relationship between the target user and each of the other users in the target set according to the association degrees between the target user and each of the other users in the target set.
For example, if the target user a belongs to a set of hundred flower industry gardens, and the set includes 8 users, which are the first user to the eighth user, respectively, when the association degree of each of the first user, the third user, the fourth user, the sixth user, the seventh user, and the eighth user with the target user a is large, and the association degree of each of the second user and the fifth user with the target user a is small, it may be determined that the target user a may be a co-worker relationship with the first user, the third user, the fourth user, the sixth user, the seventh user, and the eighth user, and there is no social relationship between the target user a and the second user and the fifth user.
For another example, if the target user a belongs to the XX cell set, and the set includes 3 users, which are an X user, a Y user, and a Z user, respectively, then when the association degree between the X user and the target user a is large, and the association degrees between the Y user and the target user a are small, it may be determined that the X user and the target user a may be in a relationship of relatives, and the Y user and the Z user may be in a neighbor relationship with the target user, or there is no social relationship.
The social group mining method provided by the embodiment of the application comprises the steps of firstly obtaining position information and network environment information of a target user, determining a target set to which the target user belongs according to the position information of the target user, determining the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set, and then determining the affiliation relation between the target user and the target set according to the association degree between the target user and each other user in the target set. Therefore, the method and the device realize mining of the relationship between the user and other users based on the position information and the network environment information of the user, and not only reduce the cost of mining the user relationship, but also enable the finally obtained user relationship to be more comprehensive and practical due to low acquisition difficulty and wide coverage range of the mining data.
According to the analysis, the target set to which the target user belongs is determined according to the position information of the target user, the association degree between the target user and each user in the target set is determined according to the network environment information of the target user and the network environment information of each other user in the target set, and the social relationship between the target user and each other user in the target set is determined according to the association degree.
In practical application, one target set may correspond to a plurality of (for example, N) entity names, so when a target user belongs to a target set with a plurality of entity names, in order to more accurately determine which entity name the target user belongs to, the present application may perform clustering processing on the target set according to the association degree between users in the target set, so as to determine the entity name to which the target user specifically belongs by calculating the association degree between the target user and each user included in each cluster in the target set.
The social group mining method of the present application is further described below with reference to fig. 2.
Fig. 2 is a flowchart illustrating a social group mining method according to another embodiment of the present disclosure.
As shown in fig. 2, the social group mining method of the embodiment of the present application may include the following steps:
step 201, obtaining the location information and the network environment information of the target user, wherein the network environment information is used for representing the network address currently accessed by the target user.
Step 202, determining a target set to which the target user belongs according to the position information of the target user.
Step 203, determining the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set.
The detailed implementation process and principle of the steps 201-203 may refer to the detailed description of the above embodiments, and are not described herein again.
And 204, clustering the target set according to the relevance between the users in the target set, and determining a target cluster to which the target user belongs, wherein each cluster corresponds to an entity name.
The entity name may refer to a business name, or the like, and is not particularly limited herein.
In the actual application process, a target set may include multiple users, and there is a difference in the degree of association between the users. Therefore, the target set with the multiple users is divided more finely, so that the accuracy of determining the attribution relationship between the target user and the target set is higher.
Optionally, in this embodiment of the present application, the number of users included in the target set may be determined first, if it is determined that multiple users are included, the association between the users is further analyzed to determine the association between the users, and then the users with the consistent association are divided into a cluster according to the association between the users, so as to obtain multiple clusters.
In this embodiment, the common attribute that each user included in each cluster has may be address information registered by each user at the server, or may also be an entity name registered by each user at the server, and the like, which is not specifically limited herein.
For example, if the target set X is divided into 3 clusters, the social group mining device may perform analysis processing on the 3 clusters including the plurality of users, respectively, and if the common attribute of the users included in the first cluster is "XX company", the common attribute of the users included in the second cluster is "XX firm", and the common attribute of the users included in the third cluster is "XX restaurant", the "XX company" may be used as a mark of the first cluster, the "XX firm" is used as a mark of the second cluster, and the "XX restaurant" is used as a mark of the third cluster.
Further, after clustering is performed on each user included in the target set, the embodiment may determine the association degree between the target user and each user in the N clusters according to the network environment information of the target user and the network environment information of each user in the N clusters, so as to determine the target cluster to which the target user belongs according to the association degree between the target user and each user in the N clusters.
Optionally, in the present embodiment, the target cluster to which the target user belongs may be determined in the following manner.
As a first optional implementation:
and if the association degrees between the target user and each user in the first cluster are both greater than the threshold value and the association degrees between the target user and each user in the second cluster are both less than the threshold value, determining that the target cluster to which the target user belongs is the first cluster.
The threshold may be set by the user, or may be adaptively set by the user according to actual needs, which is not specifically limited in this embodiment.
That is, the association degrees between the target user and each of the plurality of clusters are determined, and the cluster to which each user whose association degree with the target user is greater than the threshold value belongs is taken as the target cluster to which the target user belongs.
For example, if the threshold is 0.95, when the association degrees between the target user a and the users in the first cluster Q are 0.95, 0.96, and 0.99, respectively, it indicates that the association degrees between the target user a and the users in the first cluster Q are greater than 0.95, and when the association degrees between the target user a and the users in the second cluster W are 0.92, 0.95, 0.93, and 0.94, respectively, it indicates that the association degrees between the three users in the second cluster W and the target user a are less than 0.95, and it may be determined that the first cluster Q is the target cluster of the target user a.
As a second alternative implementation:
and if the association degrees between the target user and each user in the first cluster and the second cluster are both larger than the threshold value, and the first cluster and the second cluster respectively correspond to the first entity name and the second entity name, determining the target cluster to which the target user belongs according to the number of users contained in the first cluster and the number of users contained in the second cluster.
For example, if the threshold is 0.92, when the association degrees between the user a and each user in the first cluster Q and the second cluster W are both greater than 0.92, the social group mining device may determine the number of users included in the first cluster Q and the number of users included in the second cluster W, and if the number of users included in the first cluster Q is 100 and the number of users included in the second cluster W is 10, it may be determined that the number of users included in the first cluster Q and associated with the user a that is greater than 0.92 is far greater than the number of users included in the first cluster Q and associated with the user a that is greater than 0.92, and at this time, it may be determined that the first cluster Q is a target cluster of the user a.
As a third optional implementation:
and if the association degrees between the target user and each user in the first cluster and the second cluster are greater than the threshold value, and the first cluster and the second cluster respectively correspond to the first entity name and the second entity name, determining the target cluster to which the target user belongs according to the type and/or scale of the industry to which the first entity name and the second entity name respectively belong.
The type of industry may include various industries, such as financial industry, catering industry, software industry, automobile industry, etc., and is not limited herein.
It can be understood that, in this embodiment, the determining of the target cluster to which the target user belongs may be determined according to types of industries to which the first entity name and the second entity name respectively belong; or, the scale of the industry to which the first entity name and the second entity name belong can be determined according to the scale of the industry to which the first entity name and the second entity name belong respectively; alternatively, the first entity name and the second entity name may be determined according to the type and scale of the industry to which the first entity name and the second entity name belong, respectively, and the like, and the first entity name and the second entity name are not particularly limited herein.
For example, if the threshold is 0.92, and the association degree between the target user a and each of the users in the first cluster Q and the second cluster W is greater than 0.92, the social group mining device may determine the first entity name corresponding to the first cluster Q and the second entity name corresponding to the second cluster W, and when it is determined that the first entity name corresponding to the first cluster Q is "XX bank" and the second entity name corresponding to the second cluster W is "XX restaurant", it may be determined that the first cluster Q belongs to the banking industry, the second cluster W belongs to the catering industry, and the general catering industry belongs to the public industry, and any user may have a meal, and the mobility of the user is relatively high, and the mobility of the user in the banking industry is relatively low, and the first cluster Q belonging to the banking industry may be determined as the target cluster of the target user a.
According to the social group mining method, the target set is clustered by analyzing the relevance between the users in the target set, so that when the affiliation relationship between the target user and the target set is determined, the specific cluster to which the target user belongs can be determined more finely, and the accuracy of determining the relationship between the user and other users is improved.
According to the analysis, the target set is clustered according to the relevance between the users in the target set, and the target cluster to which the target user belongs is determined according to the relevance between the users in the clusters and the target user.
In another optional implementation form of the present application, after the target cluster to which the target user belongs is determined, in order to meet the requirements of each user in the social relationship network, the embodiment may further determine the industry type in which the target user is engaged, so as to determine the tag set corresponding to the target user according to the industry type in which the target user is engaged, so that information targeted to the user can be subsequently pushed according to the tag set, thereby effectively improving the practicability and usability of the social group data. The social group mining method of the present application is further described below with reference to fig. 3.
Fig. 3 is a flowchart illustrating a social group mining method according to another embodiment of the present application.
As shown in fig. 3, the social group mining method of the embodiment of the present application may include the following steps:
step 301, obtaining location information and network environment information of a target user, where the network environment information is used to represent a network address to which the target user is currently accessed.
Step 302, determining a target set to which the target user belongs according to the position information of the target user.
Step 303, determining the association degree between the target user and each of the other users in the target set according to the network environment information of the target user and the network environment information of each of the other users in the target set.
And step 304, determining the social relationship between the target user and each other user in the target set according to the association degree between the target user and each other user in the target set.
The detailed implementation process and principle of the steps 301-304 can refer to the detailed description of the above embodiments, and are not described herein again.
Step 305, if the target user has a social relationship with other users in the target set, analyzing the entity name corresponding to the target set, and determining the industry type to which the target set belongs.
Optionally, after determining that the target user has a social relationship with other users in the target set, the industry to which the target set belongs may be mined, so as to determine the industry information engaged by each user in the target set. The method and the device can predict the industry type of the industry to which the target set belongs by adopting naive Bayesian inference.
In an optional implementation form of the present application, in this embodiment, a specific process of analyzing an entity name corresponding to a target set and determining an industry type to which the target set belongs may be shown in fig. 4.
As shown in fig. 4, the determining the industry type to which the target set belongs in the present embodiment may include the following steps:
step 401, performing parsing processing on the entity name corresponding to the target set, and determining a weight value of each word segmentation unit in the entity name corresponding to the target set.
Optionally, semantic analysis may be performed on the entity name corresponding to the target set, and word segmentation operation may be performed on the entity name corresponding to the target set according to the analysis result to obtain a corresponding word segmentation unit, and determine the weight of the word segmentation unit in the entity name.
When determining the weight of each word segmentation unit in the entity name, the weight can be determined according to the corresponding semantic meaning of each word segmentation unit, for example, after performing word segmentation on the entity name "hundredth building", the obtained word segmentation units are "hundredth" and "building", and "hundredth" has a special semantic meaning, but "building" has only a general semantic meaning. Thus, it is possible to determine that the weight value of "hectometer" in "hectometer building" is greater than the weight value of "building".
It should be noted that, when determining the weight of each word segmentation unit, it may also be considered whether each word segmentation unit is a professional term, and the weight value of a general term is greater than that of a non-professional term.
Step 402, determining the probability values of each industry corresponding to each word segmentation unit in the entity name corresponding to the target set according to a preset industry dictionary.
The preset industry dictionary can be an industry and word mapping dictionary established after semantic analysis and word segmentation processing are performed on entity names corresponding to various sets of known industry types and invalid words in word segmentation results are filtered. For example, after the word segmentation of the hundred-degree building is carried out and the effective word is filtered, the obtained effective word is the hundred degree, and the mapped industry is the internet.
Optionally, after determining the weight of each word segmentation unit in the entity name corresponding to the target set in the entity name, the social group mining device may determine, by using a preset industry dictionary, each industry probability value corresponding to each word segmentation unit in the entity name corresponding to the target set.
For example, when the word segmentation unit is "Baidu", the probability that the word segmentation unit is the internet is P1 and the probability that the word segmentation unit is the advertisement industry is P2 according to a preset industry dictionary.
Step 403, determining the industry type of the target set according to the weight of each word segmentation unit in the entity name and the industry probability values respectively corresponding to each word segmentation unit.
Optionally, the present embodiment may determine the industry type to which the target set belongs through the following formula (3).
Figure BDA0001694463830000141
Wherein, P (B) is the probability of the entity name belonging to the B industry, A is the effective word segmentation of the entity name, and P (A)i) For word cutting unit AiThe ratio of the entity name, P (B | A)i) For word cutting unit AiThe probability of belonging to the industry B, n is the number of word segmentation units, and i is the ith word segmentation unit.
That is, according to the formula (3), the probability of each industry to which the target set belongs can be obtained, and the industry type to which the target set belongs is selected from the multiple probabilities and has the highest probability.
And step 306, determining a label set corresponding to the target user according to the mapping relation between the preset industry type and the label.
The preset mapping relationship between the industry type and the label can be established by mining the industry to which the target set belongs based on word2 vec.
In practical application, in order to make the social data exert a greater value, the embodiment can perform convenient tag enrichment construction according to the unit/industry information of the target set.
Optionally, in this embodiment, strongly related seed pan-interest words may be established for each industry, and for product interest classifications to be mapped, the word vectors are used as features, and the similarity between the seed pan-interest words and the product interest words is calculated to achieve the purpose of expansion, that is, a word bank is expanded through word2vec to form an interest pool of each industry type, and then similar tags are mined in a corresponding industry pool according to semantic information of entity names corresponding to a target set to which a target user belongs.
The product interest categories may be various product types, such as news, video, e-commerce, etc.
For example, seed interest words in the internet industry include computers and technologies, and after word expansion, the expansion content obtained is as follows: google, AI, big data, etc.
As another example, seed interest words in the banking industry include: finance and stock, after word expansion, the obtained expansion content can be: the first time, Public offering (IPO), P2P, Luzhi, etc.
Further, after obtaining the label pools of each industry, the embodiment may query a mapping relationship between a preset industry type and a label according to the industry type corresponding to the target set to which the target user belongs, and determine the label set corresponding to the target user.
For a scene with low accuracy requirement, the embodiment may perform information push for a target user based on a determined tag set; for a scene with a high requirement on accuracy, the embodiment may further perform accurate matching on the determined tag, and then perform information push for the target user according to the finally determined tag corresponding to the target user.
Optionally, in this embodiment, a way of further accurately matching tags may be performed by performing semantic word segmentation on entity names, and then performing similar tag calculation in a tag pool in an industry to which the entity names belong, so as to obtain a tag with high confidence as a final tag set.
According to the social group mining method, after the affiliation relationship between the target user and the target set is determined, the entity name corresponding to the target set is analyzed, the industry type to which the target set belongs is determined, and then the label set corresponding to the target user is determined according to the mapping relationship between the preset industry type and the labels. Therefore, a reliable pushing basis is provided for the service of information pushing for the user, and targeted pushing can be realized, so that the speed and efficiency of obtaining useful information by the user are improved, and the user requirements are met.
The social group mining device proposed by the embodiment of the present application is described below with reference to the drawings.
Fig. 5 is a schematic structural diagram of a social group mining device according to an embodiment of the present application.
As shown in fig. 5, the social group mining device includes: the device comprises a first acquisition module 11, a first determination module 12, a second determination module 13 and a third determination module 14.
The first obtaining module 11 is configured to obtain location information and network environment information of a target user, where the network environment information is used to represent a network address to which the target user currently accesses;
the first determining module 12 is configured to determine a target set to which the target user belongs according to the location information of the target user;
the second determining module 13 is configured to determine, according to the network environment information of the target user and the network environment information of each other user in the target set, a degree of association between the target user and each other user in the target set;
the third determining module 14 is configured to determine, according to the association degrees between the target user and each of the other users in the target set, a social relationship between the target user and each of the other users in the target set.
As an optional implementation form, the social group mining device according to this embodiment further includes: and a fourth determination module.
The fourth determining module is used for analyzing the map data and determining the mapping relation between each set and the position.
The first determining module 12 is specifically configured to determine a target set to which the target user belongs according to a distance between the location information of the target user and each set location.
As an optional implementation form, the second determining module 13 is specifically configured to determine, based on a preset user association model, respective association degrees corresponding to the network environment information of the target user and the network environment information of each other user in the target set respectively;
the preset user association model is obtained by training a sample by using the network environment information of the user with known association degree.
It should be noted that, for the implementation process and the technical principle of the social group mining device of the present embodiment, reference is made to the foregoing explanation of the social group mining method of the first embodiment, and details are not repeated here.
The social group mining device provided by the embodiment of the application firstly obtains the position information and the network environment information of the target user, determines the target set to which the target user belongs according to the position information of the target user, determines the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set, and then determines the affiliation relation between the target user and the target set according to the association degree between the target user and each other user in the target set. Therefore, the method and the device realize mining of the relationship between the user and other users based on the position information and the network environment information of the user, and not only reduce the cost of mining the user relationship, but also enable the finally obtained user relationship to be more comprehensive and practical due to low acquisition difficulty and wide coverage range of the mining data.
In an exemplary embodiment, a social group mining apparatus is also provided.
Fig. 6 is a schematic structural diagram of social group mining according to another embodiment of the present application.
Referring to FIG. 6, social group mining of the present application includes: the device comprises a first obtaining module 11, a first determining module 12, a second determining module 13 and a fifth determining module 15.
The first obtaining module 11 is configured to obtain location information and network environment information of a target user, where the network environment information is used to represent a network address to which the target user currently accesses;
the first determining module 12 is configured to determine a target set to which the target user belongs according to the location information of the target user;
the target set corresponds to N entity names, wherein N is a positive integer greater than 1.
The second determining module 13 is configured to determine, according to the network environment information of the target user and the network environment information of each other user in the target set, a degree of association between the target user and each other user in the target set;
as an optional implementation form, the social group mining device of the present application further includes: a fifth determining module 15.
The fifth determining module 15 is configured to perform clustering on the target set according to the relevance between users in the target set, and determine a target cluster to which the target user belongs, where each cluster corresponds to an entity name.
As an optional implementation form, the fifth determining module 15 is specifically configured to determine that the target to which the target user belongs is a first cluster, if the association degrees between the target user and each user in the first cluster are greater than a threshold and the association degrees between the target user and each user in the second cluster are less than a threshold.
As an optional implementation form, the fifth determining module 15 is specifically configured to determine, if the association degrees between the target user and each user in the first cluster and the second cluster are greater than a threshold, and the first cluster and the second cluster correspond to the first entity name and the second entity name, the target cluster to which the target user belongs according to the number of users included in the first cluster and the number of users included in the second cluster.
As an optional implementation form, the fifth determining module 15 is specifically configured to determine, if the association degrees between the target user and each user in the first cluster and the second cluster are greater than a threshold, and the first cluster and the second cluster respectively correspond to the first entity name and the second entity name, the target cluster to which the target user belongs according to the type and/or scale of the industry to which the first entity name and the second entity name respectively belong.
It should be noted that, for the implementation process and the technical principle of the social group mining device of the present embodiment, reference is made to the foregoing explanation of the social group mining method of the first embodiment, and details are not repeated here.
According to the social group mining device provided by the embodiment of the application, the target set is subjected to clustering processing by analyzing the relevance among the users in the target set, so that when the affiliation relationship between the target user and the target set is determined, the specific cluster to which the target user belongs can be determined more finely, and the accuracy of determining the relationship between the user and other users is improved.
In an exemplary embodiment, a social group mining apparatus is also provided.
Fig. 7 is a schematic structural diagram of a social group mining device according to yet another embodiment of the present application.
As shown in fig. 7, the social group mining device of the present application includes: the device comprises a first obtaining module 11, a first determining module 12, a second determining module 13, a third determining module 14, a sixth determining module 16 and a seventh determining module 17.
The first obtaining module 11 is configured to obtain location information and network environment information of a target user, where the network environment information is used to represent a network address to which the target user currently accesses;
the first determining module 12 is configured to determine a target set to which the target user belongs according to the location information of the target user;
wherein the target set corresponds to an entity name.
The second determining module 13 is configured to determine, according to the network environment information of the target user and the network environment information of each other user in the target set, a degree of association between the target user and each other user in the target set;
the third determining module 14 is configured to determine, according to the association degrees between the target user and each of the other users in the target set, a social relationship between the target user and each of the other users in the target set.
As an optional implementation form, the social group mining device of the present application further includes: a sixth determining module 16 and a seventh determining module 17.
The sixth determining module 16 is configured to, if the target user has a social relationship with each of the other users in the target set, parse an entity name corresponding to the target set, and determine an industry type to which the target set belongs;
the seventh determining module 17 is configured to determine, according to a mapping relationship between preset industry types and tags, a tag set corresponding to the target user.
As an optional implementation form, the sixth determining module 16 is specifically configured to: analyzing the entity name corresponding to the target set, and determining the weight of each word cutting unit in the entity name corresponding to the target set;
determining the probability values of various industries corresponding to the word segmentation units in the entity name corresponding to the target set according to a preset industry dictionary;
and determining the industry type of the target set according to the weight value of each word cutting unit in the entity name and the industry probability value corresponding to each word cutting unit.
It should be noted that the foregoing explanation of the embodiment of the social group mining method is also applicable to the social group mining device of the embodiment, and the implementation principle thereof is similar and will not be described herein again.
The social group mining device of the embodiment analyzes the entity name corresponding to the target set after determining the affiliation relationship between the target user and the target set, determines the industry type to which the target set belongs, and then determines the tag set corresponding to the target user according to the mapping relationship between the preset industry type and the tags. Therefore, a reliable pushing basis is provided for the service of information pushing for the user, and targeted pushing can be realized, so that the speed and efficiency of obtaining useful information by the user are improved, and the user requirements are met.
In order to implement the above embodiments, the present application also provides a computer device.
Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application. The computer device shown in fig. 8 is only an example, and should not bring any limitation to the function and the scope of use of the embodiments of the present application.
As shown in fig. 8, the computer apparatus 200 includes: the social group mining method comprises a memory 210, a processor 220 and a computer program stored on the memory 210 and executable on the processor 220, wherein the processor 220 implements the social group mining method according to the first embodiment when executing the program.
In an alternative implementation form, as shown in fig. 9, the computer device 200 may further include: a memory 210 and a processor 220, a bus 230 connecting the different components (including the memory 210 and the processor 220), wherein the memory 210 stores a computer program, and when the processor 220 executes the program, the social group mining method according to the embodiment of the present application is implemented.
Bus 230 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 200 typically includes a variety of computer device readable media. Such media may be any available media that is accessible by computer device 200 and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 210 may also include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)240 and/or cache memory 250. The computer device 200 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 260 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 9, and commonly referred to as a "hard drive"). Although not shown in FIG. 9, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 230 by one or more data media interfaces. Memory 210 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the application.
A program/utility 280 having a set (at least one) of program modules 270, including but not limited to an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment, may be stored in, for example, the memory 210. The program modules 270 generally perform the functions and/or methodologies of the embodiments described herein.
The computer device 200 may also communicate with one or more external devices 290 (e.g., keyboard, pointing device, display 291, etc.), with one or more devices that enable a user to interact with the computer device 200, and/or with any devices (e.g., network card, modem, etc.) that enable the computer device 200 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 292. Also, computer device 200 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet) through network adapter 293. As shown, network adapter 293 communicates with the other modules of computer device 200 via bus 230. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the computer device 200, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
It should be noted that, for the implementation process and the technical principle of the computer device of the present embodiment, reference is made to the foregoing explanation of the social group mining method of the first aspect, and details are not repeated here.
The computer device provided by the embodiment of the application first obtains the location information and the network environment information of the target user, determines a target set to which the target user belongs according to the location information of the target user, determines the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set, and then determines the affiliation relationship between the target user and the target set according to the association degree between the target user and each other user in the target set. Therefore, the method and the device realize mining of the relationship between the user and other users based on the position information and the network environment information of the user, and not only reduce the cost of mining the user relationship, but also enable the finally obtained user relationship to be more comprehensive and practical due to low acquisition difficulty and wide coverage range of the mining data.
To achieve the above object, the present application also proposes a computer-readable storage medium.
Wherein the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the social group mining method described in the first aspect embodiment.
In an alternative implementation, the embodiments may be implemented in any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (10)

1. A social group mining method, comprising:
acquiring position information and network environment information of a target user, wherein the network environment information is used for representing a network address currently accessed by the target user;
determining a target set to which the target user belongs according to the position information of the target user;
determining the association degree between the target user and each other user in the target set according to the network environment information of the target user and the network environment information of each other user in the target set;
determining the social relationship between the target user and each other user in the target set according to the association degree between the target user and each other user in the target set; the target set corresponds to an entity name;
after determining the social relationship between the target user and each of the other users in the target set, the method further includes:
if the target user has social relations with other users in the target set, analyzing entity names corresponding to the target set to determine the industry type of the target set;
determining a label set corresponding to the target user according to a mapping relation between a preset industry type and a label;
the analyzing the entity name corresponding to the target set and determining the industry type to which the target set belongs includes:
analyzing the entity name corresponding to the target set, and determining the weight of each word cutting unit in the entity name corresponding to the target set;
determining the probability values of various industries corresponding to the word segmentation units in the entity name corresponding to the target set according to a preset industry dictionary;
and determining the industry type of the target set according to the weight value of each word cutting unit in the entity name and the industry probability value corresponding to each word cutting unit.
2. The method of claim 1, wherein prior to determining the target set to which the target user belongs, further comprising:
analyzing the map data, and determining the mapping relation between each set and the position;
the determining the target set to which the target user belongs includes:
and determining a target set to which the target user belongs according to the distance between the position information of the target user and the positions of the sets.
3. The method of claim 1, wherein said determining the degree of association between the target user and each of the other users in the target set comprises:
determining the network environment information of the target user and the network environment information of other users in the target set based on a preset user association model, wherein the association degrees respectively correspond to the network environment information of the target user and the network environment information of other users in the target set;
the preset user association model is obtained by training a sample by using the network environment information of the user with known association degree.
4. The method of claim 1, wherein the target set corresponds to N entity names, where N is a positive integer greater than 1;
after determining the association degree between the target user and each other user in the target set, the method further includes:
and according to the relevance between all users in the target set, clustering the target set, and determining a target cluster to which the target user belongs, wherein each cluster corresponds to an entity name.
5. The method of claim 4, wherein the determining the target cluster to which the target user belongs comprises:
and if the association degrees between the target user and each user in the first cluster are both greater than a threshold value and the association degrees between the target user and each user in the second cluster are both less than the threshold value, determining that the target cluster to which the target user belongs is the first cluster.
6. The method of claim 4, wherein the determining the target cluster to which the target user belongs comprises:
and if the association degrees between the target user and each user in the first cluster and the second cluster are both larger than a threshold value, and the first cluster and the second cluster respectively correspond to the first entity name and the second entity name, determining the target cluster to which the target user belongs according to the number of users contained in the first cluster and the number of users contained in the second cluster.
7. The method of claim 4,
the determining the target cluster to which the target user belongs includes:
and if the association degrees between the target user and each user in the first cluster and the second cluster are greater than a threshold value, and the first cluster and the second cluster respectively correspond to a first entity name and a second entity name, determining the target cluster to which the target user belongs according to the type and/or scale of the industry to which the first entity name and the second entity name respectively belong.
8. A social group mining device, comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring the position information and the network environment information of a target user, and the network environment information is used for representing the network address currently accessed by the target user;
the first determining module is used for determining a target set to which the target user belongs according to the position information of the target user;
a second determining module, configured to determine, according to the network environment information of the target user and the network environment information of each other user in the target set, a degree of association between the target user and each other user in the target set;
a third determining module, configured to determine, according to the association degrees between the target user and each of the other users in the target set, a social relationship between the target user and each of the other users in the target set; wherein the target set corresponds to an entity name;
after determining the social relationship between the target user and each of the other users in the target set, the method further includes:
if the target user has social relations with other users in the target set, analyzing entity names corresponding to the target set to determine the industry type of the target set;
determining a label set corresponding to the target user according to a mapping relation between a preset industry type and a label;
the analyzing the entity name corresponding to the target set and determining the industry type to which the target set belongs includes:
analyzing the entity name corresponding to the target set, and determining the weight of each word cutting unit in the entity name corresponding to the target set;
determining the probability values of various industries corresponding to the word segmentation units in the entity name corresponding to the target set according to a preset industry dictionary;
and determining the industry type of the target set according to the weight value of each word cutting unit in the entity name and the industry probability value corresponding to each word cutting unit.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executing the program implementing the social group mining method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the social group mining method according to any one of claims 1 to 7.
CN201810606527.7A 2018-06-13 2018-06-13 Social group mining method, device, equipment and storage medium Active CN110598122B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810606527.7A CN110598122B (en) 2018-06-13 2018-06-13 Social group mining method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810606527.7A CN110598122B (en) 2018-06-13 2018-06-13 Social group mining method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110598122A CN110598122A (en) 2019-12-20
CN110598122B true CN110598122B (en) 2022-04-01

Family

ID=68849115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810606527.7A Active CN110598122B (en) 2018-06-13 2018-06-13 Social group mining method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110598122B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369375A (en) * 2020-03-17 2020-07-03 深圳市随手金服信息科技有限公司 Social relationship determination method, device, equipment and storage medium
CN111652451B (en) * 2020-08-06 2020-12-01 腾讯科技(深圳)有限公司 Social relationship obtaining method and device and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8443005B1 (en) * 2011-07-12 2013-05-14 Relationship Science LLC Using an ontology model to validate connectivity in a social graph
CN105159926A (en) * 2015-08-04 2015-12-16 百度在线网络技术(北京)有限公司 Method and apparatus for establishing user information correlation of users
CN105608179A (en) * 2015-12-22 2016-05-25 百度在线网络技术(北京)有限公司 Method and device for determining relevance of user identification
CN106372072A (en) * 2015-07-20 2017-02-01 北京大学 Location-based recognition method for user relations in mobile social network
CN106446078A (en) * 2016-09-08 2017-02-22 乐视控股(北京)有限公司 Information recommendation method and recommendation apparatus
CN106557942A (en) * 2015-09-30 2017-04-05 百度在线网络技术(北京)有限公司 A kind of recognition methodss of customer relationship and device
CN106570764A (en) * 2016-11-09 2017-04-19 广州杰赛科技股份有限公司 User relationship predicting method and device
CN106685809A (en) * 2017-02-24 2017-05-17 腾讯科技(深圳)有限公司 Method and device for generating social network
CN106776707A (en) * 2016-11-11 2017-05-31 百度在线网络技术(北京)有限公司 The method and apparatus of information pushing
CN107194412A (en) * 2017-04-20 2017-09-22 百度在线网络技术(北京)有限公司 A kind of method of processing data, device, equipment and computer-readable storage medium
CN107613084A (en) * 2017-10-09 2018-01-19 陈包容 The methods, devices and systems that a kind of address book contact is grouped automatically
CN107786943A (en) * 2017-11-15 2018-03-09 北京腾云天下科技有限公司 A kind of tenant group method and computing device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8443005B1 (en) * 2011-07-12 2013-05-14 Relationship Science LLC Using an ontology model to validate connectivity in a social graph
CN106372072A (en) * 2015-07-20 2017-02-01 北京大学 Location-based recognition method for user relations in mobile social network
CN105159926A (en) * 2015-08-04 2015-12-16 百度在线网络技术(北京)有限公司 Method and apparatus for establishing user information correlation of users
CN106557942A (en) * 2015-09-30 2017-04-05 百度在线网络技术(北京)有限公司 A kind of recognition methodss of customer relationship and device
CN105608179A (en) * 2015-12-22 2016-05-25 百度在线网络技术(北京)有限公司 Method and device for determining relevance of user identification
CN106446078A (en) * 2016-09-08 2017-02-22 乐视控股(北京)有限公司 Information recommendation method and recommendation apparatus
CN106570764A (en) * 2016-11-09 2017-04-19 广州杰赛科技股份有限公司 User relationship predicting method and device
CN106776707A (en) * 2016-11-11 2017-05-31 百度在线网络技术(北京)有限公司 The method and apparatus of information pushing
CN106685809A (en) * 2017-02-24 2017-05-17 腾讯科技(深圳)有限公司 Method and device for generating social network
CN107194412A (en) * 2017-04-20 2017-09-22 百度在线网络技术(北京)有限公司 A kind of method of processing data, device, equipment and computer-readable storage medium
CN107613084A (en) * 2017-10-09 2018-01-19 陈包容 The methods, devices and systems that a kind of address book contact is grouped automatically
CN107786943A (en) * 2017-11-15 2018-03-09 北京腾云天下科技有限公司 A kind of tenant group method and computing device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于移动网络位置信息的群体发现方法;刘分等;《计算机应用研究》;20130531;第30卷(第5期);全文 *

Also Published As

Publication number Publication date
CN110598122A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
Zhang et al. An incremental CFS algorithm for clustering large data in industrial internet of things
CN107992596B (en) Text clustering method, text clustering device, server and storage medium
CN110390054B (en) Interest point recall method, device, server and storage medium
US9262438B2 (en) Geotagging unstructured text
US11861516B2 (en) Methods and system for associating locations with annotations
Zhao et al. ICFS clustering with multiple representatives for large data
CN111212383B (en) Method, device, server and medium for determining number of regional permanent population
CN106033416A (en) A string processing method and device
WO2018223331A1 (en) Systems and methods for text attribute determination using conditional random field model
CN111522838A (en) Address similarity calculation method and related device
Skoumas et al. Location estimation using crowdsourced spatial relations
CN110909540B (en) Method and device for identifying new words of short message spam and electronic equipment
CN110598122B (en) Social group mining method, device, equipment and storage medium
CN112860993A (en) Method, device, equipment, storage medium and program product for classifying points of interest
CN111310065A (en) Social contact recommendation method and device, server and storage medium
CN113821702A (en) Urban multidimensional space multivariate heterogeneous information data processing method
CN107729944B (en) Identification method and device of popular pictures, server and storage medium
CN104615620A (en) Map search type identification method and device and map search method and system
CN109657060B (en) Safety production accident case pushing method and system
CN113139110B (en) Regional characteristic processing method, regional characteristic processing device, regional characteristic processing equipment, storage medium and program product
CN115292008A (en) Transaction processing method, device, equipment and medium for distributed system
CN114638308A (en) Method and device for acquiring object relationship, electronic equipment and storage medium
CN111125272B (en) Regional characteristic acquisition method, regional characteristic acquisition device, computer equipment and medium
CN113627184B (en) Data processing method and device
CN111767722A (en) Word segmentation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant