CN109241202B - Stranger social user matching method and system based on clustering - Google Patents

Stranger social user matching method and system based on clustering Download PDF

Info

Publication number
CN109241202B
CN109241202B CN201811056510.5A CN201811056510A CN109241202B CN 109241202 B CN109241202 B CN 109241202B CN 201811056510 A CN201811056510 A CN 201811056510A CN 109241202 B CN109241202 B CN 109241202B
Authority
CN
China
Prior art keywords
user
activity
historical
cluster
activities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811056510.5A
Other languages
Chinese (zh)
Other versions
CN109241202A (en
Inventor
陈俊华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Feichi Network Technology Co ltd
Original Assignee
Hangzhou Feichi Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Feichi Network Technology Co ltd filed Critical Hangzhou Feichi Network Technology Co ltd
Priority to CN201811056510.5A priority Critical patent/CN109241202B/en
Publication of CN109241202A publication Critical patent/CN109241202A/en
Application granted granted Critical
Publication of CN109241202B publication Critical patent/CN109241202B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a stranger social user matching method based on clustering, which comprises the following steps: clustering historical users based on the characteristic data of the historical users, and dividing the historical users into a plurality of user clusters; clustering historical activities based on the characteristic data of the historical activities, and dividing the historical activities into a plurality of activity clusters; matching the user cluster with the activity cluster, and determining the corresponding relation between the user cluster and the activity cluster; dividing a current user and a current activity respectively, and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively; and pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation. According to the clustering-based stranger social contact user matching method, activities meeting the interests and hobbies of the user can be automatically pushed for the user, so that time is saved, user experience is improved, and social contact among strangers is facilitated.

Description

Stranger social user matching method and system based on clustering
Technical Field
The application relates to the technical field of internet application, in particular to a stranger social user matching method and system based on clustering.
Background
Social interaction refers to the interpersonal communication between people in the society, and is the consciousness that people transmit information and communicate ideas in a certain mode (tool) so as to achieve various social activities with a certain purpose. In the modern times, changes in economic and social environments make interpersonal communication more important. Because people can only continuously interact with various personnel and communicate information, people can be enriched, developed and expanded continuously.
With the development of scientific technology and the application of internet resources in life, the communication between people is realized by means of the internet, and strangers can also realize social contact through the internet, so that the purposes of further developing and expanding the strangers are realized. For example, some internet platforms and services have appeared in the prior art that are directed to strangers social services, such as searching for nearby people to have online conversations, transmitting network drift bottles, and the like.
A stranger social platform recently appeared in the prior art is that an activity organizer publishes a social activity (such as dinner gathering, outing, playing games, etc.) held at a predetermined time and place on the platform, and sets conditions (such as sex, age, etc.) to be met for participating in the social activity; other users can search the social activities which are interesting and meet the conditions on the platform and register the social activities online, and then the social activities are attended to the scheduled places on time as activity participants.
However, since the social activities published on the entire platform and the user groups facing the social activities are both of a large scale, in the prior art, in the process of realizing social interaction among strangers, when a user searches for activities meeting personal interests and hobbies in the APP of the intelligent terminal, a large amount of time is generally spent, which causes time waste, and in the prior art, the user is only supported to search by using simple keywords and to screen by using stylized limited conditions about activity time, location range, activity types and the like, which cannot enable the user to search for social activities most suitable for the interests and hobbies, time arrangement and the like, which affects user experience, and meanwhile, a certain social activity is not easy to gather the most suitable participants, which is not beneficial to normal social interaction among the strangers.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a method and a system for sharing encrypted data based on a blockchain, so as to solve the technical problem in the prior art that the account privacy of a user is completely exposed to all nodes of the blockchain.
In view of the above, in a first aspect of the present application, a method for matching social users of strangers based on clustering is provided, including:
clustering historical users based on the characteristic data of the historical users, and dividing the historical users into a plurality of user clusters;
clustering historical activities based on the characteristic data of the historical activities, and dividing the historical activities into a plurality of activity clusters;
matching the user cluster with the activity cluster, and determining the corresponding relation between the user cluster and the activity cluster;
dividing a current user and a current activity respectively, and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively;
and pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation between the user cluster and the activity cluster.
In some embodiments, the clustering the historical users based on the characteristic data of the historical users, and the dividing the historical users into a plurality of user clusters includes:
generating n k-dimensional feature vectors as sample points according to feature data of the historical users, wherein n is the total amount of the historical users, and k is the degree of dimension of the feature vectors of the historical users;
giving s central points, and respectively calculating the distance from each sample point to the s central points, wherein s is less than or equal to n;
marking each sample point as a category corresponding to a central point closest to the sample point;
updating the central point in each category as the mean value of all sample points belonging to the category;
and repeating the processes of marking the sample point categories and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of user clusters.
In some embodiments, the historical user profile data includes:
personal information of the user, personal tag categories of the user, and activity history information of the user.
In some embodiments, the activity history information of the user includes:
the number of interactions of user participation, approval and attention activities, and attribute information of activities of user participation, activities of user approval and activities of user attention.
In some embodiments, the clustering historical activities based on the characteristic data of the historical activities, the dividing the historical activities into a plurality of activity clusters, includes:
generating m g-dimensional feature vectors as sample points according to feature data of the historical activities, wherein m is the total amount of the historical activities, and g is the dimension number of the feature data of the historical activities;
giving q central points, and respectively calculating the distance from each sample point to the q central points, wherein q is less than or equal to m;
marking each sample point as a category corresponding to a central point closest to the sample point;
updating the central point in each category as the mean value of all sample points belonging to the category;
and repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of activity clusters.
In some embodiments, the characteristic data of the historical activities includes:
activity time, activity location, activity person attributes, and activity category.
In some embodiments, further comprising:
and quantitatively scoring the characteristic data of the historical user and the characteristic data of the historical activity, and converting the characteristic data of the historical user and the characteristic data of the historical activity into numerical values of characteristic values.
In some embodiments, further comprising:
using weight vectors (α)1,α2,α3,……,αk) Feature vector (x) for historical user and/or historical activity1,x2,x3,……,xk) Making a correction of which α123+……+αk=1,α1,α2,α3,……,αkThe value of (d) may be set based on a bias factor for the user to match the social activity.
In some embodiments, the matching the user cluster and the activity cluster to determine the correspondence between the user cluster and the activity cluster specifically includes:
and taking the activity cluster with the highest ratio of the users in the user cluster participating in the activities in the activity cluster as a matching activity cluster of the user cluster, and determining the corresponding relation between the user cluster and the activity cluster.
In another aspect of the present application, there is provided a system for matching stranger social users based on clustering, including:
the user cluster dividing module is used for clustering the historical users based on the characteristic data of the historical users and dividing the historical users into a plurality of user clusters;
the activity cluster dividing module is used for clustering historical activities based on the characteristic data of the historical activities and dividing the historical activities into a plurality of activity clusters;
the matching module is used for matching the user cluster with the activity cluster and determining the corresponding relation between the user cluster and the activity cluster;
the system comprises a current user and activity dividing module, a current activity dividing module and a current activity dividing module, wherein the current user and activity dividing module is used for dividing a current user and a current activity respectively and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively;
and the pushing module is used for pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation.
The embodiment of the application provides a clustering-based stranger social contact user matching method and system, which can automatically push activities meeting the interests and hobbies of users for the users, save the time for the users to search social contact activities suitable for the users, increase the satisfaction degree of the social contact activities, improve the user experience, and are beneficial to normal social contact among strangers.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is a flow chart of a stranger social user matching method based on clustering according to a first embodiment of the present application;
FIG. 2 is a flow chart of a stranger social user matching method based on clustering according to a second embodiment of the present application;
fig. 3 is a schematic structural diagram of a stranger social user matching system based on clustering according to a third embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
As an embodiment of the present application, as shown in fig. 1, fig. 1 is a flowchart of a stranger social user matching method based on clustering according to an embodiment of the present application. The method comprehensively collects personal attributes expressed in various aspects of the user and integrates the personal attributes into the characteristic vector corresponding to the user; moreover, according to the social activity and the related factors, a feature vector reflecting the attributes of the social activity is formed; and then determining social activities matched with the personal attributes of the users and pushing the social activities by means of clustering and historical data matching.
As can be seen from fig. 1, the clustering-based stranger social user matching method of the present embodiment may include the following steps:
s101: and clustering the historical users based on the characteristic data of the historical users, and dividing the historical users into a plurality of user clusters.
According to the stranger social contact user matching method based on clustering, a user can realize stranger social contact by registering an APP account by using social contact APPs (application software) installed in intelligent terminals such as smart phones. Specifically, the user APP can initiate social activities and publish activity information on a social APP platform, so that other users can obtain the activity information and further select registration to participate in the activities, and social interaction among strangers is achieved. Of course, users may also register to participate in social activities initiated by other users.
By utilizing the big data technology in the prior art, a user portrait can be established for each registered user of the social APP platform, and personal attributes of each user, including personal information such as gender and age filled in when the user registers, and a user tag, are stored in the user portrait. The user tags may be added by the user, or may be added by the social friends of the user, and reflect the interests and personalities of the user, for example, one user may be associated with a plurality of interest tags such as "sports", "gourmet", "art", and/or with personalities tags such as "music feverish friends", "eating", "soccer kidnapping". In addition, the invention refers to the users who have organized or registered the social activities participating in the social APP platform among the registered users as historical users, and also leaves activity history records of the historical social activities organized and participated in by the historical users in the user portraits. Furthermore, for the activity history record, it is also possible to record the historical social activity that the user has not participated in but had the interactive behavior related to the activity, such as the activity praise and the activity plus concern.
In step S101, historical users participating in historical activities are clustered, and the historical users are divided into a plurality of user clusters. Specifically, each user may correspond to a multi-dimensional feature vector, and the user feature data corresponding to each feature vector dimension may include: (1) personal information in the user representation, such as gender, age, transformed feature values. Taking 0-100 points as the value range of the characteristic value, the gender characteristic value of the male user can be specified as 100, and the gender characteristic value of the female user can be specified as 0. The user ages can be converted into corresponding characteristic values according to the distribution of the user ages in the range of 0-100 years (the characteristic values of the ages of the users more than or equal to 100 years are all 100). (2) The feature values of the corresponding categories, such as sports, gourmet, art, etc., that the interest tags in the user representation translate. Moreover, the category corresponding to the individual label can be identified by using a preset individual label word bank, and the identifiable individual label in the individual label of the user is converted into the characteristic value of the corresponding category according to a preset rule, for example, "music feisha" is converted into the characteristic value corresponding to the category of "art", the "food" is converted into the characteristic value corresponding to the category of "food", and "football little will" is converted into the characteristic value corresponding to "sports". Taking 0-100 points as the value interval of the characteristic value, and if the interest tag or the individual tag of the user corresponds to a certain category, adding 20 points to the characteristic value of the category; for example, if the interest tag of a certain user is "sports" and the personality tag is "eating", 20 points are added to the feature values of the user corresponding to the two categories of "sports" and "food". (3) And converting the historical social activity recorded in the activity history record of the user into a corresponding characteristic value. As previously described, historical social activities include activities initiated or engaged by the user, activities liked by the user, and activities of interest to the user. Firstly, the characteristic value of the corresponding category can be converted according to the category of the historical social activities and the number of times of interaction of user participation, praise, attention and the like, for example, the characteristic value corresponding to the category of 'food delicacy' is converted according to the number of times of the user participation, praise and attention to the historical social activities of the type of dinner party; converting the times of the user participating in historical social activities such as praise, concerned football games, running and the like into characteristic values corresponding to sports categories; and converting the historical social activities into characteristic values corresponding to 'art' according to the times of the users participating in the historical social activities such as praise, concerned movie watching, music listening and the like. And if the number of times that a user participates in and likes and pays attention to the historical social activities of the sports category is more than or equal to 10 times, adding 20 points to the characteristic value of the 'sports' category of the user. Meanwhile, attribute conditions of time, place, gender of the peers and the like of the historical social activities in which the user participates can be counted, for example, the holding time period of the historical social activities and the gender conditions of the activity peers (such as the percentage of the same sex peers) are counted, and the attribute conditions are converted into feature values corresponding to the attributes of each activity respectively. For example, according to whether the historical social activity holding time in which a certain user participates is distributed in the morning, afternoon or evening, the corresponding characteristic values are respectively set to 0, 50 and 100, and the characteristic value is 100 on the assumption that most of the historical social activity in which the user participates is distributed in the evening; if the majority is distributed in the afternoon, the score is 50. According to the distribution of the same-sex partner proportion of the historical social activities in which a certain user participates in the statistics, the proportion is converted into the characteristic value of the same-sex partner proportion.
Based on the feature data, each historical user may be associated with a multi-dimensional feature vector. For example, each historical user is associated with a seven-dimensional feature vector, the seven-dimensional feature vector includes seven feature data dimensions such as "user age", "user gender", "sports", "gourmet", "art", "historical activity time", "historical activity gender situation", and the like, each feature vector dimension may be associated with a specific numerical value as a feature value, for example, the corresponding feature vector of user a may be (x)1,x2,x3,x4,x5,x6,x7) In this embodiment, the range of the quantization value in each dimension may be 0 to 100. The historical users can be divided into a plurality of user clusters by clustering the big data of the feature vectors corresponding to all the historical users, so that the distance between the feature vectors corresponding to the users in the same user cluster is minimum, namely the interests and hobbies of the users in the same user cluster are similar. For specific clustering algorithm, refer to the following second embodiment, which is not described herein again.
S102: and clustering the historical activities based on the characteristic data of the historical activities, and dividing the historical activities into a plurality of activity clusters. In this embodiment, the historical activities may also be clustered by using big data, and the historical activities may be divided into a plurality of activity clusters. Specifically, each historical activity may correspond to a multi-dimensional feature vector, and the feature vector dimensions of the activity may include an activity time dimension, and one or more feature vector dimensions representing attributes of activity people (including population, age attributes, gender status attributes of activity initiators and activity participants), and feature vector dimensions corresponding to various activity categories (e.g., feature vector dimensions corresponding to categories of sports, gourmet, art, etc.).
For example, based on the activity feature data, each historical activity may be associated with an eight-dimensional feature vector including "activity time", "activity location", "number of active people", "average age of active people", "gender proportion of active people",eight dimensions of sports, food and art, each feature vector dimension can correspond to a specific feature value, for example, the corresponding feature vector of activity B can be (y)1,y2,y3,y4,y5,y6,y7,y8) The historical activities can be divided into a plurality of activity clusters by clustering the feature vectors corresponding to the historical activities, so that the distance between the feature vectors corresponding to the activities in the same activity cluster is the minimum.
S103: and matching the user cluster with the activity cluster, and determining the corresponding relation between the user cluster and the activity cluster.
In this embodiment, after dividing the historical user and the historical activity into a plurality of user clusters and activity clusters, the user clusters and the activity clusters may be matched to determine the corresponding relationship between the user clusters and the activity clusters.
S104: dividing a current user and a current activity respectively, and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively.
When a user needs to acquire activity information (the user is the current user), acquiring current activity information, wherein the current activity information in the embodiment refers to an activity in a recruitment stage, namely an activity to be performed, respectively matching the current user and the current activity with the user cluster and the activity cluster generated in the above steps, determining a user cluster to which the current user belongs, and determining an activity cluster to which the current activity belongs.
In this embodiment, the distance between the feature vector corresponding to the current user and the clustering center of each user cluster may be calculated, and the current user is divided into user clusters having the smallest distance from the clustering center.
S105: and pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation between the user cluster and the activity cluster.
After the current user and the current activity are divided, the current activity in the activity cluster corresponding to the user cluster to which the current user belongs may be pushed to the current user according to the corresponding relationship between the user cluster and the activity cluster in step S103, where the pushing mode includes that the APP social platform actively sends introduction information and an entry link about the current activity to the current user or in response to a request from a previous user. Of course, if the number of current activities in the activity cluster corresponding to the user cluster to which the current user belongs is too large, an appropriate number of activities may be selected and pushed to the current user according to the screening conditions of the attachments, such as time, distance from the user, and the like.
According to the clustering-based stranger social contact user matching method, social contact activities meeting various characteristic attributes such as user interests and hobbies can be automatically pushed to the user based on clustering analysis of historical data of the user and the social contact activities, so that the time for the user to search for the social contact activities suitable for the user is saved, the satisfaction degree of the social contact activities is increased, the user experience is improved, and normal social contact among strangers is facilitated.
Fig. 2 is a flowchart of a stranger social user matching method based on clustering according to a second embodiment of the present application. The method of this embodiment also includes steps S101-S105, which are not described herein again. Fig. 2 shows a process of clustering historical users in step S101 in the clustering-based stranger social user matching method, which specifically includes the following steps:
s201: and generating n k-dimensional feature vectors as sample points according to the feature data of the historical users, wherein n is the total amount of the historical users, and k is the degree of dimension of the feature vectors of the historical users.
As can be seen from the first embodiment, each historical user may correspond to a feature vector, for example, the seven-dimensional feature vector in the first embodiment, and the feature vector corresponding to each historical user is used as a sample point. When the total amount of the historical users acquired based on the big data technology is n, the capacity of the sample point is n, and k is the total amount of the dimensionality of the feature data of the historical users, namely the dimensionality of the feature vector.
For example, in the present embodiment, the 7 dimensions of the feature vector of the user a are respectively ("user age", "user gender", "sports", "gourmet", "art", "historical activity time", and "historical activity gender situation"), the quantized values of the corresponding feature data are respectively (20, 100, 60, 40, 75, 50, 50), and the range of the quantized values in the present embodiment may be 0 to 100.
S202: given s center points, the distance between each sample point and the s center points is respectively calculated, wherein s is smaller than or equal to n.
The embodiment may adopt a K-Means algorithm for clustering, and when clustering is started, central points need to be given, the number of the central points is the same as the number of the divided user clusters, and the embodiment takes the central point as s as an example to explain the technical scheme of the application. The dimension of the central point is the same as that of the sample point, and the value of each dimension of the initial central point can be arbitrarily given, for example, (50, 50, 50, 50, 50, 50), etc., which are not listed here. After the center point is given, the distance between each sample point to the center point can be calculated, and still in the example above, the distance between the sample point (20, 100, 60, 40, 75, 50, 50) to the center point (50, 0, 50, 50, 50) is d = ((20-50)2+(100-0
2+(60-50)2+(40-50)2+(75-50)2+(50-0)2+(50-50)2½
S203: each sample point is labeled as the category corresponding to its closest center point.
After calculating the distance of each sample point to the center point, each sample point is labeled as the category corresponding to the center point with the closest distance thereto.
S204: the center point in each class is updated to be the mean of all samples belonging to that class.
And after the first classification is carried out on all the sample points, updating the characteristic values of all the dimensions of the central point in each class into the mean value of the characteristic values of the characteristic vectors of all the sample points belonging to the class in all the dimensions.
S205: and judging whether the sum of the distances from all the sample points to the center points to which the sample points belong is minimum or not.
S206: and when the sum of the distances between all the sample points and the center points to which the sample points belong is minimum, generating a plurality of user clusters, namely taking the classification result at the moment as a final classification result, and taking each classification result as one user cluster.
If the sum of the distances from all the sample points to the center points to which the sample points belong is not the minimum, the above steps S203 to S205 are repeated until the sum of the distances from all the sample points to the center points to which the sample points belong is the minimum.
The stranger social user matching method based on clustering according to the embodiment of the application can achieve the similar technical effects as the embodiment, and is not repeated here.
The third embodiment of the application provides a flow chart of a stranger social user matching method based on clustering. The method of this embodiment also includes steps S101-S105, which are not described herein again. In the clustering-based stranger social user matching method, in step S102, historical activities are clustered based on feature data of the historical activities, and the historical activities are divided into a plurality of activity clusters, including:
generating m g-dimensional feature vectors as sample points according to feature data of the historical activities, wherein m is the total amount of the historical activities, and g is the dimension number of the feature vectors of the historical activities; for example, in one embodiment, an eight-dimensional feature vector is established for each historical activity.
Giving q central points, and respectively calculating the distance from each sample point to the q central points, wherein q is less than or equal to m; q corresponds to the number of activity clusters expected to partition historical activity.
Marking each sample point as a category corresponding to a central point closest to the sample point;
updating the characteristic value of each dimension of the characteristic vector of the central point in each category to be the mean value of the characteristic values of the characteristic vectors of all sample points belonging to the category in the dimension;
and repeating the process of marking the corresponding categories of the sample points and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of activity clusters.
The stranger social user matching method based on clustering according to the embodiment of the application can achieve the similar technical effects as the embodiment, and is not repeated here.
As another alternative embodiment of the above examples 1-3, further comprising:
for feature vectors of historical users and/or feature vectors of historical activities, a weight vector (α) is utilized1,α2,α3,……,αk) For the feature vector (x)1,x2,x3,……,xk) Making a correction of which α123+……+αk=1, the corrected eigenvector is (α)1x1,α2x2,α3x3,……,αkxk) Weight vector α for subsequent cluster analysis1,α2,α3,……,αkThe value of (a) can be specifically set according to a specific user, or can be determined by an empirical value, and α in the weight vector can be adjusted according to the emphasis on various aspect attribute factors when the user is matched with social activities1,α2,α3,……,αkTaking the value of (A); for example, if a preference is given to matching users with social activities on the principle of consistent interests, a higher weight value may be set for the "sports", "gourmet", "art", etc. dimensions representing interests than for the other dimensions; if a preference is given to matching users with social activities based on partner age and gender needs, a higher weight value may be assigned to the dimension representing the age to gender ratio.
In the above embodiments 1-3, the matching between the user cluster and the active cluster in step S103 to determine the corresponding relationship between the user cluster and the active cluster specifically includes:
and taking the activity cluster with the highest ratio of the users in the user cluster participating in the activities in the activity cluster as a matching activity cluster of the user cluster, and determining the corresponding relation between the user cluster and the activity cluster. For example, if the percentage of activities in the activity cluster C in which the user in the user cluster has participated is 65% of all activities in the activity cluster C, and the percentage of activities in the activity cluster D in which the user in the user cluster has participated is 70% of all activities in the activity cluster D, the activity cluster matched with the user cluster is the activity cluster D.
Fig. 3 is a schematic structural diagram of a stranger social user matching system based on clustering according to a fourth embodiment of the present application. The stranger social user matching system based on clustering of this embodiment includes:
the user cluster dividing module 301 is configured to cluster historical users based on feature data of the historical users, and divide the historical users into a plurality of user clusters;
an activity cluster dividing module 302, configured to cluster historical activities based on feature data of the historical activities, and divide the historical activities into a plurality of activity clusters;
a matching module 303, configured to match the user cluster with the activity cluster, and determine a correspondence between the user cluster and the activity cluster;
a current user and activity dividing module 304, configured to divide a current user and a current activity respectively, and determine a user cluster and an activity cluster to which the current user and the current activity belong respectively;
a pushing module 305, configured to push, to the current user, a current activity in an activity cluster corresponding to a user cluster to which the current user belongs, based on the correspondence.
Therefore, the clustering-based stranger social contact user matching method can automatically push social contact activities meeting various characteristic attributes such as user interests and hobbies for the user based on the clustering analysis of historical data of the user and the social contact activities, so that the time for the user to search for the social contact activities suitable for the user is saved, the satisfaction degree of the social contact activities is increased, the user experience is improved, and the normal operation of social contact among strangers is facilitated.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (3)

1. A stranger social user matching method based on clustering is characterized by comprising the following steps:
establishing a user portrait for each registered user, and storing personal attribute information, user interests, user personality labels and activity history information of each user in the user portrait, wherein the personal attribute information comprises gender and age information filled in when the user registers, and the activity history information is an activity history record of historical social activities organized and participated by the registered user and having praise or concern behaviors;
converting the sex and age personal attribute information of the user portrait of the historical user into corresponding characteristic values; converting the user interest tags into characteristic values corresponding to sports, gourmet and art categories, identifying the categories corresponding to the personalized tags by using a preset personalized tag lexicon, and converting identifiable personalized tags in the user personalized tags into the characteristic values corresponding to the sports, gourmet and art categories according to preset rules; according to the category of the historical social activities and the times of activities initiated, participated, liked or concerned by the user, converting the activity history information of the activities initiated, participated, liked or concerned by the user into characteristic values of sports, gourmet and art categories corresponding to the registered user, counting the time of the historical social activities participated by the user and the attribute conditions of peers, and respectively converting the time and the attribute conditions of the peers into the characteristic values corresponding to the attributes of each activity;
corresponding each historical user to a multi-dimensional feature vector based on the feature data of the historical users; the multidimensional feature vector comprises seven feature vector dimensions of 'user age', 'user sex', 'sports', 'food', 'art', 'historical activity time' and 'historical activity sex situation', and each feature vector dimension corresponds to a specific numerical value serving as a feature value;
clustering historical users, and dividing the historical users into a plurality of user clusters;
based on the feature data of the historical activities, each historical activity corresponds to an eight-dimensional feature vector, the eight-dimensional feature vector comprises eight dimensions of ' activity time ', ' activity place ', ' number of active people ', ' average age of active people ', ' sex proportion of active people ', ' sports ', ' food and ' art ', each feature vector dimension corresponds to a specific feature value, the historical activities are clustered based on the eight-dimensional feature vectors of the historical activities, and the historical activities are divided into a plurality of activity clusters;
taking an activity cluster with the highest ratio of the users in the user cluster participating in activities in the activity cluster as a matching activity cluster of the user cluster, and determining the corresponding relation between the user cluster and the activity cluster;
dividing a current user and a current activity respectively, and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively; calculating the distance between the feature vector corresponding to the current user and the clustering center of each user cluster, and dividing the current user into user clusters with the minimum distance from the clustering center; calculating the distance between the feature vector of the current activity and the clustering center of each activity cluster, and dividing the current activity into activity clusters with the minimum distance from the clustering center;
pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation between the user cluster and the activity cluster;
wherein a weight vector (α) is utilized for feature vectors of historical users and/or feature vectors of historical activities1,α2,α3,……,αk) For the feature vector (x)1,x2,x3,……,xk) Making a correction of which α123+……+αk=1, the corrected eigenvector is (α)1x1,α2x2,α3x3,……,αkxk) For subsequent cluster analysis, weight vector α1,α2,α3,……,αkIs adjusted α in the weight vector based on the emphasis on various aspect attribute factors when matching social activities for the user1,α2,α3,……,αkTaking the value of (A);
the clustering of the historical users based on the characteristic data of the historical users, the dividing of the historical users into a plurality of user clusters, comprises: generating n k-dimensional feature vectors as sample points according to feature data of the historical users, wherein n is the total amount of the historical users, and k is the degree of dimension of the feature vectors of the historical users; giving s central points, and respectively calculating the distance from each sample point to the s central points, wherein s is less than or equal to n; marking each sample point as a category corresponding to a central point closest to the sample point; updating the central point in each category as the mean value of all sample points belonging to the category; repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of user clusters;
wherein, eight-dimensional eigenvector based on historical activities clusters historical activities, divides the historical activities into a plurality of activity clusters, including: generating m g-dimensional feature vectors as sample points according to feature data of the historical activities, wherein m is the total amount of the historical activities, and g is the dimension number of the feature data of the historical activities; giving q central points, and respectively calculating the distance from each sample point to the q central points, wherein q is less than or equal to m; marking each sample point as a category corresponding to a central point closest to the sample point; updating the central point in each category as the mean value of all samples belonging to the category; and repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of activity clusters.
2. The method of claim 1, further comprising: and quantitatively scoring the characteristic data of the historical user and the characteristic data of the historical activity, and converting the characteristic data of the historical user and the characteristic data of the historical activity into numerical values of characteristic values.
3. A cluster-based stranger social user matching system, comprising:
the user cluster dividing module is used for establishing a user portrait for each registered user, and storing personal attribute information, user interests, user personality labels and activity history information of each user in the user portrait, wherein the personal attribute information comprises gender and age information filled in when the user registers, and the activity history information is an activity history record of historical social activities organized and participated by the registered user and having praise or concern behaviors; converting the sex and age personal attribute information of the user portrait of the historical user into corresponding characteristic values, converting the interest tags of the user into characteristic values corresponding to sports, gourmet and art categories, identifying the categories corresponding to the personalized tags by utilizing a preset personalized tag lexicon, and converting identifiable personalized tags in the personalized tags of the user into the characteristic values corresponding to the sports, gourmet and art categories according to preset rules; according to the category of the historical social activities and the times of activities initiated, participated, liked or concerned by the user, converting the activity history information of the activities initiated, participated, liked or concerned by the user into characteristic values of sports, gourmet and art categories corresponding to the registered user, counting the time of the historical social activities participated by the user and the attribute conditions of peers, and respectively converting the time and the attribute conditions of the peers into the characteristic values corresponding to the attributes of each activity; corresponding each historical user to a multi-dimensional feature vector based on the feature data of the historical users; the multidimensional feature vector comprises seven feature vector dimensions of 'user age', 'user sex', 'sports', 'food', 'art', 'historical activity time' and 'historical activity sex situation', and each feature vector dimension corresponds to a specific numerical value serving as a feature value; clustering historical users, and dividing the historical users into a plurality of user clusters;
the activity cluster dividing module is used for corresponding each historical activity to an eight-dimensional feature vector based on feature data of the historical activity, wherein the eight-dimensional feature vector comprises eight dimensions, namely ' activity time ', ' activity place ', ' number of active people ', ' average age of active people ', ' sex proportion of active people ', ' sports ', ' food and ' art ', each feature vector dimension corresponds to a specific feature value, the historical activities are clustered based on the eight-dimensional feature vectors of the historical activities, and the historical activities are divided into a plurality of activity clusters;
a matching module, configured to use an activity cluster with a highest ratio of users in the user clusters participating in activities in the activity clusters as a matching activity cluster of the user clusters, and determine a correspondence between the user clusters and the activity clusters;
the system comprises a current user and activity dividing module, a current activity dividing module and a current activity dividing module, wherein the current user and activity dividing module is used for dividing a current user and a current activity respectively and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively; calculating the distance between the feature vector corresponding to the current user and the clustering center of each user cluster, and dividing the current user into user clusters with the minimum distance from the clustering center; calculating the distance between the feature vector of the current activity and the clustering center of each activity cluster, and dividing the current activity into activity clusters with the minimum distance from the clustering center;
a pushing module, configured to push, to the current user, a current activity in an activity cluster corresponding to a user cluster to which the current user belongs, based on the correspondence;
wherein a weight vector (α) is utilized for feature vectors of historical users and/or feature vectors of historical activities1,α2,α3,……,αk) For the feature vector (x)1,x2,x3,……,xk) Making a correction of which α123+……+αk=1, the corrected eigenvector is (α)1x1,α2x2,α3x3,……,αkxk) For subsequent cluster analysis, weight vector α1,α2,α3,……,αkIs adjusted α in the weight vector based on the emphasis on various aspect attribute factors when matching social activities for the user1,α2,α3,……,αkTaking the value of (A);
the clustering of the historical users based on the characteristic data of the historical users, the dividing of the historical users into a plurality of user clusters, comprises: generating n k-dimensional feature vectors as sample points according to feature data of the historical users, wherein n is the total amount of the historical users, and k is the degree of dimension of the feature vectors of the historical users; giving s central points, and respectively calculating the distance from each sample point to the s central points, wherein s is less than or equal to n; marking each sample point as a category corresponding to a central point closest to the sample point; updating the central point in each category as the mean value of all sample points belonging to the category; repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of user clusters;
wherein, the eight-dimensional feature vector based on historical activities clusters historical activities, and divides the historical activities into a plurality of activity clusters, including: generating m g-dimensional feature vectors as sample points according to feature data of the historical activities, wherein m is the total amount of the historical activities, and g is the dimension number of the feature data of the historical activities; giving q central points, and respectively calculating the distance from each sample point to the q central points, wherein q is less than or equal to m; marking each sample point as a category corresponding to a central point closest to the sample point; updating the central point in each category as the mean value of all samples belonging to the category; and repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of activity clusters.
CN201811056510.5A 2018-09-11 2018-09-11 Stranger social user matching method and system based on clustering Active CN109241202B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811056510.5A CN109241202B (en) 2018-09-11 2018-09-11 Stranger social user matching method and system based on clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811056510.5A CN109241202B (en) 2018-09-11 2018-09-11 Stranger social user matching method and system based on clustering

Publications (2)

Publication Number Publication Date
CN109241202A CN109241202A (en) 2019-01-18
CN109241202B true CN109241202B (en) 2020-10-16

Family

ID=65060800

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811056510.5A Active CN109241202B (en) 2018-09-11 2018-09-11 Stranger social user matching method and system based on clustering

Country Status (1)

Country Link
CN (1) CN109241202B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116109332A (en) * 2019-07-16 2023-05-12 第四范式(北京)技术有限公司 Method and device for constructing user portrait information and electronic equipment
CN112085114A (en) * 2020-09-14 2020-12-15 杭州中奥科技有限公司 Online and offline identity matching method, device, equipment and storage medium
CN112967781A (en) * 2021-02-01 2021-06-15 苏州工业职业技术学院 NFC-based sports social system and method

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101082972A (en) * 2007-05-30 2007-12-05 华为技术有限公司 Method and device for forecasting user's interest to commercial product and method for publishing advertisement thereof
CN102750647A (en) * 2012-06-29 2012-10-24 南京大学 Merchant recommendation method based on transaction network
CN106162348A (en) * 2015-04-13 2016-11-23 海信集团有限公司 A kind of personal program recommends method and device
US20180211270A1 (en) * 2017-01-25 2018-07-26 Business Objects Software Ltd. Machine-trained adaptive content targeting
CN107506480B (en) * 2017-09-13 2020-05-05 浙江工业大学 Double-layer graph structure recommendation method based on comment mining and density clustering
CN107885778B (en) * 2017-10-12 2020-08-04 浙江工业大学 Personalized recommendation method based on dynamic near point spectral clustering
CN108052639A (en) * 2017-12-21 2018-05-18 中国联合网络通信集团有限公司 Industry user based on carrier data recommends method and device
CN108197285A (en) * 2018-01-15 2018-06-22 腾讯科技(深圳)有限公司 A kind of data recommendation method and device

Also Published As

Publication number Publication date
CN109241202A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
CN106355449B (en) User selection method and device
US7680770B1 (en) Automatic generation and recommendation of communities in a social network
Yusefi Hafshejani et al. Improving sparsity and new user problems in collaborative filtering by clustering the personality factors
US20220005117A1 (en) Credit scoring method and server
US8332418B1 (en) Collaborative filtering to match people
CN109241202B (en) Stranger social user matching method and system based on clustering
CN103119620A (en) Action suggestions based on inferred social relationships
Nayak et al. A social matching system for an online dating network: a preliminary study
CN103581165B (en) Message processing device, information processing method and information processing system
CN109241120A (en) A kind of user's recommended method and device
CN109284932B (en) Stranger social user evaluation method and system based on big data
CN109408735B (en) Stranger social user portrait generation method and system
Liu et al. A hybrid book recommendation algorithm based on context awareness and social network
Kuo et al. Contextual restaurant recommendation utilizing implicit feedback
JP2020057221A (en) Information processing method, information processing device, and program
CN115131052A (en) Data processing method, computer equipment and storage medium
US20210133801A1 (en) Methods and systems for signature extraction in data management platform for contact center
US20210133804A1 (en) Methods and systems for call advertisement in data management platform for contact center
US20210125233A1 (en) Methods and systems for segmentation and activation in data management platform for contact center
US20210133781A1 (en) Methods and systems for predictive marketing platform in data management platform for contact center
US20210125209A1 (en) Methods and systems for customer identifier in data management platform for contact center
US20210133780A1 (en) Methods and systems for marketing automation and customer relationship management (crm) automation in data management platform for contact center
US20210125204A1 (en) Data management platform, methods, and systems for contact center
Mazumdar et al. Machine learning model for prediction of smartphone addiction
CN111506674B (en) Matching method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant