CN109241202B

CN109241202B - Stranger social user matching method and system based on clustering

Info

Publication number: CN109241202B
Application number: CN201811056510.5A
Authority: CN
Inventors: 陈俊华
Original assignee: Hangzhou Feichi Network Technology Co ltd
Current assignee: Hangzhou Feichi Network Technology Co ltd
Priority date: 2018-09-11
Filing date: 2018-09-11
Publication date: 2020-10-16
Anticipated expiration: 2038-09-11
Also published as: CN109241202A

Abstract

The embodiment of the application provides a stranger social user matching method based on clustering, which comprises the following steps: clustering historical users based on the characteristic data of the historical users, and dividing the historical users into a plurality of user clusters; clustering historical activities based on the characteristic data of the historical activities, and dividing the historical activities into a plurality of activity clusters; matching the user cluster with the activity cluster, and determining the corresponding relation between the user cluster and the activity cluster; dividing a current user and a current activity respectively, and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively; and pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation. According to the clustering-based stranger social contact user matching method, activities meeting the interests and hobbies of the user can be automatically pushed for the user, so that time is saved, user experience is improved, and social contact among strangers is facilitated.

Description

Stranger social user matching method and system based on clustering

Technical Field

The application relates to the technical field of internet application, in particular to a stranger social user matching method and system based on clustering.

Background

Social interaction refers to the interpersonal communication between people in the society, and is the consciousness that people transmit information and communicate ideas in a certain mode (tool) so as to achieve various social activities with a certain purpose. In the modern times, changes in economic and social environments make interpersonal communication more important. Because people can only continuously interact with various personnel and communicate information, people can be enriched, developed and expanded continuously.

With the development of scientific technology and the application of internet resources in life, the communication between people is realized by means of the internet, and strangers can also realize social contact through the internet, so that the purposes of further developing and expanding the strangers are realized. For example, some internet platforms and services have appeared in the prior art that are directed to strangers social services, such as searching for nearby people to have online conversations, transmitting network drift bottles, and the like.

A stranger social platform recently appeared in the prior art is that an activity organizer publishes a social activity (such as dinner gathering, outing, playing games, etc.) held at a predetermined time and place on the platform, and sets conditions (such as sex, age, etc.) to be met for participating in the social activity; other users can search the social activities which are interesting and meet the conditions on the platform and register the social activities online, and then the social activities are attended to the scheduled places on time as activity participants.

However, since the social activities published on the entire platform and the user groups facing the social activities are both of a large scale, in the prior art, in the process of realizing social interaction among strangers, when a user searches for activities meeting personal interests and hobbies in the APP of the intelligent terminal, a large amount of time is generally spent, which causes time waste, and in the prior art, the user is only supported to search by using simple keywords and to screen by using stylized limited conditions about activity time, location range, activity types and the like, which cannot enable the user to search for social activities most suitable for the interests and hobbies, time arrangement and the like, which affects user experience, and meanwhile, a certain social activity is not easy to gather the most suitable participants, which is not beneficial to normal social interaction among the strangers.

Disclosure of Invention

In view of the above, an object of the present invention is to provide a method and a system for sharing encrypted data based on a blockchain, so as to solve the technical problem in the prior art that the account privacy of a user is completely exposed to all nodes of the blockchain.

In view of the above, in a first aspect of the present application, a method for matching social users of strangers based on clustering is provided, including:

clustering historical users based on the characteristic data of the historical users, and dividing the historical users into a plurality of user clusters;

clustering historical activities based on the characteristic data of the historical activities, and dividing the historical activities into a plurality of activity clusters;

matching the user cluster with the activity cluster, and determining the corresponding relation between the user cluster and the activity cluster;

dividing a current user and a current activity respectively, and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively;

and pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation between the user cluster and the activity cluster.

In some embodiments, the clustering the historical users based on the characteristic data of the historical users, and the dividing the historical users into a plurality of user clusters includes:

generating n k-dimensional feature vectors as sample points according to feature data of the historical users, wherein n is the total amount of the historical users, and k is the degree of dimension of the feature vectors of the historical users;

giving s central points, and respectively calculating the distance from each sample point to the s central points, wherein s is less than or equal to n;

marking each sample point as a category corresponding to a central point closest to the sample point;

updating the central point in each category as the mean value of all sample points belonging to the category;

and repeating the processes of marking the sample point categories and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of user clusters.

In some embodiments, the historical user profile data includes:

personal information of the user, personal tag categories of the user, and activity history information of the user.

In some embodiments, the activity history information of the user includes:

the number of interactions of user participation, approval and attention activities, and attribute information of activities of user participation, activities of user approval and activities of user attention.

In some embodiments, the clustering historical activities based on the characteristic data of the historical activities, the dividing the historical activities into a plurality of activity clusters, includes:

generating m g-dimensional feature vectors as sample points according to feature data of the historical activities, wherein m is the total amount of the historical activities, and g is the dimension number of the feature data of the historical activities;

giving q central points, and respectively calculating the distance from each sample point to the q central points, wherein q is less than or equal to m;

and repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of activity clusters.

In some embodiments, the characteristic data of the historical activities includes:

activity time, activity location, activity person attributes, and activity category.

In some embodiments, further comprising:

and quantitatively scoring the characteristic data of the historical user and the characteristic data of the historical activity, and converting the characteristic data of the historical user and the characteristic data of the historical activity into numerical values of characteristic values.

In some embodiments, further comprising:

using weight vectors (α)₁，α₂，α₃，……，α_k) Feature vector (x) for historical user and/or historical activity₁，x₂，x₃，……，x_k) Making a correction of which α₁+α₂+α₃+……+α_k=1，α₁，α₂，α₃，……，α_kThe value of (d) may be set based on a bias factor for the user to match the social activity.

In some embodiments, the matching the user cluster and the activity cluster to determine the correspondence between the user cluster and the activity cluster specifically includes:

and taking the activity cluster with the highest ratio of the users in the user cluster participating in the activities in the activity cluster as a matching activity cluster of the user cluster, and determining the corresponding relation between the user cluster and the activity cluster.

In another aspect of the present application, there is provided a system for matching stranger social users based on clustering, including:

the user cluster dividing module is used for clustering the historical users based on the characteristic data of the historical users and dividing the historical users into a plurality of user clusters;

the activity cluster dividing module is used for clustering historical activities based on the characteristic data of the historical activities and dividing the historical activities into a plurality of activity clusters;

the matching module is used for matching the user cluster with the activity cluster and determining the corresponding relation between the user cluster and the activity cluster;

the system comprises a current user and activity dividing module, a current activity dividing module and a current activity dividing module, wherein the current user and activity dividing module is used for dividing a current user and a current activity respectively and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively;

and the pushing module is used for pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation.

The embodiment of the application provides a clustering-based stranger social contact user matching method and system, which can automatically push activities meeting the interests and hobbies of users for the users, save the time for the users to search social contact activities suitable for the users, increase the satisfaction degree of the social contact activities, improve the user experience, and are beneficial to normal social contact among strangers.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is a flow chart of a stranger social user matching method based on clustering according to a first embodiment of the present application;

FIG. 2 is a flow chart of a stranger social user matching method based on clustering according to a second embodiment of the present application;

fig. 3 is a schematic structural diagram of a stranger social user matching system based on clustering according to a third embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

As an embodiment of the present application, as shown in fig. 1, fig. 1 is a flowchart of a stranger social user matching method based on clustering according to an embodiment of the present application. The method comprehensively collects personal attributes expressed in various aspects of the user and integrates the personal attributes into the characteristic vector corresponding to the user; moreover, according to the social activity and the related factors, a feature vector reflecting the attributes of the social activity is formed; and then determining social activities matched with the personal attributes of the users and pushing the social activities by means of clustering and historical data matching.

As can be seen from fig. 1, the clustering-based stranger social user matching method of the present embodiment may include the following steps:

s101: and clustering the historical users based on the characteristic data of the historical users, and dividing the historical users into a plurality of user clusters.

According to the stranger social contact user matching method based on clustering, a user can realize stranger social contact by registering an APP account by using social contact APPs (application software) installed in intelligent terminals such as smart phones. Specifically, the user APP can initiate social activities and publish activity information on a social APP platform, so that other users can obtain the activity information and further select registration to participate in the activities, and social interaction among strangers is achieved. Of course, users may also register to participate in social activities initiated by other users.

By utilizing the big data technology in the prior art, a user portrait can be established for each registered user of the social APP platform, and personal attributes of each user, including personal information such as gender and age filled in when the user registers, and a user tag, are stored in the user portrait. The user tags may be added by the user, or may be added by the social friends of the user, and reflect the interests and personalities of the user, for example, one user may be associated with a plurality of interest tags such as "sports", "gourmet", "art", and/or with personalities tags such as "music feverish friends", "eating", "soccer kidnapping". In addition, the invention refers to the users who have organized or registered the social activities participating in the social APP platform among the registered users as historical users, and also leaves activity history records of the historical social activities organized and participated in by the historical users in the user portraits. Furthermore, for the activity history record, it is also possible to record the historical social activity that the user has not participated in but had the interactive behavior related to the activity, such as the activity praise and the activity plus concern.

In step S101, historical users participating in historical activities are clustered, and the historical users are divided into a plurality of user clusters. Specifically, each user may correspond to a multi-dimensional feature vector, and the user feature data corresponding to each feature vector dimension may include: (1) personal information in the user representation, such as gender, age, transformed feature values. Taking 0-100 points as the value range of the characteristic value, the gender characteristic value of the male user can be specified as 100, and the gender characteristic value of the female user can be specified as 0. The user ages can be converted into corresponding characteristic values according to the distribution of the user ages in the range of 0-100 years (the characteristic values of the ages of the users more than or equal to 100 years are all 100). (2) The feature values of the corresponding categories, such as sports, gourmet, art, etc., that the interest tags in the user representation translate. Moreover, the category corresponding to the individual label can be identified by using a preset individual label word bank, and the identifiable individual label in the individual label of the user is converted into the characteristic value of the corresponding category according to a preset rule, for example, "music feisha" is converted into the characteristic value corresponding to the category of "art", the "food" is converted into the characteristic value corresponding to the category of "food", and "football little will" is converted into the characteristic value corresponding to "sports". Taking 0-100 points as the value interval of the characteristic value, and if the interest tag or the individual tag of the user corresponds to a certain category, adding 20 points to the characteristic value of the category; for example, if the interest tag of a certain user is "sports" and the personality tag is "eating", 20 points are added to the feature values of the user corresponding to the two categories of "sports" and "food". (3) And converting the historical social activity recorded in the activity history record of the user into a corresponding characteristic value. As previously described, historical social activities include activities initiated or engaged by the user, activities liked by the user, and activities of interest to the user. Firstly, the characteristic value of the corresponding category can be converted according to the category of the historical social activities and the number of times of interaction of user participation, praise, attention and the like, for example, the characteristic value corresponding to the category of 'food delicacy' is converted according to the number of times of the user participation, praise and attention to the historical social activities of the type of dinner party; converting the times of the user participating in historical social activities such as praise, concerned football games, running and the like into characteristic values corresponding to sports categories; and converting the historical social activities into characteristic values corresponding to 'art' according to the times of the users participating in the historical social activities such as praise, concerned movie watching, music listening and the like. And if the number of times that a user participates in and likes and pays attention to the historical social activities of the sports category is more than or equal to 10 times, adding 20 points to the characteristic value of the 'sports' category of the user. Meanwhile, attribute conditions of time, place, gender of the peers and the like of the historical social activities in which the user participates can be counted, for example, the holding time period of the historical social activities and the gender conditions of the activity peers (such as the percentage of the same sex peers) are counted, and the attribute conditions are converted into feature values corresponding to the attributes of each activity respectively. For example, according to whether the historical social activity holding time in which a certain user participates is distributed in the morning, afternoon or evening, the corresponding characteristic values are respectively set to 0, 50 and 100, and the characteristic value is 100 on the assumption that most of the historical social activity in which the user participates is distributed in the evening; if the majority is distributed in the afternoon, the score is 50. According to the distribution of the same-sex partner proportion of the historical social activities in which a certain user participates in the statistics, the proportion is converted into the characteristic value of the same-sex partner proportion.

Based on the feature data, each historical user may be associated with a multi-dimensional feature vector. For example, each historical user is associated with a seven-dimensional feature vector, the seven-dimensional feature vector includes seven feature data dimensions such as "user age", "user gender", "sports", "gourmet", "art", "historical activity time", "historical activity gender situation", and the like, each feature vector dimension may be associated with a specific numerical value as a feature value, for example, the corresponding feature vector of user a may be (x)₁，x₂，x₃，x₄，x₅，x₆，x₇) In this embodiment, the range of the quantization value in each dimension may be 0 to 100. The historical users can be divided into a plurality of user clusters by clustering the big data of the feature vectors corresponding to all the historical users, so that the distance between the feature vectors corresponding to the users in the same user cluster is minimum, namely the interests and hobbies of the users in the same user cluster are similar. For specific clustering algorithm, refer to the following second embodiment, which is not described herein again.

S102: and clustering the historical activities based on the characteristic data of the historical activities, and dividing the historical activities into a plurality of activity clusters. In this embodiment, the historical activities may also be clustered by using big data, and the historical activities may be divided into a plurality of activity clusters. Specifically, each historical activity may correspond to a multi-dimensional feature vector, and the feature vector dimensions of the activity may include an activity time dimension, and one or more feature vector dimensions representing attributes of activity people (including population, age attributes, gender status attributes of activity initiators and activity participants), and feature vector dimensions corresponding to various activity categories (e.g., feature vector dimensions corresponding to categories of sports, gourmet, art, etc.).

For example, based on the activity feature data, each historical activity may be associated with an eight-dimensional feature vector including "activity time", "activity location", "number of active people", "average age of active people", "gender proportion of active people",eight dimensions of sports, food and art, each feature vector dimension can correspond to a specific feature value, for example, the corresponding feature vector of activity B can be (y)₁，y₂，y₃，y₄，y₅，y₆，y₇，y₈) The historical activities can be divided into a plurality of activity clusters by clustering the feature vectors corresponding to the historical activities, so that the distance between the feature vectors corresponding to the activities in the same activity cluster is the minimum.

S103: and matching the user cluster with the activity cluster, and determining the corresponding relation between the user cluster and the activity cluster.

In this embodiment, after dividing the historical user and the historical activity into a plurality of user clusters and activity clusters, the user clusters and the activity clusters may be matched to determine the corresponding relationship between the user clusters and the activity clusters.

S104: dividing a current user and a current activity respectively, and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively.

When a user needs to acquire activity information (the user is the current user), acquiring current activity information, wherein the current activity information in the embodiment refers to an activity in a recruitment stage, namely an activity to be performed, respectively matching the current user and the current activity with the user cluster and the activity cluster generated in the above steps, determining a user cluster to which the current user belongs, and determining an activity cluster to which the current activity belongs.

In this embodiment, the distance between the feature vector corresponding to the current user and the clustering center of each user cluster may be calculated, and the current user is divided into user clusters having the smallest distance from the clustering center.

S105: and pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation between the user cluster and the activity cluster.

After the current user and the current activity are divided, the current activity in the activity cluster corresponding to the user cluster to which the current user belongs may be pushed to the current user according to the corresponding relationship between the user cluster and the activity cluster in step S103, where the pushing mode includes that the APP social platform actively sends introduction information and an entry link about the current activity to the current user or in response to a request from a previous user. Of course, if the number of current activities in the activity cluster corresponding to the user cluster to which the current user belongs is too large, an appropriate number of activities may be selected and pushed to the current user according to the screening conditions of the attachments, such as time, distance from the user, and the like.

According to the clustering-based stranger social contact user matching method, social contact activities meeting various characteristic attributes such as user interests and hobbies can be automatically pushed to the user based on clustering analysis of historical data of the user and the social contact activities, so that the time for the user to search for the social contact activities suitable for the user is saved, the satisfaction degree of the social contact activities is increased, the user experience is improved, and normal social contact among strangers is facilitated.

Fig. 2 is a flowchart of a stranger social user matching method based on clustering according to a second embodiment of the present application. The method of this embodiment also includes steps S101-S105, which are not described herein again. Fig. 2 shows a process of clustering historical users in step S101 in the clustering-based stranger social user matching method, which specifically includes the following steps:

s201: and generating n k-dimensional feature vectors as sample points according to the feature data of the historical users, wherein n is the total amount of the historical users, and k is the degree of dimension of the feature vectors of the historical users.

As can be seen from the first embodiment, each historical user may correspond to a feature vector, for example, the seven-dimensional feature vector in the first embodiment, and the feature vector corresponding to each historical user is used as a sample point. When the total amount of the historical users acquired based on the big data technology is n, the capacity of the sample point is n, and k is the total amount of the dimensionality of the feature data of the historical users, namely the dimensionality of the feature vector.

For example, in the present embodiment, the 7 dimensions of the feature vector of the user a are respectively ("user age", "user gender", "sports", "gourmet", "art", "historical activity time", and "historical activity gender situation"), the quantized values of the corresponding feature data are respectively (20, 100, 60, 40, 75, 50, 50), and the range of the quantized values in the present embodiment may be 0 to 100.

S202: given s center points, the distance between each sample point and the s center points is respectively calculated, wherein s is smaller than or equal to n.

The embodiment may adopt a K-Means algorithm for clustering, and when clustering is started, central points need to be given, the number of the central points is the same as the number of the divided user clusters, and the embodiment takes the central point as s as an example to explain the technical scheme of the application. The dimension of the central point is the same as that of the sample point, and the value of each dimension of the initial central point can be arbitrarily given, for example, (50, 50, 50, 50, 50, 50), etc., which are not listed here. After the center point is given, the distance between each sample point to the center point can be calculated, and still in the example above, the distance between the sample point (20, 100, 60, 40, 75, 50, 50) to the center point (50, 0, 50, 50, 50) is d = ((20-50)²+（100-0

）²+（60-50）²+（40-50）²+（75-50）²+（50-0）²+（50-50）²）^½。

S203: each sample point is labeled as the category corresponding to its closest center point.

After calculating the distance of each sample point to the center point, each sample point is labeled as the category corresponding to the center point with the closest distance thereto.

S204: the center point in each class is updated to be the mean of all samples belonging to that class.

And after the first classification is carried out on all the sample points, updating the characteristic values of all the dimensions of the central point in each class into the mean value of the characteristic values of the characteristic vectors of all the sample points belonging to the class in all the dimensions.

S205: and judging whether the sum of the distances from all the sample points to the center points to which the sample points belong is minimum or not.

S206: and when the sum of the distances between all the sample points and the center points to which the sample points belong is minimum, generating a plurality of user clusters, namely taking the classification result at the moment as a final classification result, and taking each classification result as one user cluster.

If the sum of the distances from all the sample points to the center points to which the sample points belong is not the minimum, the above steps S203 to S205 are repeated until the sum of the distances from all the sample points to the center points to which the sample points belong is the minimum.

The stranger social user matching method based on clustering according to the embodiment of the application can achieve the similar technical effects as the embodiment, and is not repeated here.

The third embodiment of the application provides a flow chart of a stranger social user matching method based on clustering. The method of this embodiment also includes steps S101-S105, which are not described herein again. In the clustering-based stranger social user matching method, in step S102, historical activities are clustered based on feature data of the historical activities, and the historical activities are divided into a plurality of activity clusters, including:

generating m g-dimensional feature vectors as sample points according to feature data of the historical activities, wherein m is the total amount of the historical activities, and g is the dimension number of the feature vectors of the historical activities; for example, in one embodiment, an eight-dimensional feature vector is established for each historical activity.

Giving q central points, and respectively calculating the distance from each sample point to the q central points, wherein q is less than or equal to m; q corresponds to the number of activity clusters expected to partition historical activity.

updating the characteristic value of each dimension of the characteristic vector of the central point in each category to be the mean value of the characteristic values of the characteristic vectors of all sample points belonging to the category in the dimension;

and repeating the process of marking the corresponding categories of the sample points and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of activity clusters.

As another alternative embodiment of the above examples 1-3, further comprising:

for feature vectors of historical users and/or feature vectors of historical activities, a weight vector (α) is utilized₁，α₂，α₃，……，α_k) For the feature vector (x)₁，x₂，x₃，……，x_k) Making a correction of which α₁+α₂+α₃+……+α_k=1, the corrected eigenvector is (α)₁x₁，α₂x₂，α₃x₃，……，α_kx_k) Weight vector α for subsequent cluster analysis₁，α₂，α₃，……，α_kThe value of (a) can be specifically set according to a specific user, or can be determined by an empirical value, and α in the weight vector can be adjusted according to the emphasis on various aspect attribute factors when the user is matched with social activities₁，α₂，α₃，……，α_kTaking the value of (A); for example, if a preference is given to matching users with social activities on the principle of consistent interests, a higher weight value may be set for the "sports", "gourmet", "art", etc. dimensions representing interests than for the other dimensions; if a preference is given to matching users with social activities based on partner age and gender needs, a higher weight value may be assigned to the dimension representing the age to gender ratio.

In the above embodiments 1-3, the matching between the user cluster and the active cluster in step S103 to determine the corresponding relationship between the user cluster and the active cluster specifically includes:

and taking the activity cluster with the highest ratio of the users in the user cluster participating in the activities in the activity cluster as a matching activity cluster of the user cluster, and determining the corresponding relation between the user cluster and the activity cluster. For example, if the percentage of activities in the activity cluster C in which the user in the user cluster has participated is 65% of all activities in the activity cluster C, and the percentage of activities in the activity cluster D in which the user in the user cluster has participated is 70% of all activities in the activity cluster D, the activity cluster matched with the user cluster is the activity cluster D.

Fig. 3 is a schematic structural diagram of a stranger social user matching system based on clustering according to a fourth embodiment of the present application. The stranger social user matching system based on clustering of this embodiment includes:

the user cluster dividing module 301 is configured to cluster historical users based on feature data of the historical users, and divide the historical users into a plurality of user clusters;

an activity cluster dividing module 302, configured to cluster historical activities based on feature data of the historical activities, and divide the historical activities into a plurality of activity clusters;

a matching module 303, configured to match the user cluster with the activity cluster, and determine a correspondence between the user cluster and the activity cluster;

a current user and activity dividing module 304, configured to divide a current user and a current activity respectively, and determine a user cluster and an activity cluster to which the current user and the current activity belong respectively;

a pushing module 305, configured to push, to the current user, a current activity in an activity cluster corresponding to a user cluster to which the current user belongs, based on the correspondence.

Therefore, the clustering-based stranger social contact user matching method can automatically push social contact activities meeting various characteristic attributes such as user interests and hobbies for the user based on the clustering analysis of historical data of the user and the social contact activities, so that the time for the user to search for the social contact activities suitable for the user is saved, the satisfaction degree of the social contact activities is increased, the user experience is improved, and the normal operation of social contact among strangers is facilitated.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. A stranger social user matching method based on clustering is characterized by comprising the following steps:

establishing a user portrait for each registered user, and storing personal attribute information, user interests, user personality labels and activity history information of each user in the user portrait, wherein the personal attribute information comprises gender and age information filled in when the user registers, and the activity history information is an activity history record of historical social activities organized and participated by the registered user and having praise or concern behaviors;

converting the sex and age personal attribute information of the user portrait of the historical user into corresponding characteristic values; converting the user interest tags into characteristic values corresponding to sports, gourmet and art categories, identifying the categories corresponding to the personalized tags by using a preset personalized tag lexicon, and converting identifiable personalized tags in the user personalized tags into the characteristic values corresponding to the sports, gourmet and art categories according to preset rules; according to the category of the historical social activities and the times of activities initiated, participated, liked or concerned by the user, converting the activity history information of the activities initiated, participated, liked or concerned by the user into characteristic values of sports, gourmet and art categories corresponding to the registered user, counting the time of the historical social activities participated by the user and the attribute conditions of peers, and respectively converting the time and the attribute conditions of the peers into the characteristic values corresponding to the attributes of each activity;

corresponding each historical user to a multi-dimensional feature vector based on the feature data of the historical users; the multidimensional feature vector comprises seven feature vector dimensions of 'user age', 'user sex', 'sports', 'food', 'art', 'historical activity time' and 'historical activity sex situation', and each feature vector dimension corresponds to a specific numerical value serving as a feature value;

clustering historical users, and dividing the historical users into a plurality of user clusters;

based on the feature data of the historical activities, each historical activity corresponds to an eight-dimensional feature vector, the eight-dimensional feature vector comprises eight dimensions of ' activity time ', ' activity place ', ' number of active people ', ' average age of active people ', ' sex proportion of active people ', ' sports ', ' food and ' art ', each feature vector dimension corresponds to a specific feature value, the historical activities are clustered based on the eight-dimensional feature vectors of the historical activities, and the historical activities are divided into a plurality of activity clusters;

taking an activity cluster with the highest ratio of the users in the user cluster participating in activities in the activity cluster as a matching activity cluster of the user cluster, and determining the corresponding relation between the user cluster and the activity cluster;

dividing a current user and a current activity respectively, and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively; calculating the distance between the feature vector corresponding to the current user and the clustering center of each user cluster, and dividing the current user into user clusters with the minimum distance from the clustering center; calculating the distance between the feature vector of the current activity and the clustering center of each activity cluster, and dividing the current activity into activity clusters with the minimum distance from the clustering center;

pushing the current activity in the activity cluster corresponding to the user cluster to which the current user belongs to the current user based on the corresponding relation between the user cluster and the activity cluster;

wherein a weight vector (α) is utilized for feature vectors of historical users and/or feature vectors of historical activities₁，α₂，α₃，……，α_k) For the feature vector (x)₁，x₂，x₃，……，x_k) Making a correction of which α₁+α₂+α₃+……+α_k=1, the corrected eigenvector is (α)₁x₁，α₂x₂，α₃x₃，……，α_kx_k) For subsequent cluster analysis, weight vector α₁，α₂，α₃，……，α_kIs adjusted α in the weight vector based on the emphasis on various aspect attribute factors when matching social activities for the user₁，α₂，α₃，……，α_kTaking the value of (A);

the clustering of the historical users based on the characteristic data of the historical users, the dividing of the historical users into a plurality of user clusters, comprises: generating n k-dimensional feature vectors as sample points according to feature data of the historical users, wherein n is the total amount of the historical users, and k is the degree of dimension of the feature vectors of the historical users; giving s central points, and respectively calculating the distance from each sample point to the s central points, wherein s is less than or equal to n; marking each sample point as a category corresponding to a central point closest to the sample point; updating the central point in each category as the mean value of all sample points belonging to the category; repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of user clusters;

wherein, eight-dimensional eigenvector based on historical activities clusters historical activities, divides the historical activities into a plurality of activity clusters, including: generating m g-dimensional feature vectors as sample points according to feature data of the historical activities, wherein m is the total amount of the historical activities, and g is the dimension number of the feature data of the historical activities; giving q central points, and respectively calculating the distance from each sample point to the q central points, wherein q is less than or equal to m; marking each sample point as a category corresponding to a central point closest to the sample point; updating the central point in each category as the mean value of all samples belonging to the category; and repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of activity clusters.

2. The method of claim 1, further comprising: and quantitatively scoring the characteristic data of the historical user and the characteristic data of the historical activity, and converting the characteristic data of the historical user and the characteristic data of the historical activity into numerical values of characteristic values.

3. A cluster-based stranger social user matching system, comprising:

the user cluster dividing module is used for establishing a user portrait for each registered user, and storing personal attribute information, user interests, user personality labels and activity history information of each user in the user portrait, wherein the personal attribute information comprises gender and age information filled in when the user registers, and the activity history information is an activity history record of historical social activities organized and participated by the registered user and having praise or concern behaviors; converting the sex and age personal attribute information of the user portrait of the historical user into corresponding characteristic values, converting the interest tags of the user into characteristic values corresponding to sports, gourmet and art categories, identifying the categories corresponding to the personalized tags by utilizing a preset personalized tag lexicon, and converting identifiable personalized tags in the personalized tags of the user into the characteristic values corresponding to the sports, gourmet and art categories according to preset rules; according to the category of the historical social activities and the times of activities initiated, participated, liked or concerned by the user, converting the activity history information of the activities initiated, participated, liked or concerned by the user into characteristic values of sports, gourmet and art categories corresponding to the registered user, counting the time of the historical social activities participated by the user and the attribute conditions of peers, and respectively converting the time and the attribute conditions of the peers into the characteristic values corresponding to the attributes of each activity; corresponding each historical user to a multi-dimensional feature vector based on the feature data of the historical users; the multidimensional feature vector comprises seven feature vector dimensions of 'user age', 'user sex', 'sports', 'food', 'art', 'historical activity time' and 'historical activity sex situation', and each feature vector dimension corresponds to a specific numerical value serving as a feature value; clustering historical users, and dividing the historical users into a plurality of user clusters;

the activity cluster dividing module is used for corresponding each historical activity to an eight-dimensional feature vector based on feature data of the historical activity, wherein the eight-dimensional feature vector comprises eight dimensions, namely ' activity time ', ' activity place ', ' number of active people ', ' average age of active people ', ' sex proportion of active people ', ' sports ', ' food and ' art ', each feature vector dimension corresponds to a specific feature value, the historical activities are clustered based on the eight-dimensional feature vectors of the historical activities, and the historical activities are divided into a plurality of activity clusters;

a matching module, configured to use an activity cluster with a highest ratio of users in the user clusters participating in activities in the activity clusters as a matching activity cluster of the user clusters, and determine a correspondence between the user clusters and the activity clusters;

the system comprises a current user and activity dividing module, a current activity dividing module and a current activity dividing module, wherein the current user and activity dividing module is used for dividing a current user and a current activity respectively and determining a user cluster and an activity cluster to which the current user and the current activity belong respectively; calculating the distance between the feature vector corresponding to the current user and the clustering center of each user cluster, and dividing the current user into user clusters with the minimum distance from the clustering center; calculating the distance between the feature vector of the current activity and the clustering center of each activity cluster, and dividing the current activity into activity clusters with the minimum distance from the clustering center;

a pushing module, configured to push, to the current user, a current activity in an activity cluster corresponding to a user cluster to which the current user belongs, based on the correspondence;

wherein, the eight-dimensional feature vector based on historical activities clusters historical activities, and divides the historical activities into a plurality of activity clusters, including: generating m g-dimensional feature vectors as sample points according to feature data of the historical activities, wherein m is the total amount of the historical activities, and g is the dimension number of the feature data of the historical activities; giving q central points, and respectively calculating the distance from each sample point to the q central points, wherein q is less than or equal to m; marking each sample point as a category corresponding to a central point closest to the sample point; updating the central point in each category as the mean value of all samples belonging to the category; and repeating the processes of marking the sample point classes and updating the central points until the sum of the distances from all the sample points to the central points to which the sample points belong is minimum, and generating a plurality of activity clusters.