CN113762298A

CN113762298A - Similar population expansion method and device

Info

Publication number: CN113762298A
Application number: CN202010580213.1A
Authority: CN
Inventors: 谢宏斌
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-06-23
Filing date: 2020-06-23
Publication date: 2021-12-07
Anticipated expiration: 2040-06-23
Also published as: CN113762298B

Abstract

The invention discloses a similar population expansion method and device, and relates to the technical field of computers. One embodiment of the method comprises: acquiring a specific user corresponding to a target object, and generating characteristic data of the specific user according to basic attribute data and behavior data of the specific user, wherein the specific user comprises: a seed user and a target user; training an algorithm model of a confrontation network learning algorithm by adopting a random gradient method according to the characteristic data of a specific user; and determining the extensible user corresponding to the target object from the target users according to the characteristic data of the target users based on the algorithm model. The implementation mode carries out semi-supervised training on a specific user based on the confrontation network learning algorithm, avoids the problem of adaptability threshold selection sensitivity of unsupervised learning and the problem of overfitting of supervised learning and special optimization methods, and improves the accuracy of a trained algorithm model.

Description

Similar population expansion method and device

Technical Field

The invention relates to the technical field of computers, in particular to a similar crowd expansion method and device.

Background

The principle of similar population expansion is that based on a certain sub-user, a population similar to the seed user is found out from a large number of target users so as to expand the scale of the seed user. Currently, similar population expansion methods are mainly divided into three categories: one is a clustering method of unsupervised learning, which distributes target users according to the cluster to which the seed users belong; the other is a training method with supervised learning, which utilizes seed users to train a classification model and sorts according to the predicted value of a target user; and thirdly, a feature optimization method is used for carrying out similarity calculation on the target users according to the selected features so as to screen the target users.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: firstly, for a clustering method of unsupervised learning, the number and the threshold value of clusters are difficult to determine; secondly, the supervised learning training method is easy to be over-fitted to a seed user, so that the generalization capability on a target user is poor; thirdly, for the characteristic optimization method, independent significant characteristics are difficult to screen out, and an overfitting phenomenon is easy to cause.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method and an apparatus for similar population expansion, which can perform semi-supervised training on a specific user based on an anti-network learning algorithm, avoid the problem of sensitivity to adaptive threshold selection in unsupervised learning and the problem of overfitting in supervised learning and special optimization methods, and improve the accuracy of a trained algorithm model.

To achieve the above object, according to a first aspect of the embodiments of the present invention, a similar population expansion method is provided.

The similar crowd expansion method of the embodiment of the invention comprises the following steps: acquiring a specific user corresponding to a target object, and generating feature data of the specific user according to basic attribute data and behavior data of the specific user, wherein the specific user comprises: a seed user and a target user; training an algorithm model of a confrontation network learning algorithm by adopting a random gradient method according to the characteristic data of the specific user; and determining the extensible user corresponding to the target object from the target users according to the characteristic data of the target users based on the algorithm model.

Optionally, the training an algorithm model of a countering network learning algorithm by using a stochastic gradient method according to the feature data of the specific user includes: initializing a classification model of a counternetwork learning algorithm; training a discrimination parameter by a random gradient ascent method according to the feature data of the specific user and the classification model to obtain a discrimination model corresponding to the discrimination parameter; training classification parameters by using a loss function of the discrimination model through a random gradient descent method, and updating the classification model according to the trained classification parameters; judging whether the discrimination model and the updated classification model meet preset conditions or not; if yes, determining the updated classification model as a target classification model, and determining the discrimination model as a target discrimination model; if not, performing model training by using the feature data of the specific user and the updated classification model until the discriminant model and the classification model obtained by training meet the preset conditions.

Optionally, the training, according to the feature data of the specific user and the classification model, a discrimination parameter by a random gradient ascent method to obtain a discrimination model corresponding to the discrimination parameter includes: sampling at least one first sorted sample set from the particular user; inputting the characteristic data corresponding to the at least one first classification sample set into the classification model to obtain a prediction classification result corresponding to the at least one first classification sample set; sampling at least one marked sample set from the seed user and the negative user, and determining a real classification result corresponding to the at least one marked sample set, wherein the real classification result corresponding to the seed user is 1, and the real classification result corresponding to the negative user is 0; constructing a discriminant training set by using the predicted classification result corresponding to the at least one first classification sample set and the real classification result corresponding to the at least one labeled sample set; and training the discrimination parameters according to the discrimination training set by a random gradient rise method.

Optionally, the training the classification parameters by using the loss function of the discriminant model through a stochastic gradient descent method includes: determining a loss function of the discriminant model; sampling at least one second sorted sample set from the particular user; and inputting the characteristic data corresponding to the at least one second classification sample set into a loss function of the discriminant model, and training the classification parameters by a random gradient descent method.

Optionally, the initializing a classification model of the countering network learning algorithm includes: acquiring an initial value of a classification parameter, and directly initializing by using the initial value of the classification parameter; and sampling at least one classification training set from the seed user and the negative user, and training at least one initial classification model in advance by using the at least one classification training set so as to finish initialization of the classification model.

Optionally, the obtaining a target classification model and a target discriminant model of the antagonistic network learning algorithm includes: obtaining at least one optional classification model and at least one optional discriminant model corresponding to the at least one initial classification model; determining the target classification model and the target discriminant model from the at least one alternative classification model and the at least one alternative discriminant model.

Optionally, the method further comprises: randomly sampling the negative users from the target users; and sampling the negative user from the target user according to the seed user behavior data and the second behavior data of the seed user; and the first behavior data of the negative user and the first behavior data of the seed user have intersection, and the second behavior data of the negative user and the second behavior data of the seed user have no intersection.

Optionally, the behavior data includes: first behavior data and second behavior data; and generating feature data of the specific user according to the basic attribute data and the behavior data of the specific user, wherein the generating of the feature data of the specific user comprises the following steps: acquiring the basic attribute data, the first behavior data and the second behavior data; processing the first behavior data based on a preset embedded feature processing rule to generate embedded feature data corresponding to the first behavior data; processing the second behavior data based on a preset word segmentation feature processing rule to generate word segmentation feature data corresponding to the second behavior data; and combining the basic attribute data, the embedded characteristic data and the word segmentation characteristic data to generate the characteristic data.

Optionally, the processing the first behavior data based on a preset embedded feature processing rule to generate embedded feature data corresponding to the first behavior data includes: acquiring at least one item attribute data corresponding to the first behavior data; performing segmentation processing on the at least one article attribute data according to a preset time threshold corresponding to the at least one article attribute data to obtain a behavior sequence corresponding to the at least one article attribute data; embedding the behavior sequence corresponding to the at least one article attribute data by using a word vector embedding algorithm to obtain sub-embedded characteristic data corresponding to the at least one article attribute data; and combining the sub-embedded characteristic data corresponding to the at least one item attribute data to generate embedded characteristic data corresponding to the first behavior data.

Optionally, the processing the second behavior data based on a preset word segmentation feature processing rule to generate word segmentation feature data corresponding to the second behavior data includes: acquiring an article description sentence corresponding to the second behavior data; performing word segmentation processing on the article description sentence to obtain at least one word segmentation, and filtering and screening the at least one word segmentation; and generating word segmentation characteristic data corresponding to the second behavior data by using the filtered and screened word segmentation.

Optionally, the determining, based on the algorithm model and according to the feature data of the target user, an extensible user corresponding to the target object from the target user includes: inputting the characteristic data of the target user into the target classification model to obtain a prediction classification result corresponding to the target user; and selecting the extensible user from the target users according to the corresponding prediction classification result of the target user based on a preset extension condition.

To achieve the above object, according to a second aspect of the embodiments of the present invention, a similar population spreading device is provided.

The similar crowd extension device of the embodiment of the invention comprises: a generating module, configured to obtain a specific user corresponding to a target object, and generate feature data of the specific user according to basic attribute data and behavior data of the specific user, where the specific user includes: a seed user and a target user; the training module is used for training an algorithm model of the confrontation network learning algorithm by adopting a random gradient method according to the characteristic data of the specific user; and the determining module is used for determining the extensible user corresponding to the target object from the target users according to the characteristic data of the target users on the basis of the algorithm model.

Optionally, the training module is further configured to: initializing a classification model of a counternetwork learning algorithm; training a discrimination parameter by a random gradient ascent method according to the feature data of the specific user and the classification model to obtain a discrimination model corresponding to the discrimination parameter; training classification parameters by using a loss function of the discrimination model through a random gradient descent method, and updating the classification model according to the trained classification parameters; judging whether the discrimination model and the updated classification model meet preset conditions or not; if yes, determining the updated classification model as a target classification model, and determining the discrimination model as a target discrimination model; if not, performing model training by using the feature data of the specific user and the updated classification model until the discriminant model and the classification model obtained by training meet the preset conditions.

Optionally, the training module is further configured to: sampling at least one first sorted sample set from the particular user; inputting the characteristic data corresponding to the at least one first classification sample set into the classification model to obtain a prediction classification result corresponding to the at least one first classification sample set; sampling at least one marked sample set from the seed user and the negative user, and determining a real classification result corresponding to the at least one marked sample set, wherein the real classification result corresponding to the seed user is 1, and the real classification result corresponding to the negative user is 0; constructing a discriminant training set by using the predicted classification result corresponding to the at least one first classification sample set and the real classification result corresponding to the at least one labeled sample set; and training the discrimination parameters according to the discrimination training set by a random gradient rise method.

Optionally, the training module is further configured to: determining a loss function of the discriminant model; sampling at least one second sorted sample set from the particular user; and inputting the characteristic data corresponding to the at least one second classification sample set into a loss function of the discriminant model, and training the classification parameters by a random gradient descent method.

Optionally, the training module is further configured to: acquiring an initial value of a classification parameter, and directly initializing by using the initial value of the classification parameter; and sampling at least one classification training set from the seed user and the negative user, and training at least one initial classification model in advance by using the at least one classification training set so as to finish initialization of the classification model.

Optionally, the training module is further configured to: obtaining at least one optional classification model and at least one optional discriminant model corresponding to the at least one initial classification model; determining the target classification model and the target discriminant model from the at least one alternative classification model and the at least one alternative discriminant model.

Optionally, the apparatus further comprises a sampling module configured to: randomly sampling the negative users from the target users; and sampling the negative user from the target user according to the seed user behavior data and the second behavior data of the seed user; and the first behavior data of the negative user and the first behavior data of the seed user have intersection, and the second behavior data of the negative user and the second behavior data of the seed user have no intersection.

Optionally, the behavior data includes: first behavior data and second behavior data; and the generating module is further configured to: acquiring the basic attribute data, the first behavior data and the second behavior data; processing the first behavior data based on a preset embedded feature processing rule to generate embedded feature data corresponding to the first behavior data; processing the second behavior data based on a preset word segmentation feature processing rule to generate word segmentation feature data corresponding to the second behavior data; and combining the basic attribute data, the embedded characteristic data and the word segmentation characteristic data to generate the characteristic data.

Optionally, the generating module is further configured to: acquiring at least one item attribute data corresponding to the first behavior data; performing segmentation processing on the at least one article attribute data according to a preset time threshold corresponding to the at least one article attribute data to obtain a behavior sequence corresponding to the at least one article attribute data; embedding the behavior sequence corresponding to the at least one article attribute data by using a word vector embedding algorithm to obtain sub-embedded characteristic data corresponding to the at least one article attribute data; and combining the sub-embedded characteristic data corresponding to the at least one item attribute data to generate embedded characteristic data corresponding to the first behavior data.

Optionally, the generating module is further configured to: acquiring an article description sentence corresponding to the second behavior data; performing word segmentation processing on the article description sentence to obtain at least one word segmentation, and filtering and screening the at least one word segmentation; and generating word segmentation characteristic data corresponding to the second behavior data by using the filtered and screened word segmentation.

Optionally, the determining module is further configured to: inputting the characteristic data of the target user into the target classification model to obtain a prediction classification result corresponding to the target user; and selecting the extensible user from the target users according to the corresponding prediction classification result of the target user based on a preset extension condition.

To achieve the above object, according to a third aspect of embodiments of the present invention, there is provided an electronic apparatus.

An electronic device of an embodiment of the present invention includes: one or more processors; a storage device for storing one or more programs which, when executed by one or more processors, cause the one or more processors to implement the similar population expansion method of an embodiment of the present invention.

To achieve the above object, according to a fourth aspect of embodiments of the present invention, there is provided a computer-readable medium.

A computer-readable medium of an embodiment of the present invention has a computer program stored thereon, and the program, when executed by a processor, implements the similar population extension method of an embodiment of the present invention.

One embodiment of the above invention has the following advantages or benefits: the basic attribute data and the behavior data of a specific user can be utilized to generate the characteristic data, and the behavior data is introduced, so that the method has important significance for improving the conversion rate and further can improve the accuracy of a trained algorithm model; then, an algorithm model of the confrontation network learning algorithm is trained by combining the feature data of the specific user through a random gradient method, semi-supervised training can be carried out on the specific user based on the confrontation network learning algorithm, the method is different from a method for carrying out supervised training only by using a seed user, and is different from a method for using all users based on unsupervised clustering, and the problems of sensitivity of adaptive threshold selection of unsupervised learning and overfitting of a supervised learning and special optimization method are avoided; and finally, selecting extensible users from the target users by using the trained algorithm model to complete similar population extension.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of the main steps of a similar population expansion method according to an embodiment of the present invention;

FIG. 2 is a schematic illustration of at least one item attribute data acquired in accordance with an embodiment of the present invention;

FIG. 3 is a schematic diagram of a main flow of a method of generating profile data for a particular user, according to an embodiment of the invention;

FIG. 4 is a schematic diagram of a main flow of a method of training classification models and discriminant models according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a similar population expansion system according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of the main blocks of a similar population expansion device according to an embodiment of the present invention;

FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Similar population expansion needs to solve two main problems, one is how to determine seed users or how to determine their characteristics according to seed users; and secondly, how to expand the similarity according to the determined seed users and the characteristics of the seed users. The seed user is equivalent to a user interested in a certain article or a certain store, for example, if a certain user purchases a certain article or collects a certain article or pays attention to a certain store, the user is considered as the seed user, and a specific source can be provided by a specific operator. Similar population expansion is to determine the characteristics of seed users, and then can dig out populations similar to the characteristics of the seed users according to the characteristics of the seed users, and the excavated populations are considered to be interested in an article or a shop as the seed users. Research finds that the existing similar population expansion methods are mainly divided into the following three categories:

the method comprises the steps of clustering a seed user and a target user based on an unsupervised learning clustering method, and distributing the target user according to a cluster to which the seed user belongs. Unsupervised learning belongs to one of machine learning algorithms, and is used for processing a sample set which is not classified and marked when a classifier is designed. Therefore, in the clustering algorithm of the unsupervised learning algorithm, the sample data type is unknown, and the sample set needs to be classified or clustered according to the similarity between the samples, so that the intra-class difference is tried to be minimized, and the inter-class difference is tried to be maximized. In practical applications, the labels of the samples cannot be known in advance in many cases, that is, there is no class corresponding to the training samples, so that the classifier design can be learned only from the original sample set without the labels of the samples. The target users refer to users except the seed users, and the purpose of similar crowd expansion is to select users who may be interested in a certain article or a certain shop from the target users, so that recommendation messages can be sent to the users.

For the clustering method based on unsupervised learning, it is difficult to determine the number of class clusters, i.e. it is difficult to define in advance how many classes a seed user can be divided into, this strong a priori knowledge guidance is lacking in practice, and even if the number of class clusters can be determined, it is difficult to select an adaptive threshold to screen users.

And secondly, based on the supervised learning training method, the seed users are regarded as positive samples, then negative samples are obtained, classification model training is carried out, and finally sequencing is carried out according to the predicted values of the target users. Supervised learning is one type of machine learning algorithm, and is a machine learning task that infers a function from a labeled training data set. In supervised learning training methods, a function (model parameters) is learned from a given training set, and when new data comes, the result can be predicted from this function. The training set includes positive samples and negative samples, the positive samples refer to samples belonging to a certain category, the negative samples refer to samples not belonging to a certain category, for example, when image recognition of the letter a is performed, the samples of the letter a belong to the positive samples, and the samples not belonging to the letter a belong to the negative samples. Supervised learning is the most common classification (different from clustering) problem, an optimal model (the model belongs to a certain function set and is optimal under a certain evaluation criterion) is obtained by training through an existing training sample (namely known data and corresponding output), all inputs are mapped into corresponding outputs by utilizing the model, and the outputs are simply judged so as to realize the classification purpose.

For the training method based on supervised learning, when the number of seed users is too small and the feature dimension is too large, the overfitting is easy to be performed on the seed users, so that the generalization capability on the target users is poor. Over-fitting is that the machine learning model or deep learning model built performs too well in the training samples, resulting in poor performance in the validation data set and the test data set. Such as identifying a model of a dog, requires training of the model. Just, all the training pictures in the training set are biha, so after many times of iterative training, the model is trained well and performs well in the training set. However, by placing a test sample of golden hairs in this dog-identifying model, it is likely that the model will eventually output a result that the golden hairs are not a dog. This results in an overfitting of the model, which, while performing well on the training set, performs exactly the opposite in the test set, which is an excessive covariance in terms of performance, as well as a loss function on the test set. Generalization ability refers to the predictive ability of a learned model to unknown data.

And thirdly, based on the feature optimization method, user features are optimized, then the overall similarity calculation of the seed users is carried out on the target users according to the selected features, and then the target users are screened.

For the feature-based optimization method, relative independence among features of a user needs to be assumed, and since local features often have strong or weak dependency relationships, it is difficult to screen out independent significant features, and an overfitting phenomenon is easily caused by using strong independent features.

In order to solve the above problems, the embodiment of the present invention provides a semi-supervised user classification scheme for counternetwork learning, which is different from a method for supervised training by using only seed users and a method for supervised clustering based on all users. Due to the introduction of a semi-supervised learning mechanism, the problems of adaptability threshold selection sensitivity of unsupervised learning, overfitting of supervised learning and a special optimization method are solved, wherein the semi-supervised learning is a learning method combining the supervised learning and the unsupervised learning, and the semi-supervised learning mainly considers the problem of how to train and classify by using a small amount of labeled samples and a large amount of unlabeled samples. In the semi-supervised learning of the embodiment of the invention, a confrontation network learning algorithm is used for learning the user classifier, and the confrontation aims to learn the joint distribution of a sample set, wherein the sample set comprises seed users and also covers target users, namely all data are used for training, and not only the seed users. Fig. 1 is a schematic diagram illustrating major steps of a similar population expansion method according to an embodiment of the present invention, and as shown in fig. 1, the major steps of the similar population expansion method may include steps S101 to S103.

Step S101: and acquiring a specific user corresponding to the target object, and generating characteristic data of the specific user according to the basic attribute data and the behavior data of the specific user.

The target object may refer to an article or a store that a seed user is interested in, for example, if a user purchases an article W or pays attention to a store D, the user is interested in the article W or the store D, the user may be the seed user, and the article W or the store D may be the target object; the specific users may include: the seed user and the target user are users who are interested in a certain article or shop, and the target user is a user except the seed user, so that the specific user is equivalent to all users, for example, on a certain e-commerce platform, and the specific user is a user registered on the e-commerce platform. Or, the target object is a game with age restriction, and the specific users refer to all users who meet the age requirement. In the embodiment of the invention, the specific user can be determined according to the actual situation.

After the specific user is obtained, the feature data of the specific user can be generated according to the basic attribute data and the behavior data of the specific user. The basic attribute data may include user attributes, such as age, sex, academic degree, height, marital, occupation, income, and the like, behavior trends, such as liveness, deep sleep, shopping place and time period, return tendency, and the like, and interest preferences, such as frequently browsed brands, categories, and the like, which may reflect characteristics of users that are relatively stable over a certain period of time. The basic attribute data of the user belongs to the long-term characteristics of the user, and in the embodiment of the invention, the recent characteristics of the user, namely the latest user behavior data, are required to be introduced besides the long-term characteristics, wherein the user behavior data represent clicking, collecting, paying attention and purchasing behaviors shown in one login session period. In the method for generating the characteristic data, the user behavior data is introduced, so that the method has an important significance for improving the conversion rate, and in the e-commerce environment, the conversion rate can refer to the proportion of the ordering purchasing behavior finally generated by the browsing user.

Step S102: and training an algorithm model of the confrontation network learning algorithm by adopting a random gradient method according to the characteristic data of the specific user.

After the feature data of a specific user is generated, an algorithm model of the confrontation network learning algorithm can be trained by using the generated feature data and a random gradient method. The confrontation network learning algorithm belongs to a deep learning model, and in the embodiment of the invention, the algorithm model of the confrontation network learning algorithm may include: classification models and discriminant models. The classification model is mainly used for learning how to identify the target user as a potential extensible user, namely, the classification model is used for predicting whether the user has preference on the target object, so that the prediction result of the classification model is more accurate, and the discriminant model is cheated. The discrimination model needs to discriminate between true and false of the received samples. For example, given a training set comprising positive and negative examples, the goal is to learn according to what rules such a training set is constructed. In the whole process, the classification model strives to make the prediction result more accurate, the discrimination model strives to identify the true and false of the sample, the classification model and the discrimination model continuously resist over time, and finally the two models reach a dynamic balance: the classification model predicts whether the user has a preference for the target object more accurately, and the discrimination model does not identify true and false samples. That is, the antagonistic balance of the classification model and the discriminant model enables the classification model to learn the joint distribution from the seed user and the target user.

In the semi-supervised learning of the embodiment of the invention, a confrontation network learning algorithm is used for learning user classification, and the confrontation aims to learn the joint distribution of a sample set, wherein the sample set comprises seed users and also covers target users, namely all data are used for training, and not only the seed users. Therefore, in step S102, training is performed using feature data of the feature user.

In addition, the random gradient method is adopted for model training in the embodiment of the invention. The random gradient method is an optimization algorithm, is commonly used for a recursive approximation minimum deviation model in machine learning and artificial intelligence, and solves a maximum value or a minimum value along the direction of gradient rising or falling. The random gradient method is characterized in that a batch of training data is randomly extracted from a training set to be input to obtain an average gradient, and the steps are repeated in this way, so that the calculation expense of each iteration is reduced, and the model training speed can be accelerated.

Step S103: and determining the extensible user corresponding to the target object from the target users according to the characteristic data of the target users based on the algorithm model.

By training the algorithm model in step S102, a target classification model and a target discrimination model can be obtained, that is, in the model training process, if the classification model and the discrimination model are balanced, the training can be stopped, and the classification model and the discrimination model at this time are the target classification model and the target discrimination model, and it can be predicted whether the target user has a preference for the target object by using the target classification model, so that an extensible user can be selected from the target user.

Therefore, as a referential embodiment of the present invention, determining an extensible user corresponding to a target object from target users according to feature data of the target users based on an algorithm model may include: inputting the characteristic data of the target user into a target classification model, and acquiring a prediction classification result corresponding to the target user; and selecting extensible users from the target users according to the corresponding prediction classification results of the target users based on preset extension conditions.

The preset expansion condition may be a preset classification result threshold, the feature data of a certain target user M is input into the target classification model, and if the obtained predicted classification result is greater than the preset classification result threshold, the target user M is an expandable user. The preset expansion condition can also be the preset expansion user number N, the feature data of all target users are input into the target classification model, the obtained prediction classification results are sorted, and the target users with the N top ranking are selected as the expandable users. Of course, the preset extension condition may also be in other forms, and is set according to actual situations, which is not limited in the embodiment of the present invention.

The similar population expansion method provided by the embodiment of the invention can utilize basic attribute data and behavior data of a specific user to generate characteristic data, and has important significance for conversion rate improvement due to the introduction of the behavior data, so that the accuracy of a trained algorithm model can be improved; then, an algorithm model of the confrontation network learning algorithm is trained by combining the feature data of the specific user through a random gradient method, semi-supervised training can be carried out on the specific user based on the confrontation network learning algorithm, the method is different from a method for carrying out supervised training only by using a seed user, and is different from a method for using all users based on unsupervised clustering, and the problems of sensitivity of adaptive threshold selection of unsupervised learning and overfitting of a supervised learning and special optimization method are avoided; and finally, selecting extensible users from the target users by using the trained algorithm model to complete similar population extension.

The similar population expansion function is located on a user portrait platform, needs to rely on a user portrait system to acquire user basic attributes as input, the user basic attributes describe characteristics of user attributes, behavior tendency, interest preference and the like, reflect characteristics that a user is relatively stable in a certain period, the user basic attributes are illustrated in step S101, and the description is omitted here. Besides these stable characteristics, it is also necessary to introduce recent characteristics of users, i.e. recent user behaviors, which are important for improving conversion rate. However, it is worth noting that the user behavior features are generally sparse, and model training is performed by directly using the sparse features, so that overfitting is easy to perform, and therefore the user behavior features need to be processed. Therefore, as a referential example of the present invention, the feature data of a specific user is generated based on the basic attribute data and behavior data of the specific user, including:

step S1011, acquiring basic attribute data, first behavior data and second behavior data;

step S1012, processing the first behavior data based on a preset embedded feature processing rule, and generating embedded feature data corresponding to the first behavior data;

step S1013, processing the second behavior data based on a preset word segmentation feature processing rule to generate word segmentation feature data corresponding to the second behavior data;

step 1014, combining the basic attribute data, the embedded feature data and the word segmentation feature data to generate feature data.

The behavior of the user may include a normal click behavior, an attention behavior, a collection behavior, an additional purchase behavior, a purchase behavior, and the like, and since the number of the normal click behaviors is large and other behaviors are relatively small, in the embodiment of the present invention, the behavior data is separately processed and divided into the first behavior data and the second behavior data, where the first behavior data is equivalent to the large number of the normal click behavior data, and the second behavior data is equivalent to the relatively small number of other behavior data, such as the attention behavior, the collection behavior, the additional purchase behavior, and the purchase behavior.

Considering that the common click behaviors of the user reflect continuous changes of the user requirements, the article sequences corresponding to the common click behaviors have strong correlation, and the articles can reflect the intrinsic essential requirements of the user. However, since the common click behaviors of the user are relatively dispersed and belong to sparse features, and data are relatively dispersed in a high-dimensional space, if the feature data are formed by directly using the common click behaviors of the user, the problem of overfitting of a trained model is caused. Therefore, the embodiment of the invention carries out low-dimensional feature embedding processing on the common clicking behaviors of the user. The benefit of the embedded user features is that a global abstract feature is obtained, so the generalization ability is strong, for example, a user wants to buy fruits, she sees dragon fruits, oranges, mangos and grapefruits, and finally buys the grapefruits, if the user is described by the word segmentation features of the items concerned, bought and bought, the user may like the grapefruits, so an over-fitting behavior is caused, and the user buying the grapefruits belongs to two distinct groups as compared with the user buying the oranges and oranges, but the fact that the user buying the grapefruits also has a preference for the oranges, even the mangos and the dragon fruits.

Furthermore, embedded low-dimensional features are more able to characterize the intrinsic correlation properties in the item space than sparse features. This is because for high-dimensional vectors such as a (1, 0, 0), B (0, 1, 0) and C (0, 0, 1), the distances between the three are equal, it is difficult to find whether a and B are similar or a and C are similar, but because the low-dimensional space is dense, it is easier to make a and B or C more similar (closer in distance) after embedding. Therefore, as a reference embodiment of the present invention, processing the first behavior data based on a preset embedded feature processing rule to generate embedded feature data corresponding to the first behavior data may include:

step S10121, acquiring at least one article attribute data corresponding to the first behavior data;

step S10122, performing segmentation processing on at least one article attribute data according to a preset time threshold corresponding to the at least one article attribute data to obtain a behavior sequence corresponding to the at least one article attribute data;

step S10123, embedding the behavior sequence corresponding to the at least one article attribute data by using a word vector embedding algorithm to obtain sub-embedded characteristic data corresponding to the at least one article attribute data;

step S10124, combining the sub-embedded characteristic data corresponding to the at least one item attribute data to generate embedded characteristic data corresponding to the first behavior data.

Wherein the item attribute data may include: the name of the article, the brand name of the article, the category of the article, and the store of the article. After the first behavior data of the user is obtained, the item sequence corresponding to the first behavior data may be obtained, for example, if the first behavior data of 100 items clicked by a certain user is obtained, 100 item sequences corresponding to the first behavior data may be obtained. Because the method is a fine-grained description of user requirements, and does not want to use high-dimensional item title participles as feature vectors, attributes of the items such as brands, categories and shops are used as alternative descriptions of the items. Therefore, in step S10121, at least one item attribute data corresponding to the first behavior data is acquired, and fig. 2 is a schematic diagram of the acquired at least one item attribute data according to the embodiment of the present invention. In fig. 2, when the first behavior data of the user is behavior data generated by clicking in the order from item a to item F, the obtained item attribute data are 4 sequences, which are respectively a sequence from item a to item F, a sequence from the brand corresponding to item a to the brand corresponding to item F, a sequence from the category corresponding to item a to the category corresponding to item F, and a sequence from store corresponding to item a to store corresponding to item F.

Due to the fact that the describing granularity of brands, categories and shops is large, the situation that repeated elements in a sequence are more in a user click sequence is caused, and dense embedding is not facilitated, therefore, segmented compression processing needs to be conducted on adjacent repeated brands, categories and shops in at least one article attribute data, and then the sequences after segmented compression are used for brand, category and shop embedding. Firstly, a click sequence segment is divided, namely a behavior sequence corresponding to at least one item attribute data is obtained, and different time thresholds are required to be set for constructing the corresponding behavior sequence due to different granularities of items, brands, shops and categories. For the action sequence corresponding to the article, if the time interval between two adjacent clicks exceeds 1 minute, segmenting; for the behavior sequences corresponding to the brands and the shops, segmenting if the time interval between two adjacent clicks exceeds 5 minutes; and for the action sequence corresponding to the category, segmenting if the interval of the time length of two adjacent clicks exceeds 30 minutes. In addition, if the behavior sequence obtained by the segmentation processing according to the preset time threshold still has adjacent repeated elements, one of the repeated elements is reserved.

After the behavior sequence corresponding to the at least one item attribute data is obtained, embedding processing can be performed on the behavior sequence corresponding to the at least one item attribute data by using a word vector embedding algorithm, so that sub-embedded feature data corresponding to the at least one item attribute data is obtained. The word vector embedding algorithm is an algorithm for converting words into vectors, namely words in a dictionary are identified by using vectors with certain dimensionality.

In summary, the processing of the common click behavior of the user to obtain the embedded feature data of the user specifically includes: processing is carried out from four angles of clicked articles, brands, categories and shops, four corresponding sequences are generated according to a first action data, four corresponding embedded feature vectors, namely four sub-embedded feature data, are obtained, and the four sub-embedded feature data are spliced to obtain the embedded feature data of a user.

In the embodiment of the invention, the first behavior data is equivalent to a large amount of common click behavior data, and the second behavior data is equivalent to a relatively small amount of other behavior data, such as attention behavior, collection behavior, purchase behavior and purchase behavior, so that the first behavior data is subjected to embedded feature processing, and the second behavior data is subjected to word segmentation feature processing. As a reference embodiment of the present invention, processing the second behavior data based on a preset word segmentation feature processing rule to generate word segmentation feature data corresponding to the second behavior data may include:

step S10131, acquiring an article description sentence corresponding to the second behavior data;

step S10132, performing word segmentation processing on the article description sentence to obtain at least one word segmentation, and filtering and screening the at least one word segmentation;

step S10133, generating word segmentation characteristic data corresponding to the second behavior data by using the filtered and screened word segmentation.

The word segmentation statistical characteristics are that the attention, collection, purchase and purchase behaviors of a user are processed, namely second behavior data of the user are processed to obtain articles and article description sentences corresponding to the behaviors, then word segmentation processing is carried out on the article description sentences to obtain a plurality of words, then all the obtained words are filtered and screened, and finally generated article title word segmentation characteristics are used as a part of user characteristic data.

After the first behavior data is processed in step S1012 to generate the embedded feature data corresponding to the first behavior data and the second behavior data is processed in step S1013 to generate the segmentation feature data corresponding to the second behavior data, the basic attribute data, the embedded feature data, and the segmentation feature data may be combined to generate the feature data of the user.

Fig. 3 is a schematic diagram of a main flow of a method of generating feature data of a specific user according to an embodiment of the present invention. As shown in fig. 3, the main flow of the method for generating feature data of a specific user may include:

step S301, acquiring basic attribute data, first behavior data and second behavior data of a specific user;

step S302, at least one item attribute data corresponding to the first behavior data is obtained;

step S303, performing segmentation processing on at least one article attribute data according to a preset time threshold corresponding to the at least one article attribute data to obtain a behavior sequence corresponding to the at least one article attribute data;

step S304, embedding the behavior sequence corresponding to at least one article attribute data by using a word vector embedding algorithm to obtain sub-embedded characteristic data corresponding to at least one article attribute data;

step S305, combining the sub-embedded characteristic data corresponding to at least one item attribute data to generate embedded characteristic data corresponding to the first behavior data;

step S306, acquiring an article description sentence corresponding to the second behavior data;

step S307, performing word segmentation processing on the article description sentence to obtain at least one word segmentation, and filtering and screening the at least one word segmentation;

step S308, generating word segmentation characteristic data corresponding to the second behavior data by using the filtered and screened word segmentation;

step S309, the basic attribute data, the embedded characteristic data and the word segmentation characteristic data are combined to generate the characteristic data of the specific user.

It should be noted that, steps S302 to S305 are to process the first behavior data to generate embedded feature data, and steps S306 to S308 are to process the second behavior data to generate participle feature data, and a specific execution sequence may be adjusted according to actual situations, and the embedded feature data may be generated first, and then the participle feature data may be generated, or the embedded feature data and the participle feature data may be generated at the same time, which is not limited in the embodiment of the present invention.

In the method for generating the feature data of the specific user, the long-term feature and the recent behavior feature of the user can be comprehensively considered from the three aspects of basic attribute data, first behavior data and second behavior data, so that the accuracy of the feature data and the algorithm model can be improved, the accuracy of the obtained extensible user can be further ensured, and the introduction of the behavior feature of the user as far as possible has important significance for improving the conversion rate. Considering the characteristics that the first behavior data are large in quantity and belong to sparse features, the corresponding embedded feature data are obtained by adopting an embedded feature processing method, and the problem of overfitting of a trained model is avoided.

The algorithm model for training the anti-network learning algorithm is an important component of the similar population expansion method in the embodiment of the invention. As a referential embodiment of the present invention, training an algorithm model of a countering network learning algorithm by using a stochastic gradient method according to feature data of a specific user may include: initializing a classification model of a counternetwork learning algorithm; training a discrimination parameter by a random gradient ascent method according to the feature data and the classification model of the specific user to obtain a discrimination model corresponding to the discrimination parameter; training classification parameters by using a loss function of a discrimination model through a random gradient descent method, and updating a classification model according to the trained classification parameters; judging whether the discrimination model and the updated classification model meet preset conditions or not; if yes, determining the updated classification model as a target classification model, and determining the discrimination model as a target discrimination model; if not, performing model training by using the feature data of the specific user and the updated classification model until the discrimination model and the classification model obtained by training meet the preset conditions.

The input of the classification model C (x) is the characteristic data of the user x, and the predicted classification result corresponding to the user x is output

For the classification model c (x), the user x may be selected from a classification sample set formed by the seed user and the target user, that is, the classification model c (x) may predict the classification results of the seed user and the target user. In the embodiment of the invention, a deep neural network can be adopted as a classification model for training, the number of neurons in an input layer is kept to be the same as the characteristic number of a user, a plurality of layers of full-connection units are arranged in the middle, a Leaky-Relu is selected as an activation function in the network, batch regularization is carried out firstly, the number of neurons in an output layer is 2, and softmax is used for predicting the probability distribution of the user

It should be noted that, for the seed user, the corresponding real classification result y is 1, that is, the seed user may be regarded as a marked sample, the target user may be regarded as a sample to be marked, and the seed user belongs to a positive sample, and the negative sample selection problem, that is, how to determine the user with y being 0, will be discussed in detail next. The negative sample is important for training the model, and how to select the high-quality negative sample is directly related to the generalization ability of the model. In the embodiment of the present invention, the negative user may be randomly sampled from the target users, that is, the negative user is randomly selected from the target users, and then the negative user is determined to be the marked sample, and the corresponding real classification result y is 0, so that the negative user may be defined as a user whose real classification result is 0.

In addition, in the embodiment of the present invention, negative users may be sampled from the target users according to the seed user behavior data and the second behavior data of the seed users. And the first behavior data of the negative user and the first behavior data of the seed user have intersection, and the second behavior data of the negative user and the second behavior data of the seed user have no intersection. The first behavior data of the user is introduced into the method for generating the feature data, so that the feature dimension is enlarged, and the coverage is potentially enlarged. Therefore, in order to enhance the generalization performance of the model, in addition to random sampling, a user having an intersection with the first behavior data of the seed user and no intersection with the second behavior data of the seed user may be selected as a negative user, for example, the attention, collection, purchase behavior of the selected negative user does not have an intersection with the seed user, but the normal click behavior has an intersection with the seed user. For example, the seed user clicks on pineapple, banana, cantaloupe, tomato, beef, egg, and finally purchases pineapple, tomato, beef; then, when a negative user is selected, a user who bought eggs, bananas and Hami melons but clicked on pineapples, tomatoes and beef can be selected. According to the seed user behavior data and the second behavior data of the seed user, the negative user is sampled from the target user, so that the seed user and the negative user are not differentiated in two stages in the feature space, and the method is very important for enhancing the generalization capability of the model.

Typically, the ratio of positive and negative samples is maintained at 1: 3, if the number of the seed users is 100, 300 users can be randomly selected from the target users as negative users, or 300 users can be selected from the target users as negative users according to the seed user behavior data and the second behavior data of the seed users.

For discriminant model D, a sample is given

The discrimination model D can distinguish it as a fake sample, i.e., the feature data of the input user x and the predicted classification result by the classification model

The discrimination model D can determine the sample

The discrimination model D can discriminate that the input classification result is a predicted classification result of the classification model and is not a true classification result for a counterfeit sample. Similarly, for the discrimination model D, given a sample (x, y), the discrimination model D can distinguish it as a real sample, that is, the feature data of the input user x and the real classification result y are input, and the discrimination model D can distinguish the sample (x, y) as a real sample, that is, the discrimination model D can distinguish the input classification result as a real classification result and can distinguish the input user as a seed user or a negative user. It can be seen that the classification model c (x) is equivalent to a user labeling model for labeling users in the classification sample set, and the real samples belong to labeled samples. Also, the sample

The corresponding user is selected from the classified sample set composed of the seed user and the target user, and the user corresponding to the sample (x, y) is selected from the labeled sample set composed of the seed user and the negative user, so that the sample

The users corresponding to sample (x, y) may be the same user.

In the embodiment of the invention, the number of nodes of an input layer of a discrimination model D can be 2 more than that of nodes of a classification model C (x), the input layer of the discrimination model D respectively represents the probability of a seed user and the probability of a non-seed user, an output layer of the discrimination model D is 1 node, logistic regression is carried out, the discrimination probability of a true sample is output, a Leaky-Relu is selected as an activation function in a network, batch regularization is carried out firstly, the number of neurons of the output layer is 2, and the final loss function is as follows:

wherein D represents a discrimination model, and C represents a classification model; v (D, C) represents a cost function, which is a minimum maximum function that can be decomposed into:

f(C)＝max_DV(D,C)，min_Cmax_DV(D,C)；P_seed(x)∪P_neg(x) Representing the built-in distribution P ((x) of seed users and negative users_seed,y＝1)∪(x_negY is 0)), the sample generated by this distribution is a real sample, and logD (x, y) represents the likelihood that the discrimination model D recognizes the real sample; since the seed user is a marked user, P_all(x) Representing target user distribution

Representing the predicted probability distribution of the classification model for the target user as a potentially scalable user,

indicating the likelihood of the discrimination pattern identifying a counterfeit sample.

And V (D, C) finally represents expected likelihood, the training target is to give a classification model C, and a discriminant model D is obtained by maximum likelihood optimization V (D, C), namely, the classification model C is used for training discriminant parameters, and the trained discriminant model D is obtained by utilizing the trained discriminant parameters. Then, for a given discriminant model D, this time

Is constant and only needs to be minimized

Even if the predicted classification result of the classification model C is more accurate, the classification parameters are optimized and solved, the classification model C is updated, and iteration is carried outAnd carrying out the operation until the classification model C and the discrimination model D meet the preset conditions. The preset conditions are equivalent to whether the judgment classification model and the judgment model meet preset requirements or not, the classification model is used for predicting whether a user has preference on a target object or not, and the judgment model is used for judging whether the input sample is true or false, so the preset conditions can be set according to the accuracy of judging whether the judgment model judges the true or false, if the accuracy of judging whether the judgment model judges the true or false reaches 0.5, the judgment model and the classification model are considered to meet the preset conditions, wherein if the accuracy of judging whether the judgment model judges the true or false reaches 0.5, the judgment model is difficult to judge the true or false, and the prediction accuracy of the classification model is high.

In addition, the random gradient algorithm may include a random gradient ascent algorithm and a random gradient descent algorithm, and in the machine learning algorithm, when the loss function is minimized, the minimized loss function and the corresponding parameter value may be obtained through a gradient descent concept, and conversely, if the maximized loss function is required, the minimized loss function may be obtained through a gradient ascent concept.

In summary, the sample set corresponding to the discriminant model D is the real sample (x, y) and the forged sample

Selecting a user corresponding to the real sample (x, y) from a marked sample set consisting of a seed user and a negative user, wherein the real classification result corresponding to the seed user is y-1, the real classification result corresponding to the negative user is y-0, and forging the sample

The corresponding user is selected from a sorted sample set of seed users and target users,

the predicted classification result obtained by using the classification model C (x) is shown. The discrimination model D can discriminate the authenticity of the input sample. The classification model C (x) is used for learning how to identify the target user as a potential extensible user, namely predicting whether the user has preference on the target object or not and leading the prediction result of the user to be selfMore accurately, to cheat the discriminant model D. In the embodiment of the invention, the classification model C (x) and the discrimination model D are continuously subjected to iterative training until the classification model C (x) and the discrimination model D reach countervailing balance, and the discrimination model D is difficult to discriminate the authenticity of the input sample, so that the classification model C (x) can predict the distribution of the target user according with the real sample.

In the embodiment of the invention, the semi-supervised training is carried out on the specific user based on the confrontation network learning algorithm, and the problem of how to train and classify by using a small amount of labeled samples and a large amount of unlabeled samples is mainly considered. It can be seen that the real sample is equivalent to a labeled sample, the forged sample is equivalent to an unlabeled sample, and the classification model c (x) is equivalent to a user labeled model, and can label the unlabeled sample, i.e., the forged sample, so that semi-supervised training can be performed on a specific user, the problem of sensitivity of adaptive threshold selection of unsupervised learning, the problem of overfitting of supervised learning and a special optimization method can be avoided, and the accuracy of the trained algorithm model can be improved.

As a reference embodiment of the present invention, training a discrimination parameter by a random gradient ascent method according to feature data of a specific user by using a classification model to obtain a discrimination model corresponding to the discrimination parameter may include:

step S1021, sampling at least one first classification sample set from a specific user;

step S1022, inputting the feature data corresponding to the at least one first classification sample set into the classification model to obtain a prediction classification result corresponding to the at least one first classification sample set;

step S1023, at least one mark sample set is sampled from the seed user and the negative user, and a real classification result corresponding to the at least one mark sample set is determined;

step S1024, constructing a discriminant training set by using a prediction classification result corresponding to at least one first classification sample set and a real classification result corresponding to at least one labeled sample set;

and step S1025, training a discrimination parameter according to the discrimination training set by a random gradient rise method.

In the embodiment of the invention, the seed user and the target user group are classified into the classification sample set, and in the method for training the discrimination model by using the classification model, the known classification model is used for solving the discrimination model, namely solving the discrimination parameters of the discrimination model. The random gradient method is an optimization algorithm, is commonly used for a recursive approximation minimum deviation model in machine learning and artificial intelligence, and solves a maximum value or a minimum value along the direction of gradient rising or falling. The random gradient method is characterized in that a batch of training data is randomly extracted from a training set to be input to obtain an average gradient, and the steps are repeated in this way, so that the calculation expense of each iteration is reduced, and the model training speed can be accelerated. As already described above, the discriminant model D can be obtained by maximum likelihood optimization V (D, C), that is, solving for maxima along the direction of gradient rise.

Therefore, in the method for training the discriminant parameters, from the seed user phi_seedAnd a target user phi_targetA plurality of first classification sample sets are sampled in the formed classification sample set (the first classification sample set is only used for distinguishing from a second classification sample set which is described below and has no practical significance), and the user characteristic data in the first classification sample set is input into a classification model C (x) to obtain a prediction classification result

From the seed user phi_seedAnd negative users phi_negAnd sampling at least one marked sample set, and determining a real classification result corresponding to a user in the marked sample set. Then, constructing a discriminant training set by using the predicted classification result corresponding to the at least one first classification sample set and the real classification result corresponding to the at least one labeled sample set:

wherein, Fake represents a Fake sample, and Real represents a Real sample. Finally, training a discrimination parameter theta by a random gradient ascent method_DThe specific calculation formula is as follows:

in the embodiment of the invention, after the discriminant parameters are trained, the discriminant parameters can be fixed to obtain a corresponding discriminant model D, and then the loss function of the discriminant model D is utilized to update the classification parameters, so that the purpose of training the classification model C is achieved. As already explained above, the formula for the loss function is:

for a given discriminant model D, this time

Is constant and only needs to be minimized

Therefore, the predicted classification result of the classification model C can be more accurate, namely, the minimum value is solved along the gradient descending direction.

Therefore, as a reference embodiment of the present invention, training the classification parameters by using the loss function of the discriminant model through a stochastic gradient descent method may include: determining a loss function of the discriminant model; sampling at least one second sorted sample set from the particular user; and inputting the characteristic data corresponding to at least one second classification sample set into a loss function of the discriminant model, and training classification parameters by a random gradient descent method. In a method of training classification parameters, from a seed user phi_seedAnd a target user phi_targetA plurality of second classification sample sets are sampled in the formed classification sample set (the second classification sample set is only used for distinguishing from the first classification sample set in the above and has no practical significance), and the user characteristic data in the second classification sample set is input into a classification model C (x) to obtain a prediction classification result

Then, the user can use the device to perform the operation,training classification parameter theta by random gradient descent method_CThe specific calculation formula is as follows:

FIG. 4 is a schematic diagram of the main flow of a method for training classification models and discriminant models according to an embodiment of the present invention. As shown in fig. 4, the main process of the method for training the classification model and the discriminant model may include:

step S401, initializing a classification model of a counternetwork learning algorithm;

step S402, sampling at least one first classification sample set from a specific user;

step S403, inputting the characteristic data corresponding to at least one first classification sample set into a classification model to obtain a prediction classification result corresponding to at least one first classification sample set;

step S404, sampling at least one marked sample set from the seed user and the negative user, and determining a real classification result corresponding to the at least one marked sample set;

step S405, constructing a discriminant training set by using a prediction classification result corresponding to at least one first classification sample set and a real classification result corresponding to at least one labeled sample set;

step S406, training a discrimination parameter according to the discrimination training set by a random gradient rise method to obtain a discrimination model and a loss function corresponding to the discrimination parameter;

step S407, sampling at least one second classification sample set from a specific user;

step S408, inputting the characteristic data corresponding to at least one second classification sample set into a loss function of the discrimination model, training classification parameters by a random gradient descent method, and updating the classification model according to the trained classification parameters;

step S409, judging whether the updated classification model and the trained discrimination model meet preset conditions, if so, executing step S410, and if not, executing step S402 to step S408 again for iterative training;

and step S410, obtaining a target classification model and a target discrimination model of the confrontation network learning algorithm.

In the embodiment of the invention, the classification model and the discrimination model are trained by using the antagonistic network learning algorithm, so that supervision training can be carried out on all users, and the problems of unsupervised learning adaptability threshold selection sensitivity and overfitting of supervised learning and special optimization methods are solved. The classification model is used for classifying the target users and outputting the predicted probability distribution of the target users, and is equivalent to a generator of the countermeasure network, the judgment model is used for judging whether the predicted probability distribution of the classification model is the same as the real probability distribution, and the countermeasure balance of the classification model and the judgment model enables the classification model to learn the combined distribution from the seed users and the target users. In addition, iterative training is performed by adopting a random gradient algorithm, so that the calculation overhead of each iteration can be reduced, and the model training speed is accelerated.

It should be noted that, in step S401, the classification model is initialized first, then the discriminant model is trained by using the initialized classification model, and then the classification model is trained by using the obtained discriminant model, and the iterative training is performed continuously. Therefore, it is important to initialize the acquisition of the classification model, and as a reference embodiment of the present invention, initializing the classification model of the countering network learning algorithm may include: and acquiring an initial value of the classification parameter, and directly initializing by using the initial value of the classification parameter. That is, the initial value of the classification parameter is directly given, and the initial value is substituted into the classification model, thereby completing the initialization process of the classification model.

In order to keep the overall stability of the training process, the classification model can be trained in advance to keep a certain accuracy, and then the confrontation network is accessed for overall training. The method for training the classification model in advance may be as follows: and sampling at least one classification training set from the seed user and the negative user, and training at least one initial classification model in advance by using the at least one classification training set so as to finish initialization of the classification model. Considering that the seed user and the negative user belong to the marked users, the real classification result corresponding to the seed user is y-1, and the real classification result corresponding to the negative user is y-0, in this embodiment of the present invention, a classification training set may be constructed by using the seed user and the negative user, and the initial classification model may be trained in advance by using the constructed classification training set.

In addition, in the embodiment of the invention, at least one classification training set can be constructed. Then, aiming at each classification training set, training an initial classification model corresponding to the classification training set in advance, and further obtaining a corresponding classification model and a discrimination model. Therefore, as a reference embodiment of the present invention, obtaining a target classification model and a target discrimination model of a countering network learning algorithm may include: obtaining at least one optional classification model and at least one optional discriminant model corresponding to at least one initial classification model; from the at least one selectable classification model and the at least one selectable discriminant model, a target classification model and a target discriminant model are determined. In general, different classification training sets can train different classification models, and the prediction classification results corresponding to the different classification models can be voted, and finally, a target classification model is selected.

Fig. 5 is a schematic structural diagram of a similar population expansion system according to an embodiment of the present invention, and as shown in fig. 5, the structure of the similar population expansion system according to the embodiment of the present invention may include: the system comprises a data processing part, a classification model part and a discrimination model part.

The data processing section may generate feature data of the user based on the basic attribute data of the user and the behavior data of the user. The basic attribute data of the user represents long-term characteristics of the user, the behavior data of the user represents recent characteristics of the user, and the behavior data of the user is sparse and needs to be processed. The behavior data of the user is divided into a large amount of common click behavior data and a relatively small amount of other behavior data, such as attention behavior, collection behavior, purchase behavior and purchase behavior. In the embodiment of the present invention, feature embedding processing may be performed on the common click behavior data, which has been described in detail in step S10121 to step S10124 above, and will not be described here again; the word segmentation feature processing may be performed on a relatively small number of other behavior data, which has been described in detail in the above step S10131 to step S10313, and will not be described in detail here.

The classification model part and the discriminant model part form a countermeasure system, and the countermeasure balance of the classification model part and the discriminant model part enables the classification model to learn joint distribution from the seed user and the target user. The sample set corresponding to the discriminant model is real sample (x, y) and forged sample

is a predicted classification result obtained by using a classification model. The discrimination model can discriminate the authenticity of the input sample. The classification model is used for learning how to identify the target user as a potential extensible user, namely, the classification model is used for predicting whether the user has preference on the target object, so that the prediction result of the classification model is more accurate, and the discriminant model is cheated. And (3) carrying out iterative training on the classification model and the discrimination model continuously until the classification model and the discrimination model reach countervailing balance, and the discrimination model D is difficult to discriminate the authenticity of the input sample, so that the classification model can predict the distribution of the real sample for the target user. The process of training the classification model and the discriminant model has been described in detail in the above steps S401 to S410, and will not be described here again.

Fig. 6 is a schematic diagram of the main modules of a similar population extending device according to an embodiment of the present invention. As shown in fig. 6, the similar population expanding device 600 of the embodiment of the present invention mainly includes the following modules: a generation module 601, a training module 602, and a determination module 603.

The generating module 601 may be configured to obtain a specific user corresponding to the target object, and generate feature data of the specific user according to the basic attribute data and the behavior data of the specific user, where the specific user may include: a seed user and a target user; the training module 602 may be configured to train an algorithm model of the anti-net learning algorithm using a stochastic gradient method according to feature data of a specific user; the determining module 603 may be configured to determine, based on the algorithm model, an extensible user corresponding to the target object from the target users according to the feature data of the target users.

In this embodiment of the present invention, the training module 602 may further be configured to: initializing a classification model of a counternetwork learning algorithm; training a discrimination parameter by a random gradient ascent method according to the feature data and the classification model of the specific user to obtain a discrimination model corresponding to the discrimination parameter; training classification parameters by using a loss function of a discrimination model through a random gradient descent method, and updating a classification model according to the trained classification parameters; judging whether the discrimination model and the updated classification model meet preset conditions or not; if yes, determining the updated classification model as a target classification model, and determining the discrimination model as a target discrimination model; if not, performing model training by using the feature data of the specific user and the updated classification model until the discrimination model and the classification model obtained by training meet the preset conditions.

In this embodiment of the present invention, the training module 602 may further be configured to: sampling at least one first sorted sample set from a particular user; inputting the characteristic data corresponding to at least one first classification sample set into a classification model to obtain a prediction classification result corresponding to at least one first classification sample set; sampling at least one marked sample set from a seed user and a negative user, and determining a real classification result corresponding to the at least one marked sample set, wherein the real classification result corresponding to the seed user is 1, and the real classification result corresponding to the negative user is 0; constructing a discriminant training set by using a predicted classification result corresponding to at least one first classification sample set and a real classification result corresponding to at least one labeled sample set; and training the discrimination parameters according to the discrimination training set by a random gradient rise method.

In this embodiment of the present invention, the training module 602 may further be configured to: determining a loss function of the discriminant model; sampling at least one second sorted sample set from the particular user; and inputting the characteristic data corresponding to at least one second classification sample set into a loss function of the discriminant model, and training classification parameters by a random gradient descent method.

In this embodiment of the present invention, the training module 602 may further be configured to: acquiring an initial value of a classification parameter, and directly initializing by using the initial value of the classification parameter; and sampling at least one classification training set from the seed user and the negative user, and training at least one initial classification model in advance by using the at least one classification training set so as to finish initialization of the classification model.

In this embodiment of the present invention, the training module 602 may further be configured to: obtaining at least one optional classification model and at least one optional discriminant model corresponding to at least one initial classification model; from the at least one selectable classification model and the at least one selectable discriminant model, a target classification model and a target discriminant model are determined.

In an embodiment of the present invention, the similar people group expanding device may further include: a sampling module (not shown in the figure). The sampling module may be operable to: randomly sampling negative users from target users; sampling negative users from the target users according to the seed user behavior data and the second behavior data of the seed users; and the first behavior data of the negative user and the first behavior data of the seed user have intersection, and the second behavior data of the negative user and the second behavior data of the seed user have no intersection.

In this embodiment of the present invention, the behavior data may include: first behavior data and second behavior data; and the generation module 601 is further operable to: acquiring basic attribute data, first behavior data and second behavior data; processing the first behavior data based on a preset embedded characteristic processing rule to generate embedded characteristic data corresponding to the first behavior data; processing the second behavior data based on a preset word segmentation characteristic processing rule to generate word segmentation characteristic data corresponding to the second behavior data; and combining the basic attribute data, the embedded characteristic data and the word segmentation characteristic data to generate characteristic data.

In this embodiment of the present invention, the generating module 601 may further be configured to: acquiring at least one article attribute data corresponding to the first behavior data; performing segmentation processing on at least one article attribute data according to a preset time threshold corresponding to the at least one article attribute data to obtain a behavior sequence corresponding to the at least one article attribute data; embedding the behavior sequence corresponding to the at least one article attribute data by using a word vector embedding algorithm to obtain sub-embedded characteristic data corresponding to the at least one article attribute data; and combining the sub-embedded characteristic data corresponding to at least one item attribute data to generate embedded characteristic data corresponding to the first behavior data.

In this embodiment of the present invention, the generating module 601 may further be configured to: acquiring an article description sentence corresponding to the second behavior data; performing word segmentation processing on the article description sentence to obtain at least one word segmentation, and filtering and screening the at least one word segmentation; and generating word segmentation characteristic data corresponding to the second behavior data by using the filtered and screened word segmentation.

In this embodiment of the present invention, the determining module 603 may further be configured to: inputting the characteristic data of the target user into a target classification model, and acquiring a prediction classification result corresponding to the target user; and selecting extensible users from the target users according to the corresponding prediction classification results of the target users based on preset extension conditions.

From the above description, it can be seen that the similar population expansion device of the embodiment of the present invention can generate feature data by using basic attribute data and behavior data of a specific user, and because behavior data is introduced, the device has an important meaning for conversion rate improvement, and thus can improve the accuracy of a trained algorithm model; then, an algorithm model of the confrontation network learning algorithm is trained by combining the feature data of the specific user through a random gradient method, semi-supervised training can be carried out on the specific user based on the confrontation network learning algorithm, the method is different from a method for carrying out supervised training only by using a seed user, and is different from a method for using all users based on unsupervised clustering, and the problems of sensitivity of adaptive threshold selection of unsupervised learning and overfitting of a supervised learning and special optimization method are avoided; and finally, selecting extensible users from the target users by using the trained algorithm model to complete similar population extension.

In addition, the generation module of the similar population expansion device in the embodiment of the invention can consider from three aspects of basic attribute data, first behavior data and second behavior data, and can comprehensively consider the long-term characteristics and recent behavior characteristics of the user, so that the accuracy of the characteristic data and the algorithm model can be improved, the accuracy of the obtained expandable user can be further ensured, and in addition, the introduction of the behavior characteristics of the user as much as possible has important significance for improving the conversion rate. Considering the characteristics that the first behavior data are large in quantity and belong to sparse features, the corresponding embedded feature data are obtained by adopting an embedded feature processing method, and the problem of overfitting of a trained model is avoided.

In addition, the training module of the similar population extension device provided by the embodiment of the invention can utilize an anti-network learning algorithm to train a classification model and a discrimination model, supervise and train all users, and solve the problems of unsupervised learning adaptability threshold selection sensitivity and overfitting of supervised learning and special optimization methods. The classification model is used for classifying the target users and outputting the predicted probability distribution of the target users, and is equivalent to a generator of the countermeasure network, the judgment model is used for judging whether the predicted probability distribution of the classification model is the same as the real probability distribution, and the countermeasure balance of the classification model and the judgment model enables the classification model to learn the combined distribution from the seed users and the target users. In addition, iterative training is performed by adopting a random gradient algorithm, so that the calculation overhead of each iteration can be reduced, and the model training speed is accelerated.

Fig. 7 shows an exemplary system architecture 700 to which the similar population extending method or similar population extending apparatus of the embodiments of the present invention can be applied.

As shown in fig. 7, the system architecture 700 may include

terminal devices

701, 702, 703, a network 704, and a server 705. The network 704 serves to provide a medium for communication links between the

terminal devices

701, 702, 703 and the server 705. Network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

701, 702, 703 to interact with a server 705 over a network 704, to receive or send messages or the like. The

terminal devices

701, 702, 703 may have installed thereon various communication client applications, such as a shopping-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).

The

terminal devices

701, 702, 703 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 705 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the

terminal devices

701, 702, 703. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.

It should be noted that the similar population expansion method provided by the embodiment of the present invention is generally executed by the server 705, and accordingly, the similar population expansion apparatus is generally disposed in the server 705.

It should be understood that the number of terminal devices, networks, and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program executes the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 801.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a generation module, a training module, and a determination module. For example, the generation module may be further described as a module that acquires a specific user corresponding to the target object and generates feature data of the specific user according to basic attribute data and behavior data of the specific user.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring a specific user corresponding to the target object, and generating feature data of the specific user according to the basic attribute data and the behavior data of the specific user, where the specific user may include: a seed user and a target user; training an algorithm model of a confrontation network learning algorithm by adopting a random gradient method according to the characteristic data of a specific user; and determining the extensible user corresponding to the target object from the target users according to the characteristic data of the target users based on the algorithm model.

According to the technical scheme of the embodiment of the invention, the basic attribute data and the behavior data of a specific user can be utilized to generate the characteristic data, and the behavior data is introduced, so that the method has important significance for improving the conversion rate and further can improve the accuracy of the trained algorithm model; then, an algorithm model of the confrontation network learning algorithm is trained by combining the feature data of the specific user through a random gradient method, semi-supervised training can be carried out on the specific user based on the confrontation network learning algorithm, the method is different from a method for carrying out supervised training only by using a seed user, and is different from a method for using all users based on unsupervised clustering, and the problems of sensitivity of adaptive threshold selection of unsupervised learning and overfitting of a supervised learning and special optimization method are avoided; and finally, selecting extensible users from the target users by using the trained algorithm model to complete similar population extension.

In addition, in the method for generating feature data according to the embodiment of the present invention, the long-term feature and the recent behavior feature of the user can be considered comprehensively from the three aspects of the basic attribute data, the first behavior data, and the second behavior data, so that the accuracy of the feature data and the algorithm model can be improved, the accuracy of the obtained extensible user can be further ensured, and the introduction of the behavior feature of the user as much as possible has an important meaning for the conversion rate improvement. Considering the characteristics that the first behavior data are large in quantity and belong to sparse features, the corresponding embedded feature data are obtained by adopting an embedded feature processing method, and the problem of overfitting of a trained model is avoided.

In addition, in the method for training the classification model and the discrimination model of the embodiment of the invention, the classification model and the discrimination model can be trained by using an anti-net learning algorithm, and all users can be supervised and trained, so that the problems of sensitivity to adaptive threshold selection of unsupervised learning and overfitting of a supervised learning and special optimization method are solved. The classification model is used for classifying the target users and outputting the predicted probability distribution of the target users, and is equivalent to a generator of the countermeasure network, the judgment model is used for judging whether the predicted probability distribution of the classification model is the same as the real probability distribution, and the countermeasure balance of the classification model and the judgment model enables the classification model to learn the combined distribution from the seed users and the target users. In addition, iterative training is performed by adopting a random gradient algorithm, so that the calculation overhead of each iteration can be reduced, and the model training speed is accelerated.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for similar population expansion, comprising:

acquiring a specific user corresponding to a target object, and generating feature data of the specific user according to basic attribute data and behavior data of the specific user, wherein the specific user comprises: a seed user and a target user;

training an algorithm model of a confrontation network learning algorithm by adopting a random gradient method according to the characteristic data of the specific user;

and determining the extensible user corresponding to the target object from the target users according to the characteristic data of the target users based on the algorithm model.

2. The method of claim 1, wherein training an algorithm model of a countering network learning algorithm by a stochastic gradient method according to the feature data of the specific user comprises:

initializing a classification model of a counternetwork learning algorithm;

training a discrimination parameter by a random gradient ascent method according to the feature data of the specific user and the classification model to obtain a discrimination model corresponding to the discrimination parameter;

training classification parameters by using a loss function of the discrimination model through a random gradient descent method, and updating the classification model according to the trained classification parameters;

judging whether the discrimination model and the updated classification model meet preset conditions or not;

if yes, determining the updated classification model as a target classification model, and determining the discrimination model as a target discrimination model;

if not, performing model training by using the feature data of the specific user and the updated classification model until the discriminant model and the classification model obtained by training meet the preset conditions.

3. The method according to claim 2, wherein the training a discriminant parameter according to the feature data of the specific user and the classification model by a stochastic gradient ascent method to obtain a discriminant model corresponding to the discriminant parameter comprises:

sampling at least one first sorted sample set from the particular user;

inputting the characteristic data corresponding to the at least one first classification sample set into the classification model to obtain a prediction classification result corresponding to the at least one first classification sample set;

sampling at least one marked sample set from the seed user and the negative user, and determining a real classification result corresponding to the at least one marked sample set, wherein the real classification result corresponding to the seed user is 1, and the real classification result corresponding to the negative user is 0;

constructing a discriminant training set by using the predicted classification result corresponding to the at least one first classification sample set and the real classification result corresponding to the at least one labeled sample set;

and training the discrimination parameters according to the discrimination training set by a random gradient rise method.

4. The method of claim 2, wherein the training of the classification parameters using the discriminant model's loss function by a stochastic gradient descent method comprises:

determining a loss function of the discriminant model;

sampling at least one second sorted sample set from the particular user;

and inputting the characteristic data corresponding to the at least one second classification sample set into a loss function of the discriminant model, and training the classification parameters by a random gradient descent method.

5. The method of claim 2, wherein initializing a classification model of the countering network learning algorithm comprises:

acquiring an initial value of a classification parameter, and directly initializing by using the initial value of the classification parameter; and

and sampling at least one classification training set from the seed user and the negative user, and training at least one initial classification model in advance by using the at least one classification training set so as to complete initialization of the classification model.

6. The method of claim 5, wherein obtaining the target classification model and the target discriminant model of the countering network learning algorithm comprises:

obtaining at least one optional classification model and at least one optional discriminant model corresponding to the at least one initial classification model;

determining the target classification model and the target discriminant model from the at least one alternative classification model and the at least one alternative discriminant model.

7. The method according to claim 3 or 5, characterized in that the method further comprises:

randomly sampling the negative users from the target users; and

sampling the negative users from the target users according to the seed user behavior data and the second behavior data of the seed users; wherein the content of the first and second substances,

the first behavior data of the negative user and the first behavior data of the seed user have intersection, and the second behavior data of the negative user and the second behavior data of the seed user have no intersection.

8. The method of claim 1, wherein the behavior data comprises: first behavior data and second behavior data; and

generating feature data of the specific user according to the basic attribute data and the behavior data of the specific user, including:

acquiring the basic attribute data, the first behavior data and the second behavior data;

processing the first behavior data based on a preset embedded feature processing rule to generate embedded feature data corresponding to the first behavior data;

processing the second behavior data based on a preset word segmentation feature processing rule to generate word segmentation feature data corresponding to the second behavior data;

and combining the basic attribute data, the embedded characteristic data and the word segmentation characteristic data to generate the characteristic data.

9. The method according to claim 8, wherein the processing the first behavior data based on a preset embedded feature processing rule to generate embedded feature data corresponding to the first behavior data comprises:

acquiring at least one item attribute data corresponding to the first behavior data;

performing segmentation processing on the at least one article attribute data according to a preset time threshold corresponding to the at least one article attribute data to obtain a behavior sequence corresponding to the at least one article attribute data;

embedding the behavior sequence corresponding to the at least one article attribute data by using a word vector embedding algorithm to obtain sub-embedded characteristic data corresponding to the at least one article attribute data;

and combining the sub-embedded characteristic data corresponding to the at least one item attribute data to generate embedded characteristic data corresponding to the first behavior data.

10. The method according to claim 8, wherein the processing the second behavior data based on a preset word segmentation feature processing rule to generate word segmentation feature data corresponding to the second behavior data includes:

acquiring an article description sentence corresponding to the second behavior data;

performing word segmentation processing on the article description sentence to obtain at least one word segmentation, and filtering and screening the at least one word segmentation;

and generating word segmentation characteristic data corresponding to the second behavior data by using the filtered and screened word segmentation.

11. The method according to claim 2, wherein the determining, based on the algorithm model and according to the feature data of the target user, an extensible user corresponding to the target object from the target user comprises:

inputting the characteristic data of the target user into the target classification model to obtain a prediction classification result corresponding to the target user;

and selecting the extensible user from the target users according to the corresponding prediction classification result of the target user based on a preset extension condition.

12. A similar population extension device, comprising:

a generating module, configured to obtain a specific user corresponding to a target object, and generate feature data of the specific user according to basic attribute data and behavior data of the specific user, where the specific user includes: a seed user and a target user;

the training module is used for training an algorithm model of the confrontation network learning algorithm by adopting a random gradient method according to the characteristic data of the specific user;

and the determining module is used for determining the extensible user corresponding to the target object from the target users according to the characteristic data of the target users on the basis of the algorithm model.

13. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-11.

14. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-11.