CN110765352B - User interest identification method and device - Google Patents

User interest identification method and device Download PDF

Info

Publication number
CN110765352B
CN110765352B CN201910967146.6A CN201910967146A CN110765352B CN 110765352 B CN110765352 B CN 110765352B CN 201910967146 A CN201910967146 A CN 201910967146A CN 110765352 B CN110765352 B CN 110765352B
Authority
CN
China
Prior art keywords
app
user
data
neural network
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910967146.6A
Other languages
Chinese (zh)
Other versions
CN110765352A (en
Inventor
李君浩
李铁铮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shanghu Information Technology Co ltd
Original Assignee
Shanghai Shanghu Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shanghu Information Technology Co ltd filed Critical Shanghai Shanghu Information Technology Co ltd
Priority to CN201910967146.6A priority Critical patent/CN110765352B/en
Publication of CN110765352A publication Critical patent/CN110765352A/en
Application granted granted Critical
Publication of CN110765352B publication Critical patent/CN110765352B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a user interest identification method and device, and belongs to the technical field of deep learning. The method comprises the following steps: the method comprises the steps that APP list data containing user APP time sequence information are input into a neural network recognition model, calculation containing weighted polymerization is carried out on the neural network recognition model, and a user interest recognition result is output, wherein the neural network recognition model is obtained through training of APP list training data containing the user APP time sequence information. The method and the device solve the problem of inaccurate model prediction caused by neglecting the APP time sequence information of the user in the prior art, obtain richer semantic information related to the interest of the user, and improve the accuracy of the interest identification of the user.

Description

User interest identification method and device
Technical Field
The invention relates to the technical field of deep learning, in particular to a user interest identification method and device.
Background
For a machine learning model for depicting user interest preference, a traditional method is as shown in fig. 1, all APPs are directly put together for learning, vector representation parameters of the APPs are obtained through a network, timing information is not considered, each APP is independent, only static statistics is carried out on what types of APPs are used by a user historically as characteristic variables, behavior preference of the user often changes in a short time, but the behavior preference can be regarded as noise and interference information, and model prediction is inaccurate. Therefore, in the modeling process, the time sequence information needs to be considered, and the long-term stable interest and the short-term mutation preference of the user are captured.
For example, in a big data application scenario of network marketing, recommendation and wind control, in order to better meet the experience of a user, the user often actively authorizes a service provider to obtain APP installation list information installed in the mobile device of the user for evaluating the interest preference of the user, and in combination with an actual service scenario, the user is more accurately recommended, and a higher credit limit is provided. Through data analysis, the interest preference of a user can change along with the time, for example, for a user who likes music, many APPs in the recruitment class can be downloaded in a centralized manner during job hunting and recruitment, and after the user enters into a job, the APPs in the class can be deleted, but the APPs in the class of music cannot be easily unloaded.
In addition, millions of APPs exist in the current application market, APP description information provided by each APP download channel, APP category information and the like can have differences, and in addition, in the application market, newly-added APPs are applied every day, and great cost needs to be invested for a service provider to manually arrange and classify the information, so that a machine learning model needs to be utilized to automatically extract semantic information of the APPs, and user interest identification is realized.
Disclosure of Invention
In order to solve the problems in the prior art, embodiments of the present invention provide a method and an apparatus for identifying user interest, which solve the problem in the prior art that model prediction is inaccurate due to ignoring user APP timing information, obtain richer semantic information related to user interest, and improve accuracy of user interest identification.
The technical scheme is as follows:
in one aspect, a user interest identification method is provided, and the method includes:
the method comprises the steps that APP list data containing user APP time sequence information are input into a neural network recognition model, calculation containing weighted polymerization is carried out on the neural network recognition model, and a user interest recognition result is output, wherein the neural network recognition model is obtained through training of APP list training data containing the user APP time sequence information.
Further, the neural network recognition model is obtained by training in the following way:
inputting APP list training data containing user APP time sequence information into a preset neural network recognition model to generate APP vector data;
performing weighted aggregation on the APP vector data;
and continuously training by utilizing a preset multilayer perceptron algorithm to obtain the neural network recognition model.
Further, before performing weighted aggregation on the APP vector data, each APP weight of each user is calculated respectively, so as to obtain respective APP weight data of multiple users.
Further, calculating each APP weight of each user respectively, thereby obtaining respective APP weight data of a plurality of users, including:
two original embedding vectors in the APP vector data are subjected to point multiplication in pairs to obtain N-dimensional interaction vectors, then the two original embedding vectors and the N-dimensional interaction vectors are spliced into 3N-dimensional input vectors respectively, and respective APP weight data of a plurality of users are obtained through calculation, wherein N is an integer greater than 1.
Further, the method further comprises:
acquiring APP list data, cleaning the APP list data according to the number of installation people, removing low-frequency APPs in the APP list data, labeling all APPs, and acquiring respective numbers of the APPs;
and acquiring corresponding APP information of a plurality of users at different moments, and marking each user and the corresponding APP information of the user at different moments respectively based on the respective numbers of the APPs to obtain APP list data containing the APP time sequence information of the users.
Further, the weighted aggregation comprises:
weighting the APP vector weights of the later moment and the APP vector weights of the previous moment in sequence to obtain weights w21, w22, w23, \ 8230, w2n and w2 (n + 1), aggregating the APP vector weights of each moment and the APP vector weights of all the previous moments in sequence to obtain an aggregate vector of each moment, and summing or averaging the aggregate vectors of each moment, wherein n is an integer greater than 1.
In another aspect, an apparatus for identifying a user interest is provided, the apparatus including a model identification module configured to:
the method comprises the steps that APP list data containing user APP time sequence information are input into a neural network recognition model, calculation containing weighted polymerization is carried out on the neural network recognition model, and a user interest recognition result is output, wherein the neural network recognition model is obtained through training of APP list training data containing the user APP time sequence information.
Further, the device also comprises a model training module, wherein the model training module comprises a vector generation sub-module, a weighting aggregation sub-module and an MLP sub-module, the vector generation sub-module inputs APP list training data containing user APP time sequence information into a preset neural network recognition model to generate APP vector data; the weighted aggregation sub-module carries out weighted aggregation on the APP vector data; and the MLP sub-module continues training by utilizing a preset multilayer perceptron algorithm to obtain the neural network recognition model.
Further, the model training module further comprises a weight calculation sub-module, the weight calculation sub-module is configured to: before the APP vector data are subjected to weighted aggregation, two original embedding vectors in the APP vector data are subjected to pairwise point multiplication to obtain N-dimensional interaction vectors, then the two original embedding vectors and the N-dimensional interaction vectors are spliced into 3N-dimensional input vectors respectively, and the APP weight data of the users are obtained through calculation, wherein N is an integer greater than 1.
Further, the weighted aggregation comprises: weighting the APP vector weight at the later moment and the APP vector weight at the previous moment in sequence to obtain weights w21, w22, w23, \ 8230;, w2n, and w2 (n + 1), aggregating the APP vector weight at each moment and the APP vector weights at all the previous moments in sequence to obtain an aggregate vector at each moment, and summing or averaging the aggregate vector at each moment, wherein n is an integer greater than 1.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
by utilizing a neural network identification model and an APP semantic feature extraction scheme fusing time sequence information, an APP low-dimensional vector containing time sequence behavior action information and other identification results with richer semantics are obtained, due to the consideration of fully considering APP dynamic time sequence information and the fusion of interaction information among different APPs in the identification process through weighting polymerization operation, the long-term interest and the short-term interest of a user can be dynamically calculated, the accuracy of user interest identification is improved, and the automatically obtained APP low-dimensional vector can also be used as input data of other deep learning or subsequent applications (such as clustering and similarity calculation).
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow diagram of user interest identification through a machine learning model in the prior art;
FIG. 2 is a flowchart of a user interest identification method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a neural network recognition model training process according to a preferred embodiment;
FIG. 4 is a flow chart of a preferred embodiment of a user interest identification method;
FIG. 5 is a diagram of a main network framework of a neural network model in the user interest identification method according to a preferred embodiment;
FIG. 6 is a diagram of the weighted aggregation sub-network framework of FIG. 5;
FIG. 7 is a schematic structural diagram of a user interest recognition apparatus according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a preferred embodiment of a user interest recognition apparatus.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
According to the user interest identification method and device provided by the embodiment of the invention, the APP time sequence information of the user is fully considered, the calculation and identification are carried out by combining the user interest identification model trained by the APP time sequence information, richer semantic information vectors and user interest identification results are obtained, the user interest identification accuracy is improved, and the richer and comprehensive semantic information vectors can be directly used as input data of other data processing processes (clustering or machine learning algorithm and the like) as an additional effect, so that the method and device are convenient and practical. Therefore, the user interest identification method and the user interest identification device are suitable for various application scenarios related to user interest identification in the field of big data, such as user information mining, user information analysis, network marketing, network recommendation, network credit wind control and the like.
The following describes a user interest identification method and apparatus provided in an embodiment of the present invention in detail with reference to specific embodiments and accompanying drawings.
Fig. 2 is a flowchart of a user interest identification method according to an embodiment of the present invention. As shown in fig. 2, the method for identifying user interest provided by the embodiment of the present invention includes the following steps:
101. APP list data containing user APP timing information is input.
Specifically, APP list data containing user APP time sequence information is input into the neural network recognition model. Here, the user APP time sequence information refers to time sequence behavior action information for one or more APPs installed or uninstalled by a certain user, for example, installation or uninstallation time, installation or uninstallation times, installation or uninstallation frequency, and the like, and besides, may also include any other time sequence behavior action information possible in the prior art, and the embodiment of the present invention is not particularly limited thereto. In order to comprehensively identify the user interest preference, the APP list data may include, in addition to the user APP timing information, APP name information installed or uninstalled by the user, APP category information, and/or legally acquired user personal information. In addition, the neural network recognition model is preferably created based on the user interest specific recognition purpose and the APP list data, and any other possible user interest recognition models in the prior art may be adopted, and the embodiment of the present invention is not limited thereto.
The method and the device are different from the user interest identification scheme in the traditional technology, the dynamic APP time sequence information of the user is fully considered, and richer, comprehensive and accurate user interest related semantic information is learned and obtained through the learning model, so that the long-term interest and the short-term interest of the user are dynamically calculated, and the accuracy of user interest identification is improved.
102. The calculation including the weighted aggregation is performed by a neural network recognition model.
Specifically, the input user APP list data is calculated through a neural network recognition model, the calculation process comprises weighted aggregation operation, and the weighted aggregation mainly has the effect that user APP time sequence information and/or other APP information data in the user APP list data are fused or spliced according to corresponding weights, so that subsequent calculation is facilitated. The neural network recognition model is obtained by training through inputting APP list training data containing user APP time sequence information, and the specific training process is not limited in the embodiment of the invention, and any possible mode in the prior art can be adopted as long as the obtained neural network model can realize the user interest recognition function claimed by the embodiment of the invention.
In addition, in a preferred embodiment, before performing weighted aggregation on the APP vector data, each APP weight of each user is calculated, so as to obtain respective APP weight data of a plurality of users. As for the method for calculating and acquiring, any possible method in the prior art may be adopted, and the embodiment of the present invention is not particularly limited thereto.
103. And outputting a user interest identification result.
Specifically, according to the purpose of user interest identification, corresponding selection of APP list data and corresponding parameter setting of a calculation model are carried out, and after the calculation process, a corresponding user interest identification result is output. How to select the APP list data, how to set the parameters of the calculation model, and what kind of user interest recognition result (the specific data form or the interest item content of the user interest recognition result) is output may be set according to specific needs, and the embodiment of the present invention is not particularly limited thereto. For example, APP list data related to the user transaction credit may be selected, where the APP list data includes APP timing information related to the user transaction credit, APP categories related to the transaction credit, and personal information of the user related to the transaction credit, and the output identification result is a user transaction credit level or a user default risk probability.
FIG. 3 is a schematic diagram of a neural network recognition model training process according to a preferred embodiment. As shown in fig. 3, the neural network recognition model provided in the embodiment of the present invention is obtained by training through the following steps:
201. inputting APP list training data containing user APP time sequence information into a preset neural network recognition model to generate APP vector data;
202. carrying out weighted aggregation on APP vector data;
203. and continuously training by utilizing a preset multilayer perceptron algorithm to obtain the neural network recognition model.
First, the pre-set neural network recognition model is a network model designed and created in advance according to training requirements, and preferably includes an input layer, an aggregation layer, an MLP network layer, and an output layer. Inputting APP list training data containing user APP time sequence information into an input layer of a preset neural network recognition model to generate APP vector data; performing weighted aggregation on the APP vector data in an aggregation layer of a preset neural network recognition model; and (4) continuing training on an MLP network layer of the preset neural network identification model, and outputting a result for verification, thereby obtaining the neural network identification model.
Secondly, in a preferred embodiment, before the step 201, the following operation steps are further included:
acquiring APP list training data, cleaning the APP list training data according to the number of installation persons, removing low-frequency APPs in the APP list training data, and then labeling all APPs to acquire the respective numbers of the APPs;
corresponding APP information of a plurality of users at different moments is obtained, based on respective numbers of the APPs, the corresponding APP information of each user and the corresponding APP information of the user at different moments are respectively marked, and APP list training data containing APP time sequence information of the users are obtained.
In addition, in a preferred embodiment, before step 202, the following steps are further included:
two original embedding vectors in the APP vector data are subjected to point multiplication in pairs to obtain N-dimensional interaction vectors, then the two original embedding vectors and the N-dimensional interaction vectors are spliced into 3N-dimensional input vectors, respective APP weight data of a plurality of users are obtained through MLP network calculation, and N is an integer greater than 1.
Namely, the preset neural network recognition model further comprises a weight calculation sub-network or a weight calculation network layer, and the operation steps execute a training process through the weight calculation sub-network or the weight calculation network layer.
Fig. 4 is a flow chart of a preferred method for identifying user interest. As shown in fig. 4, the method for identifying user interest in the preferred embodiment includes the following steps:
301. acquiring APP list data, cleaning the APP list data according to the number of installation people, removing low-frequency APPs in the APP list data, labeling all APPs, and acquiring respective numbers of the APPs;
302. acquiring corresponding APP information of a plurality of users at different moments, and labeling each user and the corresponding APP information of the user at different moments respectively based on the respective numbers of the APPs to obtain APP list data containing user APP time sequence information;
303. inputting APP list data containing user APP time sequence information into a neural network identification model to generate APP vector data;
304. performing point multiplication on two original embedding vectors in the APP vector data to obtain N-dimensional interactive vectors, splicing the two original embedding vectors and the N-dimensional interactive vectors into 3N-dimensional input vectors, and calculating to obtain respective APP weight data of a plurality of users, wherein N is an integer greater than 1;
305. weighting the APP vector weight at the next moment and the APP vector weight at the previous moment in sequence to obtain weights w21, w22, w23, 8230w 2n and w2 (n + 1), aggregating the APP vector weight at each moment and the APP vector weights at all the previous moments in sequence to obtain an aggregate vector at each moment, and summing or averaging the aggregate vectors at each moment, wherein n is an integer greater than 1;
306. MLP calculation is carried out through (an MLP network layer of) a neural network model, and a user interest recognition result is output.
Fig. 5 is a diagram of a main network framework of a neural network model in the user interest recognition method according to the preferred embodiment. FIG. 6 is a diagram of the weighted aggregation subnetwork framework of FIG. 5
In a preferred embodiment, the user interest recognition method is implemented as follows.
Under a big data analysis scene related to user transaction, when a user authorizes for the first time and a service provider obtains APP installation list information of the user, the APP installation list information is marked as a time T0, namely an initial time, the obtained APP list at the time cannot obtain the sequence of each APP, and when the user becomes an active user of a platform, T1, T2, \ 8230is carried out at the later time.
In order to obtain APP list data as input data, data preprocessing is required, and an APP list library is prepared: selecting a time window with longer time, for example, starting from 2017, performing data cleaning on APP lists installed in borrowing users by all the borrowing users on the platform, removing low-frequency APPs according to the number of installed APPs, and finally generating a duplicate-removed APP list, wherein each APP is assigned with a unique number, such as a _1, a _2, \ 8230;, a _ n and the like;
secondly, a data set is constructed: for the borrowing crowd in the selection time window, selecting the crowd capable of obtaining a plurality of APP lists of the user at different moments, according to the obtained APP information of the user at different moments and the tag information Y of the user (such as whether the user has credit default or not), for any user, according to the APP list library, using symbols to express that X is: [ U _1: { T0: [ a _100, a _2, a _20, \8230 ], T1: [ a _10], T2: [ a _1, a _25], \8230; }, U _ 2: { T0: [ a _2, a _3, a _10, a _100, \8230 ], T1: [ a _13], \8230; }, \8230;, the corresponding tag Y: [1, 0, \8230; ]
The input data obtained by the above data preprocessing are as follows:
x: the APP list information installed in the debit user mobile phone comprises the APP name and the APP installation time as shown in the table
Figure 158805DEST_PATH_IMAGE001
Y: whether the user has a default on the platform, [0, 1]
In the embodiment of the invention, the core step is the network design of a neural network recognition model, and the detailed process is as follows:
1) Initializing network parameters: in fig. 5, a small rectangular box represents semantic vector representation of an APP, at an initial time T0, there are relatively many APP data in an APP list at that time, and the order between APPs at the time T0 is irrelevant (no order relation is considered) in the training process, for a given APP, a vector dimension of each APP is set in advance to be N (16 or 64) in network design, in order to accelerate model convergence, a gaussian random noise is used to initialize parameters, in the model learning process, default parameters are updated, and a vector representation with semantic information is obtained for each APP;
2) The main network design process comprises:
as shown in fig. 5, an input of the main network includes an APP list at an initial time T0 and APP lists at subsequent times T1, T2, \8230andtn, and an embedding vector of weighted APPs is obtained by weighting embedding vectors of APPs at a plurality of times preceding the APP at the subsequent times. Specifically, a group of weight lists is obtained by using the APP list at the time T1 and the weight information generated by a sub-network or other weight calculation methods for all the APP lists at the time T0, wherein T1- > T0: [ w11, w12, w13, \8230;, w1N, ], the original embedding vectors at the time T0 are weighted by using the weight information, and the weighted vectors are aggregated (sum/average ave) to obtain a group of N-dimensional vectors agg _ emb _ T0; in turn, for the time T2, weighting the APPs at the time T0 and T1 by using the newly added APP at the time T2 to obtain another set of weight information, T2- > T0, T1: [ w21, w22, w23, \ 8230:, [ w2N, w2 (N + 1) ], (note that the former N +1 is the APP with the time T1 more), weighting the original embedding vector of the APPs by using the new weight information again, and aggregating to obtain a new vector N-dimensional vector agg _ emb _ T1; according to the steps, n aggregation vectors are finally obtained for samples with Tn moments, [ agg _ emb _ T0, agg _ emb _ T1, \8230 ], and the agg _ emb _ T (n-1) ], and the vectors with the Tn moments are further aggregated to be used as the input of the next-layer MLP network, and the probability of overdue of the user is finally obtained through the multilayer neural network;
b. according to given label information Y of a user, training is carried out by utilizing a multi-layer MLP network, as shown in an upper frame after weighting aggregation in fig. 5, the partial network is a typical MLP, a multi-layer perceptron algorithm is adopted, and the depth of the network and the number of hidden nodes can be set according to the size of data;
3) Design of the sub-network:
as shown in fig. 6, the purpose of the sub-network is to obtain weight information of APPs at later time and all APPs at earlier time. And weighting the original embedding vector by using the obtained weight information w. The network of the sub-network is shown in fig. 6, the network inputs two original imbedding vectors of APP, obtains an N-dimensional interaction vector by performing point multiplication on APP and APP vectors, finally splices the two input original vectors of APP and the interaction vector into a 3N-dimensional input vector, and obtains weight information between [0, 1] through a multilayer MLP network.
4) And setting other hyper-parameters according to the size of the data set, training and evaluating the model to obtain the final model parameters. The obtained APP vector can be used as a feature for other machine learning algorithms.
The weight information output by the sub-network is dynamically adjusted according to the APP at the next moment, and is not a static weight coefficient. The interactive information between APP and APP is fully considered, and more various input information is provided for the main network. In order to depict the time sequence of the APP installed by the user, the weight information is automatically learned through a sub-network and is used for weighting the vector of the APP at the initial moment, and therefore semantic information is abundant when the vector representation of the APP is compared with that of the APP in the traditional method.
The user interest identification method process provided by the embodiment of the invention can be realized only by the main network, and the design of the sub-network is mainly to optimize the weight calculation sub-step before weighted aggregation. The input of the main network and the sub-network is an embedding vector (parameter of which the model needs to learn) of the APP, and the main network is a fully-connected MLP network, the function of the main network is to input the time sequence information for installing the APP and the installation for the user through a deep neural network model, and output the time sequence information for learning and obtaining a vector representation of n dimensionality for each APP, and the probability whether the user has default or not. The subnetwork, namely the weighting network, inputs embedding vectors of two APPs, which are used for calculating weights between the APP at the later moment of each sample and the APP at the initial moment, and the subnetwork outputs a weight between [0, 1], and the embedding vectors of the APPs at the initial moment are weighted by the weight, so that the network has richer characterization capability.
And (4) outputting the probability of default occurrence of the user and the low-dimensional vector representation of each APP through the calculation of the neural network recognition model.
Fig. 7 is a schematic structural diagram of a user interest recognition apparatus according to an embodiment of the present invention. FIG. 8 is a schematic diagram of a preferred embodiment of a user interest recognition apparatus.
As shown in fig. 7, the user interest recognition apparatus 1 according to the embodiment of the present invention includes a model recognition module 11, in which a neural network recognition model is stored in advance, and the model recognition module 11 is configured to:
the method comprises the steps of inputting APP list data containing user APP time sequence information into a neural network recognition model, calculating the APP list data through the neural network recognition model and containing weighted aggregation, and outputting a user interest recognition result, wherein the neural network recognition model is obtained through training APP list training data containing the user APP time sequence information.
As shown in fig. 8, the user interest recognition apparatus 2 of the preferred embodiment includes a model training module 21 and a model recognition module 22, and the model training module 21 includes a vector generation sub-module 211, a weight calculation sub-module 212, a weighting aggregation sub-module 213, and an MLP sub-module 214.
The vector generation sub-module 211 inputs the APP list training data containing the APP timing information of the user into the preset neural network recognition model, and generates APP vector data.
The weight calculation sub-module 212 is configured to: before carrying out weighted aggregation on the APP vector data, carrying out pairwise dot multiplication on two original embedding vectors in the APP vector data respectively to obtain an N-dimensional interactive vector, splicing the two original embedding vectors and the N-dimensional interactive vector into a 3N-dimensional input vector respectively, and calculating to obtain respective APP weight data of a plurality of users, wherein N is an integer greater than 1.
The weighted aggregation sub-module 213 is configured to perform weighted aggregation on the APP vector data.
And the MLP sub-module continues training by utilizing a preset multi-layer perceptron algorithm to obtain a neural network recognition model.
The weighted aggregation comprises the following steps: weighting the APP vector weight at the later moment and the APP vector weight at the previous moment in sequence to obtain weights w21, w22, w23, \ 8230;, w2n, and w2 (n + 1), aggregating the APP vector weight at each moment and the APP vector weights at all the previous moments in sequence to obtain an aggregate vector at each moment, and summing or averaging the aggregate vector at each moment, wherein n is an integer greater than 1.
It should be noted that: the user interest identification device provided in the foregoing embodiment is illustrated by only dividing the functional modules when triggering the user interest identification service, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the user interest identification apparatus provided in the above embodiment and the user interest identification method embodiment belong to the same concept, and specific implementation processes thereof are described in the method embodiment, and are not described herein again.
All the above-mentioned optional technical solutions can be combined arbitrarily to form the optional embodiments of the present invention, and are not described herein again.
In summary, the method and the device for identifying user interests provided by the embodiments of the present invention have the following beneficial effects, compared with the prior art:
1. the method comprises the steps of fully considering the APP timing sequence information of a user, integrating the timing sequence information into a deep network model design, and calculating and identifying through a user interest identification model trained by combining the APP timing sequence information, so that richer semantic information vectors and user interest identification results are obtained, and the user interest identification accuracy is improved;
2. the output of the model comprises the specific recognition result of the probability of whether the user violates, for example, the embedding vector representation of each APP, the vector is used as a part of the parameter of the model and is used as an additional effect, the richer and comprehensive semantic information vector can be used as the feature of automatic extraction, and can also be directly used as the input data of other data processing processes (clustering or machine learning algorithm, etc.), and the model is convenient and practical.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present embodiments have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the true scope of the embodiments of the present application.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Those not described in detail in this specification are within the skill of the art.

Claims (5)

1. A method for identifying user interests, the method comprising:
inputting APP list data containing user APP time sequence information into a neural network recognition model, performing calculation containing weighted aggregation through the neural network recognition model, and outputting a user interest recognition result, wherein the neural network recognition model is obtained by inputting APP list training data containing user APP time sequence information for training;
the neural network recognition model is obtained by training in the following way:
inputting APP list training data containing user APP time sequence information into a preset neural network recognition model to generate APP vector data;
performing weighted aggregation on the APP vector data;
training is continued by utilizing a preset multilayer perceptron algorithm to obtain the neural network recognition model;
before performing weighted aggregation on the APP vector data, calculating each APP weight of each user respectively, thereby obtaining respective APP weight data of a plurality of users;
calculating each APP weight of each user respectively, thereby obtaining respective APP weight data of a plurality of users, including:
two original embedding vectors in the APP vector data are subjected to point multiplication in pairs to obtain N-dimensional interaction vectors, then the two original embedding vectors and the N-dimensional interaction vectors are spliced into 3N-dimensional input vectors respectively, and respective APP weight data of a plurality of users are obtained through calculation, wherein N is an integer greater than 1.
2. The method of claim 1, further comprising:
acquiring APP list data, cleaning the APP list data according to the number of installation persons, removing low-frequency APPs in the APP list data, and then labeling all APPs to acquire the respective numbers of the APPs;
and acquiring corresponding APP information of a plurality of users at different moments, and marking each user and the corresponding APP information of the user at different moments respectively based on the respective numbers of the APPs to obtain APP list data containing the APP time sequence information of the users.
3. The method of claim 1, wherein the weighted aggregation comprises:
obtaining a group of weight information corresponding to each moment according to the respective APP weight data, weighting the original embedding vectors of all previous moments according to the group of weight information corresponding to each moment, aggregating the weighted vectors to obtain an aggregate vector of each moment, and then summing or averaging the aggregate vector of each moment.
4. A user interest recognition apparatus, configured to implement a user interest recognition method, the apparatus comprising a model recognition module, where the model recognition module is configured to:
inputting APP list data containing user APP time sequence information into a neural network recognition model, performing calculation containing weighted aggregation through the neural network recognition model, and outputting a user interest recognition result, wherein the neural network recognition model is obtained by inputting APP list training data containing user APP time sequence information for training;
the device further comprises a model training module, wherein the model training module comprises a vector generation submodule, a weighted aggregation submodule and an MLP submodule, and the vector generation submodule inputs APP list training data containing user APP time sequence information into a preset neural network recognition model to generate APP vector data; the weighted aggregation sub-module carries out weighted aggregation on the APP vector data; the MLP submodule continues to train by utilizing a preset multilayer perceptron algorithm to obtain the neural network recognition model;
the model training module further comprises a weight calculation sub-module for: before the APP vector data are subjected to weighted aggregation, two original embedding vectors in the APP vector data are subjected to pairwise point multiplication to obtain N-dimensional interaction vectors, the two original embedding vectors and the N-dimensional interaction vectors are spliced into 3N-dimensional input vectors, and respective APP weight data of a plurality of users are obtained through calculation, wherein N is an integer greater than 1.
5. The apparatus of claim 4, wherein the weighted aggregation comprises: obtaining a group of weight information corresponding to each moment according to the respective APP weight data, weighting the original embedding vectors of all previous moments according to the group of weight information corresponding to each moment, aggregating the weighted vectors to obtain an aggregate vector of each moment, and then summing or averaging the aggregate vector of each moment.
CN201910967146.6A 2019-10-11 2019-10-11 User interest identification method and device Active CN110765352B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910967146.6A CN110765352B (en) 2019-10-11 2019-10-11 User interest identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910967146.6A CN110765352B (en) 2019-10-11 2019-10-11 User interest identification method and device

Publications (2)

Publication Number Publication Date
CN110765352A CN110765352A (en) 2020-02-07
CN110765352B true CN110765352B (en) 2022-11-11

Family

ID=69331716

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910967146.6A Active CN110765352B (en) 2019-10-11 2019-10-11 User interest identification method and device

Country Status (1)

Country Link
CN (1) CN110765352B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015439B (en) * 2020-09-21 2024-01-12 北京百度网讯科技有限公司 Embedding method, device, equipment and storage medium of user APP interest
CN115098789B (en) * 2022-08-05 2024-04-16 湖南工商大学 Multidimensional interest fusion recommendation method and device based on neural network and related equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117440A (en) * 2015-08-11 2015-12-02 北京奇虎科技有限公司 Method and apparatus for determining to-be-recommended application (APP)
CN106503025A (en) * 2015-09-08 2017-03-15 北京搜狗科技发展有限公司 Method and system is recommended in a kind of application
CN108536572A (en) * 2018-04-16 2018-09-14 浙江大学 Smart mobile phone App based on AppUsage2Vec models uses prediction technique
CN109597844A (en) * 2019-01-31 2019-04-09 中科人工智能创新技术研究院(青岛)有限公司 Core customer's method for digging and system based on deep neural network Yu figure network

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160162972A1 (en) * 2014-12-05 2016-06-09 Google Inc. Systems and methods of curation and recommendation of applications and accessory devices to provide expanded user experience
CN107423442B (en) * 2017-08-07 2020-09-25 火烈鸟网络(广州)股份有限公司 Application recommendation method and system based on user portrait behavior analysis, storage medium and computer equipment
US20190129727A1 (en) * 2017-10-27 2019-05-02 Salesforce.Com, Inc. User pattern prediction using a neural network
CN107832426B (en) * 2017-11-13 2021-11-02 上海交通大学 APP recommendation method and system based on using sequence context
US11244326B2 (en) * 2018-03-06 2022-02-08 Verizon Media Inc. Analytical precursor mining for personalized recommendation
CN110309427B (en) * 2018-05-31 2023-03-10 腾讯科技(深圳)有限公司 Object recommendation method and device and storage medium
CN109104620B (en) * 2018-07-26 2020-05-19 腾讯科技(深圳)有限公司 Short video recommendation method and device and readable medium
CN109359140B (en) * 2018-11-30 2022-09-09 苏州大学 Sequence recommendation method and device based on adaptive attention

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117440A (en) * 2015-08-11 2015-12-02 北京奇虎科技有限公司 Method and apparatus for determining to-be-recommended application (APP)
CN106503025A (en) * 2015-09-08 2017-03-15 北京搜狗科技发展有限公司 Method and system is recommended in a kind of application
CN108536572A (en) * 2018-04-16 2018-09-14 浙江大学 Smart mobile phone App based on AppUsage2Vec models uses prediction technique
CN109597844A (en) * 2019-01-31 2019-04-09 中科人工智能创新技术研究院(青岛)有限公司 Core customer's method for digging and system based on deep neural network Yu figure network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Automated Interestingness Calculator for mobile app recommendation;V. S. Gupta&S. Kohli;《2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions)》;20151217;全文 *
基于时间序列挖掘的APP安装推荐与卸载预测;许冶金;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190415;全文 *

Also Published As

Publication number Publication date
CN110765352A (en) 2020-02-07

Similar Documents

Publication Publication Date Title
CN107766940A (en) Method and apparatus for generation model
CN111741330B (en) Video content evaluation method and device, storage medium and computer equipment
WO2022252363A1 (en) Data processing method, computer device and readable storage medium
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
CN112232546A (en) Recommendation probability estimation method and device, electronic equipment and storage medium
CN110765352B (en) User interest identification method and device
CN115130536A (en) Training method of feature extraction model, data processing method, device and equipment
CN113379449B (en) Multimedia resource recall method and device, electronic equipment and storage medium
CN111695041A (en) Method and device for recommending information
CN112182281B (en) Audio recommendation method, device and storage medium
CN112884569A (en) Credit assessment model training method, device and equipment
CN114595323B (en) Portrait construction, recommendation, model training method, apparatus, device and storage medium
CN111444930A (en) Method and device for determining prediction effect of two-classification model
WO2021147084A1 (en) Systems and methods for emotion recognition in user-generated video(ugv)
WO2024051146A1 (en) Methods, systems, and computer-readable media for recommending downstream operator
WO2020162833A1 (en) Method and system for generating content data
CN114048294B (en) Similar population extension model training method, similar population extension method and device
CN113128597B (en) Method and device for extracting user behavior characteristics and classifying and predicting user behavior characteristics
CN114529191A (en) Method and apparatus for risk identification
CN113850670A (en) Bank product recommendation method, device, equipment and storage medium
CN112016613A (en) Training method and device for video content classification model, computer equipment and medium
CN112200602A (en) Neural network model training method and device for advertisement recommendation
CN116628236B (en) Method and device for delivering multimedia information, electronic equipment and storage medium
CN111931058B (en) Sequence recommendation method and system based on self-adaptive network depth
CN117575894B (en) Image generation method, device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant