CN111291264B - Access object prediction method and device based on machine learning and computer equipment - Google Patents

Access object prediction method and device based on machine learning and computer equipment Download PDF

Info

Publication number
CN111291264B
CN111291264B CN202010076468.4A CN202010076468A CN111291264B CN 111291264 B CN111291264 B CN 111291264B CN 202010076468 A CN202010076468 A CN 202010076468A CN 111291264 B CN111291264 B CN 111291264B
Authority
CN
China
Prior art keywords
user
data
access object
target
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010076468.4A
Other languages
Chinese (zh)
Other versions
CN111291264A (en
Inventor
田帅
鲁梦平
吴汉杰
戴云峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010076468.4A priority Critical patent/CN111291264B/en
Publication of CN111291264A publication Critical patent/CN111291264A/en
Application granted granted Critical
Publication of CN111291264B publication Critical patent/CN111291264B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Finance (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Accounting & Taxation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to an access object prediction method, device and computer equipment based on machine learning, wherein the method comprises the following steps: acquiring user portrait data and user behavior data corresponding to the user identifier; acquiring access object data, sampling the access object data according to user portrait data and user behavior data, and generating a candidate access object set by using the acquired access object data; inputting the user portrait data and the candidate access object set into a prediction model, and extracting features of the user portrait data and the candidate access object set to obtain target user feature representation and target object feature representation; calculating predicted values of the user identifications corresponding to the access objects according to the target user characteristic representation and the target object characteristic representation; and extracting target access object data meeting the condition threshold according to the predicted value, and pushing the target access object data to the corresponding user terminal. The scheme provided by the application can effectively improve the prediction accuracy of the interest degree of the user on the access object.

Description

Access object prediction method and device based on machine learning and computer equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and apparatus for predicting an access object based on machine learning, and a computer device.
Background
With the rapid development of internet technology, many applications of personalized information pushing have emerged to push information conforming to the interests of users, such as store pushing scenes. The existing pushing mode generally extracts user characteristics and information characteristics of a pushing object, all samples adopt the same characteristic extracting mode, and matching target objects are obtained by calculating the similarity between the user characteristics and the pushing object characteristics and pushed to users. However, the method cannot deeply mine personalized interests and hobbies of the user, and has the problems of low interest recognition degree for the user and low accuracy of target information pushing.
Disclosure of Invention
Based on the above, it is necessary to provide a push data prediction method, device and computer equipment based on machine learning, aiming at the technical problem that the accuracy of target information push is not high.
An access object prediction method based on machine learning, comprising:
acquiring user portrait data and user behavior data corresponding to the user identifier;
Access object data are acquired, sampling processing is carried out on the access object data according to the user portrait data and the user behavior data, and a candidate access object set is generated by using the access object data obtained by sampling;
inputting the user portrait data and the candidate access object set into a trained prediction model, and extracting features of the user portrait data and the candidate access object set to obtain target user feature representation and target object feature representation;
calculating predicted values of the user identifications corresponding to all access objects according to the target user characteristic representation and the target object characteristic representation;
and extracting target access object data meeting a condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier.
An access object prediction apparatus based on machine learning, the apparatus comprising:
the data acquisition module is used for acquiring user portrait data and user behavior data corresponding to the user identifier; acquiring access object data;
the data sampling module is used for sampling the access object data according to the user portrait data and the user behavior data, and generating a candidate access object set by using the access object data obtained by sampling;
The data prediction module is used for inputting the user portrait data and the candidate access object set into a prediction model, and extracting the characteristics of the user portrait data and the candidate access object set to obtain target user characteristic representation and target object characteristic representation; calculating predicted values of the user identifications corresponding to all access objects according to the target user characteristic representation and the target object characteristic representation;
and the data pushing module is used for extracting target access object data meeting a condition threshold according to the predicted value and pushing the target access object data to the user terminal corresponding to the user identifier.
A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the above-described machine learning-based access object prediction method.
A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the machine learning based access object prediction method described above.
According to the machine learning-based access object prediction method, the machine learning-based access object prediction device, the computer-readable storage medium and the computer equipment, the user portrait data and the user behavior data corresponding to the user identifier are acquired, after the access object data are acquired, the access object data are subjected to sampling processing according to the user portrait data and the user behavior data, a candidate access object set is generated by using the access object data obtained through sampling, the access object data corresponding to the positive sample are acquired firstly, and then the access object data corresponding to the negative sample are obtained through sampling, so that the candidate access object data can be obtained through effective sampling. And inputting the user portrait data and the candidate access object set into a prediction model to perform feature extraction on the user portrait data and the candidate access object set to obtain target user feature representation and target object feature representation, so that predicted values of the user identifications corresponding to the access objects can be accurately and effectively calculated according to the target user feature representation and the target object feature representation. The prediction model is obtained by training by adopting a mutual attention mechanism in advance, so that the interestingness prediction value of the user corresponding to each access object can be accurately and effectively predicted. And then, extracting target access object data meeting the condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier, so that the prediction accuracy of the interest degree of the user on the access object can be effectively improved, and the pushing efficiency and the pushing accuracy of the access object data are effectively improved.
Drawings
FIG. 1 is an application environment diagram of a machine learning based access object prediction method in one embodiment;
FIG. 2 is a flow diagram of a method of machine learning based access object prediction in one embodiment;
FIG. 3 is a flow chart of a method of machine learning based access object prediction in another embodiment;
FIG. 4 is a flow diagram of a method of machine learning based access object prediction in yet another embodiment;
FIG. 5 is a flow chart of a method of machine learning based access object prediction in yet another embodiment;
FIG. 6 is a block diagram of an access object prediction apparatus based on machine learning in one embodiment;
FIG. 7 is a block diagram of an access object prediction apparatus based on machine learning in another embodiment;
FIG. 8 is a block diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
FIG. 1 is an application environment diagram of a machine learning based access object prediction method in one embodiment. For example, referring to fig. 1, the access object prediction method based on machine learning is applied to a data push system. The data push system includes a terminal 102 and a server 104. The terminal 102 and the server 104 are connected through a network. The terminal 102 may be a desktop terminal or a mobile terminal, and the mobile terminal may be at least one of a mobile phone, a tablet computer, a notebook computer, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.
As shown in fig. 2, in one embodiment, a method of access object prediction based on machine learning is provided. The present embodiment is mainly exemplified by the application of the method to the server 104 in fig. 1. Referring to fig. 2, the access object prediction method based on machine learning specifically includes the following steps:
step S202, user portrait data and user behavior data corresponding to the user identification are obtained.
The user portrait data is labeled user data constructed according to information such as user characteristics, service scenes, user behaviors and the like, namely, the user portrait data is labeled typical user information. For example, the user profile data may include a variety of user information such as user gender, user age, user location, user interests, and the like. The user behavior data may refer to historical behavior events triggered by the user, such as event behaviors including user search information, web records, etc., including information about what time, what platform, which ID, search is made, content searched, etc., and user behavior information about each click and each page browsing of the user. Such as user identification, network attribute information, etc. included in the user network connection behavior data.
Specifically, the server may obtain user portrait data of the user according to the user representation, and user behavior data within a preset period of time, and may be used to analyze the user's interest level in the access object, and so on.
Step S204, access object data is obtained, the access object data is sampled according to the user portrait data and the user behavior data, and a candidate access object set is generated by using the access object data obtained by sampling.
The access object refers to a target object accessible to a user, for example, an entity store, a network merchant, a public number and other entity objects accessible to multiple users. The access object data includes attribute information of the access object, such as access object identification, access object category, access object address, score, people average consumption, attribute range, and the like.
After the server acquires user portrait data and user behavior data of the user, a plurality of access object data are acquired, wherein the plurality of access object data is two or more.
In one embodiment, the server may further obtain a plurality of access object data according to the user portrait data, for example, may obtain a plurality of access object data of a city in which the user is located according to a residence of the user.
The server further samples the access object data according to the user portrait data and the user behavior data. Specifically, the server may identify access object data associated with the user presence based on the user portrait data and the user behavior data, determine a portion of the access object data associated with the user presence as positive sample data, determine the remaining portion of the access object data as negative sample data, and generate a candidate access object set using the sampled positive sample data and negative sample data.
For example, the server may identify access object data recorded in connection with the access object based on network connection data in the user behavior data, indicating that the user may have accessed the access object. The server then determines the access object data recorded in connection with the existence of the access object as the positive sampling data. The server obtains the access object data corresponding to the negative sample through sampling after the access object data corresponding to the positive sample is obtained, and therefore candidate access object data can be obtained through effective sampling.
And S206, inputting the user portrait data and the candidate access object set into a trained prediction model, and extracting features of the user portrait data and the candidate access object set to obtain a target user feature representation and a target object feature representation.
The prediction model may be a neural network model based on machine learning, and in particular may be a neural network model based on an attention mechanism. The server can learn and train the preset machine learning model by utilizing a large amount of training user sample data and training access object data in advance, so that a trained prediction model is obtained, and the trained prediction model has prediction capability. The target user feature representation and the target object feature representation refer to the final user feature representation and the access object feature representation learned based on a mutual awareness mechanism using a predictive model.
The server samples the access object data to obtain a candidate access object set, and then inputs the user portrait data and the access object data in the candidate access object set into a trained prediction model. The prediction model may first perform preprocessing and normalization processing on the user portrait data and the access object data through the data preprocessing layer, for example, perform data preprocessing such as data cleaning, feature conversion, and vector extraction on the user portrait data and the access object data.
The server in turn performs feature extraction on the user profile data and the set of candidate access objects through the predictive model, and in particular, the server may extract multiple features of the user profile data and multiple features of the access objects, respectively, e.g., the multiple features may include continuous features and discrete features. The server further performs feature connection on various features of the user portrait data and various features of the access object, and then extracts a user vector and an access object vector respectively.
The server further performs mutual attention analysis on the user vector and the access object vector through a mutual attention mechanism in the prediction model, so that the user attention weight and the access object attention weight can be obtained. And the prediction model carries out mutual attention learning on the user vector, the access object vector, the user attention weight and the access object attention weight to obtain access object features focused by the user and user features focused by the access object, and outputs final target user feature representation and target object feature representation.
Step S208, calculating predicted values of the user identifications corresponding to the access objects according to the target user characteristic representation and the target object characteristic representation.
The predicted value may represent a predicted probability value of interest of the user in the access object, and represents the interest of the user in the access object.
And the server performs mutual attention analysis on the user vector and the access object vector through the prediction model, and further calculates a predicted value of each access object corresponding to the user identifier according to the target user characteristic representation and the target object characteristic representation after obtaining the target user characteristic representation and the target object characteristic representation.
Specifically, the server may calculate the similarity between the target user feature representation and the target object feature representation by using a loss function preset in the prediction model, and further calculate the predicted value of each access object corresponding to the user identifier according to the similarity.
And step S210, extracting target access object data meeting the condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier.
After the server calculates the predicted value of each access object corresponding to the user identifier through the predicted model, the server extracts target access object data meeting the condition threshold according to the predicted value, and pushes the target access object data to the user terminal corresponding to the user identifier. Specifically, the server extracts the access object identifier meeting the condition threshold according to the predicted value, acquires the target access object data corresponding to the access object identifier, sorts the extracted target access object data according to the predicted value, and pushes the target access object data to the user terminal corresponding to the user identifier according to the sorting result. Therefore, the prediction accuracy of the interest degree of the user on the access object can be effectively improved, and the pushing efficiency and the pushing accuracy of the access object data are effectively improved.
According to the machine learning-based access object prediction method, the server acquires the user portrait data and the user behavior data corresponding to the user identification, samples the access object data according to the user portrait data and the user behavior data after acquiring the access object data, generates the candidate access object set by using the access object data obtained by sampling, acquires the access object data corresponding to the positive sample firstly, and then acquires the access object data corresponding to the negative sample by sampling, so that the candidate access object data can be effectively sampled. And inputting the user portrait data and the candidate access object set into a prediction model to perform feature extraction on the user portrait data and the candidate access object set to obtain target user feature representation and target object feature representation, so that predicted values of the user identifications corresponding to the access objects can be accurately and effectively calculated according to the target user feature representation and the target object feature representation. The prediction model is obtained by training by adopting a mutual attention mechanism in advance, so that the interestingness prediction value of the user corresponding to each access object can be accurately and effectively predicted. And then, extracting target access object data meeting the condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier, so that the prediction accuracy of the interest degree of the user on the access object can be effectively improved, and the pushing efficiency and the pushing accuracy of the access object data are effectively improved.
In one embodiment, as shown in fig. 3, there is provided a machine learning-based access object prediction method, which includes the following contents:
step S302, user portrait data and user behavior data corresponding to the user identification are obtained.
Step S304, access object data is obtained, association screening is carried out on the user portrait data and the access object data according to the user behavior data, and the user access object interaction data is obtained.
Step S306, positive sample data and negative sample data are generated according to the user access object interaction data.
Step S308, a candidate access object set is generated using the positive sample data and the negative sample data.
And step S310, inputting the user portrait data and the candidate access object set into a trained prediction model, and extracting the characteristics of the user portrait data and the candidate access object set to obtain target user characteristic representation and target object characteristic representation.
Step S312, calculating the predicted value of each access object corresponding to the user identification according to the target user characteristic representation and the target object characteristic representation.
And step S314, extracting target access object data meeting the condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier.
The positive sample data refers to sample data belonging to a certain category, and the negative sample data refers to sample data not belonging to a certain category. For example, the access object data associated with the user presence is positive sample data, and the access object data not associated with the user is negative sample data. For example, the server may estimate the probability of occurrence of a new sample by estimating the feature distribution of all sample data, and when that sample occurs, using the distribution, if the probability is too small, it is considered a negative sample.
After the server acquires user portrait data and user behavior data of the user, a plurality of access object data are acquired. The server further performs association screening on the access object data according to the user portrait data and the user behavior data. Specifically, the server may identify access object data associated with the user presence according to the user portrait data and the user behavior data, for example, the server may extract access object data recorded in association with the access object presence according to the user behavior data, such as a WIFI network connection record, a transaction record, and other various record information. When the user identification has an associated record with the access object data, then the access object data may be determined to be the access object data associated with the user presence. The server further determines the access object data associated with the user existence as user access object interaction data, and indicates that the user has an association relationship with the access object.
And the server determines the user access object interaction data as positive sampling data, determines the rest access object data as negative sampling data, and generates a candidate access object set by utilizing the positive sampling data and the negative sampling data obtained by sampling. For example, the server may identify access object data recorded in connection with the access object based on network connection data in the user behavior data, indicating that the user may have accessed the access object. The server then determines the access object data recorded in connection with the existence of the access object as the positive sampling data.
In one embodiment, the server may further determine a user negative association access object class according to the user portrait data and the user behavior data, for example, the negative association access object class may not be interested by the user or may never access the access object class of the relevant type, and the server may reject the corresponding access object data according to the negative association access object class, and further determine the remaining access object data except for the positive sampling data as negative sampling data, so as to effectively sample and filter the acquired multiple access object data.
The server obtains the access object data corresponding to the negative sample through sampling after the access object data corresponding to the positive sample is obtained, and therefore candidate access object data can be obtained through effective sampling. And inputting the user portrait data and the candidate access object set into a prediction model to perform feature extraction on the user portrait data and the candidate access object set, so as to obtain target user feature representation and target object feature representation. The prediction model is obtained by training by adopting a mutual attention mechanism in advance, so that the interestingness prediction value of the user corresponding to each access object can be accurately and effectively predicted. And extracting target access object data meeting the condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier, so that the prediction accuracy of the interest degree of the user on the access object can be effectively improved.
In one embodiment, the user behavior data includes network connection data; generating positive sample data and negative sample data from the user access object interaction data includes: identifying access object data of which the connection record exists between the network connection data and the user identifier; associating the access object data with the connection record with the user identifier to generate user access object interaction data; determining user access object interaction data as positive sample data; negative sample data is generated using unassociated access object data in the access object data.
The user behavior data includes network connection data, where the network connection data may include lan connection data and network access data corresponding to the user identifier, for example, the lan connection data may be WiFi signal data connected or scanned by the terminal of the user.
The server acquires user portrait data and user behavior data of the user, and identifies access object data with a connection record with the user identifier according to network connection data in the user behavior data after acquiring a plurality of access object data. And the server associates the access object data with the connection record with the user identifier to generate user access object interaction data, and the server determines the user access object interaction data as positive sample data.
The server further generates negative sample data by using unassociated access object data in the access object data, and specifically, the server may further acquire unassociated access object data according to a location area where the access object in the positive sample data is located, and generate negative sample data by using the acquired access object data. Therefore, the acquired multiple access object data can be effectively sampled and screened.
In one embodiment, generating the set of candidate access objects using the positive sample data and the negative sample data includes: extracting user-associated feature degrees of positive sample data and negative sample data according to connection records in the network connection data; sampling and distributing each access object data in the candidate access object set according to the user association feature degree to obtain sampling probability of each access object data; and extracting access object data with sampling probability meeting preset conditions, and generating a candidate access object set by using the extracted access object data.
The user association feature degree can identify the association degree of the user identifier and the access object, for example, if the user layer accesses one access object for multiple times, the user association feature degree of the user and the access object is higher.
The server acquires user portrait data and user behavior data of the user, and identifies access object data with a connection record with the user identifier according to network connection data in the user behavior data after acquiring a plurality of access object data. And the server associates the access object data with the connection record with the user identifier to generate user access object interaction data. After positive sample data and negative sample data are determined according to the user access object interaction data, the server extracts user association feature degrees of the positive sample data and the negative sample data according to connection records in the network connection data.
Specifically, the server calculates the access frequency of the user according to the connection record in the network connection data, and determines the user association feature degree of each access object data according to the access frequency. The server sorts the access object data according to the access frequency, and samples and distributes each access object data in the candidate access object set according to the user association feature degree to obtain the sampling probability of each access object data.
For example, the server may calculate the sampling distribution using the following formula:
Figure GDA0004168419050000101
wherein S is i Represents an access object, p (S i ) The sampling probability of the access object is represented, k is the sequence identifier of the access object in the access object set, for example, when the access object is a store, the sequence corresponding to the frequency of the user passing through the store can be represented; and Γ is the number of access objects in the candidate access object set.
The server further extracts the access object data with sampling probability meeting the preset condition, and generates a candidate access object set by utilizing the extracted access object data, so that the acquired plurality of access object data can be effectively sampled, a negative sample is acquired to generate an effective candidate access object set, and the server can effectively conduct interestingness prediction processing of the access object.
In one embodiment, as shown in fig. 4, there is provided a machine learning-based access object prediction method, which specifically includes the following steps:
step S402, user portrait data and user behavior data corresponding to the user identification are obtained.
Step S404, access object data is obtained, the access object data is sampled according to the user portrait data and the user behavior data, and a candidate access object set is generated by using the access object data obtained by sampling.
Step S406, inputting the user portrait data and the candidate access object set into a trained prediction model, and respectively extracting the user continuous type feature and the user discrete type feature corresponding to the user portrait data and the access object continuous type feature and the access object discrete type feature corresponding to the access object data through the prediction model.
Step S408, inputting the user continuous type feature and the access object continuous type feature into a forward neural network layer for feature mapping to obtain a continuous type feature matrix.
Step S410, inputting the user discrete type feature and the access object discrete type feature into the embedded layer for feature mapping to obtain a discrete feature matrix.
And step S412, performing feature connection processing on the continuous feature matrix and the discrete feature matrix to obtain a user vector and an access object vector, and obtaining a target user feature representation and a target object feature representation by using the user vector and the access object vector.
Step S414, calculating the predicted value of each access object corresponding to the user identifier according to the target user characteristic representation and the target object characteristic representation.
And step S416, extracting target access object data meeting the condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier.
The continuous characteristic refers to a variable characteristic which can be arbitrarily valued in a certain interval, and the discrete characteristic refers to a variable whose variable value can be listed in a certain order, and is usually valued in whole digits. For example, if the user profile data of a certain user is sex woman, age 20 years, and academic family, the continuity feature according to the feature type is age 20 years, and the discrete feature is sex woman, academic family, and the like.
The server acquires user portrait data and user behavior data corresponding to the user identification, acquires access object data, samples the access object data according to the user portrait data and the user behavior data, and generates a candidate access object set by using the access object data obtained through sampling. The server inputs the user portrait data and the candidate access object set into a prediction model to extract the characteristics of the user portrait data and the candidate access object set.
Specifically, the prediction model may include a full connection layer and an embedded layer, and the server extracts a user continuous feature and a user discrete feature corresponding to the user image data, and an access object continuous feature and an access object discrete feature corresponding to the access object data, respectively, through the full connection layer of the prediction model.
For the continuous feature, the server can input the user continuous feature and the access object continuous feature into a forward neural network in the prediction model for nonlinear mapping, so as to obtain a continuous feature matrix. For discrete features, the server can input the user discrete features and the access object discrete features into an embedded layer in the prediction model for feature mapping, and map the discrete features into dense vectors, so that a discrete feature matrix is obtained.
The server further performs feature connection processing on the continuous feature matrix and the discrete feature matrix, and after the continuous feature and the discrete feature are connected by the prediction model, nonlinear mapping is performed on the continuous feature and the discrete feature respectively, so that a user vector and an access object vector are obtained.
For example, the prediction model is mapped non-linearly through the forward neural network, and the specific formula can be as follows:
Figure GDA0004168419050000111
Figure GDA0004168419050000112
wherein u is user portrait data, s is access target data, c is continuous data, W u And W is s As a parameter matrix, b u And b s For the corresponding bias, f (·) is the activation function, which may be, for example, a Sigmoid function.
Further, the server may cause
Figure GDA0004168419050000121
Representing the ith discrete feature of the user, +.>
Figure GDA0004168419050000122
An embedding matrix for the i-th discrete feature. Where m is the dimension of the embedded vector, D is the size of the set of discrete feature values, v represents the index of a certain value of the discrete feature in the set, and the mathematical representation of the embedded layer may be as follows:
Figure GDA0004168419050000123
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure GDA0004168419050000124
an embedded vector, W [ v ], which is the ith discrete feature of the user]Representing column v of matrix W. Connecting all the discrete features to form a feature vector of the discrete features>
Figure GDA0004168419050000125
Feature vectors for obtaining discrete features of store in the same way
Figure GDA0004168419050000126
After the server connects the discrete feature and the continuous feature by using the formula, respectively performing nonlinear mapping to obtain a user vector h u And access object vector h s The corresponding mathematical representations may be as follows:
Figure GDA0004168419050000127
Figure GDA0004168419050000128
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure GDA0004168419050000129
and->
Figure GDA00041684190500001210
Is a parameter matrix->
Figure GDA00041684190500001211
And->
Figure GDA00041684190500001212
For the corresponding bias. The server respectively portrays to the users through the prediction modelThe data and the access object data are respectively subjected to continuous feature extraction and discrete feature extraction and feature connection, so that a user vector and an access object vector can be effectively extracted, a target user feature representation and a target object feature representation are further obtained, and further, a predicted value of each access object corresponding to the user identifier can be accurately and effectively calculated according to the target user feature representation and the target object feature representation. The prediction model is obtained by training by adopting a mutual attention mechanism in advance, so that the interestingness prediction value of the user corresponding to each access object can be accurately and effectively predicted. And the server extracts target access object data meeting the condition threshold according to the predicted value, and pushes the target access object data to the user terminal corresponding to the user identifier, so that the prediction accuracy of the interest degree of the user on the access object can be effectively improved.
In one embodiment, as shown in fig. 5, there is provided a machine learning-based access object prediction method, which specifically includes the following steps:
step S502, user portrait data and user behavior data corresponding to the user identification are obtained.
Step S504, access object data is obtained, the access object data is sampled according to the user portrait data and the user behavior data, and a candidate access object set is generated by using the access object data obtained by sampling.
Step S506, inputting the user portrait data and the candidate access object set into a trained prediction model, and respectively extracting the user continuous type feature and the user discrete type feature corresponding to the user portrait data and the access object continuous type feature and the access object discrete type feature corresponding to the access object data through the prediction model.
Step S508, inputting the user continuous type feature and the access object continuous type feature into a forward neural network layer for feature mapping to obtain a continuous type feature matrix; and inputting the user discrete type features and the access object discrete type features into an embedded layer for feature mapping to obtain a discrete feature matrix.
And S510, performing feature connection processing on the continuous feature matrix and the discrete feature matrix to obtain a user vector and an access object vector, and performing mutual attention analysis on the user vector and the access object vector by using a mutual attention network layer of the prediction model to obtain a user attention weight and an access object attention weight.
Step S512, calculating the user attention access object characteristics according to the user attention weight; and calculating the attention user characteristics of the access object according to the attention weight of the access object.
Step S514, determining a target user feature representation from the user vector and the user attention access object feature, and determining a target object feature representation from the access object vector and the access object attention user feature.
Step S516, calculating the similarity between the target user characteristic representation and the target object characteristic representation; and calculating predicted values of the user identifications corresponding to the access objects according to the similarity.
And step S518, extracting target access object data meeting the condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier.
The user attention access object feature represents the attention degree of the user to the access object, and the access object attention user feature represents the attention degree of the access object to the user.
The server acquires user portrait data and user behavior data corresponding to the user identification, acquires access object data, samples the access object data according to the user portrait data and the user behavior data, and generates a candidate access object set by using the access object data obtained through sampling. The server inputs the user portrait data and the candidate access object set into a prediction model, respectively extracts continuous characteristics and discrete characteristics of the user portrait data and the access object data by using the prediction model, performs characteristic connection, and further extracts target user characteristic representations and target object characteristic representations corresponding to the user vectors and the access object vectors after obtaining the user vectors and the access object vectors.
Specifically, a mutual attention network layer based on a mutual attention mechanism is included in the prediction model. And the server performs mutual attention analysis on the user vector and the access object vector by using a mutual attention network layer of the prediction model to obtain the user attention weight and the access object attention weight. The mutual attention network layer based on the mutual attention mechanism can learn the user attention weight corresponding to the user vector and the access object attention weight corresponding to the access object vector, so that the user attention weight and the access object attention weight are obtained effectively.
The server then calculates the user attention access object features from the user attention weights and calculates the access object attention user features from the access object attention weights. Specifically, the server focuses the user attention weight and the access object attention weight on the user vector and the access object vector, respectively, through the mutual attention network layer, thereby generating a user attention access object feature and an access object attention user feature. The server can determine the target user feature representation according to the user vector and the user attention access object feature, and determine the target object feature representation according to the access object vector and the access object attention user feature, so that the final target user feature representation and the target object feature representation can be accurately and effectively obtained.
And the server extracts the target user characteristic representation and the target object characteristic representation, further calculates the similarity between the target user characteristic representation and the target object characteristic representation, and further calculates the predicted value of each access object corresponding to the user identifier according to the similarity.
For example, in the process that the server extracts the user attention weight and the access object attention weight by using the mutual attention network layer of the prediction model, the corresponding user deviation attention weight and access object deviation attention weight can be respectively extracted, and the specific mathematical expression is as follows:
Figure GDA0004168419050000141
Figure GDA0004168419050000142
wherein alpha isFor the user biased attention weight, beta is the visit object biased attention weight, softmax (·) is the weight normalization function, W u att And W is s att As a matrix of parameters,
Figure GDA0004168419050000143
and->
Figure GDA0004168419050000144
For corresponding bias, the server further performs vector splicing processing, and focuses the user bias attention weight alpha and the access object bias attention weight beta on the user vector and the access object vector respectively to generate user focus access object feature +.>
Figure GDA0004168419050000145
And access object attention user feature->
Figure GDA0004168419050000146
Then the user vector h u And access object features of interest to the user>
Figure GDA0004168419050000147
Connection generates final target user characteristic representation +. >
Figure GDA0004168419050000148
Will access the object vector h s And access object attention user feature->
Figure GDA0004168419050000149
Connection generates the final target access object representation +.>
Figure GDA00041684190500001410
The specific mathematical representation may be as follows:
Figure GDA00041684190500001411
Figure GDA00041684190500001412
wherein h is u For the user vector originally extracted,
Figure GDA00041684190500001413
for the access object feature of the user's interest, the original user vector and the access object feature of the user's interest are spliced, and the final target user feature representation is generated by connection>
Figure GDA0004168419050000151
Original features extracted by the connecting layer and the embedding layer are reserved, and individuation features generated by an attention mechanism are spliced, so that corresponding target feature representation can be effectively obtained, and further prediction accuracy of a user corresponding to an access object can be effectively improved.
The server then performs inner product measurement similarity on the target user feature representation and the target access object representation, and then sends a similarity score into a sigmoid function to probability to generate probability p:
Figure GDA0004168419050000152
p=sigmoid(score)
where p represents the probability that the user store pair is a positive sample, i.e. pETR,
Figure GDA0004168419050000153
is a vector inner product operator. Wherein the ligation tag c is replaced by pETR, i.e. +.>
Figure GDA0004168419050000154
User u then 0 And (5) stacking and sorting pETR of all corresponding access objects, and recommending the first N access objects.
The server extracts target access object data meeting the condition threshold according to the predicted value, and pushes the target access object data to the user terminal corresponding to the user identifier, so that the prediction accuracy of the interest degree of the user on the access object can be effectively improved, and the pushing efficiency and the pushing accuracy of the access object data are effectively improved.
In one embodiment, extracting target access object data meeting a condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier includes: extracting access object identifiers meeting a condition threshold according to the predicted value, and acquiring target access object data corresponding to the access object identifiers; and sequencing the extracted target access object data according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier according to the sequencing result.
The server acquires user portrait data and user behavior data corresponding to the user identification, acquires access object data, samples the access object data according to the user portrait data and the user behavior data, and generates a candidate access object set by using the access object data obtained through sampling. The server further performs feature extraction on the user portrait data and the candidate access object set by the prediction model to obtain target user feature representation and target object feature representation, and further can accurately and effectively calculate predicted values of the user identifications corresponding to the access objects according to the target user feature representation and the target object feature representation.
After the server calculates the predicted value of each access object corresponding to the user identifier through the predicted model, the server extracts target access object data meeting the condition threshold according to the predicted value, and pushes the target access object data to the user terminal corresponding to the user identifier. Specifically, the server extracts the access object identifier meeting the condition threshold according to the predicted value, acquires the target access object data corresponding to the access object identifier, sorts the extracted target access object data according to the predicted value, and pushes the target access object data to the user terminal corresponding to the user identifier according to the sorting result. Therefore, the access objects can be pushed to the user according to the ordering of the interestingness predicted values, so that the user can obtain the access object data of the interestingness predicted values preferentially. Therefore, the prediction accuracy of the interest degree of the user on the access object can be effectively improved, and the pushing efficiency and the pushing accuracy of the access object data are effectively improved.
In one embodiment, before inputting the user portrait data and the candidate set of access objects to the predictive model, further comprising: acquiring training user sample data and training access object data, and generating a training set and a verification set by utilizing the training sample data and the training access object data; inputting the training set into a preset machine learning model for learning and training to obtain a training result; iteratively updating model parameters of the machine learning model according to the training result to obtain an initial prediction model; and verifying the initial prediction model by using the verification set until the verification condition threshold is met, and obtaining the prediction model after training.
The server needs to construct a machine learning-based predictive model in advance before performing predictive analysis using the predictive model user portrayal data and the candidate set of access objects. The server may pre-obtain a large amount of training user sample data and training access object data. Specifically, the server may obtain a large amount of training user sample data and training access object data from a local database or a third party database in advance, where the training user sample data includes user portrait data and user behavior data. For example, the server may obtain user behavior data of the user from the history log information.
The server in turn generates a training set and a validation set using the obtained training user sample data and the training access object data. Wherein the training user sample data in the training set may comprise labeled connection tags. The server performs data cleaning and data preprocessing on the data in the training set and the verification set, for example, the server may perform vectorization on training user sample data and training access object data to obtain feature vectors corresponding to the training user sample data and the training access object data, and convert the feature vectors into corresponding feature variables. The server further carries out derivative processing on the characteristic variables to obtain a plurality of processed characteristic variables. Such as filling missing values, extracting and replacing abnormal values, and the like, for the characteristic variables.
The server obtains a preset machine learning model, for example, the server may be based on a mutual-attention network machine learning model. For example, the machine learning model includes a plurality of neural network models, wherein the neural network models may include an input layer, a data preprocessing layer, a full connection layer, an embedded layer, a mapping layer, a mutual attention network layer, an output layer, and the like. The network layer of the neural network model may include an activation function and a bias loss function. The neural network model also comprises a calculation mode for determining errors, for example, a minimum cross entropy algorithm and the like can be adopted; the neural network model may further include an iterative update mode for determining the weight parameters, for example, adam optimization algorithm may be adopted, and parameters in the network are updated based on training data iteration.
After the server acquires the preset deep learning model, the preprocessed training user sample data and the training access object data in the training set are input into the deep learning model for learning and training, the model is subjected to multi-target combined training by the training set to obtain a training result, and model parameters of the machine learning model are iteratively updated according to the training result until the model is not improved in prediction effect, so that an initial prediction model is obtained. In the training process, there are multiple target loss functions, and the multiple target loss functions need to be fused to obtain the loss function of the whole model. For example, the loss functions of the respective targets may be weighted according to the directivities of the predicted targets, and finally the loss functions of the plurality of targets may be weighted and summed. In the training process, parameters in the network can be updated by using an Adam optimizer based on training data iteration, and a local optimal solution can be found based on an optimization problem in a reasonable time, so that a prediction model is effectively trained and continuously optimized, and an initial prediction model can be obtained through training.
After the server obtains the initial prediction model, training user sample data and training access object data in the verification set are input into the initial prediction model for further training and verification, and category probabilities corresponding to a plurality of verification data are obtained. And stopping training until the number of the condition thresholds in the verification set data reaches the verification threshold, and further obtaining a prediction model after training is completed. By training and learning a large amount of training user sample data and training access object data, a prediction model with high prediction accuracy can be effectively constructed and trained, so that the prediction accuracy of interest of a user on candidate access object data is effectively improved.
For example, the prediction model may employ a bi-classification cross entropy function as the loss function, and the specific formula may be as follows:
Figure GDA0004168419050000171
wherein omega is the training set, M is the number of samples of the training set, and y is the real label of the samples. The model adopts a gradient optimization algorithm ADAM and a back propagation algorithm to update and learn parameters. After model training is completed, the user and all access objects in the area are formed into a user question object pair during generalization test, so that the predicted value of each access object in the candidate access object set is effectively predicted.
In a specific embodiment, the access object may be a physical object accessible to multiple users, such as a physical store, a network merchant, and a public number. Take the access object as a store as an example. After the server obtains user portrait data and user behavior data corresponding to the user identifier, a plurality of store data are obtained. Wherein the user behavior data comprises network connection data. The server samples store data according to the user portrait data and the user behavior data. Specifically, the server may identify store data associated with the user presence based on the user portrait data and the network connection data, and determine a store data portion associated with the user presence as user store pair data. The user store pair data indicates paired data formed by a certain user and a certain store. The server determines the data of the identified store pair of the user as positive sampling data, determines the rest store data as negative sampling data, and generates a candidate store set by using the positive sampling data and the negative sampling data obtained by sampling.
The server further inputs the user image data and store data in the candidate store set into the trained predictive model, and performs feature extraction on the user image data and store data to extract various features of the user image data and various features of the store, respectively, for example, the various features may include continuous features and discrete features. The server then performs feature connection on various features of the user and the store, and then extracts a user vector and a store vector, respectively.
The server further performs mutual attention analysis on the user vector and the store vector through a mutual attention mechanism in the prediction model, so that the user attention weight and the store attention weight can be obtained. And the prediction model carries out mutual attention learning on the user vector, the store vector, the user attention weight and the store attention weight to obtain store attention characteristics and store attention user characteristics of the user, and outputs final target user characteristic representation and target store characteristic representation. The server further calculates the similarity between the target user characteristic representation and the target store characteristic representation by using a loss function preset in the prediction model, and further calculates the predicted value of each store corresponding to the user identification according to the similarity. The predicted value may represent a store entry prediction probability value for the user for each store to predict a store entry probability for the user to enter each store. The server further sorts the predicted values of all shops corresponding to the user from large to small, extracts the preset quantity of shop data with higher sorting, pushes the extracted shop data to the corresponding user terminal according to the sorting, and accordingly can accurately and effectively predict the store entering prediction probability of each shop corresponding to the user. For example, the server may recommend marketing to potential target users by calculating predicted values for each store for the user based on the WiFi connection data, the user portrayal data, and the store feature data, and by calculating predicted values for each store for the user based on the predicted values.
In a specific embodiment, the server obtains a large amount of user portrait data and user behavior data and store data by adopting the method to perform experimental analysis. Specifically, analysis data was constructed using a large amount of data collected, and by randomly selecting 3 ten thousand user store pair data from the constructed analysis data, 10 samples were negatively sampled for each pair of user store pair data as negative sample data, and 33 ten thousand user store pairs were collectively constructed as an experimental data set. Dividing the experimental data set into a training set, a verification set and a test set according to the ratio of 7:1:2, and further utilizing the training set and the verification set, wherein the experimental model comprises: the prediction model based on the mutual attention mechanism and the existing training provided in the embodiment are respectively utilized to obtain a prediction model, and the prediction model is utilized to test the test set, so that a test result can be obtained. The experimental analysis is performed based on a model without a attentiveness mechanism in the technology, and the obtained experimental results are shown in the following table 1:
experimental mode Predictive value Accuracy rate of
Predictive model based on mutual attention mechanism in this embodiment 64.81 84.69
Model based on inattention mechanism in prior art 61.53 83.52
TABLE 1
As can be seen from the above table 1, the access object prediction method based on machine learning provided in this embodiment is superior to the existing model without attention mechanism in terms of accuracy and F1 value, so that the prediction accuracy of the user's interest level in the access object is effectively improved, and the pushing efficiency and pushing accuracy of the access object data are effectively improved.
Fig. 2-5 are flow diagrams of a machine learning based access object prediction method in one embodiment. It should be understood that, although the steps in the flowcharts of fig. 2-5 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2-5 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or steps.
In one embodiment, as shown in fig. 6, there is provided a machine learning based access object prediction apparatus 600 comprising a data acquisition module 602, a data sampling module 604, a data prediction module 606, and a data push module 608, wherein:
a data acquisition module 602, configured to acquire user portrait data and user behavior data corresponding to a user identifier; acquiring access object data;
The data sampling module 604 is configured to sample the access object data according to the user portrait data and the user behavior data, and generate a candidate access object set by using the access object data obtained by sampling;
the data prediction module 606 is configured to input the user portrait data and the candidate access object set into a prediction model, and perform feature extraction on the user portrait data and the candidate access object set to obtain a target user feature representation and a target object feature representation; calculating predicted values of the user identifications corresponding to the access objects according to the target user characteristic representation and the target object characteristic representation;
and the data pushing module 608 is configured to extract, according to the predicted value, target access object data that meets the condition threshold, and push the target access object data to a user terminal corresponding to the user identifier.
In one embodiment, the data sampling module 604 is further configured to perform association screening on the user portrait data and the access object data according to the user behavior data, so as to obtain user access object interaction data; generating positive sample data and negative sample data from the access object data according to the user access object interaction data; a set of candidate access objects is generated using the positive sample data and the negative sample data.
In one embodiment, the user behavior data includes network connection data; the data sampling module 604 is further configured to identify access object data in the network connection data, where a connection record exists between the access object data and the user identifier; associating the access object data with the connection record with the user identifier to generate user access object interaction data; determining user access object interaction data as positive sample data; negative sample data is generated using unassociated access object data in the access object data.
In one embodiment, the data sampling module 604 is further configured to extract user-associated features of positive and negative sample data from connection records in the network connection data; sampling and distributing each access object data in the candidate access object set according to the user association feature degree to obtain sampling probability of each access object data; and extracting access object data with sampling probability meeting preset conditions, and generating a candidate access object set by using the extracted access object data.
In one embodiment, the data prediction module 606 is further configured to extract, through a prediction model, a user continuous feature and a user discrete feature corresponding to the user portrait data, and an access object continuous feature and an access object discrete feature corresponding to the access object data, respectively; inputting the user continuous type feature and the access object continuous type feature into a forward neural network layer for feature mapping to obtain a continuous type feature matrix; inputting the discrete features of the user and the discrete features of the access object into the embedded layer for feature mapping to obtain a discrete feature matrix; and carrying out feature connection processing on the continuous feature matrix and the discrete feature matrix to obtain a user vector and an access object vector.
In one embodiment, the data prediction module 606 is further configured to perform a mutual attention analysis on the user vector and the access object vector by using a mutual attention network layer of the prediction model, to obtain a user attention weight and an access object attention weight; calculating the user attention access object characteristics according to the user attention weights; calculating the attention user characteristics of the access object according to the attention weight of the access object; determining a target user feature representation from the user vector and the user attention access object feature, and determining a target object feature representation from the access object vector and the access object attention user feature; calculating a similarity between the target user feature representation and the target object feature representation; and calculating predicted values of the user identifications corresponding to the access objects according to the similarity.
In one embodiment, the data pushing module 608 is further configured to extract, according to the predicted value, an access object identifier that meets a condition threshold, and obtain target access object data corresponding to the access object identifier; and sequencing the extracted target access object data according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier according to the sequencing result.
In one embodiment, as shown in fig. 7, the apparatus further includes a model training module 601, configured to obtain training user sample data and training access object data, and generate a training set and a verification set using the training sample data and the training access object data; inputting the training set into a preset machine learning model for learning and training to obtain a training result; iteratively updating model parameters of the machine learning model according to the training result to obtain an initial prediction model; and verifying the initial prediction model by using the verification set until the verification condition threshold is met, and obtaining the prediction model after training.
FIG. 8 illustrates an internal block diagram of a computer device in one embodiment. The computer device may be specifically the server 104 of fig. 1. As shown in fig. 8, the computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer equipment is used for storing network attack data, polygon data, target nuclear index value and other data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements the steps of a machine learning based access object prediction method provided in any one of the embodiments of the present application.
It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the machine learning based access object prediction apparatus provided herein may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 8. The memory of the computer device may store various program modules that make up the machine learning-based access object prediction apparatus, such as the data acquisition module 602, the data sampling module 604, the data prediction module 606, and the data pushing module 608 shown in fig. 6. The computer program constituted by the respective program modules causes the processor to execute the steps in the machine learning-based access object prediction method of the respective embodiments of the present application described in the present specification.
For example, the computer apparatus shown in fig. 8 may perform step 202 by the data acquisition module 602 in the machine learning-based access object prediction apparatus shown in fig. 6. The computer device may perform step 204 through the data sampling module 604. The computer device may perform steps 206 and 208 via the data prediction module 606, and the computer device may perform step 210 via the data push module 608.
In one embodiment, a computer device is provided that includes a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the machine learning based access object prediction method described above. The step of the machine learning-based access object prediction method herein may be a step in the machine learning-based access object prediction method of the above-described respective embodiments.
In one embodiment, a computer readable storage medium is provided, storing a computer program which, when executed by a processor, causes the processor to perform the steps of the above machine learning based access object prediction method. The steps of the machine learning-based access object prediction method herein may be the steps in the machine learning-based access object prediction method of the above-described respective embodiments.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (16)

1. An access object prediction method based on machine learning, comprising:
acquiring user portrait data and user behavior data corresponding to the user identifier;
access object data are acquired, sampling processing is carried out on the access object data according to the user portrait data and the user behavior data, and a candidate access object set is generated by using the access object data obtained by sampling;
Inputting the user portrait data and the candidate access object set into a trained prediction model, and extracting features of the user portrait data and the candidate access object set to obtain target user feature representation and target object feature representation;
calculating predicted values of the user identifications corresponding to all access objects according to the target user characteristic representation and the target object characteristic representation;
extracting target access object data meeting a condition threshold according to the predicted value, and pushing the target access object data to a user terminal corresponding to the user identifier;
the feature extraction of the user portrait data and the candidate access object set includes:
extracting user continuous type features and user discrete type features corresponding to the user portrait data and access object continuous type features and access object discrete type features corresponding to the access object data respectively through the prediction model;
inputting the user continuous type feature and the access object continuous type feature into a forward neural network layer for feature mapping to obtain a continuous type feature matrix;
inputting the user discrete features and the access object discrete features into an embedded layer for feature mapping to obtain a discrete feature matrix;
And carrying out feature connection processing on the continuous feature matrix and the discrete feature matrix to obtain a user vector and an access object vector.
2. The method of claim 1, wherein the sampling the access object data according to the user portrait data and the user behavior data, and generating a candidate access object set using the sampled access object includes:
performing association screening on the user portrait data and the access object data according to the user behavior data to obtain user access object interaction data;
generating positive sample data and negative sample data from the access object data according to the user access object interaction data;
and generating a candidate access object set by utilizing the positive sample data and the negative sample data.
3. The method of claim 2, wherein the user behavior data comprises network connection data; the generating positive sample data and negative sample data from the user access object interaction data includes:
identifying access object data of which the connection record exists between the network connection data and the user identifier;
Associating the access object data of the presence connection record with the user identifier to generate user access object interaction data;
determining the user access object interaction data as positive sample data;
and generating negative sample data by using unassociated access object data in the access object data.
4. The method of claim 3, wherein the generating a set of candidate access objects using the positive sample data and negative sample data comprises:
extracting user-associated feature degrees of the positive sample data and the negative sample data according to connection records in the network connection data;
sampling and distributing each access object data in the candidate access object set according to the user association feature degree to obtain sampling probability of each access object data;
and extracting the access object data of which the sampling probability meets the preset condition, and generating a candidate access object set by using the extracted access object data.
5. The method of claim 1, wherein calculating the predicted value of the user identification for each access object based on the target user characteristic representation and target object characteristic representation comprises:
Performing mutual attention analysis on the user vector and the access object vector by using a mutual attention network layer of the prediction model to obtain a user attention weight and an access object attention weight;
calculating the user attention access object characteristics according to the user attention weights; calculating the attention user characteristics of the access object according to the attention weight of the access object;
determining a target user feature representation according to the user vector and the user attention access object feature, and determining a target object feature representation according to the access object vector and the access object attention user feature;
calculating the similarity between the target user feature representation and the target object feature representation;
and calculating the predicted value of each access object corresponding to the user identification according to the similarity.
6. The method according to claim 1, wherein extracting target access object data meeting a condition threshold according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier comprises:
extracting access object identifiers meeting a condition threshold according to the predicted value, and acquiring target access object data corresponding to the access object identifiers;
And sequencing the extracted target access object data according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier according to the sequencing result.
7. The method of any one of claims 1 to 6, wherein before the inputting the user representation data and the set of candidate access objects to a predictive model, further comprising:
acquiring training user sample data and training access object data, and generating a training set and a verification set by using the training user sample data and the training access object data;
inputting the training set into a preset machine learning model for learning and training to obtain a training result; iteratively updating model parameters of the machine learning model according to the training result to obtain an initial prediction model;
and verifying the initial prediction model by using the verification set until a verification condition threshold is met, and obtaining a trained prediction model.
8. An access object prediction apparatus based on machine learning, the apparatus comprising:
the data acquisition module is used for acquiring user portrait data and user behavior data corresponding to the user identifier; acquiring the number of access objects;
The data sampling module is used for sampling the access object data according to the user portrait data and the user behavior data, and generating a candidate access object set by using the access object data obtained by sampling;
the data prediction module is used for inputting the user portrait data and the candidate access object set into a prediction model, and extracting the characteristics of the user portrait data and the candidate access object set to obtain target user characteristic representation and target object characteristic representation; calculating predicted values of the user identifications corresponding to all access objects according to the target user characteristic representation and the target object characteristic representation;
the data pushing module is used for extracting target access object data meeting a condition threshold according to the predicted value and pushing the target access object data to the user terminal corresponding to the user identifier;
the data prediction module is also used for respectively extracting the user continuous type feature and the user discrete type feature corresponding to the user portrait data and the access object continuous type feature and the access object discrete type feature corresponding to the access object data through the prediction model; inputting the user continuous type feature and the access object continuous type feature into a forward neural network layer for feature mapping to obtain a continuous type feature matrix; inputting the user discrete features and the access object discrete features into an embedded layer for feature mapping to obtain a discrete feature matrix; and carrying out feature connection processing on the continuous feature matrix and the discrete feature matrix to obtain a user vector and an access object vector.
9. The device of claim 8, wherein the data sampling module is further configured to perform association filtering on the user portrait data and the access object data according to the user behavior data to obtain user access object interaction data; generating positive sample data and negative sample data from the access object data according to the user access object interaction data; and generating a candidate access object set by utilizing the positive sample data and the negative sample data.
10. The apparatus of claim 9, wherein the user behavior data comprises network connection data; the data sampling module is also used for identifying access object data which have connection records with the user identifier in the network connection data; associating the access object data of the presence connection record with the user identifier to generate user access object interaction data; determining the user access object interaction data as positive sample data; and generating negative sample data by using unassociated access object data in the access object data.
11. The apparatus of claim 10, wherein the data sampling module is further configured to extract user-associated features of the positive sample data and the negative sample data from connection records in the network connection data; sampling and distributing each access object data in the candidate access object set according to the user association feature degree to obtain sampling probability of each access object data; and extracting the access object data of which the sampling probability meets the preset condition, and generating a candidate access object set by using the extracted access object data.
12. The apparatus of claim 8, wherein the data prediction module is further configured to perform a mutual attention analysis on the user vector and the access object vector using a mutual attention network layer of the prediction model to obtain a user attention weight and an access object attention weight; calculating the user attention access object characteristics according to the user attention weights; calculating the attention user characteristics of the access object according to the attention weight of the access object; determining a target user feature representation according to the user vector and the user attention access object feature, and determining a target object feature representation according to the access object vector and the access object attention user feature; calculating the similarity between the target user feature representation and the target object feature representation; and calculating the predicted value of each access object corresponding to the user identification according to the similarity.
13. The apparatus of claim 8, wherein the data pushing module is further configured to extract, according to the predicted value, an access object identifier that meets a condition threshold, and obtain target access object data corresponding to the access object identifier; and sequencing the extracted target access object data according to the predicted value, and pushing the target access object data to the user terminal corresponding to the user identifier according to the sequencing result.
14. The apparatus of any one of claims 8-13, further comprising a model training module to obtain training user sample data and training access object data, and to generate a training set and a validation set using the training user sample data and the training access object data; inputting the training set into a preset machine learning model for learning and training to obtain a training result; iteratively updating model parameters of the machine learning model according to the training result to obtain an initial prediction model; and verifying the initial prediction model by using the verification set until a verification condition threshold is met, and obtaining a trained prediction model.
15. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of any one of claims 1 to 7.
16. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 7.
CN202010076468.4A 2020-01-23 2020-01-23 Access object prediction method and device based on machine learning and computer equipment Active CN111291264B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010076468.4A CN111291264B (en) 2020-01-23 2020-01-23 Access object prediction method and device based on machine learning and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010076468.4A CN111291264B (en) 2020-01-23 2020-01-23 Access object prediction method and device based on machine learning and computer equipment

Publications (2)

Publication Number Publication Date
CN111291264A CN111291264A (en) 2020-06-16
CN111291264B true CN111291264B (en) 2023-06-23

Family

ID=71023375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010076468.4A Active CN111291264B (en) 2020-01-23 2020-01-23 Access object prediction method and device based on machine learning and computer equipment

Country Status (1)

Country Link
CN (1) CN111291264B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113781076A (en) * 2020-06-29 2021-12-10 北京沃东天骏信息技术有限公司 Prompting method, device, equipment and readable storage medium
WO2022016363A1 (en) * 2020-07-21 2022-01-27 Telefonaktiebolaget Lm Ericsson (Publ) Similar data set identification
CN111898904B (en) * 2020-07-28 2024-03-22 拉扎斯网络科技(上海)有限公司 Data processing method and device
CN113780607A (en) * 2020-11-16 2021-12-10 北京沃东天骏信息技术有限公司 Method and device for generating model and method and device for generating information
CN112581246A (en) * 2020-12-23 2021-03-30 上海永骁智能技术有限公司 Differentiation method and device based on deep learning tax service
CN112819355A (en) * 2021-02-09 2021-05-18 浙江工商大学 Student learning state evaluation method and device and computer equipment
CN112925982B (en) * 2021-03-12 2023-04-07 上海意略明数字科技股份有限公司 User redirection method and device, storage medium and computer equipment
CN113159840A (en) * 2021-04-12 2021-07-23 深圳市腾讯信息技术有限公司 Object type prediction method, device and storage medium
CN114048392B (en) * 2022-01-13 2022-07-01 北京达佳互联信息技术有限公司 Multimedia resource pushing method and device, electronic equipment and storage medium
CN117422530B (en) * 2023-12-19 2024-03-26 深圳华强电子交易网络有限公司 Electronic component information pushing method and device and electronic equipment

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324686A (en) * 2013-06-03 2013-09-25 中国科学院自动化研究所 Real-time individuation video recommending method based on text stream network
WO2015188699A1 (en) * 2014-06-10 2015-12-17 华为技术有限公司 Item recommendation method and device
CN105354202A (en) * 2014-08-20 2016-02-24 阿里巴巴集团控股有限公司 Data pushing method and apparatus
CN105447013A (en) * 2014-08-18 2016-03-30 南京理工大学常熟研究院有限公司 News recommendation system
CN105843953A (en) * 2016-04-12 2016-08-10 乐视控股(北京)有限公司 Multimedia recommendation method and device
CN105956146A (en) * 2016-05-12 2016-09-21 腾讯科技(深圳)有限公司 Article information recommending method and device
CN106168980A (en) * 2016-07-26 2016-11-30 合网络技术(北京)有限公司 Multimedia resource recommends sort method and device
CN106326391A (en) * 2016-08-17 2017-01-11 合智能科技(深圳)有限公司 Method and device for recommending multimedia resources
CN106599226A (en) * 2016-12-19 2017-04-26 深圳大学 Content recommendation method and content recommendation system
CN106709037A (en) * 2016-12-29 2017-05-24 武汉大学 Movie recommendation method based on heterogeneous information network
CN106909594A (en) * 2016-06-06 2017-06-30 阿里巴巴集团控股有限公司 Information-pushing method and device
CN107305557A (en) * 2016-04-20 2017-10-31 北京陌上花科技有限公司 Content recommendation method and device
CN108416625A (en) * 2018-02-28 2018-08-17 阿里巴巴集团控股有限公司 The recommendation method and apparatus of marketing product
CN108665064A (en) * 2017-03-31 2018-10-16 阿里巴巴集团控股有限公司 Neural network model training, object recommendation method and device
CN109408729A (en) * 2018-12-05 2019-03-01 广州市百果园信息技术有限公司 Material is recommended to determine method, apparatus, storage medium and computer equipment
CN109511015A (en) * 2018-08-10 2019-03-22 腾讯科技(深圳)有限公司 Multimedia resource recommended method, device, storage medium and equipment
CN109670161A (en) * 2017-10-13 2019-04-23 北京京东尚科信息技术有限公司 Commodity similarity calculating method and device, storage medium, electronic equipment
CN109903168A (en) * 2019-01-18 2019-06-18 平安科技(深圳)有限公司 The method and relevant device of recommendation insurance products based on machine learning
CN110019943A (en) * 2017-09-11 2019-07-16 中国移动通信集团浙江有限公司 Video recommendation method, device, electronic equipment and storage medium
CN110119474A (en) * 2018-05-16 2019-08-13 华为技术有限公司 Recommended models training method, the prediction technique based on recommended models and device
CN110213325A (en) * 2019-04-02 2019-09-06 腾讯科技(深圳)有限公司 Data processing method and data push method
WO2019205795A1 (en) * 2018-04-26 2019-10-31 腾讯科技(深圳)有限公司 Interest recommendation method, computer device, and storage medium
CN110569446A (en) * 2019-09-04 2019-12-13 第四范式(北京)技术有限公司 Method and system for constructing recommended object candidate set

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678329B (en) * 2012-09-04 2018-05-04 中兴通讯股份有限公司 Recommend method and device
US9146894B2 (en) * 2013-08-08 2015-09-29 Facebook, Inc. Objective value models for entity recommendation
US20160364783A1 (en) * 2014-06-13 2016-12-15 Truecar, Inc. Systems and methods for vehicle purchase recommendations
US9671862B2 (en) * 2014-10-15 2017-06-06 Wipro Limited System and method for recommending content to a user based on user's interest
CN104881642B (en) * 2015-05-22 2018-10-26 海信集团有限公司 A kind of content delivery method, device and equipment
CN110717069B (en) * 2018-07-11 2022-08-05 阿里巴巴(中国)有限公司 Video recommendation method and device

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324686A (en) * 2013-06-03 2013-09-25 中国科学院自动化研究所 Real-time individuation video recommending method based on text stream network
WO2015188699A1 (en) * 2014-06-10 2015-12-17 华为技术有限公司 Item recommendation method and device
CN105447013A (en) * 2014-08-18 2016-03-30 南京理工大学常熟研究院有限公司 News recommendation system
CN105354202A (en) * 2014-08-20 2016-02-24 阿里巴巴集团控股有限公司 Data pushing method and apparatus
CN105843953A (en) * 2016-04-12 2016-08-10 乐视控股(北京)有限公司 Multimedia recommendation method and device
WO2017177643A1 (en) * 2016-04-12 2017-10-19 乐视控股(北京)有限公司 Multimedia recommendation method and device
CN107305557A (en) * 2016-04-20 2017-10-31 北京陌上花科技有限公司 Content recommendation method and device
CN105956146A (en) * 2016-05-12 2016-09-21 腾讯科技(深圳)有限公司 Article information recommending method and device
CN106909594A (en) * 2016-06-06 2017-06-30 阿里巴巴集团控股有限公司 Information-pushing method and device
CN106168980A (en) * 2016-07-26 2016-11-30 合网络技术(北京)有限公司 Multimedia resource recommends sort method and device
CN106326391A (en) * 2016-08-17 2017-01-11 合智能科技(深圳)有限公司 Method and device for recommending multimedia resources
CN106599226A (en) * 2016-12-19 2017-04-26 深圳大学 Content recommendation method and content recommendation system
CN106709037A (en) * 2016-12-29 2017-05-24 武汉大学 Movie recommendation method based on heterogeneous information network
CN108665064A (en) * 2017-03-31 2018-10-16 阿里巴巴集团控股有限公司 Neural network model training, object recommendation method and device
CN110019943A (en) * 2017-09-11 2019-07-16 中国移动通信集团浙江有限公司 Video recommendation method, device, electronic equipment and storage medium
CN109670161A (en) * 2017-10-13 2019-04-23 北京京东尚科信息技术有限公司 Commodity similarity calculating method and device, storage medium, electronic equipment
CN108416625A (en) * 2018-02-28 2018-08-17 阿里巴巴集团控股有限公司 The recommendation method and apparatus of marketing product
WO2019205795A1 (en) * 2018-04-26 2019-10-31 腾讯科技(深圳)有限公司 Interest recommendation method, computer device, and storage medium
CN110119474A (en) * 2018-05-16 2019-08-13 华为技术有限公司 Recommended models training method, the prediction technique based on recommended models and device
CN109511015A (en) * 2018-08-10 2019-03-22 腾讯科技(深圳)有限公司 Multimedia resource recommended method, device, storage medium and equipment
CN109408729A (en) * 2018-12-05 2019-03-01 广州市百果园信息技术有限公司 Material is recommended to determine method, apparatus, storage medium and computer equipment
CN109903168A (en) * 2019-01-18 2019-06-18 平安科技(深圳)有限公司 The method and relevant device of recommendation insurance products based on machine learning
CN110213325A (en) * 2019-04-02 2019-09-06 腾讯科技(深圳)有限公司 Data processing method and data push method
CN110569446A (en) * 2019-09-04 2019-12-13 第四范式(北京)技术有限公司 Method and system for constructing recommended object candidate set

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Designing Deep Convolutional Neural Networks for Continuous Object Orientation Estimation;Hara, Kota;《Computer Vision and Pattern Recognition》;1-10 *
基于深度神经网络的推荐系统排序模型研究;冯健飞;《中国优秀硕士学位论文全文数据库 信息科技》;I138-1428 *

Also Published As

Publication number Publication date
CN111291264A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
CN111291264B (en) Access object prediction method and device based on machine learning and computer equipment
CN109165840B (en) Risk prediction processing method, risk prediction processing device, computer equipment and medium
CN110489520B (en) Knowledge graph-based event processing method, device, equipment and storage medium
CN110263265B (en) User tag generation method, device, storage medium and computer equipment
CN109345302B (en) Machine learning model training method and device, storage medium and computer equipment
Bashar et al. Performance of machine learning algorithms in predicting the pavement international roughness index
CN111191092B (en) Label determining method and label determining model training method
CN111966914B (en) Content recommendation method and device based on artificial intelligence and computer equipment
CN109063921B (en) Optimization processing method and device for client risk early warning, computer equipment and medium
Sohnesen et al. Is random forest a superior methodology for predicting poverty? An empirical assessment
CN114298417A (en) Anti-fraud risk assessment method, anti-fraud risk training method, anti-fraud risk assessment device, anti-fraud risk training device and readable storage medium
CN112905876B (en) Information pushing method and device based on deep learning and computer equipment
CN110135943B (en) Product recommendation method, device, computer equipment and storage medium
CN112035611B (en) Target user recommendation method, device, computer equipment and storage medium
WO2019200742A1 (en) Short-term profit prediction method, apparatus, computer device, and storage medium
CN111429161B (en) Feature extraction method, feature extraction device, storage medium and electronic equipment
CN114202336A (en) Risk behavior monitoring method and system in financial scene
CN112288279A (en) Business risk assessment method and device based on natural language processing and linear regression
CN115936159A (en) Interpretable credit default rate prediction method and system based on automatic feature mining
CN116307671A (en) Risk early warning method, risk early warning device, computer equipment and storage medium
Rodriguez‐Lozano et al. Efficient data dimensionality reduction method for improving road crack classification algorithms
CN114445121A (en) Advertisement click rate prediction model construction and advertisement click rate prediction method
CN114692785B (en) Behavior classification method, device, equipment and storage medium
Keerthana et al. Accurate prediction of fake job offers using machine learning
CN115758271A (en) Data processing method, data processing device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40024297

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant