CN111460300A - Network content pushing method and device and storage medium - Google Patents

Network content pushing method and device and storage medium Download PDF

Info

Publication number
CN111460300A
CN111460300A CN202010247149.5A CN202010247149A CN111460300A CN 111460300 A CN111460300 A CN 111460300A CN 202010247149 A CN202010247149 A CN 202010247149A CN 111460300 A CN111460300 A CN 111460300A
Authority
CN
China
Prior art keywords
user
sample
current user
sequence
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010247149.5A
Other languages
Chinese (zh)
Other versions
CN111460300B (en
Inventor
刘志煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Cloud Computing Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Cloud Computing Beijing Co Ltd filed Critical Tencent Cloud Computing Beijing Co Ltd
Priority to CN202010247149.5A priority Critical patent/CN111460300B/en
Publication of CN111460300A publication Critical patent/CN111460300A/en
Application granted granted Critical
Publication of CN111460300B publication Critical patent/CN111460300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a network content pushing method and device, computer equipment and a storage medium, and relates to the technical field of cloud computing. The method comprises the following steps: firstly, a server acquires the combined feature and the sequence feature of a current user and the sequence feature of each sample user to acquire the weight corresponding to the current user, then clusters the current user and each sample user to acquire the category of the current user, and finally pushes network content to the current user according to the category of the current user. According to the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing to determine the weight of the current user, perform clustering based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of prediction of recommended network content is improved on the premise of ensuring accurate content recommendation.

Description

Network content pushing method and device and storage medium
Technical Field
The present disclosure relates to the field of cloud computing technologies, and in particular, to a method and an apparatus for pushing network content, and a storage medium.
Background
Nowadays, with the increasing development of the cloud computing technology field, in order to enable a user to obtain more intelligent and targeted service scene recommendation on an internet platform, there are various methods for implementing personalized service scene recommendation, including a recommendation method based on demographics, a recommendation method based on user related content, and a recommendation method based on a coordination filtering algorithm.
In the related art, the recommendation method based on the collaborative filtering algorithm may complete collaborative filtering through an association model algorithm, a clustering model algorithm, a classification model algorithm, a regression model algorithm, matrix decomposition, a graph model, and the like, and is based on user collaborative consideration and article collaborative consideration in both user and content aspects.
However, in the solutions in the related art, the neural network model constructed by the above model algorithm needs a large amount of data for support, and a large amount of actual data needs to be input to obtain a more accurate model, which results in that the neural network model needs to process a large amount of data before accurate prediction, thereby affecting the prediction efficiency.
Disclosure of Invention
The embodiment of the application provides a network content pushing method, a network content pushing device, computer equipment and a storage medium, which can improve the recommendation efficiency of related content, and the technical scheme is as follows:
in one aspect, a method for pushing network content is provided, where the method is performed by the platform server, and the method includes:
acquiring a combination characteristic and a sequence characteristic of a current user, wherein the combination characteristic comprises a user characteristic of a corresponding user and an article characteristic of the corresponding user; the sequence characteristics are used for indicating network behavior characteristics sequentially executed by corresponding users;
combining the sequence characteristics of the current user and the sequence characteristics of each sample user to obtain the weight corresponding to the current user;
clustering the current user and each sample user according to the weight corresponding to the current user, the combination characteristics of the current user and the combination characteristics of each sample user to obtain the category of the current user;
and pushing the network content to the current user according to the category of the current user.
In one aspect, a network content pushing apparatus is provided, where the apparatus is used in the platform server, and the apparatus includes:
the system comprises a characteristic acquisition module, a characteristic analysis module and a characteristic analysis module, wherein the characteristic acquisition module is used for acquiring the combined characteristic and the sequence characteristic of a current user, and the combined characteristic comprises the user characteristic of a corresponding user and the article characteristic of the corresponding user; the sequence characteristics are used for indicating network behavior characteristics sequentially executed by corresponding users;
the weight obtaining module is used for obtaining the weight corresponding to the current user by combining the sequence characteristics of the current user and the sequence characteristics of each sample user;
a category obtaining module, configured to cluster the current user and each sample user according to a weight corresponding to the current user, a combination feature of the current user, and a combination feature of each sample user, so as to obtain a category to which the current user belongs;
and the content pushing module is used for pushing the network content to the current user according to the category of the current user.
In a possible implementation manner, the weight obtaining module includes:
the common mode determining submodule is used for determining the common sequence mode of each sample user and the longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user through a sequence mode mining algorithm;
a number obtaining sub-module, configured to obtain the number of sample users having the longest common sequence pattern and the total number of sample users from among the sample users;
and the weight determining submodule is used for determining the common sequence mode support degree of the current user according to the number of the sample users with the longest common sequence mode and the total number of the sample users, and the common sequence mode support degree is used as the weight corresponding to the current user.
In a possible implementation manner, the weight obtaining module further includes:
and the sample weight determining submodule is used for determining the sample weight of each sample user according to the common sequence mode of the sample users, and the sequence mode comprises at least one of a behavior sequence mode and a browsing sequence mode.
In one possible implementation, the sample weight determining sub-module includes:
the weight determining unit is used for determining the weight of the field type according to the frequency occupied by the field type in the common sequence mode;
a sample weight determining unit, configured to average weights of at least one field type included in the common sequence pattern of the sample user, and determine a sample weight of the sample user;
in a possible implementation manner, the category obtaining module includes:
a clustering center determining submodule, configured to determine, through a weighted clustering algorithm, a clustering center of each piece of network content corresponding to each sample user according to the sample weight of each sample user and the combination feature of each sample user;
the distance determining submodule is used for determining the distance between the current user and each clustering center according to the combination characteristics of the current user and the weight corresponding to the current user;
and the first category obtaining submodule is used for obtaining the category of the current user according to the distance between the current user and each clustering center.
In a possible implementation manner, the category obtaining module includes:
a sample user determining submodule, configured to determine, according to the sample weight of each sample user, the combined feature of each sample user, the weight corresponding to the current user, and the combined feature of the current user, the sample user belonging to the same category as the current user through a weighted clustering algorithm;
and the second category acquisition submodule is used for acquiring the category to which the current user belongs, corresponding to the sample user belonging to the same category as the current user, as the category to which the current user belongs.
In one possible implementation manner, the content pushing module includes:
the first content obtaining submodule is used for obtaining the network content corresponding to the minimum value of the distances between the current user and each clustering center;
and the first content pushing submodule is used for pushing the network content to the terminal of the current user.
In one possible implementation manner, the content pushing module includes:
a second content obtaining sub-module, configured to obtain, from the sample users belonging to the same category as the current user, a pushed percentage of each network content corresponding to each sample user, where the pushed percentage is a ratio between the number of times that the corresponding network content is pushed and a sum of the number of times that the network content is pushed to all the sample users in the same category;
and the second content pushing submodule is used for pushing target network content to the terminal of the current user, wherein the target network content is the network content with the highest pushed ratio in each network content corresponding to each sample user.
In one possible implementation manner, the feature obtaining module includes:
the user characteristic obtaining submodule is used for obtaining the user data of the current user and generating the user characteristics of the current user;
the article characteristic acquisition sub-module is used for acquiring article data of the current user and generating article characteristics of the current user;
and the combined feature generation submodule is used for performing feature processing according to the user features and the article features to generate the combined features.
In one possible implementation, the apparatus further includes:
and the sample library construction module is used for acquiring the conversion user as a sample user and constructing a user sample library, wherein the conversion user is used for indicating the user with the actual conversion.
In one possible implementation, the apparatus further includes:
a sample feature obtaining module, configured to obtain a sample sequence feature of each sample user before obtaining a weight corresponding to the current user according to the sequence feature of the current user and the sequence feature of each sample user;
and the sample sequence mode determining module is used for determining the sample sequence mode of each sample user according to the sample sequence characteristics.
In one aspect, a computer device is provided, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the network content pushing method according to any one of the above-mentioned optional implementation manners.
In one aspect, a computer-readable storage medium is provided, where at least one instruction, at least one program, a code set, or a set of instructions is stored in the storage medium, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the network content pushing method according to any one of the above-mentioned optional implementation manners.
The technical scheme provided by the application can comprise the following beneficial effects:
according to the content recommendation scheme provided by the embodiment of the application, firstly, the server acquires the weight corresponding to the current user by acquiring the combined feature and the sequence feature of the current user and combining the sequence feature of the current user and the sequence feature of each sample user, then, the current user and each sample user are clustered according to the weight corresponding to the current user, the combined feature of the current user and the combined feature of each sample user to acquire the affiliated category of the current user, and finally, the network content is pushed to the current user according to the affiliated category of the current user. According to the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing to determine the weight of the current user, perform clustering based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of prediction of recommended network content is improved on the premise of ensuring accurate content recommendation.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic diagram of a network content push system according to an exemplary embodiment of the present application;
fig. 2 is a schematic diagram of a network content pushing method according to an exemplary embodiment of the present application;
FIG. 3 is a flow diagram of a network content push provided by an exemplary embodiment of the present application;
fig. 4 is a flowchart illustrating a network content pushing method according to an exemplary embodiment of the present application;
fig. 5 is a flowchart illustrating a network content pushing method according to an exemplary embodiment of the present application;
fig. 6 is a block diagram showing the construction of a network content push apparatus according to an exemplary embodiment;
FIG. 7 is a block diagram illustrating a computer device according to an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
It is to be understood that reference herein to "a number" means one or more and "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
For convenience of understanding, terms referred to in the embodiments of the present disclosure are explained below.
(1) Artificial intelligence AI
Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like. The scheme provided by the embodiment of the application mainly relates to the technologies of machine learning/deep learning and the like in artificial intelligence.
(2) Neural network
The Neural network is also called Artificial Neural Networks (ans) or Connection models (Connection models), and is an algorithmic mathematical Model that simulates the behavior characteristics of animal Neural Networks such as human beings and performs distributed parallel information processing. The network achieves the aim of processing information by adjusting the mutual connection relationship among a large number of nodes in the network depending on the complexity of the system.
(3) Cloud technology (Cloud technology)
The cloud technology is a hosting technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize the calculation, storage, processing and sharing of data.
The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.
(4) Cloud Computing (Cloud Computing)
Cloud Computing refers to a delivery and use mode of an IT infrastructure, which refers to obtaining required resources in an on-demand and easily-extensible manner through a Network, and generalized cloud Computing refers to a delivery and use mode of services, which refers to obtaining required services in an on-demand and easily-extensible manner through a Network.
With the development of diversification of internet, real-time data stream and connecting equipment and the promotion of demands of search service, social network, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Different from the prior parallel distributed computing, the generation of cloud computing can promote the revolutionary change of the whole internet mode and the enterprise management mode in concept.
(5) Database with a plurality of databases
Database (Database), which can be regarded as an electronic file cabinet in short, a place for storing electronic files, a user can add, query, update, delete, etc. to data in files. A "database" is a collection of data that is stored together in a manner that can be shared by multiple users, has as little redundancy as possible, and is independent of the application.
Database Management systems (DBMS) are computer software systems designed to manage databases, typically with underlying functions of storage, interception, security, backup, etc. the DBMS can be categorized according to the Database model it supports, such as relational, XM L (Extensible Markup language), or according to the type of computer supported, such as server clustering, mobile telephony, or according to the Query language used, such as SQ L (Structured Query L, Structured Query language), XQuery, or according to the size of the performance impact, such as maximum speed, maximum speed of operation, or other categories.
(6) Big data
Big Data (Big Data) is a Data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode. With the advent of the cloud era, big data has attracted more and more attention, and the big data needs special technology to effectively process a large amount of data within a tolerance elapsed time. The method is suitable for the technology of big data, and comprises a large-scale parallel processing database, data mining, a distributed file system, a distributed database, a cloud computing platform, the Internet and an extensible storage system.
Fig. 1 is a schematic diagram illustrating a network content push system according to an example embodiment. The network content push system includes a terminal 110 and a platform server 120.
A user may enter a platform scenario corresponding to the platform server 120 on the terminal 110, and the user may perform a service in the platform scenario.
After the user enters the platform scenario, the platform server 120 may record user data of the user in the platform scenario.
The user data may include browsing data of the user in the scene, behavior data of the user in the scene, and basic data of the user.
The platform server 120 may include a memory therein, which may be used to store various user data.
The terminal 110 may perform data transmission with the platform server 120 through a wired or wireless network.
The platform server 120 may be a server, or may be a server cluster composed of several servers, or may include one or more virtualization platforms, or may be a cloud computing service center.
The platform server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
The Network is typically the Internet, but may be any Network including, but not limited to, any combination of a local Area Network (L cal Area Network, L AN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a mobile, wireless Network, a Private Network, or a Virtual Private Network.
Fig. 2 is a diagram illustrating a network content push method according to an example embodiment. As shown in fig. 2, the method for pushing network content includes the following steps:
in step 201, user and item features are constructed and the actual transformed user is mined as a positive sample.
In one possible implementation, the server constructs the user characteristics and item characteristics from the user data and the item data.
The user characteristics may include user basic attribute characteristics, such as age, gender, school calendar, city level, and the like; the user characteristics may include user consumption characteristics such as total number of payments, total amount, distribution of number of payments over a period of time (24 hours, week, month, half year), distribution of payment amounts, average amount of pens, etc.; the user characteristics may also include user behavior characteristics such as browsing duration, page clicks, and the like.
The item characteristics can include item basic attribute characteristics, such as item category, item price, item brand, item purchase score, item comment sentiment and other characteristics; the item characteristics may include item consumption characteristics such as the number of times an item was purchased, the number of times it was clicked to browse, the number of times a shopping cart was added, the number of times items of the same type were purchased, and the like.
Optionally, the server may construct a combination feature of < user, article > by splicing and combining the user feature and the article feature, and perform data preprocessing.
Wherein the processing step may comprise:
1) features with excessive missing values are discarded.
For example, the server may set a missing value filtering threshold value of × 0.4.4, and filter a feature when the number of missing feature data exceeds the missing value filtering threshold value, and delete a single-valued feature.
2) And carrying out abnormal value processing.
For example, an abnormal value with a feature value that is too large and ranks 0.001 (one thousandth) in the top is discarded according to the feature distribution.
3) And (5) processing missing values.
For example, continuous features may be filled with a mean value, and discrete features may be filled with a constant as individual classes.
4) And performing characteristic derivation.
For example, the server may perform feature combination and derivation through feature transformation, feature squaring, and feature addition and subtraction.
5) And performing characteristic processing.
For example, the continuous type feature can be subjected to box discretization, and the discrete type feature can be subjected to one-hot encoding.
Optionally, the user with the actual conversion in the service scene is used as a high-value user, the definition of the high-value user includes users who have bought members, have high points and exchange points, and have large historical transaction amounts, and the users are used as positive samples to construct a high-value user sample library.
In step 202, a Prefix span algorithm is mined based on the sequence pattern to mine the behavior sequences of different users of the same article.
In one possible implementation, the server may mine a user behavior sequence pattern based on a Prefixspan algorithm, and find a user group transformed from a touch to a user group with a common habit/browsing habit.
Meanwhile, a multi-minimum support degree strategy can be used, the formula of the calculation method of the minimum support degree is min _ sup ═ a × n, wherein n is the number of samples in the sample set, a is a minimum support rate parameter, and the minimum support rate parameter is adjusted according to the number of the sample sets.
In step 203, a sample weight is obtained by performing a weighted calculation on the sequence pattern mining features.
In a possible implementation mode, a common user behavior pattern is mined and converted from a user behavior sequence and a user browsing sequence, so that the part of feature types are more concerned in a modeling method, the part of type features are weighted, the field of the feature type field which does not appear in each common prefix of the sequence pattern is removed, and factors which have little influence on user conversion are filtered. Setting the weighting weight as the frequency occupation ratio corresponding to each field type, if the threshold value of the minimum support degree is set as 0.5, if the frequency occupation ratio of various types of values of a certain field is less than the minimum support degree, rejecting the field, if the frequency occupation ratio of the field type 'collection behavior f' is 0.7, then the weighting weight of the field type is 0.7; the field type "browsing sequence AaBcA" appears with a frequency of 0.56, and the field type is weighted by 0.56. And calculating the average weight of the field types of the conversion common behavior sequences contained by the user as the sample weight of the conversion of the user.
In step 204, a weighted clustering algorithm is constructed to cluster the sample combination features.
In a possible implementation manner, user combination feature construction and feature processing are performed based on step 201, user combination features are weighted according to the sample weight of each sample calculated in step 203, and a sample weighted clustering algorithm is constructed to cluster feature vectors.
In step 205, different users are recommended based on the clustering result and preset conditions.
In a possible implementation manner, based on step 201, user combination feature construction and feature processing are performed, and the user combination feature is weighted according to the sample weight of each sample calculated in step 203, so as to construct a new sample weighted feature. And (3) carrying out step 204 weighted clustering on the new sample combination characteristic vector and the conversion user sample combination characteristic vector, calculating conversion ratios of various articles in the category to which the new sample combination characteristic vector belongs after clustering is finished, and recommending the article with the highest conversion rate in the category to the prediction user sample.
Please refer to fig. 3, which illustrates a flowchart of a network content push according to an exemplary embodiment of the present application. The user can be a user of an e-commerce platform or a user using a terminal with a recommendation system platform, and when the user is on the e-commerce platform, the e-commerce platform can recommend commodities preferred by the user according to the relevant attributes of the user and the sample attributes in the database through the content recommendation process; when the user is in a terminal system with the function of recommending advertisements, the process advertisement recommendation platform for recommending the content can recommend the advertisements related to the user in a targeted manner according to the behavior preference of the user. The content recommendation method can greatly improve the content recommendation effect. As shown in fig. 3, in a scenario of an e-commerce platform, a user 301 enters the e-commerce platform through a terminal 302, and the terminal may perform data transmission with a platform server 303 of the e-commerce platform through a wired or wireless network, first, related data and commodity data of a high-value user are stored in a database in the platform server 303 as sample data, where the high-value user may be a user who includes a member of the e-commerce platform and has a common transaction behavior, and the high-value user may provide related data of a platform high reference value, so that the high-value user may be selected as a sample user. The platform server 303 acquires data of the high-value user and commodity data in real time, and preprocesses the data of the high-value user and the commodity data. Then, the platform server 303 mines a behavior sequence of the sample user based on the sequence pattern, forms a browsing sequence pattern and a behavior sequence pattern by clicking sequence information left by browsing the sample user on the platform and a series of behavior tracks converted from other channels, and obtains all common sequence patterns in the sample set through a sequence pattern mining algorithm, wherein the common sequence patterns can be common browsing patterns or common behavior patterns of the sample user, and then calculates a sample weight according to a frequency proportion occupied by the field type, and performs a weighted clustering algorithm on the sample according to the sample weight to obtain a clustering result. And performing weighted clustering on the combined features of the users 301 and the combined features of the samples, calculating the conversion ratio of various commodities in the category to which the users 301 belong after clustering is completed, namely the ratio of the commodities preferred by each sample user in the category to which the users 301 belong, and recommending the commodity with the highest conversion rate, namely the commodity with the highest ratio in the category to the users 301.
Referring to fig. 4, a flowchart of a network content pushing method according to an exemplary embodiment of the present application is shown. The network content pushing method may be performed by a platform server. The platform server may be the platform server 120 in the system shown in fig. 1. As shown in fig. 4, the network content pushing method may include the following steps:
in step 401, obtaining a combination feature and a sequence feature of a current user, where the combination feature includes a user feature of a corresponding user and an article feature of the corresponding user; the sequence feature is used for indicating the network behavior features which are sequentially executed by the corresponding users.
Optionally, the platform server may obtain user data of the current user to generate a user feature of the current user, the platform server may obtain item data of the current user to generate an item feature of the current user, and the platform server may generate the combination feature according to the user feature and the item feature.
Optionally, the user characteristics may include at least one of user basic attribute characteristics, user consumption characteristics, and user behavior characteristics.
The user basic attribute features can be user features such as user age, user gender, user study, user city level and the like. The user consumption characteristics can be user characteristics such as total payment amount of the user, payment amount distribution of the user within a certain time period (within 24 hours, within a week, within a month and within a half year), payment amount distribution of the user, average payment amount of the user and the like. The user behavior characteristics can be user characteristics such as user browsing duration and user page clicking times.
Optionally, the item characteristics may include at least one of item base attribute characteristics and item consumer characteristics.
The basic attribute features of the article can be article features such as article category, article price, article brand, article purchase score and article comment emotion. The item consumption characteristics can be item characteristics such as the number of times an item is purchased, the number of times an item is clicked and browsed, the number of times an item is added to a shopping cart, the number of times a similar item is purchased, and the like.
Alternatively, the combined feature may be obtained by combining the user feature and the item feature, and may be expressed in the form of < user feature, item feature >.
In step 402, the sequence feature of the current user and the sequence feature of each sample user are combined to obtain the weight corresponding to the current user.
Optionally, the user server may mine the sequence pattern of the current user and the sample sequence pattern of the sample user based on a prefix span (sequence pattern mining) algorithm, and may find the sample user who has a common behavior habit or a common browsing habit with the current user from a touch by obtaining a common sequence pattern of the current user and the sample sequence pattern.
In step 403, the current user and each sample user are clustered according to the weight corresponding to the current user, the combination feature of the current user, and the combination feature of each sample user, so as to obtain the category to which the current user belongs.
In step 404, the web content is pushed to the current user according to the category to which the current user belongs.
Optionally, the corresponding content may recommend different content according to different scenes where the user is located.
For example, in the e-commerce platform scenario, the user terminal may be sent a purchase link of a certain commodity by the platform server; the user purchases the store platform at the terminal application, and the user can be sent a download link for a certain application by the platform server. Also, the content sent by the platform server may be a picture, video or even a piece of audio in addition to the address connection.
To sum up, in the network content pushing method provided in the embodiment of the present disclosure, the server first obtains the weight corresponding to the current user by obtaining the combined feature and the sequence feature of the current user, and combining the sequence feature of the current user and the sequence feature of each sample user, then clusters the current user and each sample user according to the weight corresponding to the current user, the combined feature of the current user, and the combined feature of each sample user, obtains the category to which the current user belongs, and finally pushes the network content to the current user according to the category to which the current user belongs. According to the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing to determine the weight of the current user, perform clustering based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of prediction of recommended network content is improved on the premise of ensuring accurate content recommendation.
Referring to fig. 5, a flowchart of a content recommendation method according to an exemplary embodiment of the present application is shown. The content recommendation method may be performed by a platform server. The platform server may be the platform server 120 shown in fig. 1. As shown in fig. 5, the content recommendation method may include the steps of:
step 501, the platform server obtains the conversion user as a sample user, and constructs a user sample library.
In the embodiment of the present disclosure, the platform server may obtain, in real time or periodically, the platform conversion user as the sample user of the platform, and record data of each sample user, and the platform server may store data information including each sample user in a special database of the platform server, where the database may be used as a user sample library.
Where conversion users may be used to indicate users with actual conversions. The translation may be used to represent the process of transitioning from a new user who first entered to an old user of the platform through some browsing or behavior.
The user sample library may be a database in which the platform server stores data of sample users.
Step 502, the platform server obtains the combination feature and the sequence feature of the current user.
In the embodiment of the present disclosure, the platform server may obtain data of the current user, and obtain the combination characteristic of the current user according to the obtained data information.
The combined features comprise user features corresponding to the users and article features corresponding to the users; the sequence characteristics are used for indicating the network behavior characteristics which are sequentially executed by the corresponding users.
Optionally, the platform server obtains user data of the current user, generates user characteristics of the current user, the platform server obtains item data of the current user, generates item characteristics of the current user, and the platform server generates the combination characteristics according to the user characteristics and the item characteristics.
Optionally, the platform server splices and combines the user characteristics and the article characteristics to form a combined characteristic form of < user characteristics, article characteristics >, and then may perform data preprocessing on the characteristics.
The data preprocessing can be used for removing abnormal feature data, single-value feature data or feature data with more missing, and meanwhile, derived feature data can be obtained according to the known feature data, so that the feature data quantity is amplified. Finally, the added feature data can be displayed.
Optionally, when a certain feature missing value in the obtained multiple users is too much, the platform server may set a missing value filtering threshold, and automatically filter out feature data whose feature quantity is below the missing value filtering threshold.
For example, the platform server may preset a missing value filtering threshold value of × 0.4.4, assume that the sample data is 10, and may calculate that the missing value filtering threshold value is 4 according to a set threshold value calculation formula, and then the platform server may filter out features with a missing number of feature data greater than 4.
The single-value feature is a feature with only one numerical value, so the single-value feature has no calculation significance, and the platform server can directly delete the single-value feature.
Optionally, according to the distribution of the features, when a certain feature of a certain user of the plurality of users acquired by the platform server is an abnormal value in all feature values of the feature, the feature data may be deleted.
For example, if the feature value of a feature is an abnormal value that is one thousandth of the previous feature value, the platform server may discard the abnormal value.
Optionally, if a certain feature of the obtained plurality of users has a small number of missing values, the platform server may process the missing part.
Wherein, if the feature is a continuous feature, the feature can be filled with a mean value of continuous data; if the features are discrete features, the population may be filled with constants.
Optionally, the features directly obtained by the platform server may be combined and derived through at least one of feature transformation, feature squaring, or feature addition and subtraction to generate new features.
Optionally, the continuous features may be subjected to bin discretization, and the discrete features may be subjected to one-hot encoding.
Step 503, the platform server may determine the common sequence mode of each sample user and the longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user by using a sequence mode mining prefix span algorithm.
In this disclosure, the platform server may first obtain the sample sequence feature of each sample user, then determine the sample sequence pattern of each sample user according to the sample sequence feature, and finally determine the common sequence pattern of each sample user and the longest common sequence pattern of the current user.
The platform server can acquire sequence information left by clicking and browsing each sample user on the platform and a series of behavior tracks converted from other channels according to daily operations of the sample users on the platform, and can be represented by a sample sequence mode.
Optionally, the sequence mode includes at least one of a behavior sequence mode and a browsing sequence mode.
The sequence mode may be sequential, and behavior information or browsing information of the user in the sequence mode may be obtained by marking the sequence mode.
For example, the browsing sequence pattern may be marked as follows: if the user is small, the user enters a page B by clicking a button on the page A, and then browses for a period of time and clicks B button to enter a page C; the user plumet enters the B page by clicking the button a on the A page, and then browses for a period of time and clicks the button c again to return to the A page. Then the user's minuscule browsing sequence can be marked as: AaBbC, the browsing sequence of the user litter can be marked as: AaBcA.
In addition, the behavior sequence pattern may be used to represent a series of behavior tracks from the touch of the platform to the conversion of the user, the behavior sequence information may be composed of a series of behavior tags, and the behavior sequence pattern may be labeled as follows: in the shopping platform scenario, the platform may preset a correspondence table between the user behavior tag and the user behavior code, as shown in table 1, the purchasing behavior of the user is marked as code h, the shopping cart adding behavior of the user may be marked as code g, and the detailed code correspondence relationship may be referred to in table 1. Under the scene of the platform, if a user enters the platform through a channel, then registration and login are carried out, after the page is browsed for a period of time, the user clicks an item checking detail page, after the page is browsed for a period of time, a collection button is clicked to collect items, and then the user clicks an additional shopping cart to purchase the items. Then the behavioral sequence labels for this user are: bcafgh. The user plum enters the platform through a channel, then registration and login are carried out, a page is browsed for a period of time, then a specific commodity is searched by clicking, a shopping cart is added after browsing, payment and purchase are carried out, collection is added after purchase, and then the behavior sequence label of the user is as follows: bcdaghf.
Behavior tag Behavior coding
Purchasing behavior H
Add shopping cart behavior G
Collecting behavior F
Commenting behaviors E
Search behavior D
Login behavior C
Registration behavior B
Browsing behavior A
TABLE 1
Optionally, the corresponding relationship between the behavior tag and the tag number may be tagged according to an actual application scenario and a behavior category, and further refinement and change may be performed.
In addition, the platform server can use a plurality of different methods of the minimum support threshold value to obtain a common sequence mode of each length meeting the different minimum support threshold values.
Wherein, the calculation method of the minimum support degree can be as follows,
min_sup=a×n
wherein n is the number of sample users in the sample library, and a is the minimum support rate parameter.
Optionally, the minimum support rate parameter may be adjusted according to the number of sample users.
Optionally, the calculation process of the Prefixspan algorithm may be divided into the following steps:
1. the platform server finds the user sequence prefix with unit length of 1 and the corresponding projection data set.
2. And counting the occurrence frequency of the sequence prefixes, adding the prefixes with the support degree higher than the minimum support degree threshold value to the data set, and acquiring the time sequence mode of the common item set.
3. And recursively mining all prefixes which have the length of i and meet the requirement of minimum support degree. Excavating a projection data set of the prefix, and if the projection data is an empty set, returning to the recursion; counting the minimum support degree of each item in the corresponding projection data set, combining each single item meeting the support degree with the current prefix to obtain a new prefix, and recursively returning if the support degree requirement is not met; and (5) making i equal to i +1, wherein the prefixes are new prefixes obtained after single item combination, and the 3 rd step is executed recursively.
4. All common sequence patterns in the sequence sample library are returned.
For example, if the user's minuscule browsing sequence can be marked as: AaBbC, the browsing sequence of the user litter can be marked as: AaBcA, and when the minimum support threshold set by the platform server is 0.5, a prefix and its corresponding suffix meeting the minimum support threshold can be as shown in table 2.
A prefix Corresponding suffix
A aBbCaBcA
a BbCBcA
B bCcA
TABLE 2
The two-term prefixes and their corresponding suffixes that satisfy the minimum support degree threshold may be as shown in table 3.
Prefix of two items Corresponding suffix
Aa BbCBcA
aB bCcA
TABLE 3
The three prefixes and their corresponding suffixes that satisfy the minimum support threshold may be as shown in table 4.
Figure BDA0002434257610000171
Figure BDA0002434257610000181
TABLE 4
For example, if the user's minuscule sequence of behaviors can be marked as: bcafgh, the sequence of behaviors of the user prune can be labeled as: bcdaghf, and the minimum support threshold set by the platform server is 0.5, a prefix and its corresponding suffix that meet the minimum support threshold can be as shown in table 5.
Figure BDA0002434257610000183
TABLE 5
The two-term prefixes and their corresponding suffixes that satisfy the minimum support degree threshold may be as shown in table 6.
Figure BDA0002434257610000184
TABLE 6
The three prefixes and their corresponding suffixes that satisfy the minimum support threshold may be as shown in table 7.
Figure BDA0002434257610000182
Figure BDA0002434257610000191
TABLE 7
The four prefixes and their corresponding suffixes that satisfy the minimum support threshold may be as shown in table 8.
Figure BDA0002434257610000192
TABLE 8
The five prefixes and their corresponding suffixes that satisfy the minimum support threshold may be as shown in table 9.
Prefix of five items Corresponding suffix
bcagh f
TABLE 9
Optionally, the common sequence mode with the longest prefix of the current user is determined to be the longest common sequence mode according to the method.
At step 504, the platform server may determine the sample weight of each sample user according to at least one of the common sequence pattern and the non-sequence pattern of the sample user.
The non-sequence mode is a part of the sequence mode except the common sequence mode, and the sequence mode comprises at least one of a behavior sequence mode and a browsing sequence mode.
Wherein, in response to determining the sample weight of each sample user according to the common sequence mode of the sample user, the platform server determines the weight of the field type according to the frequency occupied by the field type in the common sequence mode, and then the platform server averages the weights of at least one field type included in the common sequence mode of the sample user to determine the sample weight of the sample user.
For example, if the frequency duty ratio of the field type "favorite behavior f" is 0.7, the platform server may determine that the field type has a weighting weight of 0.7; when the field type "browse sequence AaBcA" occurs with a frequency of 0.56, the platform server may determine that the field type has a weighting weight of 0.56. The platform server may calculate the average weight of the field types of the user containing the conversion common behavior sequence as the sample weight of the user conversion.
The conversion common behavior sequence pattern included in the user behavior may be as shown in table 10, and the platform server may calculate the sample weight of the user conversion as: (0.56+0.7)/2 ═ 0.63.
Figure BDA0002434257610000203
Watch 10
Optionally, the platform server may delete a field type that does not appear in each common prefix in the sequence pattern.
For example, when the minimum support threshold is set to 0.5, if the frequency ratios of the various types of values of a field are all less than the minimum support, the field may be deleted.
Optionally, in addition to obtaining the sample weight according to the sequence pattern mining feature weight determination, the platform server may also determine the sample weight by using the non-sequence pattern together.
Wherein, in response to determining the sample weight of each sample user according to the common sequence pattern non-sequence pattern of the sample users, the platform server may determine the sample weight of each sample user according to the number of sample users with non-sequence characteristics and the total number of sample users.
The non-sequence mode determination feature weight can be determined in two ways:
1. the platform server may set the minimum support rate a as a feature weight of the non-sequence mode.
Wherein the non-sequence mode may have a lower characteristic weight than the sequence mode.
2. The platform server can calculate the feature weight of the non-sequence mode by dividing the number of samples of the feature appearance by the total number of samples, namely
Figure BDA0002434257610000201
Wherein the non-sequence mode may have a lower characteristic weight than the sequence mode.
Optionally, when the platform server obtains the sequence mode feature weight and the non-sequence mode feature weight of the user, the platform server may weight each feature weight to determine the sample weight.
For example, when the feature of the user scarlet is "AaBcAort," the feature of the sequence pattern therein is "AaBcA," and the weight of the feature of the sequence pattern is 0.56, the weight of the feature of the non-sequence pattern is "ort," and the weight of the feature of the non-sequence pattern is 0.5, then the weight of the user sample can be calculated as: (0.56 × 5+0.5 × 2)/(5+2) ═ 0.54.
For example, the sample weights of the user samples can be calculated according to the above method, and the weights of some samples are shown in table 11 below.
Figure BDA0002434257610000202
Figure BDA0002434257610000211
TABLE 11
In step 505, the platform server obtains the number of sample users having the longest common sequence pattern and the total number of sample users.
In the embodiment of the present disclosure, the platform server may obtain, through a sequence pattern mining algorithm, the number of sample users in the user sample library having the longest common sequence pattern and the total number of sample users in the user sample library.
Step 506, the platform server determines the common sequence mode support degree of the current user according to the number of the sample users with the longest common sequence mode and the total number of the sample users, and the common sequence mode support degree is used as the weight corresponding to the current user.
Alternatively, the sequence mode support degree may be calculated by a ratio of the number of sample users having the longest common sequence mode and the total number of sample users.
Step 507, the platform server clusters the current user and each sample user according to the weight corresponding to the current user, the combination characteristics of the current user and the combination characteristics of each sample user, and obtains the category to which the current user belongs.
In the embodiment of the present disclosure, the platform server may obtain the category to which the current user belongs by performing weighted clustering on the sample user first and then obtaining the category to which the current user belongs, or by performing weighted clustering on the sample user and the current user together and then obtaining the category of the current user.
Optionally, when the platform server performs weighted clustering on the sample users first and then obtains the category to which the current user belongs, the platform server may determine, by using a weighted clustering algorithm, a cluster center of each piece of network content corresponding to each sample user according to the sample weight of each sample user and the combined feature of each sample user, then determine, according to the combined feature of the current user and the weight corresponding to the current user, a distance between the current user and each cluster center, and finally obtain the category to which the current user belongs.
Optionally, when the sample user and the current user are weighted and clustered together and then the category to which the current user belongs is obtained, the platform server may determine, through a weighted clustering algorithm, the sample user that belongs to the same category as the current user according to the sample weight of each sample user, the combined feature of each sample user, the weight corresponding to the current user, and the combined feature of the current user.
The conventional clustering algorithm may be based on partitioned clustering, and each sample is treated equally when performing clustering calculation.
Alternatively, the conventional clustering Algorithm may include a k-means clustering Algorithm (k-means clustering Algorithm) or an Expectation Maximization Algorithm (Expectation Maximization Algorithm).
Under the premise of not considering the sample weight, the k-means clustering algorithm finishes clustering when the criterion function is converged, and the formula of the criterion function is as follows:
Figure BDA0002434257610000221
wherein J is represented as degree of aggregation and can be used for measuring clustering effect, k is total number of clusters, and m isiIs the total number of members in the class cluster i,
Figure BDA0002434257610000222
is the jth member in the class cluster i;
Figure BDA0002434257610000223
the central vector of the cluster i is calculated by the following formula:
Figure BDA0002434257610000224
wherein the content of the first and second substances,
Figure BDA0002434257610000225
as text
Figure BDA0002434257610000226
Is a cluster-like center point
Figure BDA0002434257610000227
The similarity of (c).
Optionally, on the premise of considering the sample weight, the similarity may be calculated by using the cosine of the vector angle.
The clustering algorithm of sample weighting is considered, and the calculation formula of the standard function of clustering after sample weighting is as follows:
Figure BDA0002434257610000231
wherein the content of the first and second substances,
Figure BDA0002434257610000232
for the class center vector after the clustering sample weighting, the calculation formula is as follows:
Figure BDA0002434257610000233
wherein, wjFor the weight of the cluster sample i, it can satisfy:
Figure BDA0002434257610000234
namely, it is
Figure BDA0002434257610000235
Step 508, the platform server pushes the network content to the current user according to the category of the current user.
In the embodiment of the present disclosure, the platform server may obtain the network content corresponding to the category having the shortest cluster center distance, or obtain the network content having the highest conversion rate as the recommended network content.
Optionally, the platform server may obtain the network content corresponding to the minimum distance between the current user and the cluster center, and then push the network content to the terminal of the current user. Or, the platform server may also obtain the pushed percentage of each network content corresponding to each sample user among the sample users belonging to the same category as the current user, and then push the target network content to the terminal of the current user.
The pushed occupation ratio is a ratio of the number of times that the corresponding network content is pushed to the sum of the number of times that the network content is pushed to all sample users in the same category, and the target network content is the network content with the highest pushed occupation ratio in the network contents corresponding to the sample users.
For example, in an e-commerce platform scenario, if the platform server can obtain that there are user a and user B in the same category as the current user, the number of times that user a pushes item a is 1, the number of times that user B pushes item B is 4, the number of times that user B pushes item a is 2, the number of times that user B pushes item B is 1, and the number of times that user c pushes item c is 2, the ratio of item a is 0.3, the ratio of item B is 0.5, the ratio of item c is 0.2, and the highest ratio of item B is obtained.
The platform server can acquire the recommended content in the following two ways:
1. the platform server can perform clustering through the historical conversion user sample characteristics to obtain the clustering center of each content preference category, the content preference category is the content with the highest conversion rate in each category, for the current user, after the current user characteristics are subjected to sequence mode weighting, the distance between the user and each content preference category center is calculated, the cosine distance can be calculated, so that the nearest content preference category to which the current user belongs is obtained, and the platform server performs content recommendation on the current user.
2. The platform server can judge the proportion of the content conversion rate of the current user in the category to which the current user belongs by performing weighted clustering on the current user and the user sample, and recommend the content with the highest conversion rate in the category to the current user.
Optionally, the platform server may send the recommended network content to a terminal of the current user for display.
To sum up, in the network content pushing method provided in the embodiment of the present disclosure, the server first obtains the weight corresponding to the current user by obtaining the combined feature and the sequence feature of the current user, and combining the sequence feature of the current user and the sequence feature of each sample user, then clusters the current user and each sample user according to the weight corresponding to the current user, the combined feature of the current user, and the combined feature of each sample user, obtains the category to which the current user belongs, and finally pushes the network content to the current user according to the category to which the current user belongs. According to the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing to determine the weight of the current user, perform clustering based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of prediction of recommended network content is improved on the premise of ensuring accurate content recommendation.
Fig. 6 is a block diagram illustrating a structure of a network content push apparatus according to an exemplary embodiment. The network content pushing device can be implemented as all or part of the server in a hardware manner or a software and hardware combination manner to execute all or part of the steps of the method shown in the corresponding embodiment of fig. 4 or fig. 5. The network content pushing device may include:
a feature obtaining module 610, configured to obtain a combination feature and a sequence feature of a current user, where the combination feature includes a user feature of a corresponding user and an article feature of the corresponding user; the sequence characteristics are used for indicating network behavior characteristics sequentially executed by corresponding users;
a weight obtaining module 620, configured to obtain, by combining the sequence features of the current user and the sequence features of each sample user, a weight corresponding to the current user;
a category obtaining module 630, configured to cluster the current user and each sample user according to a weight corresponding to the current user, a combination feature of the current user, and a combination feature of each sample user, so as to obtain a category to which the current user belongs;
and the content pushing module 640 is configured to push network content to the current user according to the category to which the current user belongs.
In a possible implementation manner, the weight obtaining module 620 includes:
the common mode determining submodule is used for determining the common sequence mode of each sample user and the longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user through a sequence mode mining algorithm;
a number obtaining sub-module, configured to obtain the number of sample users having the longest common sequence pattern and the total number of sample users from among the sample users;
and the weight determining submodule is used for determining the common sequence mode support degree of the current user according to the number of the sample users with the longest common sequence mode and the total number of the sample users, and the common sequence mode support degree is used as the weight corresponding to the current user.
In a possible implementation manner, the weight obtaining module 620 further includes:
and the sample weight determining submodule is used for determining the sample weight of each sample user according to the common sequence mode of the sample users, and the sequence mode comprises at least one of a behavior sequence mode and a browsing sequence mode.
In one possible implementation, the sample weight determining sub-module includes:
the weight determining unit is used for determining the weight of the field type according to the frequency occupied by the field type in the common sequence mode;
a sample weight determining unit, configured to average weights of at least one field type included in the common sequence pattern of the sample user, and determine a sample weight of the sample user;
in a possible implementation manner, the category obtaining module 630 includes:
a clustering center determining submodule, configured to determine, through a weighted clustering algorithm, a clustering center of each piece of network content corresponding to each sample user according to the sample weight of each sample user and the combination feature of each sample user;
the distance determining submodule is used for determining the distance between the current user and each clustering center according to the combination characteristics of the current user and the weight corresponding to the current user;
and the first category obtaining submodule is used for obtaining the category of the current user according to the distance between the current user and each clustering center.
In a possible implementation manner, the category obtaining module 630 includes:
a sample user determining submodule, configured to determine, according to the sample weight of each sample user, the combined feature of each sample user, the weight corresponding to the current user, and the combined feature of the current user, the sample user belonging to the same category as the current user through a weighted clustering algorithm;
and the second category acquisition submodule is used for acquiring the category to which the current user belongs, corresponding to the sample user belonging to the same category as the current user, as the category to which the current user belongs.
In one possible implementation manner, the content pushing module 640 includes:
the first content obtaining submodule is used for obtaining the network content corresponding to the minimum value of the distances between the current user and each clustering center;
and the first content pushing submodule is used for pushing the network content to the terminal of the current user.
In one possible implementation manner, the content pushing module 640 includes:
a second content obtaining sub-module, configured to obtain, from the sample users belonging to the same category as the current user, a pushed percentage of each network content corresponding to each sample user, where the pushed percentage is a ratio between the number of times that the corresponding network content is pushed and a sum of the number of times that the network content is pushed to all the sample users in the same category;
and the second content pushing submodule is used for pushing target network content to the terminal of the current user, wherein the target network content is the network content with the highest pushed ratio in each network content corresponding to each sample user.
In one possible implementation manner, the feature obtaining module 610 includes:
the user characteristic obtaining submodule is used for obtaining the user data of the current user and generating the user characteristics of the current user;
the article characteristic acquisition sub-module is used for acquiring article data of the current user and generating article characteristics of the current user;
and the combined feature generation submodule is used for performing feature processing according to the user features and the article features to generate the combined features.
In one possible implementation, the apparatus further includes:
and the sample library construction module is used for acquiring the conversion user as a sample user and constructing a user sample library, wherein the conversion user is used for indicating the user with the actual conversion.
In one possible implementation, the apparatus further includes:
a sample feature obtaining module, configured to obtain a sample sequence feature of each sample user before obtaining a weight corresponding to the current user according to the sequence feature of the current user and the sequence feature of each sample user;
and the sample sequence mode determining module is used for determining the sample sequence mode of each sample user according to the sample sequence characteristics.
To sum up, in the network content pushing method provided in the embodiment of the present disclosure, the server first obtains the weight corresponding to the current user by obtaining the combined feature and the sequence feature of the current user, and combining the sequence feature of the current user and the sequence feature of each sample user, then clusters the current user and each sample user according to the weight corresponding to the current user, the combined feature of the current user, and the combined feature of each sample user, obtains the category to which the current user belongs, and finally pushes the network content to the current user according to the category to which the current user belongs. According to the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing to determine the weight of the current user, perform clustering based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of prediction of recommended network content is improved on the premise of ensuring accurate content recommendation.
FIG. 7 is a block diagram illustrating a computer device according to an example embodiment. The computer device 700 includes a Central Processing Unit (CPU) 701, a system Memory 704 including a Random Access Memory (RAM) 702 and a Read-Only Memory (ROM) 703, and a system bus 705 connecting the system Memory 704 and the CPU 701. The computer device 700 also includes a basic Input/Output system (I/O system) 706 for facilitating information transfer between devices within the computer device, and a mass storage device 707 for storing an operating system 713, application programs 714, and other program modules 715.
The basic input/output system 706 comprises a display 708 for displaying information and an input device 709, such as a mouse, keyboard, etc., for a user to input information. Wherein the display 708 and input device 709 are connected to the central processing unit 701 through an input output controller 710 coupled to the system bus 705. The basic input/output system 706 may also include an input/output controller 710 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 710 may also provide output to a display screen, a printer, or other type of output device.
The mass storage device 707 is connected to the central processing unit 701 through a mass storage controller (not shown) connected to the system bus 705. The mass storage device 707 and its associated computer device-readable media provide non-volatile storage for the computer device 700. That is, the mass storage device 707 may include a computer device readable medium (not shown) such as a hard disk or Compact Disc-Only Memory (CD-ROM) drive.
Without loss of generality, the computer device readable media may comprise computer device storage media and communication media. Computer device storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer device readable instructions, data structures, program modules or other data. Computer device storage media includes RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), CD-ROM, Digital Video Disk (DVD), or other optical, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer device storage media is not limited to the foregoing. The system memory 704 and mass storage device 707 described above may be collectively referred to as memory.
The computer device 700 may also operate as a remote computer device connected to a network through a network, such as the internet, in accordance with various embodiments of the present disclosure. That is, the computer device 700 may be connected to the network 712 through the network interface unit 711 connected to the system bus 705, or may be connected to other types of networks or remote computer device systems (not shown) using the network interface unit 711.
The memory further includes one or more programs, the one or more programs are stored in the memory, and the central processing unit 701 implements all or part of the steps of the method shown in fig. 4 or fig. 5 by executing the one or more programs.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in embodiments of the disclosure may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-device-readable medium. Computer device readable media includes both computer device storage media and communication media including any medium that facilitates transfer of a computer device program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer device.
The embodiment of the present disclosure further provides a computer device storage medium, configured to store computer device software instructions for the testing apparatus, where the computer device software instructions include a program designed to execute the network content pushing method.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (15)

1. A method for pushing network contents, the method comprising:
acquiring a combination characteristic and a sequence characteristic of a current user, wherein the combination characteristic comprises a user characteristic of a corresponding user and an article characteristic of the corresponding user; the sequence characteristics are used for indicating network behavior characteristics sequentially executed by corresponding users;
combining the sequence characteristics of the current user and the sequence characteristics of each sample user to obtain the weight corresponding to the current user;
clustering the current user and each sample user according to the weight corresponding to the current user, the combination characteristics of the current user and the combination characteristics of each sample user to obtain the category of the current user;
and pushing the network content to the current user according to the category of the current user.
2. The method according to claim 1, wherein the obtaining the weight corresponding to the current user by combining the sequence feature of the current user and the sequence feature of each sample user comprises:
determining a common sequence mode of each sample user and a longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user by a sequence mode mining algorithm;
obtaining the number of sample users with the longest common sequence mode and the total number of sample users in each sample user;
and determining the common sequence mode support degree of the current user according to the number of the sample users with the longest common sequence mode and the total number of the sample users, and taking the common sequence mode support degree as the weight corresponding to the current user.
3. The method of claim 2, wherein before obtaining the number of sample users having the longest common sequence pattern and the total number of sample users, further comprising:
and determining the sample weight of each sample user according to the common sequence mode of the sample users, wherein the sequence mode comprises at least one of a behavior sequence mode and a browsing sequence mode.
4. The method of claim 3, wherein the determining the sample weight of each sample user according to the common sequence pattern of the sample users comprises:
determining the weight of the field type according to the frequency occupied by the field type in the common sequence mode;
averaging weights of at least one of the field types included in the common sequence pattern of the sample user to determine a sample weight of the sample user.
5. The method according to claim 1, wherein the clustering the current user and the sample users according to the weight corresponding to the current user, the combined feature of the current user, and the combined feature of the sample users to obtain the category to which the current user belongs comprises:
determining a clustering center of each network content corresponding to each sample user according to the sample weight of each sample user and the combination characteristic of each sample user through a weighted clustering algorithm;
determining the distance between the current user and each clustering center according to the combination characteristics of the current user and the weight corresponding to the current user;
and obtaining the category of the current user according to the distance between the current user and each clustering center.
6. The method according to claim 1, wherein the clustering the current user and the sample users according to the weight corresponding to the current user, the combined feature of the current user, and the combined feature of the sample users to obtain the category to which the current user belongs comprises:
determining the sample users belonging to the same category as the current user according to the sample weight of each sample user, the combined feature of each sample user, the weight corresponding to the current user and the combined feature of the current user through a weighted clustering algorithm;
and acquiring the belonged category corresponding to the sample user belonging to the same category as the current user as the belonged category of the current user.
7. The method of claim 5, wherein the pushing network content to the current user according to the category to which the current user belongs comprises:
acquiring the network content corresponding to the minimum value of the distance between the current user and each clustering center;
and pushing the network content to the terminal of the current user.
8. The method according to claim 5 or 6, wherein the pushing network content to the current user according to the category to which the current user belongs comprises:
acquiring a pushed ratio of each network content corresponding to each sample user in the sample users belonging to the same category as the current user, wherein the pushed ratio is a ratio between the number of times that the corresponding network content is pushed and the sum of the number of times that the network content is pushed to all the sample users in the same category;
and pushing target network content to the terminal of the current user, wherein the target network content is the network content with the highest pushed ratio in each network content corresponding to each sample user.
9. The method according to claim 1, wherein the combined features and sequence features of the current user are obtained, and the combined features comprise user features of the corresponding user and item features of the corresponding user; the sequence feature is used for indicating the network behavior features sequentially executed by the corresponding users, and comprises the following steps:
acquiring user data of the current user and generating user characteristics of the current user;
acquiring the article data of the current user and generating the article characteristics of the current user;
and performing feature processing according to the user features and the article features to generate the combined features.
10. The method of claim 1, further comprising:
and acquiring a conversion user as a sample user, and constructing a user sample library, wherein the conversion user is used for indicating the user with the actual conversion.
11. The method according to claim 1, wherein before the combining the sequence characteristics of the current user and the sequence characteristics of each sample user to obtain the corresponding weight of the current user, further comprising:
acquiring sample sequence characteristics of each sample user;
and determining the sample sequence mode of each sample user according to the sample sequence characteristics.
12. A network content pushing apparatus, the apparatus comprising:
the system comprises a characteristic acquisition module, a characteristic analysis module and a characteristic analysis module, wherein the characteristic acquisition module is used for acquiring the combined characteristic and the sequence characteristic of a current user, and the combined characteristic comprises the user characteristic of a corresponding user and the article characteristic of the corresponding user; the sequence characteristics are used for indicating network behavior characteristics sequentially executed by corresponding users;
the weight obtaining module is used for obtaining the weight corresponding to the current user by combining the sequence characteristics of the current user and the sequence characteristics of each sample user;
a category obtaining module, configured to cluster the current user and each sample user according to a weight corresponding to the current user, a combination feature of the current user, and a combination feature of each sample user, so as to obtain a category to which the current user belongs;
and the content pushing module is used for pushing the network content to the current user according to the category of the current user.
13. The apparatus of claim 12, the weight acquisition module, comprising:
the common mode determining submodule is used for determining the common sequence mode of each sample user and the longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user through a sequence mode mining algorithm;
a number obtaining sub-module, configured to obtain the number of sample users having the longest common sequence pattern and the total number of sample users from among the sample users;
and the weight determining submodule is used for determining the common sequence mode support degree of the current user according to the number of the sample users with the longest common sequence mode and the total number of the sample users, and the common sequence mode support degree is used as the weight corresponding to the current user.
14. A computer device comprising a processor and a memory, wherein the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the network content pushing method according to any one of claims 1 to 11.
15. A computer-readable storage medium, wherein at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the storage medium, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the network content pushing method according to any one of claims 1 to 11.
CN202010247149.5A 2020-03-31 2020-03-31 Network content pushing method, device and storage medium Active CN111460300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010247149.5A CN111460300B (en) 2020-03-31 2020-03-31 Network content pushing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010247149.5A CN111460300B (en) 2020-03-31 2020-03-31 Network content pushing method, device and storage medium

Publications (2)

Publication Number Publication Date
CN111460300A true CN111460300A (en) 2020-07-28
CN111460300B CN111460300B (en) 2023-04-25

Family

ID=71682409

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010247149.5A Active CN111460300B (en) 2020-03-31 2020-03-31 Network content pushing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN111460300B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000863A (en) * 2020-08-14 2020-11-27 北京百度网讯科技有限公司 User behavior data analysis method, device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120054040A1 (en) * 2010-08-30 2012-03-01 Abraham Bagherjeiran Adaptive Targeting for Finding Look-Alike Users
CN106021305A (en) * 2016-05-05 2016-10-12 北京邮电大学 Mode and preference sensing POI recommendation method and system
CN106778876A (en) * 2016-12-21 2017-05-31 广州杰赛科技股份有限公司 User classification method and system based on mobile subscriber track similitude
CN108076154A (en) * 2017-12-21 2018-05-25 广东欧珀移动通信有限公司 Application message recommends method, apparatus and storage medium and server
CN110827044A (en) * 2018-08-07 2020-02-21 北京京东尚科信息技术有限公司 Method and device for extracting user interest mode

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120054040A1 (en) * 2010-08-30 2012-03-01 Abraham Bagherjeiran Adaptive Targeting for Finding Look-Alike Users
CN106021305A (en) * 2016-05-05 2016-10-12 北京邮电大学 Mode and preference sensing POI recommendation method and system
CN106778876A (en) * 2016-12-21 2017-05-31 广州杰赛科技股份有限公司 User classification method and system based on mobile subscriber track similitude
CN108076154A (en) * 2017-12-21 2018-05-25 广东欧珀移动通信有限公司 Application message recommends method, apparatus and storage medium and server
CN110827044A (en) * 2018-08-07 2020-02-21 北京京东尚科信息技术有限公司 Method and device for extracting user interest mode

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
THANH TRAN等: "regularizing matrix factorization with user and item embeddings for recommendation", 《PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE》 *
都奕冰等: "融合项目嵌入表征与注意力机制的推荐算法", 《计算机工程与设计》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000863A (en) * 2020-08-14 2020-11-27 北京百度网讯科技有限公司 User behavior data analysis method, device, equipment and medium
CN112000863B (en) * 2020-08-14 2024-04-09 北京百度网讯科技有限公司 Analysis method, device, equipment and medium of user behavior data

Also Published As

Publication number Publication date
CN111460300B (en) 2023-04-25

Similar Documents

Publication Publication Date Title
Mao et al. Multiobjective e-commerce recommendations based on hypergraph ranking
Mishra et al. A web recommendation system considering sequential information
CN111259263B (en) Article recommendation method and device, computer equipment and storage medium
CN110222272A (en) A kind of potential customers excavate and recommended method
CN108363821A (en) A kind of information-pushing method, device, terminal device and storage medium
CN109359244A (en) A kind of recommendation method for personalized information and device
CN109582876B (en) Tourist industry user portrait construction method and device and computer equipment
CN106844407B (en) Tag network generation method and system based on data set correlation
CN106327227A (en) Information recommendation system and information recommendation method
KR20140026932A (en) System and method providing a suited shopping information by analyzing the propensity of an user
CN112632405B (en) Recommendation method, recommendation device, recommendation equipment and storage medium
CN111310032B (en) Resource recommendation method, device, computer equipment and readable storage medium
CN111400613A (en) Article recommendation method, device, medium and computer equipment
CN104050243A (en) Network searching method and system combined with searching and social contact
Sun et al. Leveraging friend and group information to improve social recommender system
CN114266443A (en) Data evaluation method and device, electronic equipment and storage medium
CN109190027A (en) Multi-source recommended method, terminal, server, computer equipment, readable medium
CN111429161A (en) Feature extraction method, feature extraction device, storage medium, and electronic apparatus
CN116823410B (en) Data processing method, object processing method, recommending method and computing device
Rao et al. BMSP-ML: big mart sales prediction using different machine learning techniques
CN112036987B (en) Method and device for determining recommended commodity
KR102238438B1 (en) System for providing commercial product transaction service using price standardization
CN113327132A (en) Multimedia recommendation method, device, equipment and storage medium
CN111460300B (en) Network content pushing method, device and storage medium
CN109299368B (en) Method and system for intelligent and personalized recommendation of environmental information resources AI

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40026286

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant