CN111460300B - Network content pushing method, device and storage medium - Google Patents

Network content pushing method, device and storage medium Download PDF

Info

Publication number
CN111460300B
CN111460300B CN202010247149.5A CN202010247149A CN111460300B CN 111460300 B CN111460300 B CN 111460300B CN 202010247149 A CN202010247149 A CN 202010247149A CN 111460300 B CN111460300 B CN 111460300B
Authority
CN
China
Prior art keywords
user
sample
current user
sequence
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010247149.5A
Other languages
Chinese (zh)
Other versions
CN111460300A (en
Inventor
刘志煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Cloud Computing Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Cloud Computing Beijing Co Ltd filed Critical Tencent Cloud Computing Beijing Co Ltd
Priority to CN202010247149.5A priority Critical patent/CN111460300B/en
Publication of CN111460300A publication Critical patent/CN111460300A/en
Application granted granted Critical
Publication of CN111460300B publication Critical patent/CN111460300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The application relates to a network content pushing method, a device, computer equipment and a storage medium, and relates to the technical field of cloud computing. The method comprises the following steps: firstly, a server acquires the weight corresponding to a current user by acquiring the combined characteristic, the sequence characteristic and the sequence characteristic of each sample user of the current user, then clusters the current user and each sample user to acquire the category of the current user, and finally pushes network content to the current user according to the category of the current user. Through the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing so as to determine the weight of the current user, cluster the current user based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of recommending network content prediction is improved on the premise of ensuring accurate content recommendation.

Description

Network content pushing method, device and storage medium
Technical Field
The disclosure relates to the technical field of cloud computing, and in particular relates to a network content pushing method, a device and a storage medium.
Background
Today, with the increasing development of the cloud computing technology field, in order to make users obtain more intelligent and targeted service scene recommendation on an internet platform, there may be multiple methods to implement personalized service scene recommendation, including a recommendation method based on demographics, a recommendation method based on content related to users, and a recommendation method based on a coordination filtering algorithm.
In the related art, the recommendation method based on the coordination filtering algorithm can complete collaborative filtering through a correlation model algorithm, a clustering model algorithm, a classification model algorithm, a regression model algorithm, matrix decomposition, a graph model and the like, and is based on user collaborative consideration and article collaborative consideration in two aspects of users and contents respectively.
However, in the scheme of the related art, the neural network model constructed by the above model algorithm requires a large amount of data to support, and a large amount of actual data needs to be input in order to obtain a more accurate model, which results in that the neural network model needs to process a large amount of data before accurately predicting, thereby affecting the efficiency of prediction.
Disclosure of Invention
The embodiment of the application provides a network content pushing method, a device, a computer device and a storage medium, which can improve the recommendation efficiency of related content, and the technical scheme is as follows:
In one aspect, there is provided a network content pushing method, the method being performed by the platform server, the method comprising:
acquiring the combination characteristics and the sequence characteristics of the current user, wherein the combination characteristics comprise the user characteristics of the corresponding user and the object characteristics of the corresponding user; the sequence features are used for indicating network behavior features which are executed by corresponding users in sequence;
combining the sequence characteristics of the current user and the sequence characteristics of each sample user to obtain the weight corresponding to the current user;
clustering the current user and each sample user according to the weight corresponding to the current user, the combination characteristic of the current user and the combination characteristic of each sample user to obtain the category of the current user;
and pushing the network content to the current user according to the category of the current user.
In one aspect, there is provided a network content pushing device, which is used in the platform server, and the device includes:
the feature acquisition module is used for acquiring the combined features and the sequence features of the current user, wherein the combined features comprise user features of corresponding users and object features of the corresponding users; the sequence features are used for indicating network behavior features which are executed by corresponding users in sequence;
The weight acquisition module is used for acquiring the weight corresponding to the current user by combining the sequence characteristics of the current user and the sequence characteristics of each sample user;
the category acquisition module is used for clustering the current user and each sample user according to the weight corresponding to the current user, the combination characteristic of the current user and the combination characteristic of each sample user to acquire the category of the current user;
and the content pushing module is used for pushing the network content to the current user according to the category to which the current user belongs.
In one possible implementation manner, the weight acquisition module includes:
the common mode determining submodule is used for determining a common sequence mode of each sample user and a longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user through a sequence mode mining algorithm;
a number obtaining sub-module, configured to obtain the number of sample users having the longest common sequence pattern and the total number of sample users in the respective sample users;
and the weight determining submodule is used for determining the common sequence mode support degree of the current user according to the number of the sample users with the longest common sequence mode and the total number of the sample users, and the common sequence mode support degree is used as the weight corresponding to the current user.
In one possible implementation manner, the weight obtaining module further includes:
and the sample weight determining submodule is used for determining the sample weight of each sample user according to the common sequence mode of the sample users, wherein the sequence mode comprises at least one of a behavior sequence mode and a browsing sequence mode.
In one possible implementation, the sample weight determination submodule includes:
the weight determining unit is used for determining the weight of the field type according to the frequency occupied by the field type in the common sequence mode;
a sample weight determining unit, configured to average weights of at least one field type included in the common sequence pattern of the sample users, and determine sample weights of the sample users;
in one possible implementation manner, the category obtaining module includes:
the clustering center determining sub-module is used for determining the clustering center of each network content corresponding to each sample user according to the sample weight of each sample user and the combination characteristics of each sample user through a weighted clustering algorithm;
the distance determining submodule is used for determining the distance between the current user and each clustering center according to the combination characteristics of the current user and the weight corresponding to the current user;
And the first category acquisition sub-module is used for acquiring the category of the current user according to the distance between the current user and each clustering center.
In one possible implementation manner, the category obtaining module includes:
the sample user determining submodule is used for determining the sample users belonging to the same category with the current user according to the sample weight of each sample user, the combined characteristic of each sample user, the weight corresponding to the current user and the combined characteristic of the current user through a weighted clustering algorithm;
and the second category obtaining sub-module is used for obtaining the category which corresponds to the sample user which belongs to the same category as the current user.
In one possible implementation manner, the content pushing module includes:
the first content acquisition sub-module is used for acquiring the network content corresponding to the minimum value of the distance between the current user and each clustering center;
and the first content pushing sub-module is used for pushing the network content to the terminal of the current user.
In one possible implementation manner, the content pushing module includes:
A second content obtaining sub-module, configured to obtain, from the sample users belonging to the same category as the current user, a pushed ratio of each network content corresponding to each sample user, where the pushed ratio is a ratio between a number of times corresponding network content is pushed and a sum of numbers of times of network content pushed to all the sample users in the same category;
and the second content pushing sub-module is used for pushing target network content to the terminal of the current user, wherein the target network content is the network content with the highest pushed duty ratio in the network contents corresponding to the sample users.
In one possible implementation manner, the feature acquisition module includes:
the user characteristic acquisition sub-module is used for acquiring the user data of the current user and generating the user characteristics of the current user;
the article characteristic obtaining sub-module is used for obtaining the article data of the current user and generating the article characteristics of the current user;
and the combined characteristic generating sub-module is used for carrying out characteristic processing according to the user characteristics and the object characteristics to generate the combined characteristics.
In one possible implementation, the apparatus further includes:
And the sample library construction module is used for acquiring the conversion user as a sample user and constructing a user sample library, wherein the conversion user is used for indicating the user with actual conversion.
In one possible implementation, the apparatus further includes:
the sample feature acquisition module is used for acquiring sample sequence features of each sample user before acquiring the weight corresponding to the current user by combining the sequence features of the current user and the sequence features of each sample user;
and the sample sequence mode determining module is used for determining the sample sequence mode of each sample user according to the sample sequence characteristics.
In one aspect, a computer device is provided, the computer device comprising a processor and a memory, the memory storing at least one instruction, at least one program, code set, or instruction set, the at least one instruction, at least one program, code set, or instruction set being loaded and executed by the processor to implement a web content push method as described in any of the above alternative implementations.
In one aspect, a computer readable storage medium is provided, where at least one instruction, at least one program, code set, or instruction set is stored, where at least one instruction, at least one program, code set, or instruction set is loaded and executed by a processor to implement a network content pushing method according to any of the above alternative implementations.
The technical scheme that this application provided can include following beneficial effect:
according to the content recommendation scheme provided by the embodiment of the application, firstly, a server acquires the weight corresponding to the current user by acquiring the combination characteristic and the sequence characteristic of the current user and combining the sequence characteristic of the current user and the sequence characteristic of each sample user, then clusters the current user and each sample user according to the weight corresponding to the current user, the combination characteristic of the current user and the combination characteristic of each sample user, acquires the category of the current user, and finally pushes network content to the current user according to the category of the current user. Through the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing so as to determine the weight of the current user, cluster the current user based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of recommending network content prediction is improved on the premise of ensuring accurate content recommendation.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of a web content push system provided in an exemplary embodiment of the present application;
FIG. 2 is a schematic diagram of a network content pushing method according to an exemplary embodiment of the present application;
FIG. 3 is a flow chart of a network content push provided in an exemplary embodiment of the present application;
FIG. 4 is a flowchart of a method for pushing web content according to an exemplary embodiment of the present application;
FIG. 5 is a flowchart of a method for pushing web content according to an exemplary embodiment of the present application;
fig. 6 is a block diagram illustrating a structure of a web content pushing apparatus according to an exemplary embodiment;
fig. 7 is a schematic diagram of a computer device, according to an example embodiment.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
It should be understood that references herein to "a number" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
For ease of understanding, terms involved in embodiments of the present disclosure are described below.
(1) Artificial intelligence AI
Artificial intelligence is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions. The scheme provided by the embodiment of the application mainly relates to the technologies of machine learning/deep learning and the like in artificial intelligence.
(2) Neural network
The neural network is also called artificial neural networks (Artificial Neural Networks, ANNs) or Connection models (Connection models), and is an algorithm mathematical Model for simulating the behavior characteristics of the neural network of animals such as humans and the like and performing distributed parallel information processing. The network relies on the complexity of the system and achieves the purpose of processing information by adjusting the relationship of the interconnection among a large number of nodes.
(3) Cloud technology (Cloud technology)
Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.
The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.
(4) Cloud Computing (Cloud Computing)
Cloud computing refers to the delivery and usage mode of an IT infrastructure, meaning that required resources are obtained in an on-demand and easily-extensible manner through a network; generalized cloud computing refers to the delivery and usage patterns of services, meaning that the required services are obtained in an on-demand, easily scalable manner over a network. Such services may be IT, software, internet related, or other services. Cloud Computing is a product of fusion of traditional computer and network technology developments such as Grid Computing (Grid Computing), distributed Computing (distributed Computing), parallel Computing (Parallel Computing), utility Computing (Utility Computing), network storage (Network StorageTechnologies), virtualization (Virtualization), load balancing (Load balancing), and the like.
With the development of the internet, real-time data flow and diversification of connected devices, and the promotion of demands of search services, social networks, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Unlike the previous parallel distributed computing, the generation of cloud computing will promote the revolutionary transformation of the whole internet mode and enterprise management mode in concept.
(5) Database for storing data
The Database (Database), which can be considered as an electronic filing cabinet, is a place for storing electronic files, and users can perform operations such as adding, inquiring, updating, deleting and the like on the data in the files. A "database" is a collection of data stored together in a manner that can be shared with multiple users, with as little redundancy as possible, independent of the application.
The database management system (Database Management System, DBMS) is a computer software system designed for managing databases, and generally has basic functions of storage, interception, security, backup, and the like. The database management system may classify according to the database model it supports, e.g., relational, XML (Extensible Markup Language ); or by the type of computer supported, e.g., server cluster, mobile phone; or by the query language used, such as SQL (Structured Query Language ), XQuery; or by performance impact emphasis, such as maximum scale, maximum speed of operation; or other classification schemes. Regardless of the manner of classification used, some DBMSs are able to support multiple query languages across categories, for example, simultaneously.
(6) Big data
Big Data (Big Data) refers to a Data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which needs a new processing mode to have stronger decision-making ability, insight discovery ability and flow optimization ability. With the advent of the cloud age, big data has attracted more and more attention, and special techniques are required for big data to effectively process a large amount of data within a tolerant elapsed time. Technologies applicable to big data include massively parallel processing databases, data mining, distributed file systems, distributed databases, cloud computing platforms, the internet, and scalable storage systems.
Fig. 1 is a schematic diagram of a web content push system, according to an example embodiment. The network content push system includes a terminal 110 and a platform server 120.
The user can enter a platform scene corresponding to the platform server 120 on the terminal 110, and the user can perform services under the platform scene.
After the user enters the platform scene, the platform server 120 may record the user data of the user in the platform scene.
The user data may include browsing data of the user in the scene, behavior data of the user in the scene, and basic data of the user.
The platform server 120 may include memory that may be used to store various user data.
The terminal 110 may perform data transmission with the platform server 120 through a wired or wireless network.
The platform server 120 may be a server, or may be a server cluster formed by a plurality of servers, or may include one or more virtualized platforms, or may also be a cloud computing service center.
The platform server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited herein.
Alternatively, the wired or wireless network described above uses standard communication techniques and/or protocols. The network is typically the Internet, but may be any network including, but not limited to, a local area network (Local Area Network, LAN), metropolitan area network (Metropolitan Area Network, MAN), wide area network (Wide Area Network, WAN), mobile, wireless network, private network, or any combination of virtual private networks. In some embodiments, data exchanged over the network is represented using techniques and/or formats including HyperText Mark-up Language (HTML), extensible markup Language (Extensible Markup Language, XML), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as secure socket layer (Secure Socket Layer, SSL), transport layer security (Transport Layer Security, TLS), virtual private network (Virtual Private Network, VPN), internet protocol security (Internet Protocol Security, IPsec), and the like. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of or in addition to the data communication techniques described above.
Fig. 2 is a schematic diagram illustrating a network content pushing method according to an exemplary embodiment. As shown in fig. 2, the network content pushing method includes the following steps:
In step 201, user features and item features are constructed and the actual transformed user is mined as a positive sample.
In one possible implementation, the server constructs user features and item features from the user data and the item data.
The user features may include user basic attribute features such as age, gender, academic, city level, etc.; the user characteristics may include user consumption characteristics such as a total number of payments, a total amount, a distribution of payments over a period of time (24 hours, a week, a month, a half year), a distribution of payments, a uniform amount, etc.; the user characteristics may also include user behavior characteristics such as browsing duration, number of page clicks, etc.
Wherein, the item features may include item base attribute features such as item category, item price, item brand, item purchase score, item comment emotion, etc. features; the item characteristics may include item consumption characteristics such as the number of times an item is purchased, the number of times it is clicked, the number of times a shopping cart is added, the number of times similar items are purchased, etc.
Optionally, the server may construct a combined feature of < user, item > by a combination of the user feature and the item feature, and perform data preprocessing.
Wherein the processing step may comprise:
1) Discarding the feature with excessive missing values.
For example, the server may set a missing value filtering threshold=sample data size×0.4, and filter a feature when the number of missing feature data exceeds the missing value filtering threshold, while deleting a single-valued feature.
2) And performing outlier processing.
For example, according to the feature distribution, outliers with too large feature values, top 0.001 (thousandth), are discarded.
3) And (5) processing the missing value.
For example, continuous features may be filled with a mean and discrete features may be filled with constants as separate categories.
4) Feature derivatization is performed.
For example, the server may perform feature combination and derivation through feature transformation, feature squaring, feature addition and subtraction.
5) And performing characteristic processing.
For example, the continuous features may be box-and-box discretized and the discrete features may be one-hot encoded.
Optionally, the users actually converted in the business scene are used as high-value users, the definition of the high-value users comprises users who purchase members, have points with high points converted and have large historical transaction amount, and the users are used as positive samples to construct a high-value user sample library.
In step 202, behavior sequences of different users of the same item are mined based on a sequence pattern mining Prefixspan algorithm.
In one possible implementation, the server may mine the user behavior sequence pattern based on a Prefixspan algorithm to find a user population that has common habit/browsing habit converted from touch to touch.
And mining the common sequence modes of each length meeting the minimum support threshold in the user behavior track based on a Prefixspan algorithm. Meanwhile, a multi-minimum support degree strategy can be used, and a formula of a calculation method of the minimum support degree is min < sup > = a×n, wherein n is the number of samples of the sample set, a is a minimum support rate parameter, and the minimum support rate parameter is adjusted according to the number of the sample sets.
In step 203, sample weights are calculated for the sequence pattern mining feature weights.
In one possible implementation, the common behavior mode of the user is mined and converted from the user behavior sequence and the user browsing sequence, so that the part of feature types are focused more on a modeling method, the part of feature types are weighted, the field rejection is performed on feature type fields which do not appear in all common prefixes of the sequence mode, and factors with little influence on the user conversion are filtered. Setting the weighting weight as the frequency duty ratio corresponding to each field type, for example, setting the minimum support threshold value as 0.5, if the frequency duty ratio of each type value of a certain field is smaller than the minimum support, rejecting the field, and if the frequency duty ratio of the field type 'collection behavior f' is 0.7, the weighting weight of the field type is 0.7; the field type "browsing sequence AaBcA" occurs with a frequency of 0.56, and then the field type weighting weight is 0.56. The average weight of the field types of the user containing the converted common behavior sequence is calculated as the sample weight converted by the user.
In step 204, a weighted clustering algorithm is constructed to cluster the sample combination features.
In one possible implementation, based on the user combined feature construction and feature processing in step 201, the user combined features are weighted according to the sample weight of each sample calculated in step 203, and a sample weighted clustering algorithm is constructed to cluster the feature vectors.
In step 205, different users are recommended based on the clustering result and preset conditions.
In one possible implementation, the user combined feature construction and feature processing is performed based on step 201, and the user combined feature is weighted according to the sample weight of each sample calculated in step 203, so as to construct a new sample weighted feature. And carrying out step 204 weighted clustering on the new sample combination feature vector and the converted user sample combination feature vector, calculating conversion duty ratio of various articles in the category to which the new sample combination feature vector belongs after clustering is completed, and recommending the article with the highest conversion rate in the category to the predicted user sample.
Referring to fig. 3, a flow chart of a network content pushing method according to an exemplary embodiment of the present application is shown. The user can be a user of an e-commerce platform or a user using a terminal with a recommendation system platform, and when the user is on the e-commerce platform, the e-commerce platform can recommend commodities preferred by the user according to the relevant attribute of the user and the sample attribute in the database through the content recommendation flow; when a user is in a terminal system with the function of recommending advertisements, the flow advertisement recommending platform for recommending the content can recommend the user related advertisements which are oriented according to the behavior preference of the user. The content recommendation method can greatly improve the content recommendation effect. As shown in fig. 3, in the scenario of an e-commerce platform, a user 301 enters the e-commerce platform through a terminal 302, the terminal can perform data transmission with a platform server 303 of the e-commerce platform through a wired or wireless network, first, relevant data of a high-value user and commodity data are stored in a database in the platform server 303 as sample data, wherein the high-value user can be a user common to members and transaction behaviors comprising the e-commerce platform, and the high-value user can provide relevant data of a high reference value of the platform, so that the high-value user can be selected as the sample user. The platform server 303 acquires the data and commodity data of the high-value user in real time, and preprocesses the data and commodity data of the high-value user. Then, the platform server 303 mines a behavior sequence of the sample user based on the sequence pattern, forms a browsing sequence pattern and a behavior sequence pattern from sequence information remained by clicking and browsing the sample user on the platform and a series of behavior tracks converted from other channels, obtains all common sequence patterns in the sample set through a sequence pattern mining algorithm, wherein the common sequence patterns can be common browsing patterns or common behavior patterns of the sample user, calculates sample weight according to the frequency occupation ratio occupied by the field types, and performs a weighted clustering algorithm on the samples according to the sample weight to obtain a clustering result. The combined features of the user 301 and the combined features of the samples are weighted and clustered, and then the conversion ratio of various commodities in the category to which the user 301 belongs, namely the ratio of the commodities favored by each sample user in the category to which the user 301 belongs, is calculated, and the commodity with the highest conversion rate in the category, namely the commodity with the highest ratio, is recommended to the user 301.
Referring to fig. 4, a flowchart of a network content pushing method according to an exemplary embodiment of the present application is shown. The web content push method may be performed by a platform server. The platform server may be the platform server 120 in the system shown in fig. 1. As shown in fig. 4, the network content pushing method may include the steps of:
in step 401, a combination feature and a sequence feature of a current user are acquired, wherein the combination feature comprises a user feature of a corresponding user and an article feature of the corresponding user; the sequence feature is used to indicate network behavior features that the corresponding user performs in sequence.
Optionally, the platform server may obtain user data of the current user, generate user features of the current user, the platform server may obtain item data of the current user, generate item features of the current user, and the platform server may generate the combined feature according to the user features and the item features.
Optionally, the user characteristics may include at least one of user base attribute characteristics, user consumption characteristics, and user behavior characteristics.
The user basic attribute features can be user features such as user age, user gender, user academy, city level of the user and the like. The user consumption characteristics may be user characteristics such as total number of payments by the user, total amount of payments by the user, distribution of payment by the user over a period of time (24 hours, one week, one month, half year), distribution of payment by the user, and average amount of money by the user. The user behavior feature may be a user feature such as a user browsing duration, a user page click number, and the like.
Optionally, the item characteristics may include at least one of an item base attribute characteristic and an item consumption characteristic.
The object basic attribute features can be object features such as object categories, object prices, object brands, object purchase scores, object comment emotions and the like. The item consumption characteristics can be item characteristics such as the number of times an item is purchased, the number of times the item is clicked and browsed, the number of times the item is added to a shopping cart, the number of times the like items are purchased, and the like.
Alternatively, the combined feature may be obtained by combining the user feature and the item feature, and the combined feature may be expressed in the form of < user feature, item feature >.
In step 402, the weight corresponding to the current user is obtained by combining the sequence features of the current user and the sequence features of each sample user.
Optionally, the user server may mine the sequence pattern of the current user and the sample sequence pattern of the sample user based on a Prefixspan (sequence pattern mining) algorithm, and by acquiring a common sequence pattern of the current user and the sample sequence pattern, the sample user having a common behavior habit or a common browsing habit converted from the contact of the current user can be found.
In step 403, the current user and each sample user are clustered according to the weight corresponding to the current user, the combination feature of the current user and the combination feature of each sample user, so as to obtain the category to which the current user belongs.
In step 404, web content is pushed to the current user according to the category to which the current user belongs.
Alternatively, the corresponding content may recommend different content according to the scene in which the user is located.
For example, in the context of an e-commerce platform, a user terminal may be sent a purchase link for a commodity by a platform server; the user purchases the store platform at the terminal application, and the user can be sent a download link of some kind of application by the platform server. Also, the content sent by the platform server may be a picture, video or even a piece of audio, in addition to the address connection.
In summary, in the network content pushing method provided in the embodiments of the present disclosure, firstly, a server obtains a weight corresponding to a current user by obtaining a combination feature and a sequence feature of the current user, combining the sequence feature of the current user and the sequence feature of each sample user, then clusters the current user and each sample user according to the weight corresponding to the current user, the combination feature of the current user and the combination feature of each sample user, obtains a category to which the current user belongs, and finally pushes network content to the current user according to the category to which the current user belongs. Through the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing so as to determine the weight of the current user, cluster the current user based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of recommending network content prediction is improved on the premise of ensuring accurate content recommendation.
Referring to fig. 5, a flowchart of a content recommendation method according to an exemplary embodiment of the present application is shown. The content recommendation method may be performed by a platform server. The platform server may be the platform server 120 shown in fig. 1. As shown in fig. 5, the content recommendation method may include the steps of:
in step 501, the platform server obtains the conversion user as a sample user, and constructs a user sample library.
In the embodiment of the disclosure, the platform server may acquire the sample users of the platform conversion users as the platform in real time or periodically, record the data of each sample user, and store the data information including each sample user in a special database of the platform server, where the database may be used as a sample database of users.
Wherein the conversion user may be used to indicate the user with the actual conversion. Transformation may be used to represent the process of transforming a new user initially entered into an old user of the platform through some browsing or behavior.
The user sample library may be a database in which data of sample users is stored in a platform server.
In step 502, the platform server obtains the combined feature and the sequence feature of the current user.
In the embodiment of the disclosure, the platform server may acquire the data of the current user, and obtain the combination feature of the current user according to the acquired data information.
Wherein the combined features include user features of the corresponding user and item features of the corresponding user; the sequence features are used to indicate network behavior features that the corresponding user performs in sequence.
Optionally, the platform server obtains user data of the current user, generates user characteristics of the current user, obtains item data of the current user, generates item characteristics of the current user, and generates the combination characteristics according to the user characteristics and the item characteristics.
Optionally, after the platform server splices and combines the two parts of features of the user feature and the article feature to form a combined feature form of < user feature, article feature >, the features can be subjected to data preprocessing.
The data preprocessing can be used for removing abnormal characteristic data, single-value characteristic data or characteristic data with more defects, and meanwhile, derivative characteristic data can be obtained according to known characteristic data, and the characteristic data quantity is amplified. And finally, displaying the added feature data.
Optionally, when a certain feature missing value in the obtained plurality of users is too many, the platform server may set a missing value filtering threshold, and automatically filter feature data with feature quantity below the missing value filtering threshold.
For example, the platform server may preset a missing value filtering threshold=sample data size×0.4, and assuming that the sample data is 10, the missing value filtering threshold is 4 according to a set threshold calculation formula, so that the platform server may filter out features with the missing number of feature data being greater than 4.
The single-value feature is a feature with only one numerical value, so that the single-value feature has no calculating meaning, and the platform server can directly delete the single-value feature.
Optionally, according to the distribution of the features, when a certain feature of a certain user of the plurality of users acquired by the platform server is an outlier in all feature values of the feature, the feature data may be deleted.
For example, if the feature value of a feature is at an outlier of one-thousandth of the feature value, the platform server may discard the outlier.
Alternatively, if a feature of the acquired plurality of users has a small number of missing values, the platform server may process the missing portions.
If the feature is a continuous feature, the feature can be filled with the average value of continuous data; if the feature is a discrete feature, it may be filled with a constant.
Optionally, the features directly acquired by the platform server may be combined and derived by at least one of feature transformation, feature squaring, or feature addition and subtraction, to generate new features.
Alternatively, the continuous features may be box-wise discretized, and the discrete features may be one-hot encoded.
In step 503, the platform server may determine the common sequence pattern of each sample user and the longest common sequence pattern of the current user according to the sequence features of the current user and the sequence features of each sample user by using a Prefixspan algorithm for sequence pattern mining.
In the embodiment of the disclosure, the platform server may first obtain the sample sequence features of the respective sample users, then determine the sample sequence patterns of the respective sample users according to the sample sequence features, and finally determine the common sequence pattern of the respective sample users and the longest common sequence pattern of the current user.
The platform server can acquire sequence information remained by clicking and browsing on the platform by each sample user and a series of behavior tracks converted from other channel contact according to daily operation of the sample user on the platform, and the sequence information can be represented by a sample sequence mode.
Optionally, the sequence pattern includes at least one of a behavior sequence pattern and a browsing sequence pattern.
The sequence modes can be sequenced, and behavior information or browsing information of the user in the sequence modes can be obtained through marking the sequence modes.
For example, the browse sequence pattern may be marked as follows: if the user's name is small, click the button a on the page A to enter the page B, browse for a period of time, click the button B to enter the page C; the user tab Li Tongguo clicks a button on page A to page B and then browses for a period of time before clicking c button to return to page A. The user's Ming's browsing sequence may be marked as: the browsing sequence of AaBbC, user xiao Li, can be marked as: aaBcA.
In addition, the behavior sequence mode can be used for representing a series of behavior tracks from the platform to the conversion of the user, the behavior sequence information can be formed by a series of behavior labels, and the behavior sequence mode can be marked as follows: in the shopping platform scene, a corresponding table of user behavior labels and user behavior codes can be preset by the platform, as shown in table 1, the purchasing behavior of the user is marked with a code h, the shopping cart adding behavior of the user can be marked with a code g and the like, and the detailed code corresponding relation can be seen in table 1. Under the scene of the platform, if a user is aware of entering the platform through a channel, registering and logging in, clicking a page for checking the detail page of the object after browsing for a period of time, clicking a collection button for collecting the object after browsing for a period of time, and clicking an additional shopping cart for purchasing the object. The behavior sequence label of the user is: bcafgh. The user xiao Li enters the platform through a channel, then registers and logs in, clicks to search specific commodities after browsing a page for a period of time, adds a shopping cart after browsing, pays for purchase, and adds collection after purchasing, then the behavior sequence label of the user is: bcdaghf.
Behavior label Behavior encoding
Purchasing behavior H
Adding shopping cart behavior G
Collection behavior F
Comment behavior E
Search behavior D
Login behavior C
Registration behavior B
Browsing behavior A
TABLE 1
Optionally, the corresponding relation between the behavior label and the marking number can be marked according to the actual application scene and the behavior category, and further refinement and change can be performed.
In addition, the platform server may use a plurality of different methods of minimum support thresholds to obtain common sequence patterns of respective lengths that satisfy the different minimum support thresholds.
Wherein, the method for calculating the minimum support degree can be as follows,
min_sup=a×n
wherein n is the number of sample users in the sample library, and a is the minimum support rate parameter.
Alternatively, the minimum support rate parameter may be adjusted according to the number of sample users.
Alternatively, the calculation process of the Prefixspan algorithm may be divided into the following steps:
1. the platform server finds a user sequence prefix of unit length 1 and a corresponding projection data set.
2. Counting the occurrence frequency of the sequence prefix, adding the prefix with the support degree higher than the minimum support degree threshold value to the data set, and obtaining the time sequence mode of the same item set.
3. All prefixes of length i and meeting the minimum support requirement are recursively mined. Mining the projection data set of the prefix, and returning to recursion if the projection data is an empty set; counting the minimum support degree of each item in the corresponding projection data set, combining each item meeting the support degree with the current prefix to obtain a new prefix, and recursively returning if the support degree requirement is not met; let i=i+1, the prefixes are each new prefix after merging the single items, and the 3 rd step is executed recursively respectively.
4. Returning all the common sequence patterns in the sequence sample library.
For example, if the user's navigation sequence is small, it may be marked as: the browsing sequence of AaBbC, user xiao Li, can be marked as: when AaBcA and the minimum support threshold set by the platform server is 0.5, a prefix and its corresponding suffix that satisfy the minimum support threshold may be as shown in table 2.
One item prefix Corresponding suffix
A aBbCaBcA
a BbCBcA
B bCcA
TABLE 2
The two prefixes satisfying the minimum support threshold and their corresponding suffixes may be as shown in table 3.
Two-term prefix Corresponding suffix
Aa BbCBcA
aB bCcA
TABLE 3 Table 3
Wherein, the three prefixes and their corresponding suffixes satisfying the minimum support threshold may be as shown in table 4.
Figure BDA0002434257610000171
Figure BDA0002434257610000181
TABLE 4 Table 4
For example, if the behavior sequence of the user's minds can be marked as: the sequence of actions of the user xiao Li can be marked as: when bcdaghf and the minimum support threshold set by the platform server is 0.5, a prefix and its corresponding suffix that satisfy the minimum support threshold may be as shown in table 5.
Figure BDA0002434257610000183
TABLE 5
The two prefixes satisfying the minimum support threshold and their corresponding suffixes may be as shown in table 6.
Figure BDA0002434257610000184
TABLE 6
Wherein, the three prefixes and their corresponding suffixes satisfying the minimum support threshold may be as shown in table 7.
Figure BDA0002434257610000182
Figure BDA0002434257610000191
TABLE 7
The four prefixes and their corresponding suffixes satisfying the minimum support threshold may be as shown in table 8.
Figure BDA0002434257610000192
TABLE 8
The five prefixes and their corresponding suffixes satisfying the minimum support threshold may be as shown in table 9.
Five-item prefix Corresponding suffix
bcagh f
TABLE 9
Optionally, the common sequence mode with the longest prefix of the current user is determined to be the longest common sequence mode according to the method.
In step 504, the platform server may determine sample weights for the respective sample users based on at least one of the common sequence pattern and the non-sequence pattern of the sample users.
The non-sequence mode is a sequence mode with a part of the common sequence mode removed, and the sequence mode comprises at least one of a behavior sequence mode and a browsing sequence mode.
Wherein, in response to determining the sample weights of the respective sample users according to the common sequence pattern of the sample users, the platform server may determine the weights of the field types according to the frequency occupied by the field types in the common sequence pattern, and then average the weights of at least one of the field types included in the common sequence pattern of the sample users to determine the sample weights of the sample users.
For example, if the frequency of occurrence of the field type "collection behavior f" is 0.7, the platform server may determine that the field type weighting is 0.7; when the field type "browsing sequence AaBcA" occurs with a frequency of 0.56, the platform server may determine that the field type weighting is 0.56. The platform server may calculate the average weight of the field types that the user contains to translate the common behavior sequence as the sample weight translated by the user.
The transformation common behavior sequence pattern included in the user behavior may be shown in table 10, and the platform server may calculate the sample weight of the user transformation as follows: (0.56+0.7)/2=0.63.
Figure BDA0002434257610000203
Table 10
Alternatively, the platform server may delete field types that do not appear in each common prefix in the sequence pattern.
For example, when the minimum support threshold is set to 0.5, a field may be deleted if the frequency duty ratio of each type value of the field is smaller than the minimum support.
Alternatively, instead of deriving the sample weights from a sequence pattern mining feature weighting determination, the platform server may also determine the sample weights by combining with the non-sequence patterns.
Wherein, in response to determining the sample weights of the respective sample users according to the common sequence pattern non-sequence pattern of the sample users, the platform server may determine the sample weights of the respective sample users according to the number of sample users having non-sequence characteristics and the total number of sample users.
Wherein, the non-sequence mode determining feature weight can be realized by the following two ways:
1. the platform server may set the minimum support rate a as the feature weight of the non-sequential mode.
Wherein the feature weight of the non-sequence pattern may be lower than the feature weight of the sequence pattern.
2. The platform server can calculate the feature weight of the non-sequence mode by dividing the number of samples of the feature occurrence with the total number of samples, namely
Figure BDA0002434257610000201
Wherein the feature weight of the non-sequence pattern may be lower than the feature weight of the sequence pattern.
Optionally, when the platform server obtains the sequence mode feature weight and the non-sequence mode feature weight of the user, each feature weight may be weighted to determine the sample weight.
For example, when the characteristic of the reddish user is "AaBcAort", the characteristic of the sequence mode is "AaBcA", the characteristic weight of the sequence mode is 0.56, the characteristic of the non-sequence mode is "ort", the characteristic weight of the non-sequence mode is 0.5, and then the sample weight of the user can be calculated as: (0.56×5+0.5×2)/(5+2) =0.54.
For example, the sample weights of the respective user samples can be calculated according to the above method, and part of the sample weights are shown in table 11 below.
Figure BDA0002434257610000202
/>
Figure BDA0002434257610000211
TABLE 11
In step 505, the platform server obtains the number of sample users having the longest common mode sequence among the sample users and the total number of sample users.
In the embodiment of the disclosure, the platform server may obtain the number of sample users in the user sample library with the longest common sequence pattern and the total number of sample users in the user sample library through a sequence pattern mining algorithm.
In step 506, the platform server determines the common sequence mode support of the current user according to the number of sample users with the longest common sequence mode and the total number of sample users, and uses the common sequence mode support as the weight corresponding to the current user.
Alternatively, the sequence pattern support may be calculated using the ratio of the number of sample users having the longest common sequence pattern to the total number of sample users.
Step 507, the platform server clusters the current user and each sample user according to the weight corresponding to the current user, the combination feature of the current user and the combination feature of each sample user, and obtains the category to which the current user belongs.
In the embodiment of the disclosure, the platform server may obtain the category of the current user by performing weighted clustering on the sample user and then obtaining the category of the current user, or performing weighted clustering on the sample user and the current user together and then obtaining the category of the current user in the two ways.
Optionally, when the platform server performs weighted clustering on the sample users and then obtains the category to which the current user belongs, the platform server may determine, through a weighted clustering algorithm, a cluster center of each network content corresponding to each sample user according to the sample weight of each sample user and the combination characteristic of each sample user, then determine, according to the combination characteristic of the current user and the weight corresponding to the current user, a distance between the current user and each cluster center, and finally obtain the category to which the current user belongs.
Optionally, when the sample user and the current user are weighted and clustered together and then the category to which the current user belongs is obtained, the platform server may determine, through a weighted clustering algorithm, the sample user belonging to the same category as the current user according to the sample weights of the respective sample users, the combined features of the respective sample users, the weights corresponding to the current user, and the combined features of the current user.
The traditional clustering algorithm can be clustering based on division, and all samples are treated equally when clustering calculation is carried out.
Alternatively, conventional clustering algorithms may include a k-means clustering algorithm (k-meansclustering algorithm) or a expectation maximization algorithm (Expectation Maximization Algorithm).
Under the premise of not considering the weight of the sample, the k-means clustering algorithm finishes clustering when the criterion function converges, and the formula of the criterion function is as follows:
Figure BDA0002434257610000221
wherein J is expressed as the degree of aggregation, which can be used to measure the clustering effect,k represents the total number of class clusters, m i Is the total number of members in the class cluster i,
Figure BDA0002434257610000222
is the j-th member in class cluster i; />
Figure BDA0002434257610000223
The calculation formula of the center vector is as follows:
Figure BDA0002434257610000224
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002434257610000225
for text->
Figure BDA0002434257610000226
Is cluster-like central point->
Figure BDA0002434257610000227
Is a similarity of (3).
Alternatively, the cosine of the vector angle may be used to calculate the similarity under consideration of the sample weights.
The method comprises the following steps of taking a sample weighted clustering algorithm into consideration, and clustering the weighted samples according to a criterion function calculation formula:
Figure BDA0002434257610000231
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002434257610000232
the class center vector weighted by the clustered samples is calculated as follows:
Figure BDA0002434257610000233
wherein the method comprises the steps of,w j For the weight of the cluster sample i, it can be satisfied that:
Figure BDA0002434257610000234
i.e.
Figure BDA0002434257610000235
In step 508, the platform server pushes the network content to the current user according to the category to which the current user belongs.
In the embodiment of the disclosure, the platform server may obtain the network content corresponding to the category with the shortest cluster center distance, or obtain the network content with the highest conversion rate as the recommended network content.
Optionally, the platform server may obtain the network content corresponding to the minimum distance between the current user and the cluster center, and then push the network content to the terminal of the current user. Or the platform server may also obtain the pushed duty ratio of each network content corresponding to each sample user in the sample users belonging to the same category as the current user, and then push the target network content to the terminal of the current user.
The pushed duty ratio is the ratio between the number of times corresponding network content is pushed and the sum of the number of times of network content pushed to all sample users in the same category, and the target network content is the network content with the highest pushed duty ratio in the network content corresponding to each sample user.
For example, in an e-commerce platform scenario, if the platform server can obtain that a sample user in the same category as the current user has a user a and a user B, the number of times the user a pushes the article a is 1 time, the number of times the user B pushes the article B is 4 times, the number of times the user B pushes the article a is 2 times, the number of times the article B pushes the article B is 1 time, and the number of times the article c pushes the article c is 2 times, the article a can be obtained with a ratio of 0.3, the article B can be obtained with a ratio of 0.5, and the article c can be obtained with a ratio of 0.2, which is the highest.
The platform server can acquire the recommended content in the following two ways:
1. the platform server can obtain a clustering center of each content preference category through historically converting the user sample characteristics, the content preference category is the content with the highest conversion rate in each category, for the current user, after the current user characteristics are weighted in a sequence mode, the distance between the user and the center of each content preference category is calculated, and cosine distance can be calculated, so that the nearest content preference category to which the current user belongs is obtained, and the platform server recommends the content of the current user.
2. The platform server can judge the duty ratio of the conversion rate of the content of the current user in the category of the current user through carrying out weighted clustering on the current user and the user sample, and recommend the content with the highest conversion rate in the category of the current user to the current user.
Alternatively, the platform server may send the recommended web content to the terminal of the current user for display.
In summary, in the network content pushing method provided in the embodiments of the present disclosure, firstly, a server obtains a weight corresponding to a current user by obtaining a combination feature and a sequence feature of the current user, combining the sequence feature of the current user and the sequence feature of each sample user, then clusters the current user and each sample user according to the weight corresponding to the current user, the combination feature of the current user and the combination feature of each sample user, obtains a category to which the current user belongs, and finally pushes network content to the current user according to the category to which the current user belongs. Through the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing so as to determine the weight of the current user, cluster the current user based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of recommending network content prediction is improved on the premise of ensuring accurate content recommendation.
Fig. 6 is a block diagram illustrating a structure of a web content pushing apparatus according to an exemplary embodiment. The network content pushing device may be implemented as all or part of a server by means of hardware or a combination of hardware and software, so as to execute all or part of the steps of the method shown in the corresponding embodiment of fig. 4 or fig. 5. The network content pushing apparatus may include:
a feature obtaining module 610, configured to obtain a combined feature and a sequence feature of a current user, where the combined feature includes a user feature of a corresponding user and an object feature of the corresponding user; the sequence features are used for indicating network behavior features which are executed by corresponding users in sequence;
the weight obtaining module 620 is configured to combine the sequence features of the current user and the sequence features of each sample user to obtain a weight corresponding to the current user;
the category obtaining module 630 is configured to cluster the current user and each sample user according to the weight corresponding to the current user, the combination feature of the current user, and the combination feature of each sample user, so as to obtain a category to which the current user belongs;
and the content pushing module 640 is configured to push network content to the current user according to the category to which the current user belongs.
In one possible implementation, the weight obtaining module 620 includes:
the common mode determining submodule is used for determining a common sequence mode of each sample user and a longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user through a sequence mode mining algorithm;
a number obtaining sub-module, configured to obtain the number of sample users having the longest common sequence pattern and the total number of sample users in the respective sample users;
and the weight determining submodule is used for determining the common sequence mode support degree of the current user according to the number of the sample users with the longest common sequence mode and the total number of the sample users, and the common sequence mode support degree is used as the weight corresponding to the current user.
In one possible implementation, the weight obtaining module 620 further includes:
and the sample weight determining submodule is used for determining the sample weight of each sample user according to the common sequence mode of the sample users, wherein the sequence mode comprises at least one of a behavior sequence mode and a browsing sequence mode.
In one possible implementation, the sample weight determination submodule includes:
The weight determining unit is used for determining the weight of the field type according to the frequency occupied by the field type in the common sequence mode;
a sample weight determining unit, configured to average weights of at least one field type included in the common sequence pattern of the sample users, and determine sample weights of the sample users;
in one possible implementation, the category obtaining module 630 includes:
the clustering center determining sub-module is used for determining the clustering center of each network content corresponding to each sample user according to the sample weight of each sample user and the combination characteristics of each sample user through a weighted clustering algorithm;
the distance determining submodule is used for determining the distance between the current user and each clustering center according to the combination characteristics of the current user and the weight corresponding to the current user;
and the first category acquisition sub-module is used for acquiring the category of the current user according to the distance between the current user and each clustering center.
In one possible implementation, the category obtaining module 630 includes:
The sample user determining submodule is used for determining the sample users belonging to the same category with the current user according to the sample weight of each sample user, the combined characteristic of each sample user, the weight corresponding to the current user and the combined characteristic of the current user through a weighted clustering algorithm;
and the second category obtaining sub-module is used for obtaining the category which corresponds to the sample user which belongs to the same category as the current user.
In one possible implementation, the content pushing module 640 includes:
the first content acquisition sub-module is used for acquiring the network content corresponding to the minimum value of the distance between the current user and each clustering center;
and the first content pushing sub-module is used for pushing the network content to the terminal of the current user.
In one possible implementation, the content pushing module 640 includes:
a second content obtaining sub-module, configured to obtain, from the sample users belonging to the same category as the current user, a pushed ratio of each network content corresponding to each sample user, where the pushed ratio is a ratio between a number of times corresponding network content is pushed and a sum of numbers of times of network content pushed to all the sample users in the same category;
And the second content pushing sub-module is used for pushing target network content to the terminal of the current user, wherein the target network content is the network content with the highest pushed duty ratio in the network contents corresponding to the sample users.
In one possible implementation manner, the feature obtaining module 610 includes:
the user characteristic acquisition sub-module is used for acquiring the user data of the current user and generating the user characteristics of the current user;
the article characteristic obtaining sub-module is used for obtaining the article data of the current user and generating the article characteristics of the current user;
and the combined characteristic generating sub-module is used for carrying out characteristic processing according to the user characteristics and the object characteristics to generate the combined characteristics.
In one possible implementation, the apparatus further includes:
and the sample library construction module is used for acquiring the conversion user as a sample user and constructing a user sample library, wherein the conversion user is used for indicating the user with actual conversion.
In one possible implementation, the apparatus further includes:
the sample feature acquisition module is used for acquiring sample sequence features of each sample user before acquiring the weight corresponding to the current user by combining the sequence features of the current user and the sequence features of each sample user;
And the sample sequence mode determining module is used for determining the sample sequence mode of each sample user according to the sample sequence characteristics.
In summary, in the network content pushing method provided in the embodiments of the present disclosure, firstly, a server obtains a weight corresponding to a current user by obtaining a combination feature and a sequence feature of the current user, combining the sequence feature of the current user and the sequence feature of each sample user, then clusters the current user and each sample user according to the weight corresponding to the current user, the combination feature of the current user and the combination feature of each sample user, obtains a category to which the current user belongs, and finally pushes network content to the current user according to the category to which the current user belongs. Through the scheme, the cloud server can mine the current user and the sample user according to the behavior sequence through cloud computing so as to determine the weight of the current user, cluster the current user based on the weight, and realize personalized recommendation of related network content according to the clustering result, so that the efficiency of recommending network content prediction is improved on the premise of ensuring accurate content recommendation.
Fig. 7 is a schematic diagram of a computer device, according to an example embodiment. The computer apparatus 700 includes a central processing unit (Central Processing Unit, CPU) 701, a system Memory 704 including a random access Memory (Random Access Memory, RAM) 702 and a Read-Only Memory (ROM) 703, and a system bus 705 connecting the system Memory 704 and the central processing unit 701. The computer device 700 also includes a basic Input/Output system (I/O) 706, which helps to transfer information between various devices within the computer device, and a mass storage device 707 for storing an operating system 713, application programs 714, and other program modules 715.
The basic input/output system 706 includes a display 708 for displaying information and an input device 709, such as a mouse, keyboard, or the like, for a user to input information. Wherein the display 708 and the input device 709 are coupled to the central processing unit 701 through an input output controller 710 coupled to a system bus 705. The basic input/output system 706 may also include an input/output controller 710 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, the input output controller 710 also provides output to a display screen, a printer, or other type of output device.
The mass storage device 707 is connected to the central processing unit 701 through a mass storage controller (not shown) connected to the system bus 705. The mass storage device 707 and its associated computer device readable media provide non-volatile storage for the computer device 700. That is, the mass storage device 707 may include a computer device readable medium (not shown) such as a hard disk or a compact disk-Only (CD-ROM) drive.
The computer device readable medium may include computer device storage media and communication media without loss of generality. Computer device storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer device readable instructions, data structures, program modules or other data. Computer device storage media includes RAM, ROM, erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), electrically erasable programmable read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), CD-ROM, digital video disk (Digital Video Disc, DVD), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will recognize that the computer device storage medium is not limited to the ones described above. The system memory 704 and mass storage device 707 described above may be collectively referred to as memory.
According to various embodiments of the present disclosure, the computer device 700 may also operate through a network, such as the Internet, to remote computer devices on the network. I.e., the computer device 700 may be connected to the network 712 through a network interface unit 711 coupled to the system bus 705, or alternatively, the network interface unit 711 may be used to connect to other types of networks or remote computer device systems (not shown).
The memory further includes one or more programs stored in the memory, and the central processor 701 implements all or part of the steps of the method shown in fig. 4 or 5 by executing the one or more programs.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described by the embodiments of the present disclosure may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer device-readable medium. Computer device readable media includes both computer device storage media and communication media including any medium that facilitates transfer of a computer device program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer device.
The embodiment of the disclosure also provides a computer device storage medium for storing computer device software instructions for the testing device, which contains a program designed for executing the network content pushing method.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (13)

1. A method for pushing web content, the method comprising:
acquiring the combination characteristics and the sequence characteristics of the current user, wherein the combination characteristics comprise the user characteristics of the corresponding user and the object characteristics of the corresponding user; the sequence features are used for indicating network behavior features which are executed by corresponding users in sequence;
Determining a common sequence mode of each sample user and a longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user through a sequence mode mining algorithm;
acquiring the number of sample users with the longest common sequence mode in the sample users and the total number of the sample users;
according to the number of the sample users with the longest common sequence mode and the total number of the sample users, determining the common sequence mode support degree of the current user as the weight corresponding to the current user;
clustering the current user and each sample user according to the weight corresponding to the current user, the combination characteristic of the current user and the combination characteristic of each sample user to obtain the category of the current user;
and pushing the network content to the current user according to the category of the current user.
2. The method of claim 1, wherein prior to obtaining the number of sample users having the longest common mode sequence among the sample users and the total number of sample users, further comprising:
And determining sample weights of the sample users according to the common sequence mode of the sample users, wherein the sequence mode comprises at least one of a behavior sequence mode and a browsing sequence mode.
3. The method of claim 2, wherein determining the sample weights for the respective sample users based on the common sequence patterns of the sample users comprises:
determining the weight of the field type according to the frequency occupied by the field type in the common sequence mode;
and averaging the weights of at least one field type contained in the common sequence mode of the sample users to determine the sample weight of the sample users.
4. The method according to claim 1, wherein the clustering the current user and the respective sample users according to the weight corresponding to the current user, the combined feature of the current user, and the combined feature of the respective sample users, to obtain the category to which the current user belongs, includes:
determining a clustering center of each network content corresponding to each sample user according to the sample weight of each sample user and the combination characteristics of each sample user through a weighted clustering algorithm;
Determining the distance between the current user and each cluster center according to the combination characteristics of the current user and the weight corresponding to the current user;
and obtaining the category of the current user according to the distance between the current user and each clustering center.
5. The method according to claim 1, wherein the clustering the current user and the respective sample users according to the weight corresponding to the current user, the combined feature of the current user, and the combined feature of the respective sample users, to obtain the category to which the current user belongs, includes:
determining the sample users belonging to the same category with the current user according to the sample weights of the sample users, the combined characteristics of the sample users, the weights corresponding to the current user and the combined characteristics of the current user through a weighted clustering algorithm;
and acquiring the belonging category corresponding to the sample user belonging to the same category as the current user as the belonging category of the current user.
6. The method of claim 4, wherein pushing network content to the current user according to the category to which the current user belongs comprises:
Acquiring the network content corresponding to the minimum value of the distance between the current user and each clustering center;
and pushing the network content to the terminal of the current user.
7. The method according to claim 4 or 5, wherein pushing network content to the current user according to the category to which the current user belongs comprises:
acquiring pushed duty ratios of the network contents corresponding to the sample users in the sample users belonging to the same category as the current user, wherein the pushed duty ratio is the ratio between the number of times corresponding network contents are pushed and the sum of the number of times of the network contents pushed to all the sample users in the same category;
and pushing target network content to the terminal of the current user, wherein the target network content is the network content with the highest pushed duty ratio in the network contents corresponding to the sample users.
8. The method of claim 1, wherein the obtaining the combined features and the sequence features of the current user, the combined features including user features of the corresponding user and item features of the corresponding user; the sequence feature is used for indicating network behavior features sequentially executed by corresponding users, and comprises the following steps:
Acquiring user data of the current user and generating user characteristics of the current user;
acquiring the article data of the current user and generating article characteristics of the current user;
and carrying out feature processing according to the user features and the object features to generate the combined features.
9. The method according to claim 1, wherein the method further comprises:
and obtaining a conversion user as a sample user, and constructing a user sample library, wherein the conversion user is used for indicating the user with actual conversion.
10. The method according to claim 1, wherein before determining the common sequence pattern of each sample user and the longest common sequence pattern of the current user according to the sequence features of the current user and the sequence features of each sample user by the sequence pattern mining algorithm, the method further comprises:
acquiring sample sequence characteristics of each sample user;
and determining the sample sequence mode of each sample user according to the sample sequence characteristics.
11. A network content pushing apparatus, the apparatus comprising:
the feature acquisition module is used for acquiring the combined features and the sequence features of the current user, wherein the combined features comprise user features of corresponding users and object features of the corresponding users; the sequence features are used for indicating network behavior features which are executed by corresponding users in sequence;
The weight acquisition module is used for determining a common sequence mode of each sample user and a longest common sequence mode of the current user according to the sequence characteristics of the current user and the sequence characteristics of each sample user through a sequence mode mining algorithm; acquiring the number of sample users with the longest common sequence mode in the sample users and the total number of the sample users; according to the number of the sample users with the longest common sequence mode and the total number of the sample users, determining the common sequence mode support degree of the current user as the weight corresponding to the current user;
the category acquisition module is used for clustering the current user and each sample user according to the weight corresponding to the current user, the combination characteristic of the current user and the combination characteristic of each sample user to acquire the category of the current user;
and the content pushing module is used for pushing the network content to the current user according to the category to which the current user belongs.
12. A computer device comprising a processor and a memory, wherein the memory has stored therein at least one program that is loaded and executed by the processor to implement the network content pushing method of any of claims 1 to 10.
13. A computer readable storage medium, wherein at least one program is stored in the storage medium, and the at least one program is loaded and executed by a processor to implement the network content pushing method according to any one of claims 1 to 10.
CN202010247149.5A 2020-03-31 2020-03-31 Network content pushing method, device and storage medium Active CN111460300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010247149.5A CN111460300B (en) 2020-03-31 2020-03-31 Network content pushing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010247149.5A CN111460300B (en) 2020-03-31 2020-03-31 Network content pushing method, device and storage medium

Publications (2)

Publication Number Publication Date
CN111460300A CN111460300A (en) 2020-07-28
CN111460300B true CN111460300B (en) 2023-04-25

Family

ID=71682409

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010247149.5A Active CN111460300B (en) 2020-03-31 2020-03-31 Network content pushing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN111460300B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000863B (en) * 2020-08-14 2024-04-09 北京百度网讯科技有限公司 Analysis method, device, equipment and medium of user behavior data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021305A (en) * 2016-05-05 2016-10-12 北京邮电大学 Mode and preference sensing POI recommendation method and system
CN106778876A (en) * 2016-12-21 2017-05-31 广州杰赛科技股份有限公司 User classification method and system based on mobile subscriber track similitude
CN108076154A (en) * 2017-12-21 2018-05-25 广东欧珀移动通信有限公司 Application message recommends method, apparatus and storage medium and server
CN110827044A (en) * 2018-08-07 2020-02-21 北京京东尚科信息技术有限公司 Method and device for extracting user interest mode

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9087332B2 (en) * 2010-08-30 2015-07-21 Yahoo! Inc. Adaptive targeting for finding look-alike users

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021305A (en) * 2016-05-05 2016-10-12 北京邮电大学 Mode and preference sensing POI recommendation method and system
CN106778876A (en) * 2016-12-21 2017-05-31 广州杰赛科技股份有限公司 User classification method and system based on mobile subscriber track similitude
CN108076154A (en) * 2017-12-21 2018-05-25 广东欧珀移动通信有限公司 Application message recommends method, apparatus and storage medium and server
CN110827044A (en) * 2018-08-07 2020-02-21 北京京东尚科信息技术有限公司 Method and device for extracting user interest mode

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
regularizing matrix factorization with user and item embeddings for recommendation;Thanh Tran等;《proceedings of the 27th ACM international conference on information and knowledge》;687-696 *
融合项目嵌入表征与注意力机制的推荐算法;都奕冰等;《计算机工程与设计》;第40卷(第3期);682-688 *

Also Published As

Publication number Publication date
CN111460300A (en) 2020-07-28

Similar Documents

Publication Publication Date Title
Pan et al. Study on convolutional neural network and its application in data mining and sales forecasting for E-commerce
US20190295114A1 (en) Digital banking platform and architecture
JP5960887B1 (en) Calculation device, calculation method, and calculation program
CN108805598B (en) Similarity information determination method, server and computer-readable storage medium
KR102297669B1 (en) System for providing matching service for connecting between manufacturer and distributor
CN112269805B (en) Data processing method, device, equipment and medium
CN111709810A (en) Object recommendation method and device based on recommendation model
WO2019072128A1 (en) Object identification method and system therefor
KR20140026932A (en) System and method providing a suited shopping information by analyzing the propensity of an user
CN103377443A (en) Online trade platform and processing method thereof
CN111259263A (en) Article recommendation method and device, computer equipment and storage medium
CN111429161B (en) Feature extraction method, feature extraction device, storage medium and electronic equipment
CN114371946B (en) Information push method and information push server based on cloud computing and big data
CN109829593B (en) Credit determining method and device for target object, storage medium and electronic device
CN111460300B (en) Network content pushing method, device and storage medium
CN116340643B (en) Object recommendation adjustment method and device, storage medium and electronic equipment
CN114996579A (en) Information pushing method and device, electronic equipment and computer readable medium
CN116521937A (en) Video form generation method, device, equipment, storage medium and program product
Iwański et al. Application of the Information Bottleneck method to discover user profiles in a Web store
CN108268482A (en) Information-pushing method, device and equipment
JP6152215B2 (en) Calculation device, calculation method, and calculation program
JP6067169B2 (en) Calculation device, calculation method, and calculation program
Ma Modeling users for online advertising
TWM573493U (en) System for predicting conversion probability by visitors&#39; browsing paths
Ling et al. TMP: Meta-path based recommendation on time-weighted heterogeneous information networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40026286

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant