CN115470407A - Training content recommendation method, device, equipment and medium - Google Patents

Training content recommendation method, device, equipment and medium Download PDF

Info

Publication number
CN115470407A
CN115470407A CN202211110364.6A CN202211110364A CN115470407A CN 115470407 A CN115470407 A CN 115470407A CN 202211110364 A CN202211110364 A CN 202211110364A CN 115470407 A CN115470407 A CN 115470407A
Authority
CN
China
Prior art keywords
content
user
training
historical behavior
training content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211110364.6A
Other languages
Chinese (zh)
Inventor
张振强
薛飞
王俐
刘水泉
魏聪惠
王怡冰
叶敏
连维淞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202211110364.6A priority Critical patent/CN115470407A/en
Publication of CN115470407A publication Critical patent/CN115470407A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance
    • G06Q50/2057Career enhancement or continuing education service

Abstract

The application discloses a training content recommendation method, a device, equipment and a medium thereof. The method comprises the following steps: acquiring historical behavior data of M first users; sequencing the historical behavior data of each first user according to the time sequence to obtain M historical behavior sequences; acquiring training content matched with each historical behavior sequence from the first content set based on a target recall algorithm to obtain M second content sets associated with M first users; inputting the input characteristics of each first user and a second content set associated with the first user into the target sorting model to obtain the click probability of each first user on each training content in the associated second content set; and outputting the content to be recommended matched with each first user based on the click probability of each first user on each training content in the associated second content set. According to the method and the device, the scientific and technological innovation capability level of the enterprise can be better evaluated.

Description

Training content recommendation method, device, equipment and medium
Technical Field
The application belongs to the technical field of computers, and particularly relates to a training content recommendation method, a device, equipment and a medium thereof.
Background
Enterprise training refers to planned and systematic training and training activities performed by enterprises to improve employee abilities, and currently, when enterprises train employees, the enterprises usually adopt a unified training mode, for example, when training related employees at a data analysis station, training contents are all unified books or video courses.
In the related art, the training contents of all employees are approximately the same in the conventional training mode, and the situations that the adaptability of the employees and the training contents is not high and the employees are not interested exist. Therefore, the enthusiasm of staff for learning cannot be mobilized, and the training quality is not high.
Disclosure of Invention
The embodiment of the application provides a training content recommendation method, a training content recommendation device, training content recommendation equipment and a training content recommendation medium, and can solve the problem of how to better evaluate the technological innovation capability level of an enterprise.
In a first aspect, an embodiment of the present application provides a method for recommending training content, where the method includes: acquiring historical behavior data of M first users; sequencing the historical behavior data of each first user according to the time sequence to obtain M historical behavior sequences; based on a target recall algorithm, training content matched with each historical behavior sequence is obtained from the first content set, and M second content sets associated with M first users are obtained; inputting the input characteristics of each first user and a second content set associated with the first user into the target ranking model to obtain the click probability of each first user on each training content in the associated second content set, wherein the input characteristics comprise at least one of business characteristics, project characteristics and skill characteristics; and outputting the content to be recommended matched with each first user based on the click probability of each first user on each training content in the associated second content set.
In some implementations of the first aspect, obtaining historical behavior data for a plurality of first users includes: acquiring historical behavior data of a plurality of first users based on target data buried points; the target data buried point comprises a user identifier, a historical behavior type, a stay time, training contents, training content categories and event time, and the historical behavior data comprises historical exposure behavior data and historical click behavior data.
In some implementations of the first aspect, the target recall algorithm includes at least one of a word vector recall algorithm, a collaborative filtering recall algorithm, and a tag recall algorithm, and the obtaining training content matching each historical behavior sequence from the first content set based on the target recall algorithm includes at least one of: acquiring training content matched with each historical behavior sequence from the first content set based on a word vector recall algorithm and a time attenuation weight; acquiring training content matched with each historical behavior sequence from the first content set based on a collaborative filtering recall algorithm and a user activity attenuation weight; training content matched with each historical behavior sequence is obtained from the first content set based on the label recall algorithm, the time attenuation weight and the frequency attenuation weight.
In some implementations of the first aspect, the method further comprises: acquiring historical behavior data and input characteristics of a plurality of second users; constructing training sample data based on historical behavior data and input features of a plurality of second users; and training the logistic regression model based on the training sample data to obtain a target sequencing model.
In some implementations of the first aspect, outputting the content to be recommended that matches each first user based on a click probability of each first user on each training content in the associated second content set includes: outputting contents to be recommended matched with each first user according to the sequence of the click probability from high to low; the content to be recommended comprises all training content in a second content set associated with the first user, or the content to be recommended comprises target training content in the second content set associated with the first user, and the click probability of the target training content is larger than a preset probability threshold.
In some implementations of the first aspect, the method further comprises: acquiring input characteristics of a third user; determining a first user with input feature similarity greater than a preset similarity threshold with a third user as a fourth user; and recommending the contents to be recommended, which are matched with the fourth user, for the third user.
In a second aspect, an embodiment of the present application provides a training content recommendation device, including: the acquisition module is used for acquiring historical behavior data of M first users; the sorting module is used for sorting the historical behavior data of each first user according to the time sequence to obtain M historical behavior sequences corresponding to M first users; the acquisition module is further used for acquiring training content matched with each historical behavior sequence from the first content set based on a target recall algorithm to obtain M second content sets associated with M first users; the input module is used for inputting the input characteristics of each first user and a second content set associated with the first user into the target ranking model to obtain the click probability of each first user on each training content in the associated second content set, wherein the input characteristics comprise at least one of business characteristics, project characteristics and skill characteristics; and the output module is used for outputting the contents to be recommended matched with each first user based on the click probability of each first user on each training content in the second content set.
In some implementations of the second aspect, the obtaining module is specifically configured to include: acquiring historical behavior data of a plurality of first users based on target data buried points; the target data buried points comprise user identification, historical behavior types, stay time, training contents, training content categories and event time, and the historical behavior data comprise historical exposure behavior data and historical click behavior data.
In some implementations of the second aspect, the target recall algorithm includes at least one of a word vector recall algorithm, a collaborative filtering recall algorithm, and a tag recall algorithm, and the obtaining module is specifically configured to: acquiring training content matched with each historical behavior sequence from the first content set based on a word vector recall algorithm and a time attenuation weight; acquiring training content matched with each historical behavior sequence from the first content set based on a collaborative filtering recall algorithm and a user activity attenuation weight; training content matched with each historical behavior sequence is obtained from the first content set based on the label recall algorithm, the time attenuation weight and the frequency attenuation weight.
In some implementations of the second aspect, the apparatus further comprises: the acquisition module is also used for acquiring historical behavior data and input characteristics of a plurality of second users; the building module is used for building training sample data based on historical behavior data and input features of a plurality of second users; and the model training module is used for training the logistic regression model based on the training sample data to obtain a target ranking model.
In some implementations of the second aspect, the output module is specifically configured to: outputting contents to be recommended matched with each first user according to the sequence of the click probability from high to low; the content to be recommended comprises all training content in a second content set associated with the first user, or the content to be recommended comprises target training content in the second content set associated with the first user, and the click probability of the target training content is larger than a preset probability threshold.
In some implementations of the second aspect, the apparatus further comprises: the acquisition module is also used for acquiring the input characteristics of a third user; the determining module is further used for determining that the first user with the input feature similarity of the third user larger than a preset similarity threshold is a fourth user; and the recommending module is also used for recommending the content to be recommended matched with the fourth user for the third user.
In a third aspect, an embodiment of the present application provides an electronic device, where the electronic device includes: a processor and a memory storing computer program instructions; the steps of the training content recommendation method as shown in any of the embodiments of the first aspect are implemented when the processor executes the computer program instructions.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, implement the steps of the training content recommendation method as shown in any one of the embodiments of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product, which is stored on a non-volatile storage medium and executed by at least one processor to implement the steps of the training content recommendation method as shown in any one of the embodiments of the first aspect.
According to the training content recommendation method, device, equipment, medium and product, after the historical behavior data of the M first users are obtained, the historical behavior data of each first user are sequenced according to the time sequence, the M historical behavior sequences can be obtained, and each historical behavior sequence can reflect habit preference of the corresponding user. On the basis, training content matched with each historical behavior sequence can be obtained from the initial first content set based on a target recall algorithm, and M second content sets associated with M first users are obtained. Therefore, training contents with high adaptation degree can be screened out for each first user from the initial content set containing a large amount of training contents, and a second content set corresponding to each first user is constructed. Based on the above, by inputting at least one of the business feature, the project feature and the skill feature of each first user and the second content set associated with the first user into the target ranking model, the click probability of each first user on each training content in the associated second content set can be accurately predicted, and the click probability can represent the adaptation degree and the interest degree of the first user on the training content, so that the content to be recommended is output for each first user based on the click probability of each training content in the second content set, thousands of people are achieved, the potential of the staff is fully mined, the enthusiasm of the staff for learning is mobilized, the adaptation degree of the training content and the staff is effectively improved, and the personalized recommendation and the precise recommendation of the training content in combination with the related interest of the staff are realized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings may be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram illustrating a training content recommendation method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating a training content recommendation method according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of a training content recommendation device according to an embodiment of the present application;
fig. 4 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are intended to be illustrative only and are not intended to be limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.
It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising 8230; \8230;" comprises 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
Enterprise training refers to planned and systematic training and training activities performed by enterprises to improve employee abilities, and currently, when enterprises train employees, the enterprises usually adopt a unified training mode, for example, when training related employees at a data analysis station, training contents are all unified books or video courses. In the related art, the training contents of all employees are approximately the same in the conventional training mode, and the situations that the adaptability of the employees and the training contents is not high and the employees are not interested exist. Therefore, the enthusiasm of staff for learning cannot be mobilized, and the training quality is not high.
Aiming at the problems in the related art, the embodiment of the application provides a training content recommendation method, after historical behavior data of M first users are obtained, the historical behavior data of each first user are sequenced according to time sequences, M historical behavior sequences can be obtained, and each historical behavior sequence can reflect habit preference of the corresponding user. On the basis, training content matched with each historical behavior sequence can be obtained from the initial first content set based on a target recall algorithm, and M second content sets associated with M first users are obtained. Therefore, training contents with high adaptation degree can be screened out for each first user from the initial content set containing a large amount of training contents, and a second content set corresponding to each first user is constructed. Based on the above, by inputting at least one of the business feature, the project feature and the skill feature of each first user and the second content set associated with the first user into the target ranking model, the click probability of each first user on each training content in the associated second content set can be accurately predicted, and the click probability can represent the adaptation degree and the interest degree of the first user on the training content, so that the contents to be recommended are output for each first user based on the click probability of each training content in the second content set, thousands of people are achieved, the potential of the staff is fully mined, the enthusiasm of the staff for learning is mobilized, the adaptation degree of the training content and the staff is effectively improved, the personalized recommendation and the accurate recommendation of the training content combined with the related interest of the staff are realized, and the training quality is improved.
The training content recommendation method provided by the embodiment of the present application is described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.
It should be noted that, in the embodiments of the present application, the acquisition, storage, use, processing, and the like of data all conform to relevant regulations of national laws and regulations.
Fig. 1 is a flowchart illustrating a training content recommendation method according to an embodiment of the present application, where an execution subject of the training content recommendation method may be an electronic device. The above-described execution body does not constitute a limitation of the present application.
Here, the electronic device may include a device having a communication function, such as a mobile phone, a tablet computer, and a one-piece computer, may also include a device simulated by a virtual machine or a simulator, and of course, may also include a device having a storage function and a computing function, such as a cloud server or a server cluster.
As shown in fig. 1, the training content recommendation method provided by the embodiment of the application may include steps 110 to 150.
Step 110, obtaining historical behavior data of M first users.
The first user may be a user to be recommended, and the historical behavior data may include, but is not limited to: browsing records, exposure records, and browsing duration.
And step 120, sequencing the historical behavior data of each first user according to the time sequence to obtain M historical behavior sequences.
Each historical behavior sequence is a group of data sequences arranged according to the time sequence.
For example, the historical behavior data of the user a is sorted according to the time sequence to obtain a historical behavior sequence: user a- [ content 3, content 1, content 7 \8230 ], content 9001], where content 3 is browsed at a time before content 1.
Step 130, based on the target recall algorithm, training content matched with each historical behavior sequence is obtained from the first content set, and M second content sets associated with M first users are obtained.
Specifically, the first content set may include a large number of training content resources, and for the historical behavior sequence of each first user, the electronic device may filter, from the large number of training content resources, training content matching the training content, and construct the second content set, resulting in M second content sets associated with the M first users.
Step 140, inputting the input characteristics of each first user and the second content set associated with the first user into the target ranking model, and obtaining the click probability of each first user on each training content in the associated second content set.
Wherein the input characteristics can comprise at least one of business characteristics, project characteristics and skill characteristics; the click probability may be used to characterize a degree of match of the corresponding first user with the training content.
The service characteristics may be characteristics for characterizing the service posts of the first user, such as data analysis, data mining, software development, software testing, and the like; the project characteristics may be characteristics used to characterize the project experience of the first user, such as the development of a XXX project; the skill characteristics may be characteristics for characterizing the skill of the first user, such as programming language (oracle, java, C language, etc.), office application identification, etc.
And 150, outputting the contents to be recommended matched with each first user based on the click probability of each first user on each training content in the associated second content set.
According to the training content recommendation method, after the historical behavior data of the M first users are obtained, the historical behavior data of each first user are sequenced according to the time sequence, the M historical behavior sequences can be obtained, and each historical behavior sequence can reflect habit preference of the corresponding user. On the basis, training content matched with each historical behavior sequence can be obtained from the initial first content set based on a target recall algorithm, and M second content sets associated with M first users are obtained. Therefore, training contents with high adaptation degree can be screened out for each first user from the initial content set containing a large amount of training contents, and a second content set corresponding to each first user is constructed. Based on the above, by inputting at least one of the business feature, the project feature and the skill feature of each first user and the second content set associated with the first user into the target ranking model, the click probability of each first user on each training content in the associated second content set can be accurately predicted, and the click probability can represent the adaptation degree and the interest degree of the first user on the training content, so that the content to be recommended is output for each first user based on the click probability of each training content in the second content set, thousands of people are achieved, the potential of the staff is fully mined, the enthusiasm of the staff for learning is mobilized, the adaptation degree of the training content and the staff is effectively improved, and the personalized recommendation and the precise recommendation of the training content in combination with the related interest of the staff are realized.
The above steps are described in detail below, specifically as follows:
referring to step 110, historical behavior data of the M first users is obtained.
Specifically, the electronic device may obtain historical behavior data of the M first users when receiving a recommendation request sent by the client.
In some embodiments of the present application, step 110 may specifically include: historical behavior data of a plurality of first users is obtained based on the target data buried points.
The target data buried points can comprise user identification, historical behavior types, stay time, training contents, training content categories and event time, and the historical behavior data can comprise historical exposure behavior data and historical click behavior data.
Specifically, the target data buried point is a data buried point preset in a target application and/or a target website, the interface of the target application and the target website can display training related content, and the target data buried point can collect operation data of a user according to a buried point event so as to obtain buried point data (namely historical behavior data). The buried point event may include a click event, an input event, a time period event, a sharing event, and the like. The click event is obtained based on the click operation of the user on the interface of the target application and/or the target website, such as the click operation on controls such as buttons and menus on the interface; the input event is obtained based on the input operation of the user on the interface of the target application and/or the target website, such as the input operation of inputting control parameters and the like in an input column of the interface; the time period event is obtained based on the browsing duration of the user on the interface of the target application and/or the target website; the sharing event is obtained based on the sharing operation of the user on the interface of the target application and/or the target website.
In an embodiment, the obtaining of the historical behavior data of the plurality of first users based on the target data buried point may specifically include: under the condition of receiving a recommendation request sent by a client, acquiring buried point logs of a plurality of first users based on target data buried points; forwarding the recommendation request to a Hyper Text Transfer Protocol (HTTP) and a reverse proxy web server nginx, and converting a buried point log into an object numbered notation json format by using a scripting language lua to obtain user behavior data; sending user behavior data to a distributed publish-subscribe messaging system (kafka cluster) via scripting language lua configuration producer (async).
In one embodiment, the kafka cluster stores user behavior data, the distributed log collection framework logstack grounds historical behavior data to the distributed file storage system, the distributed file storage system stores the user behavior data for long-term offline, the user behavior data is used for user long-term portrait generation, an analyst analyzes a user behavior path according to the buried point log, and the generated user long-term portrait is stored in a key-value database redis for online interface API call based on a user identification of a first user.
In another embodiment, a user short-term portrait is generated through real-time consumption of the framework and the distributed processing engine flink, namely a real-time feature building unit, the first user real-time online behavior is responded in time, and the user short-term portrait, the short-term behavior and the activeness, the exposure quantity, the click quantity and the like of each training content are dynamically generated.
Step 130 is involved, training content matched with each historical behavior sequence is obtained from the first content set based on a target recall algorithm, and M second content sets associated with M first users are obtained.
In one embodiment, the electronic device may perform algorithmic recall calculations based on an offline data cycle scheduling script.
In some embodiments of the present application, the target recall algorithm comprises at least one of a word vector recall algorithm, a collaborative filtering recall algorithm, and a tag recall algorithm, and based on the target recall algorithm, the step 130 may comprise at least one of: acquiring training content matched with each historical behavior sequence from the first content set based on a word vector recall algorithm and a time attenuation weight; acquiring training content matched with each historical behavior sequence from the first content set based on a collaborative filtering recall algorithm and a user activity attenuation weight; training content matched with each historical behavior sequence is obtained from the first content set based on the label recalling algorithm, the time attenuation weight and the frequency attenuation weight.
In this embodiment of the application, the electronic device may perform recall based on at least one of a word vector recall algorithm, a collaborative filtering recall algorithm, and a tag recall algorithm, and preferentially recall training content that may be of interest to each first user from among a large amount of training content to construct a candidate resource pool of each first user, so as to obtain a second content set associated with the first user. On the basis, the electronic equipment only needs to calculate the click probability of each first user on the training content in the corresponding second content set, and the content in which the first user is most interested does not need to be searched from thousands of training contents (first content sets), so that the online computing resource overhead is effectively reduced, and the real-time computing speed of subsequent sequencing computation is optimized. Meanwhile, time attenuation, frequency attenuation and user activity attenuation are considered in the training content recalling process, the recalled training content is guaranteed to be the content which is interested by the first user, the matching degree of the training content in the second content set and the first user is improved, and accurate recalling and subsequent accurate recommendation are achieved.
In one embodiment, the obtaining of training content matching each historical behavior sequence from the first content set based on the word vector recall algorithm and the time decay weight may specifically include the following steps: constructing a historical behavior matrix based on each historical behavior data; acquiring word vectors of a plurality of training contents corresponding to each historical behavior matrix; accumulating and averaging the word vector of each training content and the corresponding time attenuation weight to obtain the vector characteristic of each first user; training content is recalled from the first set of content based on the vector features of each first user.
Specifically, the electronic device may obtain word vectors of a plurality of training contents corresponding to each historical behavior matrix by using a natural language processing tool gensim of a programming language python; the electronic device may recall training content from the first set of content that matches the vector features of the first user based on a data collection tool of python (e.g., the faiss package).
The time attenuation weight is determined in a manner shown in formula (1):
Figure BDA0003843827910000101
xr is a time attenuation weight of the training content r, nowday is a current time, and startday is an event time corresponding to the training content r, for example, an event for browsing the training content r.
In the embodiment of the application, since the behavior of the user has a time period, the time dimension needs to be increased in the course of the rough ranking, and therefore, a time attenuation weight needs to be allocated to each training content to distinguish the long-term interest from the short-term interest.
In one embodiment, the obtaining of the training content matching with each historical behavior sequence from the first content set based on the collaborative filtering recall algorithm and the user activity decay weight may specifically include the following steps: acquiring N training contents related to M historical behavior sequences; calculating the similarity among the N training contents based on the user activity attenuation weight; the training content is recalled from the first set of content based on a similarity between the N training content.
The determination mode of the user activity attenuation weight is shown as formula (2):
Figure BDA0003843827910000111
n (i) is a first user set of a first user browsing the training content i, N (j) is a second user set of the first user browsing the training content j, | N (i) | is the number of users in the first user set, | N (j) | is the number of users in the second user set, and YIj is the user activity weight corresponding to the training i and j.
In one embodiment, the similarity between any two training contents may be calculated based on formula (3).
Figure BDA0003843827910000112
And Wij is the similarity between the training contents i and j, U is the intersection of N (i) and N (j), and N (U) is the number of the training contents browsed by the first user in the intersection.
In the embodiment of the application, historical behavior sequences of M first users are considered, similarity among N training contents associated with the M historical behavior sequences is calculated, user activity attenuation needs to be considered in the calculation process, for example, a user B browses over eighty percent of the training contents, the browsing record of the user B is not high in importance due to the fact that the browsing range of the user B is too wide, the user B possibly browses the training contents of a website without destination, the actual reference value is not large, and therefore the influence of the user B needs to be removed, and the user activity attenuation weight is increased.
In one embodiment, the obtaining of the training content matching each historical behavior sequence from the first content set based on the tag recall algorithm, the time decay weight and the frequency decay weight may specifically include the following steps: acquiring historical behavior data of a target in a preset time period; determining P labels of P training contents related to the target behavior data; determining a target label corresponding to the first user based on the time attenuation weight, the frequency attenuation weight and the label corresponding to the P training contents; training content is recalled from the first set of content based on the target label corresponding to each first user.
The frequency attenuation weight is determined in a manner shown in formula (4):
Figure BDA0003843827910000121
zr is the frequency attenuation weight of the training content r, and times is the operating frequency of the first user on the training content, such as browsing frequency or exposure frequency.
In the embodiment of the present application, since the user may generate repetitive behaviors in a short period of time, in order to avoid the result of calculation concentrating in a certain range and make the personalized training content single, it is necessary to assign a frequency attenuation weight to each training content.
Step 140 is involved in inputting the input characteristics of each first user and the second content set associated with the first user into the target ranking model to obtain the click probability of each first user on each training content in the associated second content set.
In one embodiment, the input features may also include, but are not limited to: exposure of training content, click rate, training content type, title length.
In some embodiments of the present application, in order to obtain the target ranking model, fig. 2 is a flowchart illustrating a training content recommendation method according to another embodiment of the present application, and as shown in fig. 2, before step 140, the method may further include steps 210 to 230.
Step 210, obtaining historical behavior data and input characteristics of a plurality of second users.
The plurality of second users may include the first user, and the second user may be a platform user of the recommendation platform.
Step 220, constructing training sample data based on the historical behavior data and the input features of the plurality of second users.
And step 230, training the logistic regression model based on the training sample data to obtain a target ranking model.
In the embodiment of the present application, the historical behavior data of the second user relates to a historical exposure click behavior record of the second user, that is, to training contents that are more interesting to the second user, and the input features can reflect past project experience, mastered related skills, related post responsibilities, and the like of the second user. Based on the method, training sample data is constructed through historical behavior data and input features, the training sample data is used for training the logistic regression model, when the target ranking model is used subsequently, the target ranking model can be combined with preference habits and employee features of the first user, the interest of the first user is mined, the click probability of the first user on training contents is predicted according to the user interest and the factor of post adaptation, and the prediction accuracy is improved.
And 150, outputting the content to be recommended matched with each first user based on the click probability of each first user on each training content in the associated second content set.
Optionally, the electronic device may output the contents to be recommended, which are matched with each first user, in the order from high click probability to low click probability; the content to be recommended may include all training content in a second content set associated with the first user, or the content to be recommended may include target training content in the second content set associated with the first user, and the click probability of the target training content is greater than a preset probability threshold.
It should be noted that the preset probability threshold may be set according to specific requirements, for example, may be set to 0.8, and may also be set to 0.85, which is not specifically limited in this application.
In some embodiments of the present application, since the initial enterprise employee may not have a corresponding history record, the initial enterprise employee may be preferentially recommended during the cold start phase based on hot training content on the same post, and the method may further include: acquiring input characteristics of a third user; determining that the first user with the input feature similarity of the third user larger than a preset similarity threshold is a fourth user; and recommending the contents to be recommended, which are matched with the fourth user, for the third user.
The third user is a user without historical behavior data or with a small amount of historical behavior data; the preset similarity threshold may be set according to specific requirements, for example, may be set to 0.7, and may also be set to other values, which is not specifically limited herein.
In one embodiment, the input features may include at least one of business features, project features, and skill features, and in the case where the input features include business features, project features, and skill features, the similarity may be determined based on the business features, project features, and skill features of the third user and the first user and weights thereof.
Illustratively, the business feature correspondence weight may be set to 0.6, and the project feature and skill feature correspondence weight may both be set to 0.2.
It should be noted that, in the training content recommendation method provided in the embodiment of the present application, the execution subject may be a training content recommendation device, or a control module in the training content recommendation device for executing the training content recommendation method. In the embodiment of the present application, a method for executing training content recommendation by a training content recommendation device is taken as an example to describe the training content recommendation device provided in the embodiment of the present application. The training content recommendation device will be described in detail below.
Fig. 3 is a schematic structural diagram of a training content recommendation device according to an embodiment of the present application. As shown in fig. 3, the training content recommendation device 300 may include: an acquisition module 310, a sorting module 320, an input module 330, and an output module 340.
The obtaining module 310 is configured to obtain historical behavior data of M first users; the sorting module 320 is configured to sort the historical behavior data of each first user according to the time sequence to obtain M historical behavior sequences corresponding to M first users; the obtaining module 310 is further configured to obtain training content matched with each historical behavior sequence from the first content set based on a target recall algorithm, so as to obtain M second content sets associated with M first users; an input module 330, configured to input, to the target ranking model, an input feature of each first user and a second content set associated with the first user, to obtain a click probability of each first user on each training content in the associated second content set, where the input feature includes at least one of a business feature, a project feature, and a skill feature; and the output module 340 is configured to output the content to be recommended, which is matched with each first user, based on the click probability of each first user on each training content in the second content set.
According to the training content recommendation device, after the historical behavior data of the M first users are obtained, the historical behavior data of each first user are sequenced according to the time sequence, M historical behavior sequences can be obtained, and each historical behavior sequence can reflect habit preference of the corresponding user. On the basis, training content matched with each historical behavior sequence can be obtained from the initial first content set based on a target recall algorithm, and M second content sets associated with M first users are obtained. Therefore, training contents with high adaptation degree can be screened out for each first user from the initial content set containing a large amount of training contents, and a second content set corresponding to each first user is constructed. Based on the above, by inputting at least one of the business feature, the project feature and the skill feature of each first user and the second content set associated with the first user into the target ranking model, the click probability of each first user on each training content in the associated second content set can be accurately predicted, and the click probability can represent the adaptation degree and the interest degree of the first user on the training content, so that the content to be recommended is output for each first user based on the click probability of each training content in the second content set, thousands of people are achieved, the potential of the staff is fully mined, the enthusiasm of the staff for learning is mobilized, the adaptation degree of the training content and the staff is effectively improved, and the personalized recommendation and the precise recommendation of the training content in combination with the related interest of the staff are realized.
In some embodiments of the present application, the obtaining module 310 is specifically configured to include: acquiring historical behavior data of a plurality of first users based on target data buried points; the target data buried points comprise user identification, historical behavior types, stay time, training contents, training content categories and event time, and the historical behavior data comprise historical exposure behavior data and historical click behavior data.
In some embodiments of the present application, the target recall algorithm includes at least one of a word vector recall algorithm, a collaborative filtering recall algorithm, and a tag recall algorithm, and the obtaining module 310 is specifically configured to at least one of: acquiring training content matched with each historical behavior sequence from the first content set based on a word vector recall algorithm and a time attenuation weight; based on a collaborative filtering recall algorithm and a user activity attenuation weight, acquiring training content matched with each historical behavior sequence from a first content set; training content matched with each historical behavior sequence is obtained from the first content set based on the label recall algorithm, the time attenuation weight and the frequency attenuation weight.
In some embodiments of the present application, the apparatus further comprises: the obtaining module 310 is further configured to obtain historical behavior data and input features of a plurality of second users; the building module is used for building training sample data based on historical behavior data and input features of a plurality of second users; and the model training module is used for training the logistic regression model based on the training sample data to obtain a target sequencing model.
In some embodiments of the present application, the output module 340 is specifically configured to: outputting contents to be recommended matched with each first user according to the sequence of the click probability from high to low; the content to be recommended comprises all training content in a second content set associated with the first user, or the content to be recommended comprises target training content in the second content set associated with the first user, and the click probability of the target training content is larger than a preset probability threshold.
In some embodiments of the present application, the apparatus further comprises: an obtaining module 310, further configured to obtain input characteristics of a third user; the determining module is further used for determining that the first user with the input feature similarity of the third user larger than the preset similarity threshold is the fourth user; and the recommending module is also used for recommending the content to be recommended matched with the fourth user for the third user.
The training content recommendation device in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.
The training content recommendation device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system, an iOS operating system, or other possible operating systems, which is not specifically limited in the embodiment of the present application.
Fig. 4 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
As shown in fig. 4, the electronic device 400 in this embodiment may include a processor 401 and a memory 402 storing computer program instructions.
In particular, the processor 401 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
Memory 402 may include mass storage for data or instructions. By way of example, and not limitation, memory 402 may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 402 may include removable or non-removable (or fixed) media, where appropriate. The memory 402 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 402 is a non-volatile solid-state memory. The Memory may include Read-Only Memory (ROM), random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash Memory devices, electrical, optical, or other physical/tangible Memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., a memory device) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors) it is operable to perform operations described with reference to methods in accordance with embodiments of the application.
The processor 401 may implement any of the training content recommendation methods in the above embodiments by reading and executing computer program instructions stored in the memory 402.
In one example, electronic device 400 may also include a communication interface 403 and a bus 410. As shown in fig. 4, the processor 401, the memory 402, and the communication interface 403 are connected via a bus 410 to complete communication therebetween.
The communication interface 403 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present application.
Bus 410 includes hardware, software, or both coupling the components of the online data traffic charging apparatus to one another. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industrial Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hyper Transport (HT) interconnect, an Industrial Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. Bus 410 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The electronic device provided in the embodiment of the present application can implement each process implemented in the method embodiments of fig. 1 and fig. 2, and can implement the same technical effect, and is not described herein again to avoid repetition.
With reference to the training content recommendation method in the foregoing embodiment, an embodiment of the present application may provide a training content recommendation system, where the training content recommendation system includes the electronic device in the foregoing embodiment. For specific contents of the electronic device, reference may be made to the relevant description in the above embodiments, and details are not described herein again.
In addition, in combination with the training content recommendation method in the foregoing embodiment, the embodiment of the present application may provide a computer-readable storage medium to implement. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement the steps of any of the training content recommendation methods of the embodiments described above.
In combination with the training content recommendation method in the foregoing embodiment, the embodiment of the present application may be implemented by providing a computer program product. The (computer) program product is stored on a non-volatile storage medium, and when executed by at least one processor, performs the steps of any of the training content recommendation methods in the above embodiments.
It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions or change the order between the steps after comprehending the spirit of the present application.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic Circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments can be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include an electronic circuit, a semiconductor memory device, a ROM, a flash memory, an Erasable ROM (EROM), a floppy disk, a CD-ROM, an optical disk, a hard disk, an optical fiber medium, a Radio Frequency (RF) link, and so forth. The code segments may be downloaded via computer networks such as the internet, intranets, etc.
It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
Aspects of the present disclosure are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As described above, only the specific embodiments of the present application are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims (10)

1. A method for recommending training content, the method comprising:
acquiring historical behavior data of M first users;
sequencing the historical behavior data of each first user according to the time sequence to obtain M historical behavior sequences corresponding to M first users;
acquiring training content matched with each historical behavior sequence from a first content set based on a target recall algorithm to obtain M second content sets associated with the M first users;
inputting the input characteristics of each first user and a second content set associated with the first user into a target ranking model to obtain the click probability of each first user on each training content in the associated second content set, wherein the input characteristics comprise at least one of business characteristics, project characteristics and skill characteristics;
and outputting the content to be recommended matched with each first user based on the click probability of each first user on each training content in the associated second content set.
2. The method of claim 1, wherein obtaining historical behavior data for a plurality of first users comprises:
acquiring historical behavior data of the plurality of first users based on target data buried points;
the target data buried point comprises a user identification, a historical behavior type, a stay time, training contents, a training content category and an event time, and the historical behavior data comprises historical exposure behavior data and historical click behavior data.
3. The method of claim 1, wherein the targeted recall algorithm comprises at least one of a word vector recall algorithm, a collaborative filtering recall algorithm, and a tag recall algorithm, and wherein the targeted recall algorithm-based acquisition of training content from a first set of content that matches each of the sequences of historical behaviors comprises at least one of:
acquiring training content matched with each historical behavior sequence from a first content set based on the word vector recall algorithm and the time attenuation weight;
acquiring training content matched with each historical behavior sequence from a first content set based on the collaborative filtering recall algorithm and the user activity attenuation weight;
and acquiring training content matched with each historical behavior sequence from the first content set based on the label recalling algorithm, the time attenuation weight and the frequency attenuation weight.
4. The method of claim 1, further comprising:
acquiring historical behavior data and input characteristics of a plurality of second users;
constructing training sample data based on historical behavior data and input features of a plurality of second users;
and training a logistic regression model based on the training sample data to obtain the target ranking model.
5. The method of claim 1, wherein outputting the content to be recommended that matches each first user based on the click probability of each first user on each training content in the associated second content set comprises:
outputting the contents to be recommended matched with each first user according to the sequence of the click probability from high to low;
the content to be recommended comprises all training content in a second content set associated with the first user, or the content to be recommended comprises target training content in the second content set associated with the first user, and the click probability of the target training content is larger than a preset probability threshold.
6. The method of claim 1, further comprising:
acquiring input characteristics of a third user;
determining that the first user with the input feature similarity of the third user larger than a preset similarity threshold is a fourth user;
recommending the content to be recommended matched with the fourth user for the third user.
7. A training content recommendation apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring historical behavior data of M first users;
the sorting module is used for sorting the historical behavior data of each first user according to the time sequence to obtain M historical behavior sequences corresponding to M first users;
the acquisition module is further used for acquiring training content matched with each historical behavior sequence from the first content set based on a target recall algorithm to obtain M second content sets associated with the M first users;
the input module is used for inputting the input characteristics of each first user and a second content set associated with the first user into a target ranking model to obtain the click probability of each first user on each training content in the associated second content set, wherein the input characteristics comprise at least one of business characteristics, project characteristics and skill characteristics;
and the output module is used for outputting the contents to be recommended matched with each first user based on the click probability of each first user on each training content in the second content set.
8. An electronic device, characterized in that the device comprises: a processor and a memory storing computer program instructions;
the processor, when executing the computer program instructions, implements the training content recommendation method of any of claims 1-6.
9. A computer-readable storage medium, having stored thereon computer program instructions, which when executed by a processor, implement the steps of the training content recommendation method of any one of claims 1-6.
10. A computer program product, stored in a non-volatile storage medium, the program product being executable by at least one processor to perform the steps of the training content recommendation method of any one of claims 1-6.
CN202211110364.6A 2022-09-13 2022-09-13 Training content recommendation method, device, equipment and medium Pending CN115470407A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211110364.6A CN115470407A (en) 2022-09-13 2022-09-13 Training content recommendation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211110364.6A CN115470407A (en) 2022-09-13 2022-09-13 Training content recommendation method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN115470407A true CN115470407A (en) 2022-12-13

Family

ID=84333306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211110364.6A Pending CN115470407A (en) 2022-09-13 2022-09-13 Training content recommendation method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN115470407A (en)

Similar Documents

Publication Publication Date Title
CN109460513B (en) Method and apparatus for generating click rate prediction model
CN110825957B (en) Deep learning-based information recommendation method, device, equipment and storage medium
CN107679211B (en) Method and device for pushing information
CN106202453B (en) Multimedia resource recommendation method and device
US10671684B2 (en) Method and apparatus for identifying demand
CN109976997B (en) Test method and device
CN110619076B (en) Search term recommendation method and device, computer and storage medium
CN107908789A (en) Method and apparatus for generating information
CN112801719A (en) User behavior prediction method, user behavior prediction device, storage medium, and apparatus
CN113220734A (en) Course recommendation method and device, computer equipment and storage medium
CN111814056A (en) Supplier recommendation method based on information processing and related equipment
CN116541610B (en) Training method and device for recommendation model
CN114422267A (en) Flow detection method, device, equipment and medium
CN111552835B (en) File recommendation method, device and server
CN113378067B (en) Message recommendation method, device and medium based on user mining
US20220198487A1 (en) Method and device for processing user interaction information
CN110634024A (en) User attribute marking method and device, electronic equipment and storage medium
CN116578774A (en) Method, device, computer equipment and storage medium for pre-estimated sorting
CN110971973A (en) Video pushing method and device and electronic equipment
CN114265777B (en) Application program testing method and device, electronic equipment and storage medium
CN115470407A (en) Training content recommendation method, device, equipment and medium
CN113780318B (en) Method, device, server and medium for generating prompt information
CN111522747B (en) Application processing method, device, equipment and medium
CN111507471B (en) Model training method, device, equipment and storage medium
CN111131354B (en) Method and apparatus for generating information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination