CN111858702A - User behavior data acquisition and weighting method for dynamic portrait - Google Patents

User behavior data acquisition and weighting method for dynamic portrait Download PDF

Info

Publication number
CN111858702A
CN111858702A CN202010597643.4A CN202010597643A CN111858702A CN 111858702 A CN111858702 A CN 111858702A CN 202010597643 A CN202010597643 A CN 202010597643A CN 111858702 A CN111858702 A CN 111858702A
Authority
CN
China
Prior art keywords
user
time
behavior
behaviors
collecting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010597643.4A
Other languages
Chinese (zh)
Other versions
CN111858702B (en
Inventor
朱欣娟
赵璟博
罗云川
吴哲
高岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Public Cultural Development Center Of Ministry Of Culture And Tourism
Xian Polytechnic University
Original Assignee
National Public Cultural Development Center Of Ministry Of Culture And Tourism
Xian Polytechnic University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Public Cultural Development Center Of Ministry Of Culture And Tourism, Xian Polytechnic University filed Critical National Public Cultural Development Center Of Ministry Of Culture And Tourism
Priority to CN202010597643.4A priority Critical patent/CN111858702B/en
Publication of CN111858702A publication Critical patent/CN111858702A/en
Application granted granted Critical
Publication of CN111858702B publication Critical patent/CN111858702B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a user behavior data acquisition and weighting method for dynamic portraits, which is implemented by the following steps: 1. dividing users into autonomous release users and non-autonomous release users, dividing time by taking time T as a time slice, and collecting user behavior data of four time slices, wherein N data are obtained in total; 2. endowing different weight coefficients to the contents of different user behaviors, carrying out normalization processing on the weight coefficients of N contents of the same user, and endowing the processed weight coefficients to the contents; 3. classifying the N data obtained in the step 1 to obtain N42-dimensional label vectors, calculating the weight coefficient corresponding to each content and the label vector corresponding to the content, and selecting the first three 3 labels as the interests of the user. The method of the invention can improve the accuracy of real-time user interest prediction.

Description

User behavior data acquisition and weighting method for dynamic portrait
Technical Field
The invention belongs to the technical field of big data user behavior data analysis mining processing, and relates to a user behavior data acquisition and weighting method for dynamic portraits.
Background
In the era of mobile internet, refined operation gradually becomes an important competitive power for enterprise development, and the concept of "user portrait" also comes from the beginning. The user portrait is a process of abstracting the behavior data of the user into labels and materializing the user image by utilizing the labels through cleaning, clustering, analyzing and mining mass behavior data information generated by the user in a big data era. The establishment of the user representation can help enterprises to better provide targeted services for users.
In the web2.0 era, the content output on the network was mainly users, and each user could generate its own content. Websites such as CSDN, Wikipedia and the like are used for popularizing knowledge and solving questions for net friends, users generate various behavior data on the websites every day, and the interest preference information of the users can be predicted by analyzing the behavior data of the users on the websites. CSDN is a forum dedicated to the popularization of computer domain knowledge, and users thereof generate a large number of different kinds of behavioral data each day. For example, a user posts a blog, a user reprints a blog, a user collects a blog, a user likes a blog, a user browses a blog, a user pays attention to other user objects, and so on. The behavior data reflect different interests of the user, and how to dynamically portray the user according to the behavior data is a focus of recent research in the field of computers
There are several methods for user profiling, but these methods currently have two problems: the method comprises the steps of distributing, forwarding, collecting, browsing, approving and paying attention to user behaviors, classifying the user behavior content data into various types according to different behaviors, and making no different contribution of different types of behaviors of different types of users to user portrayal by the existing method. Each user has own characteristics, and the behavior and content data quantity of the user generated in different time periods are different according to different periods and different frequencies of the behavior data. The problems that the data size is too large, the analysis efficiency is reduced and the dynamic real-time change of the attention of the user cannot be reflected are caused by analyzing all historical data of the user. How to make the portrait technology based on the periodic characteristics of the user personalized behavior content data and reflect the characteristic of dynamic change of user interest is a challenge faced by current technical research.
Disclosure of Invention
The invention aims to provide a user behavior data acquisition and weighting method for a dynamic portrait, which solves the problem that different contributions of different types of behaviors of different types of users to the user portrait are not highlighted in the prior art.
The invention adopts the technical scheme that a user behavior data acquisition and weighting method for dynamic portraits is implemented according to the following steps:
step 1, dividing users into autonomous release users and non-autonomous release users, dividing time by taking time T as a time segment, and collecting user behavior data of a current time segment and three time segments before the current time segment, wherein the total number of the user behavior data is N;
step 2, endowing different weight coefficients to the contents of different user behaviors, carrying out normalization processing on the weight coefficients of N contents of the same user, and endowing the processed weight coefficients to the contents;
and 3, classifying the N data obtained in the step 1 to obtain N42-dimensional label vectors, obtaining the weighted label vectors of the contents by using the weight coefficient corresponding to each content and the label vectors corresponding to the contents, weighting and summing the N weighted label vectors to obtain one label vector, and selecting the first three labels as the hobbies of the user.
The invention is also characterized in that:
the user behaviors of the self-releasing user in the step 1 comprise releasing, forwarding, collecting, browsing, praise and attention; the user behaviors of the non-autonomous releasing user comprise forwarding, collecting, browsing, praise and attention.
The step 1 of collecting user behavior data is implemented according to the following steps:
step 1.1, obtaining a personalized time attenuation function according to an Einghaus memory curve, and determining a weight coefficient of user behavior data collected by a certain time segment by the function;
step 1.2, respectively collecting data according to a proportion to different user behaviors of an autonomous release user and a non-autonomous release user, and collecting N data in a current time segment and three time segments before the current time segment;
and step 1.3, calculating the quantity of data required to be collected in different time segments of different user behaviors according to the weight coefficient calculation formula in the step 1.1.
Step 1.1 is specifically carried out according to the following steps:
step 1.1.1, fitting an Eibongos memory curve by using a power function, wherein the fitting function is shown as a formula (1):
L(t)=32.03(tc-t0)-0.1236(1)
wherein L is(t)Representing the percentage of memory residue, t0Is the user's memory time, tcThe time for memorizing the residual quantity, and the unit of time t is day;
step 1.1.2, adjusting the formula (1) to obtain an individualized time attenuation function as the formula (2):
L(i)=32.03[(i-1/2)k]-0.1236i=1、2、3、4 (2)
defining the current time period as the 1 st time segment, and respectively 2 nd, 3 rd and 4 th time segments along the forward time of the time, wherein L (i)The method comprises the steps that a weighting coefficient of user behaviors is acquired in the ith time slice, k is T '/5, the current time point is set to be 0 moment, and T' is the time when a user closest to the current time point continuously generates one or more of 5 times of issuing behaviors, forwarding behaviors or collecting behaviors.
In the step 1.2, the autonomous release user can release N/2 behaviors, transmit N/6 behaviors, collect N/12 behaviors in favor of the behavior in the current time segment and the four time segments before the current time segment, the autonomous release user can transmit N/3 behaviors, collect N/6 behaviors and collect N/6 behaviors in favor of the behavior in the four time segments.
When the number of the issuing behavior data is less than N/2, more than half of the issuing behavior data is collected in the forwarding behavior and the collecting behavior respectively to supplement the issuing behavior data.
Step 2 specifically includes giving a weight coefficient 5 to the content of the release behavior, giving a weight coefficient 2.5 to the content of the forwarding behavior, giving a weight coefficient 2.5 to the content of the collection behavior, giving a weight coefficient 0.5 to the content of the browsing behavior, and giving a weight coefficient 0.5-2.5 to the content of the praise behavior.
The weight coefficient assignment of the contents of the praise behavior is divided into: when the contents of the behavior approval are the contents of browsing, publishing or approval of the user's own object of interest, the weighting coefficient is given to be 0.5, when the contents of the behavior approval are the contents of forwarding or collection of the user's own object of interest, the weighting coefficient is given to be 0.7, and when the contents of the behavior approval are not associated with the user's own object of interest, the weighting coefficient is given to be 2.5.
And 3, specifically, sending the N data into a Bi-LSTM + Attention model to obtain N42-dimensional label vectors, wherein each dimension of the label vectors has a probability value, the sum of the 42 probability values is 1, the probability value of each dimension is the proportion of the corresponding interest field on the user, the weight coefficient corresponding to each content is multiplied by the label vector corresponding to the content to obtain the label vector with the weight of each content, the N label vectors with the weight are obtained in total, the N label vectors with the weight are subjected to weighted summation to obtain one label vector, and 3 labels with the first three probability values are selected as the interest and hobbies of the user.
The invention has the beneficial effects that: the invention relates to a user behavior data acquisition and weighting method for a dynamic portrait. By aiming at the requirement of analyzing and processing the behavior data of the big data user, a weighting and dynamic acquisition method for different behavior data of different types of users is provided, and the problem of predicting the interest of the user in real time under the condition that different types of data have different influences on the portrait result and massive data are used is solved to a certain extent.
Drawings
FIG. 1 is a flow chart of a method of the present invention for user behavior data collection and weighting of a motion portrayal;
FIG. 2 is an Ebingois memory plot used in a method of user behavior data collection and weighting for motion portrayal in accordance with the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention discloses a user behavior data acquisition and weighting method for dynamic portraits, which is also suitable for the user behaviors of users such as CSDN, Wikipedia, small video and the like by taking the user behavior of a blog user as an example as shown in figure 1, and is implemented according to the following steps:
step 1, dividing users into users who independently issue blogs and users who do not autonomously issue blogs, wherein the user behaviors of the users who do not autonomously issue blogs include blog browsing, blog forwarding, blog collecting, blog praising and user attention objects; the method comprises the following steps that users who publish blogs, send blogs, collect blogs, like blogs, browse blogs and focus objects of the users, different types of behavior data are collected for different types of users, different weight coefficients are given according to different contributions of the data to portrait results, a foundation is provided for accurately describing interest characteristics of the users in the future, on the other hand, the sampling quantity of the behavior data of the user content is reduced, a sampling process method is optimized, the dynamic interest characteristics of the users are reflected in real time, and the portrait analysis and mining efficiency of the users is improved;
The time T is taken as a time segment (T is 30-90 days) to divide the time, the user behavior data of the current time segment and the three previous time segments are dynamically collected, the collected data are analyzed and mined to predict the interest of the user of the current time segment, and because the user generates a large amount of behavior data on the Internet every day, the user can not be imaged by using all the behavior data of the user to meet the requirement of real-time property, the dynamic sampling technology provided by the invention reduces the sampling quantity of the behavior data of the content of the user, optimizes the sampling process method, accurately reflects the dynamic interest characteristics of the user in real time,
the dynamic sample collection is specifically implemented according to the following steps: firstly, dividing time by taking time T as a time slice, dynamically collecting user behavior data of a current time slice and 3 preceding time slices, collecting N blogs in 4 time slices,
since the user interest in the current time period is predicted, the number of blogs collected in four time periods follows the following rules: the number of blogs collected in the current time period is the largest, the number of blogs collected in the current time period is sequentially reduced along with the forward progress of the time, N blogs are collected in four time periods, an individualized time attenuation function is obtained by using an Ebbinghaus memory curve, and the specific number of blogs collected in each time period is determined by the time attenuation function;
An Ebbinghaos memory forgetting curve is shown in FIG. 2, and an Ebbinghaos memory curve is fit by using a power function, wherein the fit function is shown in a formula (1):
L(t)=32.03(tc-t0)-0.1236(1)
wherein L is(t)Representing the percentage of memory residue, t0Is the user's memory time, tcThe time of memorizing the residual quantity, and the unit of the time t is day;
because the time slice used by the method is T, the change amplitude of the memory residual quantity after 10 days is not obvious according to an Einghaos memory curve, the weight coefficients of the number of the acquired blogs of the four time slices obtained by the method are not very different, and the rule of acquiring the blogs cannot be met: the number of blogs collected in the current time period is the largest, and the number of blogs collected in the current time period is gradually decreased along with the forward progress of time, so that the formula (1) needs to be adjusted to obtain a personalized time attenuation function as the formula (2):
L(i)=32.03[(i-1/2)k]-0.1236i=1、2、3、4 (2)
defining the current time period as the 1 st time period, and respectively 2 nd, 3 rd and 4 th time periods along with the forward time, wherein L(i)The method comprises the steps that a weight coefficient of the number of microblogs collected in the ith time period is obtained, k is T '/5, the current time point is set to be 0, and T' is the time when a user closest to the current time point continuously produces 5 times of actions of publishing blogs, forwarding blogs or collecting blogs;
Since the users are divided into two categories of independently published blogs and blogs which are not published, the algorithm for dynamically collecting blogs is designed according to two situations.
For a user who independently publishes blogs, five types of blogs are provided, according to different behaviors, the number of different types of blogs is different, N/2 pieces of published blogs are set to be collected, N/6 pieces of forwarded blogs are collected, N/6 pieces of collected blogs are collected, N/12 pieces of browsed blogs are collected, N/12 pieces of praise blogs are collected, and N pieces of blogs are collected in four time periods;
respectively calculating the weight coefficient L of the number of the blogs collected in each time period according to the formula (1-2)(i)In the whole portrait process, N blogs are collected in four time periods in total, wherein N/2 blogs are collected in published blogs, N/6 blogs are collected in forwarded blogs, N/6 blogs are collected in collected blogs, N/12 blogs are collected in browsed blogs, and N/12 blogs are collected in praise blogs. The number of blogs collected per time period is L(i)N, blog Collection of publication L(i)(N/2) pieces, forwarded blog Collection L(i)(N/6) pieces, collected blogs, Collection L(i)(N/6) paragraphs, blog collection of browsing L(i)(N/12) pieces, collections of blogs, Collection L (i)(N/12). Due to the characteristics of the user, the number of published blogs is not forwarded, collected, praised and browsed, the number of the blogs published by the user is not enough for N/2, the missing number is supplemented by the number of the blogs forwarded and collected by the user, and if the number of the blogs published by the user is not enough for N/2, the number of the blogs forwarded and collected is forwardedThe customers respectively collect half of the missing quantity;
② there are 4 types of blogs for users who have not published blogs. Setting forwarded blogs to collect N/3 blogs, collected blogs to collect N/3 blogs, browsed blogs to collect N/6 blogs, praise blogs to collect N/6 blogs, and collecting N blogs in total in four time periods;
respectively calculating the weight coefficient L of the number of the collected microblogs in each time period according to the formula (1-2)(i)In the whole portrait process, N blogs are collected in four time periods in total, wherein N/3 blogs are collected in total for forwarded blogs, N/3 blogs are collected in total for collected blogs, N/6 blogs are collected in total for browsed blogs, and N/6 blogs are collected in total for praise blogs. The number of blogs collected per time period is L(i)N, wherein forwarded blogs collect L(i)(N/3) pieces, collections of blogs, Collection L(i)(N/3) paragraphs, blog collection of browsing L (i)(N/6) pieces, collected blogs, Collection L(i)(N/6) pieces;
if the calculated number of the blogs is a decimal number, adjusting the number of the blogs into an integer by using a rounding rule;
step 2: the behavior data of the user includes publishing a blog, forwarding the blog, collecting the blog, praise the blog, browsing the blog and paying attention to the object by the user. Different weight coefficients are given to different user behaviors, and the main purpose is to highlight different influences of different types of behavior data on the portrait result so that the portrait result is more accurate;
first for a blog: if the blog published by the user is the blog, the weighting coefficient is given to be 5; for a blog that a user likes, three cases are distinguished: if the blog liked by the user is a blog browsed, published or liked by the user concerning the object, the weight coefficient is given to be 0.5, if the blog liked by the user is a blog forwarded or collected by the user concerning the object, the weight coefficient is given to be 0.7, and if the blog liked by the user is not associated with all objects concerned by the user, the weight coefficient is given to be 2.5; if the blog is forwarded by the user, the weighting coefficient is given to be 2.5; if the blog collected by the user is the blog collected by the user, the weighting coefficient is given to be 2.5; if the blog browsed by the user is the blog browsed by the user, the weighting coefficient is given to be 0.5. Then calculating the weight coefficient of the blog, wherein for one blog, a plurality of behaviors of the user can occur simultaneously, each behavior is generated, the corresponding weight coefficient is accumulated to obtain a total weight coefficient, and finally the total weight coefficient is normalized and then is given to the blog; for example, if a user browses a blog and forwards and collects the blog, the weight coefficient of the blog is: the weight of browsing behavior is 0.5+ the weight of forwarding blogs is 2.5+ the weight of collecting blogs is 2.5, namely the total weight of blogs is 5.5, finally, the weight coefficients of the N blogs are normalized, and the weight coefficients obtained after normalization are given to each blog;
And step 3: processing the blogs collected in the step 1 to obtain three interest hobbies of the user, wherein the events to be done by the user are obtained by analyzing blogs related to the user to obtain the interest hobbies of the three users, which totally have 42 interest fields, so that the total number of the interest tags is 42, each blog is classified to obtain a 42-dimensional tag vector, each dimension of the vector has a probability value, the sum of the 42 probability values is 1, the probability value of each dimension is the proportion of the corresponding interest field on the user, the method is concretely implemented by the following steps of firstly collecting N blogs according to the method in the step 1, then calculating the weight coefficient of each blog according to the method in the step 2, carrying out normalization processing on the weight coefficient of the N blogs for later use, and finally sending the N blogs into a Bi-LSTM + Attention model to obtain N42-dimensional tag vectors, each blog also has a weight coefficient, the weight coefficient corresponding to each blog is multiplied by the label vector corresponding to the blog to obtain the label vector of the weight of each blog, N label vectors with weights are obtained in total, the N label vectors with weights are subjected to weighted summation to obtain one label vector, and 3 labels with the first three probability values are selected as the interests of the user.
The invention provides a user behavior data acquisition and weighting method for a dynamic portrait, which fully utilizes various behavior data of a user and endows different weights for different behavior content data of the user; and on the other hand, the data volume to be analyzed is acquired and extracted based on the user personalized Einghaus memory curve, so that the analysis efficiency is improved, and a foundation is laid for realizing the label generation of the user dynamic portrait.

Claims (9)

1. A user behavior data acquisition and weighting method for a dynamic portrait is characterized by comprising the following steps:
step 1, dividing users into autonomous release users and non-autonomous release users, dividing time by taking time T as a time segment, and collecting user behavior data of a current time segment and three time segments before the current time segment, wherein the total number of the user behavior data is N;
step 2, endowing different weight coefficients to the contents of different user behaviors, carrying out normalization processing on the weight coefficients of N contents of the same user, and endowing the processed weight coefficients to the contents;
and 3, classifying the N data obtained in the step 1 to obtain N42-dimensional label vectors, obtaining the weighted label vectors of the contents by using the weight coefficient corresponding to each content and the label vectors corresponding to the contents, carrying out weighted summation on the N weighted label vectors to obtain one label vector, and selecting the first three 3 labels as the interests of the user.
2. A method as claimed in claim 1, wherein the step 1 of autonomously releasing the user behavior of the user includes releasing, forwarding, collecting, browsing, praise, and following; the user behaviors of the non-autonomous releasing user comprise forwarding, collecting, browsing, praise and attention.
3. A method for collecting and weighting user behavior data of a dynamic representation as claimed in claim 2, wherein the step 1 of collecting user behavior data is implemented by the steps of:
step 1.1, obtaining a personalized time attenuation function according to an Einghaus memory curve, wherein the function is a weight coefficient for collecting user behavior data for a certain time segment;
step 1.2, respectively collecting data according to a proportion to different user behaviors of an autonomous release user and a non-autonomous release user, and collecting N data in a current time segment and three time segments before the current time segment;
and step 1.3, calculating the quantity of data required to be collected in different time segments of different user behaviors according to the weight coefficient calculation formula in the step 1.1.
4. A method as claimed in claim 3, wherein said step 1.1 is implemented by the following steps:
Step 1.1.1, fitting an Eibongos memory curve by using a power function, wherein the fitting function is shown as a formula (1):
L(t)=32.03(tc-t0)-0.1236(1)
wherein L is(t)Representing the percentage of memory residue, t0Is the user's memory time, tcThe time for memorizing the residual quantity, and the unit of time t is day;
step 1.1.2, adjusting the formula (1) to obtain an individualized time attenuation function as the formula (2):
L(i)=32.03[(i-1/2)k]-0.1236i=1、2、3、4 (2)
defining the current time period as the 1 st time segment, and respectively 2 nd, 3 rd and 4 th time segments along the forward time of the time, wherein L(i)The method comprises the steps that a weighting coefficient of user behaviors is acquired in the ith time slice, k is T '/5, the current time point is set to be 0 moment, and T' is the time when a user closest to the current time point continuously generates one or more of 5 times of issuing behaviors, forwarding behaviors or collecting behaviors.
5. The method as claimed in claim 3, wherein in step 1.2, the self-publishing user publishes N/2 behaviors in the current time slice and the first three time slices of the time slice, N/6 forwarding behaviors, N/6 collecting behavior, N/12 browsing behaviors, N/12 favorites behaviors, N/3 collecting behaviors without self-publishing user forwarding behaviors in the four time slices, N/3 collecting behavior, N/6 collecting browsing behaviors, and N/6 collecting favorites behaviors.
6. A method as claimed in claim 5, wherein when there are less than N/2 publishing behavior data, more than half of the publishing behavior data are collected in the forwarding behavior and the collection behavior respectively to supplement the publishing behavior data.
7. A method as claimed in claim 1, wherein the step 2 is to assign a weight coefficient 5 to the content of the publishing behavior, assign a weight coefficient 2.5 to the content of the forwarding behavior, assign a weight coefficient 2.5 to the content of the collecting behavior, assign a weight coefficient 0.5 to the content of the browsing behavior, and assign a weight coefficient 0.5-2.5 to the content of the praise behavior.
8. A method as claimed in claim 7, wherein the weighting factors of the content of the praise are given as: when the contents of the behavior approval are the contents of browsing, publishing or approval of the user's own object of interest, the weighting coefficient is given to be 0.5, when the contents of the behavior approval are the contents of forwarding or collection of the user's own object of interest, the weighting coefficient is given to be 0.7, and when the contents of the behavior approval are not associated with the user's own object of interest, the weighting coefficient is given to be 2.5.
9. The method as claimed in claim 1, wherein the step 3 is to send N data into a Bi-LSTM + Attention model to obtain N42-dimensional tag vectors, each dimension of the tag vectors has a probability value, the sum of the 42 probability values is 1, the probability value of each dimension is a specific gravity of the corresponding field of interest on the user, the weighted tag vector of each content is obtained by multiplying the weight coefficient corresponding to each content by the tag vector corresponding to the content, so as to obtain N weighted tag vectors, the weighted tag vectors are summed to obtain one tag vector, and 3 tags with the first three probability values are selected as interests of the user.
CN202010597643.4A 2020-06-28 2020-06-28 User behavior data acquisition and weighting method for dynamic portrait Active CN111858702B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010597643.4A CN111858702B (en) 2020-06-28 2020-06-28 User behavior data acquisition and weighting method for dynamic portrait

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010597643.4A CN111858702B (en) 2020-06-28 2020-06-28 User behavior data acquisition and weighting method for dynamic portrait

Publications (2)

Publication Number Publication Date
CN111858702A true CN111858702A (en) 2020-10-30
CN111858702B CN111858702B (en) 2022-02-11

Family

ID=72988662

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010597643.4A Active CN111858702B (en) 2020-06-28 2020-06-28 User behavior data acquisition and weighting method for dynamic portrait

Country Status (1)

Country Link
CN (1) CN111858702B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505309A (en) * 2021-05-21 2021-10-15 深圳市蘑菇财富技术有限公司 Marketing recommendation method, device, equipment and storage medium based on distributed data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609460A (en) * 2012-01-13 2012-07-25 中国科学院计算技术研究所 Method and system for microblog data acquisition
CN104166668A (en) * 2014-06-09 2014-11-26 南京邮电大学 News recommendation system and method based on FOLFM model
CN107368519A (en) * 2017-06-05 2017-11-21 桂林电子科技大学 A kind of cooperative processing method and system for agreeing with user interest change
CN109359244A (en) * 2018-10-30 2019-02-19 中国科学院计算技术研究所 A kind of recommendation method for personalized information and device
US20200019595A1 (en) * 2018-07-12 2020-01-16 Giovanni Azua Garcia System and method for graphical vector representation of a resume

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609460A (en) * 2012-01-13 2012-07-25 中国科学院计算技术研究所 Method and system for microblog data acquisition
CN104166668A (en) * 2014-06-09 2014-11-26 南京邮电大学 News recommendation system and method based on FOLFM model
CN107368519A (en) * 2017-06-05 2017-11-21 桂林电子科技大学 A kind of cooperative processing method and system for agreeing with user interest change
US20200019595A1 (en) * 2018-07-12 2020-01-16 Giovanni Azua Garcia System and method for graphical vector representation of a resume
CN109359244A (en) * 2018-10-30 2019-02-19 中国科学院计算技术研究所 A kind of recommendation method for personalized information and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘恩海等: "基于两级修正的页面排序改进算法", 《计算机工程与设计》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505309A (en) * 2021-05-21 2021-10-15 深圳市蘑菇财富技术有限公司 Marketing recommendation method, device, equipment and storage medium based on distributed data
CN113505309B (en) * 2021-05-21 2024-02-09 深圳市蘑菇财富技术有限公司 Marketing recommendation method, device, equipment and storage medium based on distributed data

Also Published As

Publication number Publication date
CN111858702B (en) 2022-02-11

Similar Documents

Publication Publication Date Title
US8682830B2 (en) Information processing apparatus, information processing method, and program
CN107818344A (en) The method and system that user behavior is classified and predicted
CN107862022B (en) Culture resource recommendation system
WO2015192667A1 (en) Advertisement recommending method and advertisement recommending server
CN109033408B (en) Information pushing method and device, computer readable storage medium and electronic equipment
CN111611478B (en) Information recommendation method and device and electronic equipment
CN111125574A (en) Method and apparatus for generating information
CN112765480B (en) Information pushing method and device and computer readable storage medium
CN104933622A (en) Microblog popularity degree prediction method based on user and microblog theme and microblog popularity degree prediction system based on user and microblog theme
CN110880127B (en) Consumption level prediction method and device, electronic equipment and storage medium
WO2020018812A1 (en) Artificial intelligence engine for generating semantic directions for websites for automated entity targeting to mapped identities
CN109389424B (en) Flow distribution method and device, electronic equipment and storage medium
CN112487283A (en) Method and device for training model, electronic equipment and readable storage medium
Liu et al. Online recommendations based on dynamic adjustment of recommendation lists
CN111858702B (en) User behavior data acquisition and weighting method for dynamic portrait
CN105432038A (en) Application ranking calculating apparatus and usage information collecting apparatus
CN116320626B (en) Method and system for calculating live broadcast heat of electronic commerce
Liu et al. A framework to compute page importance based on user behaviors
CN110990706B (en) Corpus recommendation method and device
CN115545349B (en) Time sequence social media popularity prediction method and device based on attribute sensitive interaction
CN106156232B (en) Network information propagation monitoring method and device
Sun Machine learning-driven enterprise human resource management optimization and its application
CN117114766A (en) Cost control factor determining method, device, equipment and storage medium
CN112118486B (en) Content item delivery method and device, computer equipment and storage medium
CN113094602A (en) Hotel recommendation method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant