CN112182396A - Information pushing method based on user behaviors - Google Patents

Information pushing method based on user behaviors Download PDF

Info

Publication number
CN112182396A
CN112182396A CN202011084095.1A CN202011084095A CN112182396A CN 112182396 A CN112182396 A CN 112182396A CN 202011084095 A CN202011084095 A CN 202011084095A CN 112182396 A CN112182396 A CN 112182396A
Authority
CN
China
Prior art keywords
user
text data
data
user behavior
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011084095.1A
Other languages
Chinese (zh)
Inventor
罗列异
黄吉琦
任益斌
张轲
程韶曦
金松
张帆
王迪先
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Xinlan Network Media Co ltd
Original Assignee
Zhejiang Xinlan Network Media Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Xinlan Network Media Co ltd filed Critical Zhejiang Xinlan Network Media Co ltd
Priority to CN202011084095.1A priority Critical patent/CN112182396A/en
Publication of CN112182396A publication Critical patent/CN112182396A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an information pushing method based on user behaviors, which comprises the following steps: classifying the text data through an LDA text clustering model; analyzing all text data belonging to the same category to obtain a character label of each category; marking the picture data and the video data through a standard celebrity picture library; receiving user behavior data sent by a user side; calculating and analyzing the user behavior data to obtain a user behavior calculation result and storing the user behavior calculation result in user configuration information of the user; calculating the favorite result of the user according to the updated calculation results of all user behaviors; and recommending targeted content to the user according to the preference result. According to the information pushing method based on the user behaviors, the text data are classified through the LDA text clustering model, and then all the text data belonging to the same category are analyzed to obtain the character label of each category, so that a news information provider can accurately identify the preference of a user.

Description

Information pushing method based on user behaviors
Technical Field
The invention relates to an information pushing method based on user behaviors.
Background
With the development of internet technology, more and more people prefer to obtain information from the network, and various news information type APPs emerge endlessly. In order to improve user experience, a plurality of news information APP acquire operation behaviors of a user and judge the preference of the user, so that contents which the user may like are recommended to the user in a targeted mode.
And when the news data in the news information APP are stored in a warehouse, the operation of classification labels is required. Especially when processing text data, classification is generally performed by an LDA text clustering model. Under the condition of unsupervised learning, the LDA text clustering model is divided into self-defined category quantities from mass text data, and the problem that training cannot be performed due to the fact that the data quantity is too large and labeling cannot be conducted is solved. However, the categories separated by the LDA text clustering model cannot be labeled with typical chinese meanings, and the information provider cannot identify the content of text data in each category.
Disclosure of Invention
The invention provides an information pushing method based on user behaviors, which adopts the following technical scheme:
an information pushing method based on user behaviors comprises the following steps:
acquiring a plurality of unmarked news data, wherein the news data comprises text data, picture data and video data;
cleaning news data;
classifying the text data through an LDA text clustering model;
analyzing all classified text data belonging to the same category to obtain a character label of each category;
matching the picture data through a labeled standard celebrity picture library to label the picture data;
extracting frames from the video data, and matching the extracted frame pictures of the video data through a labeled standard celebrity picture library to label the video data;
storing the classified and labeled news data into a system for a user to browse, and receiving user behavior data sent by a user side;
calculating and analyzing the user behavior data to obtain a user behavior calculation result, and storing the user behavior calculation result into user configuration information of the user, wherein all the user behavior calculation results of the user are stored in the user configuration information;
and calculating the favorite result of the user according to the updated calculation results of all the user behaviors, and recommending targeted content to the user according to the favorite result.
Furthermore, an effective calculation period is set, the preference result is updated in the period at regular time, and all the user behavior calculation results in the effective calculation period are recalculated to obtain a new preference result.
Furthermore, the calculation results of the user behaviors are divided into a plurality of statistical categories, and different effective calculation periods are set for the calculation results of different statistical categories.
Further, when the preference result is calculated, different calculation weights are set according to the sequence of the generation time of the user behavior result.
Further, when calculating the preference result of the user, the closer the generation time of the user behavior result is to the current time, the larger the corresponding calculation weight is.
Further, if the text data is currently browsed by the user, the category of the text data currently browsed by the user is obtained, and a plurality of text data are selected from other text data of the category and pushed to the user.
Further, a specific method for selecting a plurality of text data from other text data of the category and pushing the selected text data to the user is as follows:
acquiring other text data under the category from the system;
sequencing the acquired text data according to the heat;
and pushing a plurality of text data ranked at the top to the user.
Further, a specific method for selecting a plurality of text data from other text data of the category and pushing the selected text data to the user is as follows:
acquiring other text data under the category from the system;
ordering the acquired text data according to the heat degree to obtain a first order;
sequencing the acquired text data according to the release time to obtain a second sequence;
setting a calculation weight value for the text data in the first sequence according to a first rule;
setting a calculation weight value for the text data in the second sequence according to a second rule;
calculating the comprehensive weight of each text data;
reordering according to the comprehensive weight of each text data under the category to obtain a third ordering;
and pushing a plurality of text data ranked at the top in the third ranking to the user.
Further, text data is classified through an LDA text clustering model, and the classification quantity is set according to needs.
Further, the number of classifications is set according to the total amount of text data.
The information pushing method based on the user behaviors has the advantages that after text data are classified through the LDA text clustering model, all the classified text data belonging to the same category are analyzed to obtain the character label of each category, and a news information provider can accurately identify the preference of a user.
Drawings
Fig. 1 is a schematic diagram of an information pushing method based on user behavior according to the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and the embodiments.
Fig. 1 shows an information push method based on user behavior according to the present invention, which mainly includes the following steps: s1: and acquiring a plurality of unmarked news data, wherein the news data comprises text data, picture data and video data. S2: and cleaning the news data. S3: and classifying the text data through an LDA text clustering model. S4: and analyzing all classified text data belonging to the same category to obtain the character label of each category. S5: and matching the picture data through the labeled standard celebrity picture library to label the picture data. S6: and performing frame extraction on the video data, and matching frame extraction pictures of the video data through a labeled standard celebrity picture library to label the video data. S7: and storing the classified and labeled news data into a system for a user to browse, and receiving user behavior data sent by the user side. S8: and calculating and analyzing the user behavior data to obtain a user behavior calculation result, and storing the user behavior calculation result into user configuration information of the user, wherein all the user behavior calculation results of the user are stored in the user configuration information. S9: and calculating the favorite result of the user according to the updated calculation results of all the user behaviors, and recommending targeted content to the user according to the favorite result. Through the steps, the preference of the user is calculated, and the appropriate content is pertinently recommended to the user according to the preference of the user. The above steps are specifically described below.
For step S1: and acquiring a plurality of unmarked news data, wherein the news data comprises text data, picture data and video data.
First, news data is acquired from a plurality of ways. Generally, news data is acquired from a plurality of information acquisition ports such as a green wave, a microblog and the like through the internet. These data include text data, picture data, and video data.
For step S2: and cleaning the news data.
The acquired news data is cleaned through this step. Such as removing the watermark from the text, certain descriptions added to the text by some data providers, etc.
For step S3: and classifying the text data through an LDA text clustering model.
And for the text data, classifying the text data through an LDA text clustering model. And before classifying the text data through the LDA text clustering model, setting the classification quantity according to the requirement. Preferably, in order to avoid excessive text data in one category, the number of categories is set according to the total amount of text data. The number of classifications is proportional to the total amount of text data, i.e., the larger the amount of text data, the larger the number of classifications is set.
It can be understood that when the text data is classified by the LDA text clustering model, negative data in the text data can be eliminated according to the setting.
For step S4: and analyzing all classified text data belonging to the same category to obtain the character label of each category.
The LDA text clustering model in step S3 classifies the text data, and the obtained different classes only have corresponding meaningless distinguishing codes. The content of the text data in each category cannot be obtained from these discrimination codes, which is specifically related to aspects such as military, cultural, political, and the like. In step S4, for each category obtained through the processing by the LDA text clustering model, intelligent semantic analysis is performed on the text data in the same category to obtain a text label of the text data in the category. After the processing of step S4, the classification without word label obtained by the LDA text clustering model processing will obtain a word label distinguished from other classes. By these text labels, it is possible to quickly identify to which aspect the contents of text data in different categories belong.
For step S5: and matching the picture data through the labeled standard celebrity picture library to label the picture data.
For the labeling of the picture data, matching is mainly carried out through a labeled standard celebrity picture library, the figures in the picture data are identified, and corresponding labels are marked on the picture data.
For step S6: and performing frame extraction on the video data, and matching frame extraction pictures of the video data through a labeled standard celebrity picture library to label the video data.
Annotations for video images are similar to picture data. Firstly, performing frame extraction on video data to obtain a frame extraction picture, matching through a standard celebrity picture library which is labeled, identifying people in the frame extraction picture, and marking a corresponding label on the video data.
For step S7: and storing the classified and labeled news data into a system for a user to browse, and receiving user behavior data sent by the user side.
And classifying and labeling the news data and then importing the news data into the system. The user browses the data through the APP at the user end. The system automatically collects the user behavior data sent by the user side. Such behavior data includes, but is not limited to, user actions such as clicking on news data, forwarding, commenting, staying time, dragging video, etc.
For step S8: and calculating and analyzing the user behavior data to obtain a user behavior calculation result, and storing the user behavior calculation result into user configuration information of the user, wherein all the user behavior calculation results of the user are stored in the user configuration information.
Different weights are assigned to various behaviors to indicate how popular the news users of different behaviors are. And analyzing and calculating the received user behavior data to obtain a user behavior calculation result. And then storing the user behavior calculation result obtained by calculation into the user configuration information of the user. Each user corresponds to one piece of user configuration information, and all user behavior calculation results of the user are stored in the user configuration information. The user configuration information reflects all the operation behaviors of the user.
For step S9: and calculating the favorite result of the user according to the updated calculation results of all the user behaviors, and recommending targeted content to the user according to the favorite result.
And when a user behavior calculation result is newly added in the user configuration information, calculating the preference result of the user according to all the updated user behavior calculation results. This preference result reflects the user's preference. The system can accurately recommend the interested contents to the user according to the favorite result of the user.
In a preferred embodiment, an effective calculation period is set, the preference result is updated periodically in the period, and all the user behavior calculation results in the effective calculation period are recalculated to obtain a new preference result.
It will be appreciated that the user's preference for things is time-limited. Generally, over time, the user may have shifted something of interest. And all the user behavior calculation results of the user are stored in the user configuration information. Therefore, each time the preference of the user is calculated, the calculation result is biased by all the stored user behaviors.
Therefore, preferably, an effective calculation period is set, the user preference is updated regularly in the period, the calculation result of the user behavior exceeding the effective calculation period, which is stored in the user configuration information, is eliminated, and the preference result is recalculated. In the present embodiment, the effective calculation period is set to three months. It will be appreciated that the effective calculation period may be adjusted as desired.
As a preferred embodiment, the calculation results of the user behavior are divided into a plurality of statistical categories, and different effective calculation periods are set for the calculation results of different statistical categories.
It will be appreciated that the time at which the user loses interest is different for transactions in different categories, i.e. different effective calculation periods need to be set for different categories of news data, and longer effective calculation periods need to be set for categories for which the interest lasts longer.
As a preferred embodiment, when calculating the preference result of the user, the closer the generation time of the user behavior result is to the current time, the larger the corresponding calculation weight.
As a preferred embodiment, if the text data is currently browsed by the user, the category of the text data being browsed by the user is acquired, and a plurality of text data are selected from other text data of the category and pushed to the user.
As a preferred embodiment, a specific method for selecting a plurality of text data from other text data in the category and pushing the selected text data to the user is as follows: and acquiring other text data under the category from the system. And sequencing the acquired text data according to the heat degree. And pushing a plurality of text data ranked at the top to the user. I.e. preferably push the text data with higher popularity to the user.
As an optional implementation manner, a specific method for selecting a plurality of text data from other text data in the category and pushing the selected text data to the user is as follows: and acquiring other text data under the category from the system. And sequencing the acquired text data according to the heat degree to obtain a first sequence. And sequencing the acquired text data according to the release time to obtain a second sequence. And setting a calculation weight value for the text data in the first sequence according to a first rule, wherein the higher the heat of the text data is, the larger the calculation weight value is. And setting a calculation weight value for the text data in the second sequence according to a second rule, wherein the calculation weight value is larger the closer the text data release time is to the current time. And calculating the comprehensive weight of each text data, wherein each text data corresponds to one heat calculation weight and one release time calculation weight, and the two weights are added to obtain the comprehensive weight. And reordering according to the comprehensive weight of each text data under the category to obtain a third ordering. And pushing a plurality of text data ranked at the top in the third ranking to the user. That is, when selecting text data to be pushed to a user, not only the popularity of the text data is taken into consideration, but also the distribution time of the text data is used as an index for comprehensive consideration. It is understood that the setting of the calculation weight can be specified according to the requirement.
The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It should be understood by those skilled in the art that the above embodiments do not limit the present invention in any way, and all technical solutions obtained by using equivalent alternatives or equivalent variations fall within the scope of the present invention.

Claims (10)

1. An information pushing method based on user behaviors is characterized by comprising the following steps:
acquiring a plurality of unmarked news data, wherein the news data comprises text data, picture data and video data;
cleaning the news data;
classifying the text data through an LDA text clustering model;
analyzing all the classified text data belonging to the same category to obtain a character label of each category;
matching the picture data through a labeled standard celebrity picture library to label the picture data;
extracting frames from the video data, and matching the extracted frame pictures of the video data through a labeled standard celebrity picture library to label the video data;
storing the classified and labeled news data into a system for a user to browse, and receiving user behavior data sent by a user side;
calculating and analyzing the user behavior data to obtain a user behavior calculation result, and storing the user behavior calculation result into user configuration information of a user, wherein all the user behavior calculation results of the user are stored in the user configuration information;
and calculating the favorite result of the user according to the updated user behavior calculation results, and recommending targeted content to the user according to the favorite result.
2. The information pushing method based on user behavior as claimed in claim 1,
setting an effective calculation period, updating the preference result at regular time in the period, and recalculating all the user behavior calculation results in the effective calculation period to obtain a new preference result.
3. The information pushing method based on user behavior as claimed in claim 2,
and dividing the user behavior calculation result into a plurality of statistical categories, and setting different effective calculation periods for the calculation results of different statistical categories.
4. The information pushing method based on user behavior as claimed in claim 1,
and when the preference result is calculated, setting different calculation weights according to the generation time of the user behavior result.
5. The information pushing method based on user behavior as claimed in claim 1,
when the preference result of the user is calculated, the closer the generation time of the user behavior result is to the current time, the larger the corresponding calculation weight is.
6. The information pushing method based on user behavior as claimed in claim 1,
if the text data is browsed by the user currently, the type of the text data browsed by the user is obtained, and a plurality of text data are selected from other text data of the type and pushed to the user.
7. The information pushing method based on user behavior as claimed in claim 6,
the specific method for selecting a plurality of text data from other text data of the category and pushing the selected text data to the user comprises the following steps:
acquiring other text data under the category from the system;
sequencing the acquired text data according to the heat;
and pushing a plurality of the text data ranked at the top to the user.
8. The information pushing method based on user behavior as claimed in claim 6,
the specific method for selecting a plurality of text data from other text data of the category and pushing the selected text data to the user comprises the following steps:
acquiring other text data under the category from the system;
sequencing the acquired text data according to the heat degree to obtain a first sequence;
sequencing the acquired text data according to the release time to obtain a second sequence;
setting a calculation weight value for the text data in the first sequence according to a first rule;
setting a calculation weight value for the text data in the second sequence according to a second rule;
calculating the comprehensive weight of each piece of text data;
reordering according to the comprehensive weight of each text data under the category to obtain a third ordering;
pushing a number of the text data ranked top in the third ranking to the user.
9. The information pushing method based on user behavior as claimed in claim 1,
and classifying the text data through an LDA text clustering model, and setting the classification quantity according to the requirement.
10. The information pushing method based on user behavior as claimed in claim 9,
and setting the classification quantity according to the total quantity of the text data.
CN202011084095.1A 2020-10-12 2020-10-12 Information pushing method based on user behaviors Pending CN112182396A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011084095.1A CN112182396A (en) 2020-10-12 2020-10-12 Information pushing method based on user behaviors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011084095.1A CN112182396A (en) 2020-10-12 2020-10-12 Information pushing method based on user behaviors

Publications (1)

Publication Number Publication Date
CN112182396A true CN112182396A (en) 2021-01-05

Family

ID=73948104

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011084095.1A Pending CN112182396A (en) 2020-10-12 2020-10-12 Information pushing method based on user behaviors

Country Status (1)

Country Link
CN (1) CN112182396A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106028071A (en) * 2016-05-17 2016-10-12 Tcl集团股份有限公司 Video recommendation method and system
CN106156204A (en) * 2015-04-23 2016-11-23 深圳市腾讯计算机系统有限公司 The extracting method of text label and device
CN108810095A (en) * 2018-05-18 2018-11-13 歌尔科技有限公司 A kind of news push method and apparatus
CN108984657A (en) * 2018-06-28 2018-12-11 Oppo广东移动通信有限公司 Image recommendation method and apparatus, terminal, readable storage medium storing program for executing
CN110069663A (en) * 2019-04-29 2019-07-30 厦门美图之家科技有限公司 Video recommendation method and device
US20200125574A1 (en) * 2018-10-18 2020-04-23 Oracle International Corporation Smart content recommendations for content authors

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156204A (en) * 2015-04-23 2016-11-23 深圳市腾讯计算机系统有限公司 The extracting method of text label and device
CN106028071A (en) * 2016-05-17 2016-10-12 Tcl集团股份有限公司 Video recommendation method and system
CN108810095A (en) * 2018-05-18 2018-11-13 歌尔科技有限公司 A kind of news push method and apparatus
CN108984657A (en) * 2018-06-28 2018-12-11 Oppo广东移动通信有限公司 Image recommendation method and apparatus, terminal, readable storage medium storing program for executing
US20200125574A1 (en) * 2018-10-18 2020-04-23 Oracle International Corporation Smart content recommendations for content authors
CN110069663A (en) * 2019-04-29 2019-07-30 厦门美图之家科技有限公司 Video recommendation method and device

Similar Documents

Publication Publication Date Title
CN108009228B (en) Method and device for setting content label and storage medium
CN110059271B (en) Searching method and device applying tag knowledge network
CN106202475B (en) Method and device for pushing video recommendation list
CN105701498B (en) User classification method and server
CN103229169B (en) Content providing and system
US9798741B2 (en) Interactive image selection method
CN106326391A (en) Method and device for recommending multimedia resources
CN102737029A (en) Searching method and system
CN110543598A (en) information recommendation method and device and terminal
CN103577478A (en) Web page pushing method and system
CN109903127A (en) A kind of group recommending method, device, storage medium and server
CN108256537A (en) A kind of user gender prediction method and system
CN110502664A (en) Video tab indexes base establishing method, video tab generation method and device
CN111597446B (en) Content pushing method and device based on artificial intelligence, server and storage medium
CN111914172A (en) Medical information recommendation method and system based on user tags
CN110163703A (en) A kind of disaggregated model method for building up, official documents and correspondence method for pushing and server
CN111400586A (en) Group display method, terminal, server, system and storage medium
CN111914079A (en) Topic recommendation method and system based on user tags
CN111861550A (en) OTT (over the Top) equipment-based family portrait construction method and system
CN113626638A (en) Short video recommendation processing method and device, intelligent terminal and storage medium
KR20140010679A (en) System and method for recommendation
CN108810577B (en) User portrait construction method and device and electronic equipment
CN117235362A (en) Analysis system based on big data of wisdom text travel
CN112269906A (en) Automatic extraction method and device of webpage text
CN114282119B (en) Scientific and technological information resource retrieval method and system based on heterogeneous information network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210105