CN109189892B - Recommendation method and device based on article comments - Google Patents

Recommendation method and device based on article comments Download PDF

Info

Publication number
CN109189892B
CN109189892B CN201811084474.3A CN201811084474A CN109189892B CN 109189892 B CN109189892 B CN 109189892B CN 201811084474 A CN201811084474 A CN 201811084474A CN 109189892 B CN109189892 B CN 109189892B
Authority
CN
China
Prior art keywords
comment
classes
article
processed
comments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811084474.3A
Other languages
Chinese (zh)
Other versions
CN109189892A (en
Inventor
孔滕
王国斐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yidian Wangju Technology Co ltd
Original Assignee
Beijing Yidian Wangju Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yidian Wangju Technology Co ltd filed Critical Beijing Yidian Wangju Technology Co ltd
Priority to CN201811084474.3A priority Critical patent/CN109189892B/en
Publication of CN109189892A publication Critical patent/CN109189892A/en
Application granted granted Critical
Publication of CN109189892B publication Critical patent/CN109189892B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a recommendation method and device based on article comments, and belongs to the technical field of computers. The method comprises the following steps: collecting user comments; dividing the reviews into N classes by unsupervised clustering; finding out the most representative key words in the N classes in a mode of information gain and chi-square inspection; expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of large comment classes; determining whether the comments to be processed are matched with the comment word stock; and if the to-be-processed comment is matched with any one of the comment large classes in the comment word stock, recommending the article corresponding to the comment large class to the user. Therefore, more dimensionality feature labels are expanded for the articles, and the accuracy of article recommendation is effectively improved. And then overcome the technical problem that prior art exists can't accurate recommendation.

Description

Recommendation method and device based on article comments
Technical Field
The invention relates to the technical field of computers, in particular to a recommendation method and device based on article comments.
Background
The current content recommendation mode mainly calculates a user portrait according to user basic information and user behaviors, calculates an article portrait according to article basic information (including titles, contents, sources and the like), and recommends an article matched with the user portrait through correlation calculation. But features of some dimensions are difficult or costly to identify simply by the body of the article, such as headline parties (content of the article does not match the headline), plagiarism articles, rumors, inclusion ads, etc.
Disclosure of Invention
The recommendation method and device based on article comments, provided by the embodiment of the invention, can solve the technical problem that articles cannot be accurately identified in the prior art.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:
in a first aspect, an article comment-based recommendation method provided in an embodiment of the present invention includes: collecting user comments; dividing the reviews into N classes by unsupervised clustering; finding out the most representative key words in the N classes in a mode of information gain and chi-square inspection; expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of large comment classes; determining whether the comments to be processed are matched with the comment word stock; and if the to-be-processed comment is matched with any one of the comment large classes in the comment word stock, recommending the article corresponding to the comment large class to the user.
With reference to the first aspect, an embodiment of the present invention provides a first possible implementation manner of the first aspect, and after determining whether the comment to be processed matches the comment lexicon, the method further includes: and if the to-be-processed comment is not matched with any of the comment large classes in the comment word stock, filtering the to-be-processed comment.
With reference to the first aspect, an embodiment of the present invention provides a second possible implementation manner of the first aspect, recommending the article corresponding to the review broad category to the user, where the recommending includes: determining whether the article corresponding to the comment large class meets a preset requirement; and if so, recommending the article to the user through multiple channels.
With reference to the second possible implementation manner of the first aspect, an embodiment of the present invention provides a third possible implementation manner of the first aspect, and after determining whether the article corresponding to the review broad category meets a preset requirement, the method further includes: if the article corresponding to the comment large class does not meet the preset requirement, marking the comment to be processed; and clustering the marked comments to be processed.
With reference to the first aspect, an embodiment of the present invention provides a fourth possible implementation manner of the first aspect, and finding out the most representative keywords in the N classes in a manner of information gain and chi-square test includes: determining a candidate set from the N classes by a chi-squared test and information gain method; performing K-means clustering on the candidate set according to a preset word vector to obtain M classes; and determining the correlation and information entropy between the word vectors in the candidate set and the corresponding M classes, and selecting the candidate words with small information entropy as the keywords.
In a second aspect, an article comment-based recommendation apparatus provided by an embodiment of the present invention includes: the collection unit is used for collecting user comments; a first processing unit for dividing the comments into N classes by unsupervised clustering; the second processing unit is used for finding out the most representative key words in the N classes in a mode of information gain and chi-square test; the third processing unit is used for expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of comment large classes; the fourth processing unit is used for determining whether the comment to be processed is matched with the comment lexicon or not; and the recommending unit is used for recommending the article corresponding to the large comment class to the user if the to-be-processed comment is matched with any large comment class in the comment word stock.
With reference to the second aspect, an embodiment of the present invention provides a first possible implementation manner of the second aspect, and after the fourth processing unit, the apparatus further includes: and the fifth processing unit is used for filtering the to-be-processed comment if the to-be-processed comment is not matched with any of the large classes of comments in the comment word stock.
With reference to the second aspect, an embodiment of the present invention provides a second possible implementation manner of the second aspect, where the recommending unit includes: the first subunit is used for determining whether the article corresponding to the comment large class meets a preset requirement or not; and the second subunit is used for recommending the article to the user through multiple channels if the article is the first sub-unit.
With reference to the second possible implementation manner of the second aspect, an embodiment of the present invention provides a third possible implementation manner of the second aspect, and after the first subunit, the method further includes: the third subunit is configured to label the comment to be processed if the article corresponding to the comment large category does not meet the preset requirement; and the fourth subunit is used for clustering the marked comments to be processed.
With reference to the second aspect, an embodiment of the present invention provides a fourth possible implementation manner of the second aspect, where the second processing unit is further configured to: determining a candidate set from the N classes by a chi-squared test and information gain method; performing K-means clustering on the candidate set according to a preset word vector to obtain M classes; and determining the correlation and information entropy between the word vectors in the candidate set and the corresponding M classes, and selecting the candidate words with small information entropy as the keywords.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the recommendation method and device based on article comments, provided by the embodiment of the invention, are implemented by collecting user comments; dividing the reviews into N classes by unsupervised clustering; finding out the most representative key words in the N classes in a mode of information gain and chi-square inspection; expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of large comment classes; determining whether the comments to be processed are matched with the comment word stock; and if the to-be-processed comment is matched with any one of the comment large classes in the comment word stock, recommending the article corresponding to the comment large class to the user. Therefore, more dimensionality feature labels are expanded for the articles, and the accuracy of article recommendation is effectively improved. And then overcome the technical problem that prior art exists can't accurate recommendation.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a flowchart of an article review-based recommendation method according to a first embodiment of the present invention;
fig. 2 is a functional module schematic diagram of an article review-based recommendation apparatus according to a second embodiment of the present invention;
fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
First embodiment
Please refer to fig. 1, which is a flowchart illustrating a recommendation method based on article comments according to a first embodiment of the present invention. The specific process shown in FIG. 1 will be described in detail below.
And step S101, collecting user comments.
The user comment refers to a comment of a user on an article.
In actual use, user reviews of all kinds of articles at present can be collected. Or to collect a preset number of user reviews of articles for each category.
The setting of the preset number can be selected according to actual requirements, and generally, the larger the preset number is, the better the preset number is.
Step S102, the comments are divided into N classes through unsupervised clustering.
In actual use, the scores can be divided into N classes based on the K-means algorithm, where N is a positive integer. For example, two points are randomly allocated as the centroid of each cluster, then the distances between all the comments and the two centroids are judged, the marks which are closer to the centroid are of the same type, and the centroids are recalculated according to the newly divided clusters until certain requirements (such as minimum mean square error) are met, so that the comments are clustered into N classes.
For example, reviews are classified in different dimensions according to semantic attributes, such as article quality, positive energy, rumors, advertisements, plagiarisms, falseness, smelling, irrelevance, abuse, and the like.
And step S103, finding out the most representative keywords in the N categories in a mode of information gain and chi-square inspection.
Wherein the information gain is asymmetric to measure the difference between the two probability distributions P and Q. The information gain describes the difference when coding with Q, then coding with P. Usually P represents the distribution of samples or observations, and possibly also an accurately calculated theoretical distribution. Q represents a theory, model, description, or approximation to P.
The chi-square test is used for counting the deviation degree between the actual observed value and the theoretical inferred value of the sample, the deviation degree between the actual observed value and the theoretical inferred value determines the size of the chi-square value, the chi-square value is larger and is less inconsistent, the deviation is smaller, the chi-square value is smaller and tends to be consistent, and if the values are completely equal, the chi-square value is 0, which indicates that the theoretical value is completely consistent.
In the embodiment, more representative keywords can be effectively found through the information gain and chi-square test to represent each class, so that the identification and search can be more easily and quickly carried out in the search process to search the corresponding articles, and the recommendation of the corresponding articles for the user can be quickly finished.
As an embodiment, step S103 includes: determining a candidate set from the N classes by a chi-squared test and information gain method; performing K-means clustering on the candidate set according to a preset word vector to obtain M classes; and determining the correlation and information entropy between the word vectors in the candidate set and the corresponding M classes, and selecting the candidate words with small information entropy as the keywords. For example, in practical use, a candidate set of representative keywords is found through a method of a card method test and an information gain, a word vector model is trained separately for each class in advance, K-means clustering is performed according to the word vector model to obtain M classes, finally, the correlation between word vectors in the candidate set and the corresponding M classes is calculated, the information entropy is calculated, and candidate words with small information entropy are selected as final representative words, namely the keywords.
Where M is a positive integer, M may be equal to N.
In the embodiment, the most representative keywords are found by selecting a proper mode, so that the accuracy of article recommendation can be improved when articles are recommended.
And step S104, expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of comment large classes.
The shallow neural network comprises an input layer, an average pooling layer, a word vector layer (output layer) ReLu (rectified Linear units) activation function and a Softmax classifier. The input layer comprises a plurality of word tokens and cluster tokens, the average pooling layer is used for averaging data input by the input layer, inputting the averaged data into the word vector layer, processing the data through the word vector layer, processing the processed data through an activation function, and performing classification processing through a Softmax classifier. And finally classifying the comment word library into a plurality of comment large classes.
In the embodiment, after clustering is performed through unsupervised clustering, the most representative keywords which are most different from other categories in each category are found out, and the keywords are subjected to semantic expansion to form a comment word bank with dozens of comment categories. Words in the comment lexicon are used to embody an comprehension capability unique to the comment semantics. By analyzing and identifying the article comments (namely the user comments) in a large category, the characteristics of the article can be more accurately expanded from an objective angle, and the deeper understanding of the article is improved, so that the accuracy of article recommendation is improved.
In the embodiment, keyword expansion is realized by selecting a shallow neural network mode, so that more deep understanding of the articles is improved, and the accuracy of article recommendation is improved.
Step S105, determining whether the comment to be processed is matched with the comment word stock.
The to-be-processed comment may be the user comment acquired in step S101, or may be a newly acquired user comment, and is used to recommend an appropriate article to the user according to the to-be-processed comment.
In actual use, the hit rate (or the matching degree) is calculated by matching the comment lexicon, when the hit rate is zero, the comment to be processed is not matched with the comment lexicon, and otherwise, the comment to be processed is matched.
The comment word stock comprises a plurality of comment classes, and if the comment word stock is matched with at least one of the comment classes, the comment to be processed is matched with the comment word stock.
In actual use, the accuracy rate of hitting the articles through negative comments is over 70%, and the accuracy rate of hitting the articles through positive comments is over 85%.
In an optional embodiment, after step S105, the method further includes: and if the to-be-processed comment is not matched with any comment large class in the comment word stock, filtering the to-be-processed comment (or ignoring the to-be-processed comment and not performing other subsequent processing on the to-be-processed comment).
And step S106, if the comment to be processed is matched with any one of the comment large classes in the comment word stock, recommending the article corresponding to the comment large class to the user.
And when the comment to be processed is matched with at least one comment large class in the comment word stock, recommending the article corresponding to the comment large class to the user.
Optionally, recommending the article corresponding to the comment broad category to the user includes: determining whether the article corresponding to the comment large class meets a preset requirement; and if so, recommending the article to the user through multiple channels. Through multi-channel recommendation, the user can receive the article more comprehensively, and the situation that the user cannot receive the recommended article when a certain channel has a problem is avoided.
The preset requirement may be article quality, and if the article quality reaches a score greater than or equal to 80 minutes, it is determined that the article corresponding to the review broad category meets the preset requirement, and otherwise, it does not meet the preset requirement.
In an optional embodiment, after determining whether the article corresponding to the comment broad class meets a preset requirement, the method further includes: if the article corresponding to the comment large class does not meet the preset requirement, marking the comment to be processed; and clustering the marked comments to be processed.
The method comprises the steps of marking comments to be processed to generate a real training sample, and retraining by using a recurrent neural network according to semantic attributes of the comments to be processed for identifying and classifying the comments. Because the user reviews grammar and words very randomly, the words are difficult to be reasonably segmented by using a conventional word segmentation device, and preferably, the words are used as basic processing units, so that errors caused by word segmentation errors are avoided, and the scale of the model is greatly reduced; in addition, considering that the comments of the users have time-series connection, the neural network of the 2-layer is used for prediction by using the vector generated by the recurrent neural network as the feature of the comments. The marked comments are used as real and effective training data, so that the training effect can be continuously enhanced, and the recognition precision is effectively improved in a retraining mode.
In the recommendation method based on article comments provided by the embodiment, user comments are collected; dividing the reviews into N classes by unsupervised clustering; finding out the most representative key words in the N classes in a mode of information gain and chi-square inspection; expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of large comment classes; determining whether the comments to be processed are matched with the comment word stock; and if the to-be-processed comment is matched with any one of the comment large classes in the comment word stock, recommending the article corresponding to the comment large class to the user. Therefore, more dimensionality feature labels are expanded for the articles, and the accuracy of article recommendation is effectively improved. And then overcome the technical problem that prior art exists can't accurate recommendation.
Second embodiment
Fig. 2 shows a recommendation apparatus based on article comments, which is in one-to-one correspondence with the recommendation method based on article comments shown in the first embodiment, in correspondence with the recommendation method based on article comments in the first embodiment. As shown in fig. 2, the article review-based recommendation apparatus 400 includes an acquisition unit 410, a first processing unit 420, a second processing unit 430, a third processing unit 440, a fourth processing unit 450, and a recommendation unit 460. The implementation functions of the acquisition unit 410, the first processing unit 420, the second processing unit 430, the third processing unit 440, the fourth processing unit 450, and the recommendation unit 460 correspond to the corresponding steps in the first embodiment one to one, and for avoiding redundancy, detailed descriptions are not needed in this embodiment.
And the collecting unit 410 is used for collecting the user comments.
A first processing unit 420 for dividing the comments into N classes by unsupervised clustering.
And the second processing unit 430 is configured to find out the most representative keywords in the N categories by means of information gain and chi-square test.
Optionally, the second processing unit is further configured to: determining a candidate set from the N classes by a chi-squared test and information gain method; performing K-means clustering on the candidate set according to a preset word vector to obtain M classes; and determining the correlation and information entropy between the word vectors in the candidate set and the corresponding M classes, and selecting the candidate words with small information entropy as the keywords.
And the third processing unit 440 is configured to expand the keyword through a shallow neural network to generate a plurality of review word libraries of review broad categories.
And the fourth processing unit 450 is configured to determine whether the comment to be processed matches the comment lexicon.
And the recommending unit 460 is configured to recommend the article corresponding to the large comment class to the user if the to-be-processed comment is matched with any large comment class in the comment lexicon.
Optionally, the recommending unit 460 includes: a first subunit and a second subunit.
The first subunit is used for determining whether the article corresponding to the comment large class meets a preset requirement or not;
and the second subunit is used for recommending the article to the user through multiple channels if the article is the first sub-unit.
Optionally, after the first sub-unit, the recommending unit 460 further includes: a third subunit and a fourth subunit.
The third subunit is configured to label the comment to be processed if the article corresponding to the comment large category does not meet the preset requirement;
and the fourth subunit is used for clustering the marked comments to be processed.
In an optional embodiment, after the fourth processing unit 450, the apparatus 400 further includes:
and the fifth processing unit is used for filtering the to-be-processed comment if the to-be-processed comment is not matched with any of the large classes of comments in the comment word stock.
Third embodiment
As shown in fig. 3, is a schematic diagram of an electronic device 300. The electronic device 300 includes a memory 302, a processor 304, and a computer program 303 stored in the memory 302 and capable of running on the processor 304, wherein the computer program 303, when executed by the processor 304, implements the article review-based recommendation method in the first embodiment, and details are not repeated here to avoid repetition. Alternatively, the computer program 303 is implemented by the processor 304 to implement the functions of each model/unit in the article review-based recommendation apparatus according to the second embodiment, and details are not repeated here to avoid repetition.
Illustratively, the computer program 303 may be partitioned into one or more modules/units, which are stored in the memory 302 and executed by the processor 304 to implement the present invention. One or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 303 in the electronic device 300. For example, the computer program 303 may be divided into the acquisition unit 410, the first processing unit 420, the second processing unit 430, the third processing unit 440, the fourth processing unit 450, and the recommendation unit 460 in the second embodiment, and specific functions of each unit are as described in the first embodiment or the second embodiment, which are not described herein again.
The electronic device 300 may be a desktop computer, a notebook, a palmtop, or a smart phone.
The Memory 302 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 302 is used for storing a program, and the processor 304 executes the program after receiving an execution instruction, and the method defined by the flow disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 304, or implemented by the processor 304.
The processor 304 may be an integrated circuit chip having signal processing capabilities. The Processor 304 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable Gate arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It is understood that the configuration shown in fig. 3 is only a schematic configuration of the electronic device 300, and the electronic device 300 may further include more or less components than those shown in fig. 3. The components shown in fig. 3 may be implemented in hardware, software, or a combination thereof.
Fourth embodiment
An embodiment of the present invention further provides a storage medium, where instructions are stored in the storage medium, and when the instructions are run on a computer, when the computer program is executed by a processor, the article comment-based recommendation method in the first embodiment is implemented, and in order to avoid repetition, details are not repeated here. Alternatively, the computer program, when executed by the processor, implements the functions of each model/unit in the article review-based recommendation apparatus according to the second embodiment, and is not described herein again to avoid redundancy.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by hardware, or by software plus a necessary general hardware platform, and based on such understanding, the technical solution of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute the method of the various implementation scenarios of the present invention.
In summary, the recommendation method and device based on article comments provided by the invention collect user comments; dividing the reviews into N classes by unsupervised clustering; finding out the most representative key words in the N classes in a mode of information gain and chi-square inspection; expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of large comment classes; determining whether the comments to be processed are matched with the comment word stock; and if the to-be-processed comment is matched with any one of the comment large classes in the comment word stock, recommending the article corresponding to the comment large class to the user. Therefore, more dimensionality feature labels are expanded for the articles, and the accuracy of article recommendation is effectively improved. And then overcome the technical problem that prior art exists can't accurate recommendation.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

Claims (8)

1. A recommendation method based on article comments is characterized by comprising the following steps:
collecting user comments;
dividing the reviews into N classes by unsupervised clustering;
finding out the most representative key words in the N classes in a mode of information gain and chi-square inspection;
expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of large comment classes;
determining whether the comments to be processed are matched with the comment word stock;
if the to-be-processed comment is matched with any one of the comment large classes in the comment word stock, recommending an article corresponding to the comment large class to the user;
recommending the article corresponding to the comment large category to the user, wherein the recommending comprises the following steps:
determining whether the article corresponding to the comment large class meets a preset requirement;
if the article corresponding to the comment large class does not meet the preset requirement, marking the comment to be processed;
and clustering the marked comments to be processed.
2. The method of claim 1, after determining whether the pending review matches the review thesaurus, further comprising:
and if the to-be-processed comment is not matched with any of the comment large classes in the comment word stock, filtering the to-be-processed comment.
3. The method of claim 1, wherein after determining whether the article corresponding to the review broad category meets a preset requirement, further comprising:
and if so, recommending the article to the user through multiple channels.
4. The method of claim 1, wherein finding the most representative keyword of the N classes by means of information gain and chi-square test comprises:
determining a candidate set from the N classes by a chi-squared test and information gain method;
performing K-means clustering on the candidate set according to a preset word vector to obtain M classes;
and determining the correlation and information entropy between the word vectors in the candidate set and the corresponding M classes, and selecting the candidate words with small information entropy as the keywords.
5. An article review-based recommendation apparatus, comprising:
the collection unit is used for collecting user comments;
a first processing unit for dividing the comments into N classes by unsupervised clustering;
the second processing unit is used for finding out the most representative key words in the N classes in a mode of information gain and chi-square test;
the third processing unit is used for expanding the keywords through a shallow neural network to generate a plurality of comment word libraries of comment large classes;
the fourth processing unit is used for determining whether the comment to be processed is matched with the comment lexicon or not;
the recommending unit is used for recommending the article corresponding to the large comment class to the user if the to-be-processed comment is matched with any large comment class in the comment word stock;
the recommendation unit includes:
the first subunit is used for determining whether the article corresponding to the comment large class meets a preset requirement or not;
the third subunit is configured to label the comment to be processed if the article corresponding to the comment large category does not meet the preset requirement;
and the fourth subunit is used for clustering the marked comments to be processed.
6. The apparatus of claim 5, wherein after the fourth processing unit, the apparatus further comprises:
and the fifth processing unit is used for filtering the to-be-processed comment if the to-be-processed comment is not matched with any of the large classes of comments in the comment word stock.
7. The apparatus of claim 5, wherein the recommending unit comprises:
and the second subunit is used for recommending the article to the user through multiple channels if the article is the first sub-unit.
8. The apparatus of claim 5, wherein the second processing unit is further configured to:
determining a candidate set from the N classes by a chi-squared test and information gain method;
performing K-means clustering on the candidate set according to a preset word vector to obtain M classes;
and determining the correlation and information entropy between the word vectors in the candidate set and the corresponding M classes, and selecting the candidate words with small information entropy as the keywords.
CN201811084474.3A 2018-09-17 2018-09-17 Recommendation method and device based on article comments Active CN109189892B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811084474.3A CN109189892B (en) 2018-09-17 2018-09-17 Recommendation method and device based on article comments

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811084474.3A CN109189892B (en) 2018-09-17 2018-09-17 Recommendation method and device based on article comments

Publications (2)

Publication Number Publication Date
CN109189892A CN109189892A (en) 2019-01-11
CN109189892B true CN109189892B (en) 2021-04-27

Family

ID=64912027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811084474.3A Active CN109189892B (en) 2018-09-17 2018-09-17 Recommendation method and device based on article comments

Country Status (1)

Country Link
CN (1) CN109189892B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109885770B (en) * 2019-02-20 2022-01-07 杭州威佩网络科技有限公司 Information recommendation method and device, electronic equipment and storage medium
CN111858901A (en) * 2019-04-30 2020-10-30 北京智慧星光信息技术有限公司 Text recommendation method and system based on semantic similarity
CN110347781B (en) * 2019-07-18 2023-10-20 深圳市雅阅科技有限公司 Article reverse arrangement method, article recommendation method, device, equipment and storage medium
CN111783468B (en) * 2020-06-28 2023-08-15 百度在线网络技术(北京)有限公司 Text processing method, device, equipment and medium
CN112348662B (en) * 2020-10-21 2023-04-07 上海淇玥信息技术有限公司 Risk assessment method and device based on user occupation prediction and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102200981A (en) * 2010-03-25 2011-09-28 三星电子(中国)研发中心 Feature selection method and feature selection device for hierarchical text classification
CN103226597A (en) * 2013-04-19 2013-07-31 北京集奥聚合科技有限公司 Keyword advertisement matching method based on natural semantics
CN104750798A (en) * 2015-03-19 2015-07-01 腾讯科技(深圳)有限公司 Application program recommendation method and device
CN108170794A (en) * 2017-12-27 2018-06-15 杭州网易云音乐科技有限公司 Information recommendation method and device, storage medium and electronic equipment

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9152948B2 (en) * 2012-02-20 2015-10-06 Yahoo! Inc. Method and system for providing a structured topic drift for a displayed set of user comments on an article
CN105989107A (en) * 2015-02-12 2016-10-05 广东欧珀移动通信有限公司 Application recommendation method and device
CN104715049B (en) * 2015-03-26 2017-11-28 无锡中科泛在信息技术研发中心有限公司 Comment on commodity attribute word abstracting method based on body dictionary
CN104809177A (en) * 2015-04-14 2015-07-29 华信弘道(北京)科技有限公司 Webpage commenting and recommending methods and systems based on client
CN105389329B (en) * 2015-09-21 2019-02-12 中国人民解放军国防科学技术大学 A kind of open source software recommended method based on community review
CN106815297B (en) * 2016-12-09 2020-04-10 宁波大学 Academic resource recommendation service system and method
CN106777139A (en) * 2016-12-19 2017-05-31 浙江工业大学 User based on reading time reads the personalized push method of preference statistics
CN106952129A (en) * 2017-02-23 2017-07-14 广东小天才科技有限公司 Method and device is recommended in a kind of application in application shop
CN107577759B (en) * 2017-09-01 2021-07-30 安徽广播电视大学 Automatic recommendation method for user comments
CN107609960A (en) * 2017-10-18 2018-01-19 口碑(上海)信息技术有限公司 Rationale for the recommendation generation method and device
CN108320176A (en) * 2017-12-26 2018-07-24 爱品克科技(武汉)股份有限公司 One kind is classified based on socialization relational users and recommendation method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102200981A (en) * 2010-03-25 2011-09-28 三星电子(中国)研发中心 Feature selection method and feature selection device for hierarchical text classification
CN103226597A (en) * 2013-04-19 2013-07-31 北京集奥聚合科技有限公司 Keyword advertisement matching method based on natural semantics
CN104750798A (en) * 2015-03-19 2015-07-01 腾讯科技(深圳)有限公司 Application program recommendation method and device
CN108170794A (en) * 2017-12-27 2018-06-15 杭州网易云音乐科技有限公司 Information recommendation method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN109189892A (en) 2019-01-11

Similar Documents

Publication Publication Date Title
CN109189892B (en) Recommendation method and device based on article comments
CN107291723B (en) Method and device for classifying webpage texts and method and device for identifying webpage texts
US20230039496A1 (en) Question-and-answer processing method, electronic device and computer readable medium
CN112035620B (en) Question-answer management method, device, equipment and storage medium of medical query system
CN112347778A (en) Keyword extraction method and device, terminal equipment and storage medium
WO2020114100A1 (en) Information processing method and apparatus, and computer storage medium
CN112329460B (en) Text topic clustering method, device, equipment and storage medium
CN110134777B (en) Question duplication eliminating method and device, electronic equipment and computer readable storage medium
Ellis et al. Semantic Annotation and Retrieval of Music using a Bag of Systems Representation.
Sarwar et al. An effective and scalable framework for authorship attribution query processing
CN110851602A (en) Method and device for topic clustering
CN110929525A (en) Network loan risk behavior analysis and detection method, device, equipment and storage medium
CN110019556B (en) Topic news acquisition method, device and equipment thereof
CN111325033A (en) Entity identification method, entity identification device, electronic equipment and computer readable storage medium
Akata et al. Zero-shot learning with structured embeddings
CN112565903B (en) Video recommendation method and device, server and storage medium
CN104408036A (en) Correlated topic recognition method and device
CN111460808A (en) Synonymous text recognition and content recommendation method and device and electronic equipment
CN114996446B (en) Text classification method, device and storage medium
Dileep et al. Speaker recognition using pyramid match kernel based support vector machines
Coviello et al. Multivariate Autoregressive Mixture Models for Music Auto-Tagging.
Gosztolya et al. A feature selection-based speaker clustering method for paralinguistic tasks
US20170293863A1 (en) Data analysis system, and control method, program, and recording medium therefor
CN113688243A (en) Method, device and equipment for marking entities in sentences and storage medium
CN113673237A (en) Model training method, intent recognition method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant