CN109241402A - A kind of virtual comment machine introduction method based on news content - Google Patents
A kind of virtual comment machine introduction method based on news content Download PDFInfo
- Publication number
- CN109241402A CN109241402A CN201810858862.6A CN201810858862A CN109241402A CN 109241402 A CN109241402 A CN 109241402A CN 201810858862 A CN201810858862 A CN 201810858862A CN 109241402 A CN109241402 A CN 109241402A
- Authority
- CN
- China
- Prior art keywords
- comment
- news
- keyword
- data
- virtual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of, and the virtual comment machine introduction method based on news content is saved in method includes the following steps: administrator or editorial staff are manually entered comment in information system Web page and virtually comments on dictionary;With or, information system backstage, dispose big data system, crawl in real time hot news+place government affairs news comment be saved in virtually comment on dictionary;Comment data is analyzed using comment association algorithm, the data of analysis are stored into mysql database, mysql real time data synchronization is stored into redis database using hash mode;Virtual comment data is imported into related news.This programme is commented on specifically for radio, TV and film industries new media news, does not need manual intervention, the real-time, authenticity, objectivity for having ensured dictionary of maximum depth.
Description
Technical field
The present invention relates to news information process fields, and in particular to a kind of virtual comment machine importing based on news content
Method.
Background technique
In internet+melt the media epoch prevailing, especially in broadcasting and TV new media news content, traditional standard machinery formula is commented
By the demand for being also unable to satisfy user.User needs it is seen that real-time is high, the objective reality for the news content that is closely connected
Property comment.
The comment of existing tradition machinery formula, faces following shortcoming:
Comment on that library dictionary is insufficient, traditional product, by technical restriction, generally use lienar for relational database (such as: mysql,
Oracle etc.) store comment dictionary, a table is stored in all comments, when data reach a certain amount of, access data
Speed is slow;Or storage and main table and multiple classification charts, such data volume it is big when, correlation inquiry still compares consumption
When.
The product that comment library dictionary content is dull, shortage objectivity and timeliness are low, traditional, usually manually to dictionary
Comment data is added, thus there is drawback, the subjectivity of people is strong, and sentence dullness, usually this news are all well and good, this piece
News is worth the too forced comments such as recommendation.Due to being artificial addition comment data, these are all after waiting new smell, manually
After having read news, some comments manually just are added for the news, for using elsewhere.
Two places of information system are deposited in manual association's news and virtual comment, virtual comment with news, usually
Manually from selected in dictionary it is one or more of comment be associated in news.Caused by this mode influence be news comment too
Dependent on artificial.
Summary of the invention
It is an object of the invention to overcome the deficiencies of the prior art and provide a kind of virtual comment machine based on news content
Introduction method is commented on specifically for radio, TV and film industries new media news, does not need manual intervention, and maximum depth has ensured word
Real-time, authenticity, the objectivity in library.
The purpose of the present invention is achieved through the following technical solutions:
A kind of virtual comment machine introduction method based on news content, method includes the following steps:
S1: in information system Web page, administrator or editorial staff are manually entered comment, are saved in virtual comment dictionary;
With or, dispose big data system on information system backstage, the preservation of hot news+place government affairs news comment is crawled in real time
To virtual comment dictionary;
S2: analyzing comment data using comment association algorithm, stores to the data of analysis into mysql database,
Mysql real time data synchronization is stored into redis database using hash mode;
S3: virtual comment data is imported into related news.
Further, the sub-step of the step S2 is as follows:
S01: setting crawls the webpage of data;
S02: news list and news details page are analyzed from webpage;
S03: the comment url of the news is found out in news details page;
S04: as unit of news, headline, keyword, content and comment are crawled;
S05: processing crawls data, and associated record news keyword is commented on it.
Further, it is that N keyword is arranged to every news that the comment data, which carries out analysis method, gives each key
Weight W is arranged in word, then W1+W2+……WN=1, keyword is combined, comment list is stored in crucial contamination;
Assuming that comment has n item, then the sum of associated comment of each keyword is more than or equal to n.
Further, specific step is as follows by the step S4:
S11: after editorial staff writes Press release, the keyword of news is set, and keyword weight is set;
S12: editorial staff saves manuscript, and news background system can be according to the key set in manuscript keyword and dictionary at this time
Word carries out similarity mode;
S13: matched detailed process:
One similarity factor coefficient f is set, it is assumed that one of keyword of editorial staff's contribution is X, the keyword of virtual dictionary
Y, matching result Y1, then their relationship are as follows:
Y1 = 0 ;(f=0)
Y1 = X*f ;(0<f<100%)
Y1 = Y ;(f=100).
Further, the similarity factor coefficient f people is setting.
The beneficial effects of the present invention are: physical record is used for using conventional linear database mysql(, it is convenient to be with other
System docking)+non-relational database redis(caching, for reading in real time).This virtual Commentary Systems classifies to dictionary
(usually dividing hot news, current political news), there are also region news (such as: Shenzhen, Guangxi etc.), and the news such as the political situation of the time, hot topic are adopted
With big data crawler technology, Top Site, hot news, the comment data of local E-gov Network news are collected, is associated with using comment
Manuscript keyword is commented on it and is associated by parser.Rare occasion requires manual intervention audit, and rearranges.
Data after analysis are stored in mysql database, and the real time data synchronization to redis database, redis can be with mass memory
Data have the advantage quickly read.
Using crawler technology+comment association algorithm, major website recent reviews, the guarantee of maximum depth can be quickly collected
The real-time of dictionary, authenticity, objectivity.
Using weighted associations algorithm, manual intervention is not needed, editorial staff is only responsible for writing contribution, after saving contribution,
Just there is part comment to be automatically imported in the news.Editorial staff audits and modifies just.
Detailed description of the invention
Fig. 1 is the flow chart of one's duty.
Specific embodiment
Technical solution of the present invention is described in further detail combined with specific embodiments below, but protection scope of the present invention is not
It is confined to as described below.
As shown in Figure 1, a kind of virtual comment machine introduction method based on news content, this method key step are as follows:
S1: in information system Web page, administrator or editorial staff are manually entered comment, are saved in virtual comment dictionary;
With or, dispose big data system on information system backstage, the preservation of hot news+place government affairs news comment is crawled in real time
To virtual comment dictionary;
S2: analyzing comment data using comment association algorithm, stores to the data of analysis into mysql database,
Mysql real time data synchronization is stored into redis database using hash mode;
S3: virtual comment data is imported into related news.
The purpose of step S1 is for obtaining virtual comment, and virtual comment source mode includes:
It manually inputs, in information system Web page, supports administrator/editorial staff to be manually entered comment, be saved in data
In library, such as the following figure
News and its comment are crawled using big data crawler technology, information system backstage disposes big data system, crawls heat in real time
Door news+place government affairs news comment, it is as follows to crawl data step:
1 > setting crawls the webpage of data, such as: People's Net, Guangxi News Network;
2 > news list and news details page are analyzed from webpage;
3 > the comment url of the news is found out in news details page;
4 > as unit of news, crawl the related datas such as headline, keyword, content and comment;
5 > processing crawls data, and associated record news keyword is commented on it
// insertion here crawls flow chart and part core code.
Step S2 stores for realizing virtual comment and analysis, and steps are as follows for tool:
Comment data is analyzed using comment association algorithm, analyzes the content of news, keyword is set to every news, is given
Weight is arranged in each keyword, is combined to keyword, and comment list is stored in crucial contamination.Citing, it is assumed that one
Piece news has 3 keywords, is 1 weight W=30% of keyword, keyword 2W=50%, keyword 3W=20% respectively.Assuming that the news
Comment have n item, analyze this n item comment, these comments are associated on the keyword of news.One of final result may be:
Keyword 1 is associated with n/m, and keyword 2 is associated with n/p review record keyword 3 and is not associated with comment, then, and n/m+n/p >=
n。
It is stored using conventional linear database combination novel non-linearity database is unified, the data storage as above analyzed is arrived
In mysql database, mysql real time data synchronization is stored into redis database using hash mode.
Step S3 is that virtual comment data is imported into related news, and its step are as follows:
After editorial staff writes Press release, the keyword of news is set, and keyword weight is set, this program can be with
Default sorts according to keyword to be arranged.
Editorial staff saves manuscript, and news background system can be according to the key set in manuscript keyword and dictionary at this time
Word carries out similarity mode.
Matched detailed process.One similarity factor coefficient f is set here, it can be taking human as the setting coefficient value, it is assumed that
One of keyword of editorial staff's contribution is X, the keyword Y of virtual dictionary, matching result Y1, then their relationship are as follows:
Y1 = 0 ;(f=0)
Y1 = X*f ;(0<f<100%)
Y1 = Y ;(f=100)
Illustrating: one X value of input, it can be deduced that 0 or multiple Y1 values, adjustment f coefficient value, which will affect Y1, is worth number, such as:
As f=0, indicate that the degree of association of X and Y is 0, i.e., any comment in dictionary all cannot be as the comment of the news;When
When f=100%, indicate that X and Y fits like a glove, Y all comments all can serve as the comment of the news in dictionary;As f=40%,
Assuming that X value is Guangxi, Y value=[Guangxi, is understood at Guangxi Network TV Station by Guangxi TV station], then " " Guangxi Network TV Station " will not
It is matched out.
The above is only a preferred embodiment of the present invention, it should be understood that the present invention is not limited to described herein
Form should not be regarded as an exclusion of other examples, and can be used for other combinations, modifications, and environments, and can be at this
In the text contemplated scope, modifications can be made through the above teachings or related fields of technology or knowledge.And those skilled in the art institute into
Capable modifications and changes do not depart from the spirit and scope of the present invention, then all should be in the protection scope of appended claims of the present invention
It is interior.
Claims (5)
1. a kind of virtual comment machine introduction method based on news content, which is characterized in that method includes the following steps:
S1: in information system Web page, administrator or editorial staff are manually entered comment, are saved in virtual comment dictionary;
With or, dispose big data system on information system backstage, the preservation of hot news+place government affairs news comment is crawled in real time
To virtual comment dictionary;
S2: analyzing comment data using comment association algorithm, stores to the data of analysis into mysql database,
Mysql real time data synchronization is stored into redis database using hash mode;
S3: virtual comment data is imported into related news.
2. a kind of virtual comment machine introduction method based on news content according to claim 1, which is characterized in that institute
The sub-step for stating step S2 is as follows:
S01: setting crawls the webpage of data;
S02: news list and news details page are analyzed from webpage;
S03: the comment url of the news is found out in news details page;
S04: as unit of news, headline, keyword, content and comment are crawled;
S05: processing crawls data, and associated record news keyword is commented on it.
3. a kind of virtual comment machine introduction method based on news content according to claim 2, which is characterized in that institute
Stating comment data and carrying out analysis method is that N keyword is arranged to every news, weight W is arranged to each keyword, then W1+W2
+……WN=1, keyword is combined, comment list is stored in crucial contamination;
Assuming that comment has n item, then the sum of associated comment of each keyword is more than or equal to n.
4. a kind of virtual comment machine introduction method based on news content according to claim 3, which is characterized in that institute
Stating step S4, specific step is as follows:
S11: after editorial staff writes Press release, the keyword of news is set, and keyword weight is set;
S12: editorial staff saves manuscript, and news background system can be according to the key set in manuscript keyword and dictionary at this time
Word carries out similarity mode;
S13: matched detailed process:
One similarity factor coefficient f is set, it is assumed that one of keyword of editorial staff's contribution is X, the keyword of virtual dictionary
Y, matching result Y1, then their relationship are as follows:
Y1 = 0 ;(f=0)
Y1 = X*f ;(0<f<100%)
Y1 = Y ;(f=100).
5. a kind of virtual comment machine introduction method based on news content according to claim 1, which is characterized in that institute
Similarity factor coefficient f people is stated as setting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810858862.6A CN109241402A (en) | 2018-07-31 | 2018-07-31 | A kind of virtual comment machine introduction method based on news content |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810858862.6A CN109241402A (en) | 2018-07-31 | 2018-07-31 | A kind of virtual comment machine introduction method based on news content |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109241402A true CN109241402A (en) | 2019-01-18 |
Family
ID=65073370
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810858862.6A Pending CN109241402A (en) | 2018-07-31 | 2018-07-31 | A kind of virtual comment machine introduction method based on news content |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109241402A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109885770A (en) * | 2019-02-20 | 2019-06-14 | 杭州威佩网络科技有限公司 | A kind of information recommendation method, device, electronic equipment and storage medium |
CN116306514A (en) * | 2023-05-22 | 2023-06-23 | 北京搜狐新媒体信息技术有限公司 | Text processing method and device, electronic equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102087648A (en) * | 2009-12-03 | 2011-06-08 | 北京大学 | Method and system for fetching news comment page |
CN102279894A (en) * | 2011-09-19 | 2011-12-14 | 嘉兴亿言堂信息科技有限公司 | Method for searching, integrating and providing comment information based on semantics and searching system |
US20120215798A1 (en) * | 2011-02-18 | 2012-08-23 | International Business Machines Corporation | System and Method for a Centralized URL Commenting Service Enabling Metadata Aggregation |
CN103034722A (en) * | 2012-12-13 | 2013-04-10 | 合一网络技术(北京)有限公司 | Network video comment gathering device and network video comment gathering method |
CN106202563A (en) * | 2016-08-02 | 2016-12-07 | 西南石油大学 | A kind of real time correlation evental news recommends method and system |
CN106951409A (en) * | 2017-03-17 | 2017-07-14 | 黄淮学院 | A kind of network social intercourse media viewpoint tendency analysis system and method |
CN107220352A (en) * | 2017-05-31 | 2017-09-29 | 北京百度网讯科技有限公司 | The method and apparatus that comment collection of illustrative plates is built based on artificial intelligence |
CN108153723A (en) * | 2017-12-27 | 2018-06-12 | 北京百度网讯科技有限公司 | Hot spot information comment generation method, device and terminal device |
-
2018
- 2018-07-31 CN CN201810858862.6A patent/CN109241402A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102087648A (en) * | 2009-12-03 | 2011-06-08 | 北京大学 | Method and system for fetching news comment page |
US20120215798A1 (en) * | 2011-02-18 | 2012-08-23 | International Business Machines Corporation | System and Method for a Centralized URL Commenting Service Enabling Metadata Aggregation |
CN102279894A (en) * | 2011-09-19 | 2011-12-14 | 嘉兴亿言堂信息科技有限公司 | Method for searching, integrating and providing comment information based on semantics and searching system |
CN103034722A (en) * | 2012-12-13 | 2013-04-10 | 合一网络技术(北京)有限公司 | Network video comment gathering device and network video comment gathering method |
CN106202563A (en) * | 2016-08-02 | 2016-12-07 | 西南石油大学 | A kind of real time correlation evental news recommends method and system |
CN106951409A (en) * | 2017-03-17 | 2017-07-14 | 黄淮学院 | A kind of network social intercourse media viewpoint tendency analysis system and method |
CN107220352A (en) * | 2017-05-31 | 2017-09-29 | 北京百度网讯科技有限公司 | The method and apparatus that comment collection of illustrative plates is built based on artificial intelligence |
CN108153723A (en) * | 2017-12-27 | 2018-06-12 | 北京百度网讯科技有限公司 | Hot spot information comment generation method, device and terminal device |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109885770A (en) * | 2019-02-20 | 2019-06-14 | 杭州威佩网络科技有限公司 | A kind of information recommendation method, device, electronic equipment and storage medium |
CN109885770B (en) * | 2019-02-20 | 2022-01-07 | 杭州威佩网络科技有限公司 | Information recommendation method and device, electronic equipment and storage medium |
CN116306514A (en) * | 2023-05-22 | 2023-06-23 | 北京搜狐新媒体信息技术有限公司 | Text processing method and device, electronic equipment and storage medium |
CN116306514B (en) * | 2023-05-22 | 2023-09-08 | 北京搜狐新媒体信息技术有限公司 | Text processing method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101174273B (en) | News event detecting method based on metadata analysis | |
CN108154395B (en) | Big data-based customer network behavior portrait method | |
JP4489994B2 (en) | Topic extraction apparatus, method, program, and recording medium for recording the program | |
Jäschke et al. | Tag recommendations in folksonomies | |
US9430568B2 (en) | Method and system for querying information | |
CN100462969C (en) | Method for providing and inquiry information for public by interconnection network | |
CN103246644B (en) | Method and device for processing Internet public opinion information | |
US9959326B2 (en) | Annotating schema elements based on associating data instances with knowledge base entities | |
CN103020159A (en) | Method and device for news presentation facing events | |
JP2005525657A (en) | Managing expressions in database systems | |
CN106383887A (en) | Environment-friendly news data acquisition and recommendation display method and system | |
CN105718585B (en) | Document and label word justice correlating method and its device | |
CN113297457B (en) | High-precision intelligent information resource pushing system and pushing method | |
CN111125297B (en) | Massive offline text real-time recommendation method based on search engine | |
CN109241402A (en) | A kind of virtual comment machine introduction method based on news content | |
CN106776640A (en) | A kind of stock information information displaying method and device | |
US20160246794A1 (en) | Method for entity-driven alerts based on disambiguated features | |
KR102413961B1 (en) | Method for providing news analysis service using robotic process automation monitoring | |
Singhal et al. | DataGopher: Context-based search for research datasets | |
CN104216901B (en) | The method and system of information search | |
Savyanavar et al. | Multi-document summarization using TF-IDF Algorithm | |
Saravanan et al. | Extraction of Core Web Content from Web Pages using Noise Elimination. | |
CN114706948A (en) | News processing method and device, storage medium and electronic equipment | |
CN112434126B (en) | Information processing method, device, equipment and storage medium | |
CN108733687A (en) | A kind of information retrieval method and system based on Text region |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190118 |
|
RJ01 | Rejection of invention patent application after publication |