CN106021526B - News category method and device - Google Patents
News category method and device Download PDFInfo
- Publication number
- CN106021526B CN106021526B CN201610352644.6A CN201610352644A CN106021526B CN 106021526 B CN106021526 B CN 106021526B CN 201610352644 A CN201610352644 A CN 201610352644A CN 106021526 B CN106021526 B CN 106021526B
- Authority
- CN
- China
- Prior art keywords
- keyword
- press release
- score
- preliminary classification
- determined
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
Abstract
The application proposes a kind of news category method and device, wherein this method comprises: receiving Press release;Determine each matching degree between the Press release and each preset information template, wherein each information template corresponds to a kind of news category;According to each matching degree, preliminary classification belonging to the Press release is determined;According to preset algorithm, the score of each keyword in the Press release is determined;According to the score of each keyword, dimension of the Press release in the preliminary classification is determined.Hereby it is achieved that the automatic classification to Press release is classified, the efficiency to Press release classification is improved, and classification results are not influenced by subjective personal feeling, classification results are more accurate.
Description
Technical field
This application involves technical field of information processing more particularly to a kind of news category method and devices.
Background technique
Now, it is usually to carry out tissue according to field belonging to news content and arrange that news, which reads product, such as basis
Hot spot, internal and international etc. carry out first floor classification, carry out subclassification again under same category, news finally will be carried out classification hair
Row.
Currently, the above-mentioned process for carrying out classification distribution to news, usually by what is manually carried out, this not only wastes people
Power, and news category result is affected by subjective personal feeling, so that classification results are not accurate enough.
Summary of the invention
The application is intended to solve at least some of the technical problems in related technologies.
For this purpose, first purpose of the application is to propose a kind of news category method, the method achieve to news release
The automatic classification of part is classified, and improves the efficiency to Press release classification, and classification results are not influenced by subjective personal feeling,
Classification results are more accurate.
Second purpose of the application is to propose a kind of news category device.
In order to achieve the above object, the application first aspect embodiment proposes a kind of news category method, comprising:
Receive Press release;
Determine each matching degree between the Press release and each preset information template, wherein each information template pair
Answer a kind of news category;
According to each matching degree, preliminary classification belonging to the Press release is determined;
According to preset algorithm, the score of each keyword in the Press release is determined;
According to the score of each keyword, dimension of the Press release in the preliminary classification is determined, wherein
Each dimension in preliminary classification corresponds to N number of keyword, and N is the positive integer more than or equal to 1.
The news category method of the embodiment of the present application, after receiving Press release, it is first determined Press release and preset new
Each matching degree heard between template determines preliminary classification belonging to Press release according to each matching degree, then according to preset calculation
Method determines the score of each keyword in Press release, then according to the score of each keyword, determines Press release in first fraction
Dimension in class.Hereby it is achieved that the automatic classification to Press release is classified, the efficiency to Press release classification is improved, and
And classification results are not influenced by subjective personal feeling, classification results are more accurate.
In order to achieve the above object, the application second aspect embodiment proposes a kind of news category device, comprising:
Receiving module, for receiving Press release;First determining module, for determine the Press release with it is each preset
Each matching degree between information template, wherein each information template corresponds to a kind of news category;Second determining module is used for root
According to each matching degree, preliminary classification belonging to the Press release is determined;Computing module is used for according to preset algorithm, really
The score of each keyword in the fixed Press release;Third determining module, for the score according to each keyword, really
Fixed dimension of the Press release in the preliminary classification, wherein each dimension in preliminary classification corresponds to N number of keyword, N
For the positive integer more than or equal to 1.
The news category device of the embodiment of the present application, after receiving Press release, it is first determined Press release and preset new
Each matching degree heard between template determines preliminary classification belonging to Press release according to each matching degree, then according to preset calculation
Method determines the score of each keyword in Press release, then according to the score of each keyword, determines Press release in first fraction
Dimension in class.Hereby it is achieved that the automatic classification to Press release is classified, the efficiency to Press release classification is improved, and
And classification results are not influenced by subjective personal feeling, classification results are more accurate.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments
Obviously and it is readily appreciated that, in which:
Fig. 1 is the flow chart of the news category method of the application one embodiment;
Fig. 2 is the flow chart of the news category method of the application another embodiment;
Fig. 3 is the structural schematic diagram of the news category device of the application one embodiment;
Fig. 4 is the structural schematic diagram of the news category device of the application another embodiment.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end
Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached
The embodiment of figure description is exemplary, it is intended to for explaining the application, and should not be understood as the limitation to the application.
Below with reference to the accompanying drawings the news category method and device of the embodiment of the present application is described.
Fig. 1 is the flow chart of the news category method of the application one embodiment.
As shown in Figure 1, the news category method includes:
Step 101, Press release is received.
Specifically, the executing subject of news category method provided by the embodiments of the present application is news category device.
Step 102, each matching degree between the Press release and each preset information template is determined, wherein Mei Gexin
It hears template and corresponds to a kind of news category.
Wherein, a variety of information templates can be stored in advance in news category device, each information template corresponds to a kind of news
Classification.
For example, the information template of military class news may include such as: military affairs-weapons, military affairs-military situation, military affairs-army
History, military affairs-current events etc..
After receiving Press release, Press release can be matched with preset information template, so that it is determined that news
Matching degree between contribution and information template.
Specifically, can determine news according to the word quantity identical with the word in information template in Press release
Matching degree between contribution and information template.
Step 103, according to each matching degree, preliminary classification belonging to the Press release is determined.
In general, news category corresponding with the highest information template of Press release matching degree, as belonging to Press release
Preliminary classification.
For example, if certain Press release and military affairs-weapons matching degree are 0.9, it is with military affairs-military history matching degree
0.88, it is 0.5 with military affairs-military situation matching degree, is paired into 0.7 with military affairs-current events, then can determine the Press release institute
The preliminary classification of category are as follows: military affairs-weapons.
It should be noted that a matching degree threshold value can also be set, set when the matching degree of Press release and template is greater than
When fixed threshold value, then it is assumed that Press release belongs to the corresponding preliminary classification of the information template, and the threshold value of matching degree can basis
The threshold value of the adjustment of text class shape sets itself, such as specified matching degree is 0.8, then it is believed that the Press release belongs to military affairs-military history
News category corresponding with two information templates of military affairs-weapons, that is to say, that the Press release can be respectively divided to two newly
It hears in template.
Step 104, according to preset algorithm, the score of each keyword in the Press release is determined.
Wherein, the keyword in Press release can be obtained using any keyword grasping means, alternatively, can will be
The word occurred in the title and text of Press release is determined as keyword, or will frequency of occurrence be more than in Press release
The word of preset value is determined as keyword, and the present embodiment is not construed as limiting this.
Specifically, can use following formula, the score of each keyword is determined:
S=a × t1+b×t2+c×t3
Wherein, s is the score of keyword, and a, b, c are proportionality constant, t1For the number that keyword occurs in title, t2For
The number that keyword occurs in body, t3Exist for what is obtained according to the preliminary classification with word similar in the keyword
The number occurred in Press release.
Wherein, a, b, c and be 1, for example, in calculating Press release when each keyword score, a can take the 0.5, b to be
0.3, c 0.2.
It should be noted that the value of proportionality constant a, b, c are dynamic changes, for different keywords, the ratio is normal
Number can take different value.
For example, if received Press release content is as follows:
[March 8, International Women's Day special issue] boosted missile machine of making war steps on aircraft carrier into jungle: who says woman not as good as male
This is the collective for having strong fighting spirit not lose the peculiar exquisiteness of women again.They be engaged in profession no longer as
It is confined to medical treatment and service field in the past, but fights bravely in nearly all wars of Liaoning warship such as steering, electromechanics, damage pipe, supervision, radars
Pan door is filled with more vigor to move towards dark blue Chinese Navy.
This glorious collective being made of more than 90 militarized female personnels --- Liaoning warship female warship person team, naval, since establishment, remarkably
Complete all previous test trial voyage or flight of Liaoning warship and airplane carrier fighter warship take off equal significant tasks.
After keyword extraction, determining keyword includes: militarized female personnel, Liaoning warship, fighter plane, guided missile, aircraft carrier, war
Machine.
Wherein, keyword " militarized female personnel " does not occur in title, occurs in article 1 time, with word similar in militarized female personnel
" woman " occurred 1 time in title, and " women ", " female warship person " occurred respectively in the body of the email once, thus according to above formula, i.e.,
It can determine the score of keyword " militarized female personnel " are as follows:
S=a × 0+b × 1+c × 3
Identical method can determine the score of other each keywords.
It should be noted that near synonym dictionary can also be stored in news apparatus for automatically sorting, after obtaining keyword,
Word similar in each keyword can be obtained, and then word similar in determining and keyword is in Press release by inquiring the dictionary
The number of appearance.
Step 105, according to the score of each keyword, dimension of the Press release in the preliminary classification is determined
Degree, wherein each dimension in preliminary classification corresponds to N number of keyword, and N is the positive integer more than or equal to 1.
Specifically, in order to accurately be classified to Press release, it can be under each news category, further according to keyword
The division that each news category is carried out to different dimensions carries out Press release further accurate again under preliminary classification
Classification.
When practical application, in the score according to keyword each in Press release, determine Press release in preliminary classification
Dimension when, can successively determine dimension belonging to each keyword according to the score of keyword, from high to low, and then determine
Dimension belonging to Press release.
Specifically, above-mentioned steps 105, comprising:
1051: according to the score of each keyword, determining the keyword sorted lists of the Press release;
In the present embodiment, if sharing n keyword, all n keyword roots can be calculated score, and root according to above-mentioned steps
According to score, it is ranked up from high to low.
1052: top n keyword is chosen from the keyword sorted lists;
It is understood that in order to improve the accuracy of classification, a part of keyword can be chosen, rather than all keys
Word, wherein 1≤N≤n/2, N are integer.
For example, if keyword shares 5 score higher first 1 can be chosen according to keyword sorted lists
Or preceding 2 keywords carry out subsequent operation.
1053: according to the top n keyword, determining dimension of the Press release in the preliminary classification.
It should be noted that aforesaid way can be used, the top n keyword of highest scoring is chosen as determining news release
The standard of part dimension can also determine the dimension of Press release according to all keywords, to keep determining dimension more smart
Really, but can data-handling capacity to news category device and rate request it is higher.
For example, if after by matching with preset information template, preliminary classification belonging to above-mentioned Press release is determined
For " military affairs-weapons ", aforesaid way is then used, the keyword of the corresponding highest scoring of above-mentioned Press release of selection is " the Liao Dynasty
Ning Jian ".And under the preliminary classification of " military affairs-weapons ", including 8 classification by 1 key definition, respectively " fight
Machine ", " warship ", " rifle ", " guided missile ", " tank ", " submarine ", " nuclear weapon " pass through the nearly justice in inquiry news category device
Word dictionary, determine " Liaoning warship " it is close with " warship ", or by " Liaoning warship " it is upper after can belong in " warship ", so as to true
Fixed above-mentioned Press release is specifically classified as " military affairs-weapons-warship ", to realize the precise classification to Press release.
It should be noted that if corresponding 1 keyword of a dimension under preset each preliminary classification, and according to above-mentioned side
One Press release of formula selection corresponding 2 or the identical keyword of multiple scores, then can belong to the Press release simultaneously
Into two dimensions.
The news category method of the embodiment of the present application, after receiving Press release, it is first determined Press release and preset new
Each matching degree heard between template determines preliminary classification belonging to Press release according to each matching degree, then according to preset calculation
Method determines the score of each keyword in Press release, then according to the score of each keyword, determines Press release in first fraction
Dimension in class.Hereby it is achieved that the automatic classification to Press release is classified, the efficiency to Press release classification is improved, and
And classification results are not influenced by subjective personal feeling, classification results are more accurate.
It, can be according to Press release and pre- by above-mentioned analysis it is found that news category device is after receiving Press release
If information template between matching degree, determine preliminary classification belonging to Press release.Correspondingly, being needed in news category device
The corresponding information template of each preliminary classification is stored in advance, alternatively, the information template, can also be news category device to news
It is obtained after all Press release progress model training in library.That is this method further include:
Model training is carried out to Press release library, determines the corresponding information template of each preliminary classification.
For example, the algorithm of support vector machines (Support Vector Machine, abbreviation SVM) can be used, it is right
Press release library carries out model training, so that it is determined that the corresponding information template of each preliminary classification.
It is to be appreciated that through the foregoing embodiment it is found that the corresponding preliminary classification of information template is limited by two features.Such as
" military affairs-weapons " classification, is just limited by " military affairs " and " weapons " two features, therefore, can be by SVM algorithm, first will be new
The news heard in contribution library carries out the first hierarchical classification, for example Press release is first split into " current events ", " amusement ", " premises
Production ", " economy ", " military affairs " etc., and then SVM algorithm is recycled, each first level is carried out to the classification of the second level again, than
Such as " military affairs " are finally divided into: military affairs-weapons, military affairs-military situation, military affairs-military history, military affairs-current events, and each secondary classification
Respectively correspond an information template.So as to directly determine Press release according to the matching degree of Press release and each information template
Corresponding preliminary classification.
It should be noted that news category device can also be to new after information template has been determined according to Press release library
Received news continues model training, to carry out supplement and perfect to determining information template, and then makes according to news
The preliminary classification for the news that template determines is more and more accurate.
Further, it in above-described embodiment, when determining the score of each keyword, can use in news category device
Dictionary, it is determining to be determined to further increase the precision to news category according to keyword score with word similar in keyword
When the dimension of Press release, duplicate removal processing can also be carried out to each keyword.Below with reference to Fig. 2, to news provided by the present application
Classification method is further detailed.
Fig. 2 is the flow chart of the news category method of the application another embodiment.
As shown in Fig. 2, the news category method may comprise steps of:
Step 201, model training is carried out to Press release library, determines the corresponding information template of each preliminary classification.
Step 202, Press release is received.
Step 203, each matching degree between the Press release and each preset information template is determined, wherein Mei Gexin
It hears template and corresponds to a kind of news category.
Step 204, according to news category corresponding with the highest information template of Press release matching degree, the news is determined
Preliminary classification belonging to contribution.
Step 205, each keyword in the Press release is obtained.
Specifically, the keyword in Press release can be obtained using existing keyword grasping means, can also choose
The word occurred in title and text is as keyword, alternatively, can also be chosen at frequency of occurrence in Press release reaches pre-
If value word as keyword, the present embodiment is not construed as limiting this.
Step 206, preset dictionary is inquired, determines the near synonym and/or substitute of each keyword.
Specifically, preset dictionary, can be news category device according to the training to Press release library, oneself is generated
, alternatively, being also possible to determining according to the input of user.
It wherein, may include the near synonym and substitute of various words in preset dictionary.Wherein, substitute can refer to this
The hypernym of word.For example, " hydrogen bomb " word can replace with " nuclear weapon " by substitute.
Step 207, using preset algorithm, the score of each keyword is determined.
Step 208, according to the score of each keyword, the keyword sorted lists of the Press release are determined.
Step 209, the higher top n keyword of score is chosen from the keyword sorted lists.
Wherein, N can be a fixed numerical value, for example be 1,3,5,6 or 8 etc., can also be according to actual scene
It determines.
For example, first 2 or 3 keywords in keyword sorted lists are only chosen first, if just according only to first 2
The dimension of Press release can be accurately determined, then preceding 2 keywords can be selected only;And if according to preceding 3 keywords, determination
Press release dimension it is not unique, at this point it is possible to continue select keyword, to the dimension of predetermined Press release
It is modified or corrects, to finally determine dimension belonging to Press release.
Step 210, according to N number of keyword, determine the Press release in the dimension in the preliminary classification.
The news category method of the embodiment of the present application, first reception Press release, then determine Press release with it is preset
The matching degree of information template determines news release according to news category corresponding with the highest information template of Press release matching degree
The preliminary classification of part, and then keyword is chosen from Press release again, then by inquiring preset dictionary, determine each key
The near synonym and/or substitute of word determine the score of each keyword in Press release, according to each further according to preset algorithm
The score of a keyword determines the keyword sorted lists of Press release, after choosing top n keyword in sorted lists, then
According to top n keyword, the dimension of Press release is determined.Hereby it is achieved that the automatic classification to Press release is classified, improve
To the efficiency of Press release classification, and classification results are not influenced by subjective personal feeling, and classification results are more accurate.
In order to realize above-described embodiment, the application also proposes a kind of news category device.
Fig. 3 is the structural schematic diagram of the news category device of the application one embodiment.
As shown in figure 3, the news category device includes:
Receiving module 31, for receiving Press release;
First determining module 32, for determining each matching degree between the Press release and each preset information template,
Wherein, each information template corresponds to a kind of news category;
Second determining module 33, for determining preliminary classification belonging to the Press release according to each matching degree;
Computing module 34, for determining the score of each keyword in the Press release according to preset algorithm;
Third determining module 35 determines the Press release described first for the score according to each keyword
Dimension in grade classification, wherein each dimension in preliminary classification corresponds to N number of keyword, and N is the positive integer more than or equal to 1.
Wherein, news category device provided in this embodiment, for executing news category method provided by the above embodiment.
Specifically, above-mentioned computing module 34, is specifically used for:
Utilize s=a × t1+b×t2+c×t3, determine the score of each keyword;
Wherein, s is the score of keyword, and a, b, c are proportionality constant, t1For the number that keyword occurs in title, t2For
The number that keyword occurs in body, t3Exist for what is obtained according to the preliminary classification with word similar in the keyword
The number occurred in Press release.
In one embodiment, third determining module 35, is specifically used for:
According to the score of each keyword, the keyword sorted lists of the Press release are determined;
The higher top n keyword of score is chosen from keyword sequence;
According to N number of keyword, dimension of the Press release in the preliminary classification is determined.
It should be noted that the aforementioned news for being also applied for the embodiment to the explanation of news category embodiment of the method
Sorter, details are not described herein again.
The news category device of the embodiment of the present application, after receiving Press release, it is first determined Press release and preset new
Each matching degree heard between template determines preliminary classification belonging to Press release according to each matching degree, then according to preset calculation
Method determines the score of each keyword in Press release, then according to the score of each keyword, determines Press release in first fraction
Dimension in class.Hereby it is achieved that the automatic classification to Press release is classified, the efficiency to Press release classification is improved, and
And classification results are not influenced by subjective personal feeling, classification results are more accurate.
Fig. 4 is the structural schematic diagram of the news category device of the application another embodiment.
As shown in figure 4, in above-mentioned base shown in Fig. 3, the news category device, further includes:
Enquiry module 41 determines the near synonym and/or substitute of each keyword for inquiring preset dictionary.
It further, can be according to news by above-mentioned analysis it is found that news category device is after receiving Press release
Matching degree between contribution and preset information template determines preliminary classification belonging to Press release.Correspondingly, news category fills
Need to be stored in advance the corresponding information template of each preliminary classification in setting, alternatively, the information template, can also be that news category fills
It sets and is obtained after carrying out model training to all Press release in news library.The then device, further includes:
Training module 42 determines the corresponding news mould of each preliminary classification for carrying out model training to Press release library
Plate.
It should be noted that the aforementioned news for being also applied for the embodiment to the explanation of news category embodiment of the method
Sorter, details are not described herein again.
The news category device of the embodiment of the present application, first reception Press release, then determine Press release with it is preset
The matching degree of information template determines news release according to news category corresponding with the highest information template of Press release matching degree
The preliminary classification of part, and then keyword is chosen from Press release again, then by inquiring preset dictionary, determine each key
The near synonym and/or substitute of word determine the score of each keyword in Press release, according to each further according to preset algorithm
The score of a keyword determines the keyword sorted lists of Press release, after the keyword for choosing top n in sorted lists,
Further according to top n keyword, the dimension of Press release is determined.Hereby it is achieved that the automatic classification to Press release is classified, improve
To the efficiency of Press release classification, and classification results are not influenced by subjective personal feeling, and classification results are more accurate.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is contained at least one embodiment or example of the application.In addition, term " first ", " second " are used for description purposes only,
It is not understood to indicate or imply relative importance or implicitly indicates the quantity of indicated technical characteristic.
It should be appreciated that each section of the application can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware
Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal
Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above
Embodiments herein is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as the limit to the application
System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of application
Type.
Claims (6)
1. a kind of news category method, which comprises the following steps:
Model training is carried out to Press release library using support vector machines, determines the corresponding information template of each preliminary classification;
Receive Press release;
According to the word quantity identical with the word in each preset information template in the Press release, the news is determined
Each matching degree between contribution and each preset information template, wherein each information template corresponds to a kind of news category;
According to each matching degree, preliminary classification belonging to the Press release is determined;
According to preset algorithm, the score of each keyword in the Press release is determined;
According to the score of each keyword, dimension of the Press release in the preliminary classification, preliminary classification are determined
In each dimension correspond to N number of keyword, N is the positive integer more than or equal to 1;
Wherein, described according to preset algorithm, determine the score of each keyword in the Press release, comprising:
Utilize s=a × t1+b×t2+c×t3, determine the score of each keyword;
Wherein, s is the score of keyword, and a, b, c are proportionality constant, t1For the number that keyword occurs in title, t2For key
The number that word occurs in body, t3To be obtained according to the preliminary classification with word similar in the keyword in news
The number occurred in contribution.
2. the method as described in claim 1, which is characterized in that it is described according to preset algorithm, it determines in the Press release
Before the score of each keyword, further includes:
Preset dictionary is inquired, determines the near synonym and/or substitute of each keyword.
3. the method as described in claim 1, which is characterized in that the score according to each keyword, determine described in
Dimension of the Press release in the preliminary classification, comprising:
According to the score of each keyword, the keyword sorted lists of the Press release are determined;
The higher top n keyword of score is chosen from keyword sequence;
According to N number of keyword, dimension of the Press release in the preliminary classification is determined.
4. a kind of news category device characterized by comprising
Training module determines that each preliminary classification is corresponding for carrying out model training to Press release library using support vector machines
Information template;
Receiving module, for receiving Press release;
First determining module, for identical with the word in each preset information template according to the word in the Press release
Quantity determines each matching degree between the Press release and each preset information template, wherein each information template corresponding one
Kind news category;
Second determining module, for determining preliminary classification belonging to the Press release according to each matching degree;
Computing module, for determining the score of each keyword in the Press release according to preset algorithm;
Third determining module determines the Press release in the preliminary classification for the score according to each keyword
In dimension, wherein each dimension in preliminary classification corresponds to N number of keyword, and N is the positive integer more than or equal to 1;
Wherein, the computing module, is specifically used for:
Utilize s=a × t1+b×t2+c×t3, determine the score of each keyword;
Wherein, s is the score of keyword, and a, b, c are proportionality constant, t1For the number that keyword occurs in title, t2For key
The number that word occurs in body, t3To be obtained according to the preliminary classification with word similar in the keyword in news
The number occurred in contribution.
5. device as claimed in claim 4, which is characterized in that further include:
Enquiry module determines the near synonym and/or substitute of each keyword for inquiring preset dictionary.
6. device as claimed in claim 4, which is characterized in that the third determining module is specifically used for:
According to the score of each keyword, the keyword sorted lists of the Press release are determined;
The keyword of the higher top n of score is chosen from keyword sequence;
According to N number of keyword, dimension of the Press release in the preliminary classification is determined.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610352644.6A CN106021526B (en) | 2016-05-25 | 2016-05-25 | News category method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610352644.6A CN106021526B (en) | 2016-05-25 | 2016-05-25 | News category method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106021526A CN106021526A (en) | 2016-10-12 |
CN106021526B true CN106021526B (en) | 2019-09-27 |
Family
ID=57093745
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610352644.6A Active CN106021526B (en) | 2016-05-25 | 2016-05-25 | News category method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106021526B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111209390B (en) * | 2020-01-06 | 2023-09-05 | 新方正控股发展有限责任公司 | News display method and system and computer readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103218432A (en) * | 2013-04-15 | 2013-07-24 | 北京邮电大学 | Named entity recognition-based news search result similarity calculation method |
CN103530334A (en) * | 2013-09-29 | 2014-01-22 | 方正国际软件有限公司 | System and method for data matching based on comparison module |
CN103870474A (en) * | 2012-12-11 | 2014-06-18 | 北京百度网讯科技有限公司 | News topic organizing method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9448992B2 (en) * | 2013-06-04 | 2016-09-20 | Google Inc. | Natural language search results for intent queries |
-
2016
- 2016-05-25 CN CN201610352644.6A patent/CN106021526B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103870474A (en) * | 2012-12-11 | 2014-06-18 | 北京百度网讯科技有限公司 | News topic organizing method and device |
CN103218432A (en) * | 2013-04-15 | 2013-07-24 | 北京邮电大学 | Named entity recognition-based news search result similarity calculation method |
CN103530334A (en) * | 2013-09-29 | 2014-01-22 | 方正国际软件有限公司 | System and method for data matching based on comparison module |
Non-Patent Citations (1)
Title |
---|
基于文本内容的农业网页信息抽取和分类研究;朱学芳;《情报科学》;20120731;第30卷(第7期);第1012-1015页 * |
Also Published As
Publication number | Publication date |
---|---|
CN106021526A (en) | 2016-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Herbelot et al. | High-risk learning: acquiring new word vectors from tiny data | |
CN108304437B (en) | automatic question answering method, device and storage medium | |
CN109190017B (en) | Method and device for determining hotspot information, server and storage medium | |
Shardlow et al. | Semeval-2021 task 1: Lexical complexity prediction | |
Lawrence et al. | Combining argument mining techniques | |
Chang et al. | Webqa: Multihop and multimodal qa | |
CN105760526B (en) | A kind of method and apparatus of news category | |
CN110569354A (en) | Barrage emotion analysis method and device | |
CN103092966A (en) | Vocabulary mining method and device | |
Yang et al. | Learning to answer visual questions from web videos | |
Ismailov | Humor Analysis Based on Human Annotation Challenge at IberLEF 2019: First-place Solution. | |
CN112401886A (en) | Processing method, device and equipment for emotion recognition and storage medium | |
CN106021526B (en) | News category method and device | |
CN112667866A (en) | Test paper generation method and device, electronic equipment and storage medium | |
Bernstein et al. | Comparative rates of text reuse in classical Latin hexameter poetry. | |
CN106095941B (en) | Big data knowledge base-based solution recommendation method and system | |
CN114416929A (en) | Sample generation method, device, equipment and storage medium of entity recall model | |
Ryan et al. | People Tend to Like Related Games. | |
CN104978375B (en) | A kind of language material filter method and device | |
Becker et al. | Reverse dynamical evolution of η Chamaeleontis | |
CN110990709B (en) | Role automatic recommendation method and device and electronic equipment | |
Maharana et al. | Exposing and addressing cross-task inconsistency in unified vision-language models | |
Moen et al. | Towards dynamic word sense discrimination with random indexing | |
CN110858218B (en) | Automatic scoring method and system for divergent thinking test | |
Nicosia et al. | Learning to rank aggregated answers for crossword puzzles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |