CN105893467A - Information classification method and apparatus - Google Patents

Information classification method and apparatus Download PDF

Info

Publication number
CN105893467A
CN105893467A CN201610183721.XA CN201610183721A CN105893467A CN 105893467 A CN105893467 A CN 105893467A CN 201610183721 A CN201610183721 A CN 201610183721A CN 105893467 A CN105893467 A CN 105893467A
Authority
CN
China
Prior art keywords
information
determined
internet
predetermined
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610183721.XA
Other languages
Chinese (zh)
Inventor
李涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kylin Hesheng Network Technology Co Ltd
Original Assignee
Beijing Kylin Hesheng Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kylin Hesheng Network Technology Co Ltd filed Critical Beijing Kylin Hesheng Network Technology Co Ltd
Priority to CN201610183721.XA priority Critical patent/CN105893467A/en
Publication of CN105893467A publication Critical patent/CN105893467A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The present application discloses an information classification method, so as to solve the problem of low information classification efficiency caused by a method for classifying information topics by means of manual screening in the prior art. The method comprises: determining a search keyword related to a tag of a pre-determined information topic; according to a category of the pre-determined information topic and the determined search keyword, determining information, which corresponds to the determined search keyword and belongs to the category, from information on the Internet; according to a feature of the determined information from the Internet, determining information with the feature from a to-be-screened information collection and using the determined information as target information, wherein the to-be-screened information collection is formed by information that belongs to the category; and categorizing the determined information with the feature into the pre-determined information topic. The present application further discloses an information classification apparatus.

Description

A kind of information classification approach and device
Technical field
The application relates to Internet technical field, particularly relates to a kind of information classification approach and device.
Background technology
At present, the Internet era that the whole world coming into, user can be by carriers such as word, picture, videos Information is distributed in the Internet, is very easy to the propagation of information.
In the Internet, every day is all flooded with substantial amounts of information, user by the Internet obtain information time, Often only focus on oneself information interested.The information that user is interested is generally the information of particular category, institute Stating particular category can be such as news.
In view of the These characteristics of user, for internet information publisher, in order to preferable to user Reading experience, often carries out classification and shows information.When information is classified, can be according to information Different target user group information is classified.Such as some user likes paying close attention to news, and some is used Family likes paying close attention to Information, and some user likes paying close attention to financial information, then internet information publisher is permissible Information is divided into the classifications such as news, science and technology, finance and economics.Meanwhile, according to above-mentioned mode classification obtain identical The information of classification, it is also possible to further according to the dependency of information, divides the information relevant to same subject It is a class, obtains information special topic.Such as news relevant to same event in news category can be divided into Same special topic, this special topic can be described as Special Topics in Journalism.
In the prior art, when information being classified according to information special topic, need staff to be screened Information aggregate in carry out manual screening.Such as, from 1000 news of website A database purchase When screening the news of certain special topic, according to personal experience, the editor's need of website A judges which news is that this is special The news of topic, then selects relevant news as this from 1000 news of website A database purchase The news of special topic.
At present, by manually carry out manual screening come division information special topic method information classification inefficient.
Summary of the invention
The embodiment of the present application provides a kind of information classification approach, in order to solve in prior art by manually carrying out Manual screening carrys out the inefficient problem of information classification that the method for division information special topic causes.
The embodiment of the present application also provides for a kind of information sorting device, in order to solve in prior art by manually entering Row manual screening carrys out the inefficient problem of information classification that the method for division information special topic causes.
The embodiment of the present application employing following technical proposals:
A kind of information classification approach, including:
Determine the search key relevant to the label of predetermined information special topic;
According to the classification belonging to described predetermined information special topic, and the search key determined, from interconnection In information in net, determine letter that is corresponding with the described search key determined and that belong to described classification Breath;
According to the feature of the information in the Internet determined, from information aggregate to be screened, determine possess institute State the information of feature as target information;Described information aggregate to be screened, is by the letter belonging to described classification Breath is constituted;
In the information categorization the possessing described feature extremely described predetermined information special topic that will determine.
A kind of information sorting device, including:
Search key determines unit, crucial for determining the search relevant to the label of predetermined information special topic Word;
Internet information determines unit, is used for according to the classification belonging to described predetermined information special topic, and really The search key made, in the information from the Internet, determines and the described search key phase determined Corresponding and belong to the information of described classification;
Target information determines unit, for the feature according to the information in the Internet determined, to be screened Information aggregate in determine that the information possessing described feature is as target information;Described information collection to be screened Close, be by the information structure belonging to described classification;
Sort out unit, for sorting out the target information possessing described feature determined to described predetermined letter In breath special topic.
At least one technical scheme above-mentioned that the embodiment of the present application uses can reach following beneficial effect:
It is determined by the search key relevant to the label of predetermined information special topic, and according to predetermined information Information in classification belonging to special topic, the search key determined and the Internet determines and belongs to described predetermined Information special topic information, relative in prior art by manually carry out manual screening come division information special topic Method, improves information classification efficiency.
Accompanying drawing explanation
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes of the application Point, the schematic description and description of the application is used for explaining the application, is not intended that to the application not Work as restriction.In the accompanying drawings:
A kind of information classification approach that Fig. 1 provides for the embodiment of the present application 1 realize schematic flow sheet;
Fig. 2 realizes schematic flow sheet for the one of a kind of information classification approach that the embodiment of the present application 2 provides;
The concrete structure schematic diagram of a kind of information sorting device that Fig. 3 provides for the embodiment of the present application 3.
Detailed description of the invention
For making the purpose of the application, technical scheme and advantage clearer, specifically real below in conjunction with the application Execute example and technical scheme is clearly and completely described by corresponding accompanying drawing.Obviously, described Embodiment is only some embodiments of the present application rather than whole embodiments.Based on the enforcement in the application Example, the every other enforcement that those of ordinary skill in the art are obtained under not making creative work premise Example, broadly falls into the scope of the application protection.
Below in conjunction with accompanying drawing, describe the technical scheme that each embodiment of the application provides in detail.
Embodiment 1
For solution prior art is come by manually carrying out manual screening what the method for division information special topic caused The inefficient problem of information classification, the embodiment of the present application 1 provides a kind of information classification approach.The application implements The executive agent of the information classification approach that example provides can be server, such as, and the service of information website The clothes that server corresponding to device, information client, the server of news website, news client are corresponding Business device, etc..
For ease of describing, executive agent the most in this way is that server corresponding to information client is Example, is introduced the embodiment of the method.The executive agent being appreciated that the method is information Server corresponding to client is a kind of exemplary explanation, is not construed as the restriction to the method.
The method realize schematic flow sheet as it is shown in figure 1, comprise the steps:
Step 11: determine the search key relevant to the label of predetermined information special topic;
In the embodiment of the present application, described information can be the information with word, audio frequency, video etc. as carrier. Described information can comprise message header and message details.Wherein, described message header could be for summarizing The word of described information, described message details comprises for recording the word of described information, audio frequency, video etc. Information carrier.
In the embodiment of the present application, described information special topic is by the information relevant to same subject being sorted out Obtaining, the information comprised in information special topic is the theme relevant information.
In actual applications, the related news reporting same media event are being classified to same Special Topics in Journalism Time, described predetermined information special topic, can be Special Topics in Journalism.
In the embodiment of the present application, the label of described information special topic can be summarized in described information special topic and be comprised The keyword of the theme of information, such as, for carrying out the new of special report to " horse navigate 370 aviation accidents " Hearing special topic, its label can be the labels such as " Ma Hang ", " 370 ", " air crash ".
In the embodiment of the present application, the described search key relevant to label can be to be comprised with described label Equivalent in meaning or close keyword, it is also possible to be the keyword comprising the meaning that described label is comprised, It can also be the keyword comprising the word in described label.Such as relevant to label " Ma Hang " search is closed Key word can be " Malaysia's flight ".
In the embodiment of the present application, described search key is word or the word for retrieving information that user inputs, Also referred to as search for key word.
In the embodiment of the present application, when expectation builds predetermined information special topic, it may be predetermined that described predetermined Information special topic label, then according to the described label determined, determine relevant to the described label determined Search key.
Step 12: according to the classification belonging to described predetermined information special topic, and the search key determined, In information from the Internet, determine corresponding with the described search key determined and belong to described classification Information;
In actual applications, classify exhibition information to read the user of hobby to difference, can be according to information Different target user group information is classified.Such as some user likes paying close attention to news, and some is used Family like pay close attention to Information, some user like pay close attention to financial information, then information can be divided into news, The classifications such as science and technology, finance and economics.The most described predetermined classification belonging to information special topic is special with described predetermined information The news that topic is comprised may belong to same category.
In the embodiment of the present application, the target information sorting out the most described predetermined information special topic in expectation is news Time, the most described classification can be news category.
In the embodiment of the present application, the search pass that the label to predetermined information special topic that can be determined by is relevant Key word, determines the target information corresponding with the described search key determined.
But, directly utilize the search key determined and determine target information from information to be screened, can Can cause the target information determined exists some information low with search key degree of association, described with search But the information that rope keyword degree of association is low can be such as to comprise described search key message subject and institute State that the dependency of search key is relatively low or incoherent information.If by described and search key degree of association Low information is defined as target information, and classification is shown to described predetermined information special topic, easily leads Cause Consumer's Experience is poor.
In actual applications, due to the search engine rank algorithm of search engine, can comprise according to webpage The clicked frequency in Search Results of information and the degree of association of search key, webpage, web page contents are issued Webpage in Search Results is ranked up by the information such as time and web site contents quality.Therefore, it is being determined at After the search key that the label of predetermined information special topic is correlated with, it is possible to use search engine obtains from the Internet Take the webpage high with the described search key degree of association determined, so utilize obtain with described determine search The message header of the information in the webpage that rope keyword degree of association is high determines target information, just can obtain with The target information that search key degree of association is high.Avoid and directly from information to be screened, determine that target information is led The problem that there may be the information low with search key degree of association in the target information of the determination caused.
Specifically, it is possible, firstly, to according to described classification, and the search key determined, from the Internet In information in, determine that candidate that is corresponding with the described search key determined and that belong to described classification believes Breath.For example, it is possible to according to the described search key determined, by search engine, retrieve described in search The all webpages in the Internet comprising described search key included held up in index.Then can be from described institute Have and webpage obtains webpage further that comprise the information belonging to described classification, and get described further Webpage in the information that comprises as candidate information.
Such as when described classification is news, then can be from described all webpages, acquisition comprises further The classification of information is the webpage of news, then described in the letter that comprises in the webpage that classification is news of information that comprises Breath is candidate information.
Secondly as the quantity of the described webpage comprising described candidate information got further is the most relatively Greatly, therefore can screen from the described webpage got further, in order to obtain and close with described search The webpage that key word degree of association is high.
In actual applications, can according to the search engine ranking of the described webpage got further, and The candidate information screening conditions preset, determine the net high with the described search key degree of association determined Page.Owing to the described webpage got further comprising described candidate information, i.e. can be from the time determined Select in information, determine information that is corresponding with the described search key determined and that belong to described classification.
In the embodiment of the present application, described default candidate information screening conditions are it may be that described further acquisition The search engine ranking of the webpage arrived is higher than the search engine rank threshold preset.
Step 13: according to the feature of the information in the Internet determined, from information aggregate to be screened really Surely possesses the information of described feature as target information;Described information aggregate to be screened, is by belonging to described The information structure of classification;
In the embodiment of the present application, determine from the Internet corresponding with the described search key determined and After belonging to the information of described classification, the feature of the described information determined from the Internet can be obtained, in order to after Continuous according to described feature, from information aggregate to be screened, determine that the information possessing described feature is believed as target Breath.Described feature can be such as message header, then when described classification is news, then described in be characterized as new Hear title.
In the embodiment of the present application, described information aggregate to be screened can be the collection of the information structure stored in advance Closing, described information to be screened can be stored in data base.Described information aggregate to be screened, is by belonging to Information structure in described classification.Described information aggregate to be screened, can be such as database purchase News constitute set.
In actual applications, the management to information for convenience, information can be carried out according to certain specification Storage.Described certain specification can be such as that message header and message details are stored respectively in database table Different field in.
In actual applications, described information aggregate to be screened, can be such as the news structure of database purchase The set become.
In the embodiment of the present application, can according to the message header of the information determined from the information of the Internet, Have the data base of information aggregate to be screened from described storage and search and described true from the information of the Internet The message header that the message header of the information made matches.By message header and the described information from the Internet In information in the data base that matches of the message header of information determined as target information.
In actual applications, when described classification is news category, can be according to the news mark of the news determined Topic, from described information aggregate to be screened, determines the news possessing described headline.
Step 14: in the information categorization the possessing described feature extremely described predetermined information special topic that will determine.
In the embodiment of the present application, in the feature according to the information in the Internet determined, from letter to be screened After breath set determines that the information possessing described feature is as target information, just can possess institute by determine State in the information categorization extremely described predetermined information special topic of feature.
In the embodiment of the present application, it is contemplated that the information content in information aggregate to be screened may be over time Passage and increase, the information newly increased may exist the theme of theme and described predetermined information special topic Same or like information.Therefore, it can according to the predetermined time cycle from described information aggregate to be screened In determine target information, and described target information is sorted out to described predetermined information special topic, in order to permissible The information categorization that in the information that will newly increase, theme is same or like with the theme of described predetermined message subject To described predetermined message subject, the information in described predetermined information special topic is supplemented.
In the embodiment of the present application, in the feature according to the information in the Internet determined, from letter to be screened After breath set determines that the information possessing described feature is as target information, it is also possible to according to predetermined target letter Breath shows that the described target information in the predetermined information special topic determined is shown by rule.
Described target information owing to being stored in data base can be the letter reprinted from the Internet or gather Breath, thus described in information in the Internet determined and described target information can be that content is same or similar Information.The most in actual applications, it is believed that described in information in the Internet determined and described target The attention rate of information is identical.So, when described target information be attention rate promote information faster time, permissible According to predetermined target information, attention rate index according to the information in the Internet determined, shows that rule is right The target information possessing described feature determined is shown.
In actual applications, described target information shows that rule can be in the form of a list in target information Each bar information when being shown, each bar information in target information can in information displayed page according to from Page top is shown to the form being arranged in order bottom the page.Such as, can be according in target information Each bar information attention degree index order from high to low is shown described the most successively at information displayed page Each bar information in target information.
In the embodiment of the present application, the attention rate index of the information in the described the Internet determined be according to following extremely Few a kind of data determine: the search key determined that the information determined from the information of the Internet is corresponding Searchable index;Webpage corresponding to the information determined from the information of the Internet is corresponding at described candidate information Webpage in search engine ranking.Such as can be corresponding according to the information determined from the information of the Internet The ranking of searchable index that obtains of the searchable index of the search key determined, and then according to described search The ranking of index determines the attention rate index of the information in described the Internet.
In actual applications, described attention rate index can also be to the letter determined from the information of the Internet The searchable index ceasing the corresponding search key determined and the information determined from the information of the Internet Corresponding webpage two numerical value of search engine ranking in the webpage that described candidate information is corresponding are weighted Obtain.
In the embodiment of the present application, the concrete weighting algorithm of described weighting does not limits at this, is such as adding Temporary, can be weighted according to following rule: the information correspondence determined from the information of the Internet is really The searchable index of the search key made is the highest, then the described information determined from the information of the Internet Attention rate index is the highest;Webpage corresponding to the information determined from the information of the Internet is at described candidate information The corresponding search engine ranking in webpage is the highest, then the described information determined from the information of the Internet Attention rate index is the highest.
The information classification approach that the embodiment of the present application 1 provides, is determined by and the label of predetermined information special topic Relevant search key, and according to the classification belonging to predetermined information special topic, the search key determined The information belonging to described predetermined information special topic is determined with the information in the Internet, logical relative in prior art Cross and manually carry out the method that manual screening carrys out division information special topic, improve information classification efficiency.
Embodiment 2
The embodiment of the present application 2, mainly introduce the embodiment of the present application 1 provide said method in practice one Plant application scheme.
The process phase of the process of information classification and information classification described in embodiment 1 in the embodiment of the present application 2 Seemingly, the some other step not made referrals in embodiment 2 may refer to the associated description in embodiment 1, Here is omitted.
Before the implementation of the program is described in detail, first the enforcement scene of the program is carried out simply Introduce.
In this enforcement scene, server by reprint or collection by the way of obtain from the Internet substantial amounts of newly Hearing, and be stored in data base, when the news that storage gets, headline and the news of news are detailed Content is stored respectively in the different field of data base.
Now, the Special Topics in Journalism of " horse boat 370 accident " event, i.e. news from data base will be built Middle determine and " horse boat 370 accident " news that event is relevant, and is sorted out to " horse navigate 370 have an accident " In the Special Topics in Journalism of event.
Based on above-mentioned scene, in embodiment 2, the process of information classification is as in figure 2 it is shown, comprise the steps:
Step 21, determine that the label of " horse boat 370 accident " Special Topics in Journalism that expectation builds is " Ma Hang ", " 370 ", " accident ";
Step 22, according to the described label determined, determines that the search key relevant to described label is for " horse Navigate 370 air crashes ";
Step 23, according to the described search key determined, utilizes search engine, determines described search key The Search Results of word;
Step 24, determines from described Search Results and belongs to the News Network of 6 before news category and search engine ranking Page;
Step 25, the headline of the news web page determined in obtaining step 24;
Step 26, according to the headline got, determines in data base with the headline got mutually The news that the headline joined is corresponding;
Step 27, by the described news classification determined in step 26 to " horse boat 370 is had an accident " Special Topics in Journalism In;
Step 28, obtains and " Ma Hang according to the predetermined time cycle from described information aggregate to be screened 370 accidents " news that event is relevant, and by the news classification that gets to " horse boat 370 accident " news In special topic.
The news category method that the embodiment of the present application 2 provides, is determined by and the label of predetermined Special Topics in Journalism Relevant search key, and according to the classification belonging to predetermined Special Topics in Journalism, the search key determined The news belonging to described predetermined Special Topics in Journalism is determined with the news in the Internet, logical relative in prior art Cross and manually carry out manual screening to the method dividing Special Topics in Journalism, improve news category efficiency.
Embodiment 3
For solution prior art is come by manually carrying out manual screening what the method for division information special topic caused The inefficient problem of information classification, the embodiment of the present application 3 provides a kind of information sorting device.This information classification The structural representation of device is as it is shown on figure 3, mainly include following function unit:
Search key determines unit 31, for determining that the search relevant to the label of predetermined information special topic is closed Key word;
Internet information determines unit 32, is used for according to the classification belonging to described predetermined information special topic, and The search key determined, in the information from the Internet, determines and the described search key determined Corresponding and belong to the information of described classification;
Target information determines unit 33, for the feature according to the information in the Internet determined, sieves from waiting The information aggregate of choosing determines that the information possessing described feature is as target information;Described information collection to be screened Close, be by the information structure belonging to described classification;
Sort out unit 34, for sorting out the target information possessing described feature determined to described predetermined In information special topic.
In one embodiment, described predetermined information special topic, for Special Topics in Journalism;Described classification, for newly Hear class.
In one embodiment, described information aggregate to be screened, for database purchase news constitute Set.
In one embodiment, described target information determines unit 33, specifically for new according to determine The headline heard, from described information aggregate to be screened, determines that the news possessing described headline is made For target information.
According to predetermined thematic information, thematic information display unit 35, for showing that rule is predetermined to determine Information special topic in information be shown.
In one embodiment, described thematic information display unit 35, specifically for mutual according to determine According to predetermined thematic information, the attribute of the information in networking, shows that the rule predetermined information to determining is special Information in topic is shown.
In one embodiment, described internet information determines unit 32, determines specifically for basis Search key, in the information from the Internet, determines corresponding with the described search key determined Candidate information;According to default candidate information screening conditions, from the candidate information determined, determine and institute State the search key determined corresponding and belong to the information of described classification.
The attribute of the information in the described the Internet determined include following at least one:
The searchable index of the search key determined that the information determined from the information of the Internet is corresponding;
Webpage corresponding to the information determined from the information of the Internet is at webpage corresponding to described candidate information In search engine ranking.
The information sorting device that the embodiment of the present application 3 provides, is determined by and the label of predetermined information special topic Relevant search key, and according to the classification belonging to predetermined information special topic, the search key determined The information belonging to described predetermined information special topic is determined with the information in the Internet, logical relative in prior art Cross and manually carry out the method that manual screening carrys out division information special topic, improve information classification efficiency.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or meter Calculation machine program product.Therefore, the present invention can use complete hardware embodiment, complete software implementation or knot The form of the embodiment in terms of conjunction software and hardware.And, the present invention can use and wherein wrap one or more Computer-usable storage medium containing computer usable program code (include but not limited to disk memory, CD-ROM, optical memory etc.) form of the upper computer program implemented.
The present invention is with reference to method, equipment (system) and computer program product according to embodiments of the present invention The flow chart of product and/or block diagram describe.It should be understood that can by computer program instructions flowchart and / or block diagram in each flow process and/or flow process in square frame and flow chart and/or block diagram and/ Or the combination of square frame.These computer program instructions can be provided to general purpose computer, special-purpose computer, embedding The processor of formula datatron or other programmable data processing device is to produce a machine so that by calculating The instruction that the processor of machine or other programmable data processing device performs produces for realizing at flow chart one The device of the function specified in individual flow process or multiple flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions may be alternatively stored in and computer or the process of other programmable datas can be guided to set In the standby computer-readable memory worked in a specific way so that be stored in this computer-readable memory Instruction produce and include the manufacture of command device, this command device realizes in one flow process or multiple of flow chart The function specified in flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions also can be loaded in computer or other programmable data processing device, makes Sequence of operations step must be performed to produce computer implemented place on computer or other programmable devices Reason, thus the instruction performed on computer or other programmable devices provides for realizing flow chart one The step of the function specified in flow process or multiple flow process and/or one square frame of block diagram or multiple square frame.
The foregoing is only embodiments herein, be not limited to the application.For this area skill For art personnel, the application can have various modifications and variations.All institutes within spirit herein and principle Any modification, equivalent substitution and improvement etc. made, within the scope of should be included in claims hereof.

Claims (16)

1. an information classification approach, it is characterised in that described method includes:
Determine the search key relevant to the label of predetermined information special topic;
According to the classification belonging to described predetermined information special topic, and the search key determined, from interconnection In information in net, determine letter that is corresponding with the described search key determined and that belong to described classification Breath;
According to the feature of the information in the Internet determined, from information aggregate to be screened, determine possess institute State the information of feature as target information;Described information aggregate to be screened, is by the letter belonging to described classification Breath is constituted;
In the information categorization the possessing described feature extremely described predetermined information special topic that will determine.
2. method as claimed in claim 1, it is characterised in that:
Described predetermined information special topic, for Special Topics in Journalism;
Described classification, for news category.
3. method as claimed in claim 2, it is characterised in that described information aggregate to be screened, for number The set constituted according to the news of library storage.
4. method as claimed in claim 3, it is characterised in that according to the information in the Internet determined Feature, determine from information aggregate to be screened the information possessing described feature as target information, including:
According to the headline of the news determined, from described information aggregate to be screened, determine possess institute State the news of headline as target information.
5. as claimed in claim 1 method, it is characterised in that in the described feature that possesses that will determine After in information categorization extremely described predetermined information special topic, described method also includes:
Show that the information in the predetermined information special topic determined is carried out by rule according to predetermined thematic information Show.
6. method as claimed in claim 5, it is characterised in that show rule according to predetermined thematic information Information in the predetermined information special topic determined is shown, including:
According to the attribute of the information in the Internet determined, show that rule is to really according to predetermined thematic information Information in the predetermined information special topic made is shown.
7. method as claimed in claim 6, it is characterised in that belonging to described predetermined information special topic Classification, and the search key determined, in the information from the Internet, determine and determine with described Search key corresponding and belong to the information of described classification, including:
According to the search key determined, in the information from the Internet, determine with described determine search The candidate information that rope keyword is corresponding;
According to default candidate information screening conditions, from the candidate information determined, determine and determine with described The search key gone out is corresponding and belongs to the information of described classification.
8. as claimed in claim 7 method, it is characterised in that described in information in the Internet determined Attribute include following at least one:
The searchable index of the search key determined that the information determined from the information of the Internet is corresponding;
Webpage corresponding to the information determined from the information of the Internet is at webpage corresponding to described candidate information In search engine ranking.
9. an information sorting device, it is characterised in that described device includes:
Search key determines unit, crucial for determining the search relevant to the label of predetermined information special topic Word;
Internet information determines unit, is used for according to the classification belonging to described predetermined information special topic, and really The search key made, in the information from the Internet, determines and the described search key phase determined Corresponding and belong to the information of described classification;
Target information determines unit, for the feature according to the information in the Internet determined, to be screened Information aggregate in determine that the information possessing described feature is as target information;Described information collection to be screened Close, be by the information structure belonging to described classification;
Sort out unit, for sorting out the target information possessing described feature determined to described predetermined letter In breath special topic.
10. device as claimed in claim 9, it is characterised in that:
Described predetermined information special topic, for Special Topics in Journalism;
Described classification, for news category.
11. devices as claimed in claim 10, it is characterised in that described information aggregate to be screened, for The set that the news of database purchase is constituted.
12. devices as claimed in claim 11, it is characterised in that:
Described target information determines unit, specifically for the headline according to the news determined, from described In information aggregate to be screened, determine that the news possessing described headline is as target information.
13. devices as claimed in claim 9, it is characterised in that described device also includes:
According to predetermined thematic information, thematic information display unit, for showing that rule is predetermined to determine Information in information special topic is shown.
14. devices as claimed in claim 13, it is characterised in that:
Described thematic information display unit, specifically for the attribute according to the information in the Internet determined, Show that the information in the predetermined information special topic determined is shown by rule according to predetermined thematic information.
15. devices as claimed in claim 14, it is characterised in that:
Described internet information determines unit, specifically for according to the search key determined, from the Internet In information in, determine the candidate information corresponding with the described search key determined;
According to default candidate information screening conditions, from the candidate information determined, determine and determine with described The search key gone out is corresponding and belongs to the information of described classification.
16. devices as claimed in claim 15, it is characterised in that described in letter in the Internet determined Breath attribute include following at least one:
The searchable index of the search key determined that the information determined from the information of the Internet is corresponding;
Webpage corresponding to the information determined from the information of the Internet is at webpage corresponding to described candidate information In search engine ranking.
CN201610183721.XA 2016-03-28 2016-03-28 Information classification method and apparatus Pending CN105893467A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610183721.XA CN105893467A (en) 2016-03-28 2016-03-28 Information classification method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610183721.XA CN105893467A (en) 2016-03-28 2016-03-28 Information classification method and apparatus

Publications (1)

Publication Number Publication Date
CN105893467A true CN105893467A (en) 2016-08-24

Family

ID=57014557

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610183721.XA Pending CN105893467A (en) 2016-03-28 2016-03-28 Information classification method and apparatus

Country Status (1)

Country Link
CN (1) CN105893467A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503266A (en) * 2016-11-30 2017-03-15 政和科技股份有限公司 Document Classification Method and device
CN111460257A (en) * 2020-03-27 2020-07-28 北京百度网讯科技有限公司 Thematic generation method and device, electronic equipment and storage medium
CN111552879A (en) * 2020-04-29 2020-08-18 百度在线网络技术(北京)有限公司 Data processing method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101984423A (en) * 2010-10-21 2011-03-09 百度在线网络技术(北京)有限公司 Hot-search word generation method and system
CN103577501A (en) * 2012-08-10 2014-02-12 深圳市世纪光速信息技术有限公司 Hot topic searching system and hot topic searching method
CN104182443A (en) * 2014-03-28 2014-12-03 无锡天脉聚源传媒科技有限公司 News searching method and device
CN104572846A (en) * 2014-12-12 2015-04-29 百度在线网络技术(北京)有限公司 Method, device and system for recommending hot words
CN105224699A (en) * 2015-11-17 2016-01-06 Tcl集团股份有限公司 A kind of news recommend method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101984423A (en) * 2010-10-21 2011-03-09 百度在线网络技术(北京)有限公司 Hot-search word generation method and system
CN103577501A (en) * 2012-08-10 2014-02-12 深圳市世纪光速信息技术有限公司 Hot topic searching system and hot topic searching method
CN104182443A (en) * 2014-03-28 2014-12-03 无锡天脉聚源传媒科技有限公司 News searching method and device
CN104572846A (en) * 2014-12-12 2015-04-29 百度在线网络技术(北京)有限公司 Method, device and system for recommending hot words
CN105224699A (en) * 2015-11-17 2016-01-06 Tcl集团股份有限公司 A kind of news recommend method and device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503266A (en) * 2016-11-30 2017-03-15 政和科技股份有限公司 Document Classification Method and device
CN111460257A (en) * 2020-03-27 2020-07-28 北京百度网讯科技有限公司 Thematic generation method and device, electronic equipment and storage medium
CN111460257B (en) * 2020-03-27 2023-10-31 北京百度网讯科技有限公司 Thematic generation method, apparatus, electronic device and storage medium
CN111552879A (en) * 2020-04-29 2020-08-18 百度在线网络技术(北京)有限公司 Data processing method and device
CN111552879B (en) * 2020-04-29 2023-10-03 百度在线网络技术(北京)有限公司 Data processing method and device

Similar Documents

Publication Publication Date Title
CN104834729B (en) Topic recommends method and topic recommendation apparatus
US9704185B2 (en) Product recommendation using sentiment and semantic analysis
US10521469B2 (en) Image Re-ranking method and apparatus
US10031975B2 (en) Presentation of search results based on the size of the content sources from which they are obtained
US8635281B2 (en) System and method for attentive clustering and analytics
JP6141305B2 (en) Image search
US20110264651A1 (en) Large scale entity-specific resource classification
US8843483B2 (en) Method and system for interactive search result filter
US20160048754A1 (en) Classifying resources using a deep network
US20130157234A1 (en) Storyline visualization
US20080065602A1 (en) Selecting advertisements for search results
CA2832911C (en) System and method for filtering documents
CN111680254B (en) Content recommendation method and device
CN104077415A (en) Searching method and device
CN106354867A (en) Multimedia resource recommendation method and device
KR100954842B1 (en) Method and System of classifying web page using category tag information and Recording medium using by the same
CN107977420A (en) The abstract extraction method, apparatus and readable storage medium storing program for executing of a kind of evolved document
CN105893467A (en) Information classification method and apparatus
CN103412880A (en) Method and device for determining implicit associated information between multimedia resources
TW201333727A (en) Open-ended detection and categorization of word clusters in text data
US20180137198A1 (en) Data retrieval system
CN105868345A (en) Method and device for determining information
CN104077281B (en) It is a kind of to generate the method and apparatus for promoting language
US11803574B2 (en) Clustering approach for auto generation and classification of regional sports
JP7042720B2 (en) Information processing equipment, information processing methods, and programs

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160824

RJ01 Rejection of invention patent application after publication