CN107239497A - Hot content searching method and system - Google Patents
Hot content searching method and system Download PDFInfo
- Publication number
- CN107239497A CN107239497A CN201710301979.XA CN201710301979A CN107239497A CN 107239497 A CN107239497 A CN 107239497A CN 201710301979 A CN201710301979 A CN 201710301979A CN 107239497 A CN107239497 A CN 107239497A
- Authority
- CN
- China
- Prior art keywords
- value
- hot
- content
- text data
- dimensional parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Abstract
The present invention relates to a kind of hot content searching method and system, wherein, hot content searching method may comprise steps of:Obtain search key;According to search key, retrieved in default index database, obtain each text data;According to default time fluctuation temperature algorithm, the corresponding temperature amplification value of each dimensional parameter of text data is obtained;Using temperature amplification value and the product of default pad value as the hot value of dimensional parameter, and addition summation is carried out to the hot value of each dimensional parameter, obtain the content hot value of text data;According to content hot value, each text data is ranked up, each text data after being sorted;Each text data after sequence is shown or is sent to corresponding applications as the hot content searched out according to search key.The present invention can embody the ageing of temperature situation in the period and content hot value, and effectively improve the accuracy for obtaining hot content information.
Description
Technical field
The present invention relates to data retrieval technology field, more particularly to a kind of hot content searching method and system.
Background technology
In data retrieval service, content information is gathered first, and rope is then set up according to the content information data collected
Draw.Applications carry out full-text search when using these content information datas by indexing, and give tacit consent to the hair according to information
The dimensions such as cloth time, comment number, thumb up number are ranked up, and obtain the high content information of attention rate.
In implementation process, inventor has found that at least there are the following problems in conventional art:Using conventional contents retrieval side
Method, because comment number, thumb up number etc. understand growth over time and becomes big, and the content hot value drawn can constantly increase;But
Content hot value is often effective property, and change that can over time produces fluctuation, and traditional hot content searching method can not
Embody this ageing, it is impossible to obtain accurate content hot value, so that the accuracy rate for obtaining hot content information is low.
The content of the invention
Based on this, it is necessary to which the accuracy rate that obtains hot content information for traditional hot content searching method is low to ask
There is provided a kind of hot content searching method and system for topic.
To achieve these goals, on the one hand, the embodiments of the invention provide a kind of hot content searching method, including with
Lower step:
Obtain search key;According to search key, retrieved in default index database, obtain each text data;
According to default time fluctuation temperature algorithm, the corresponding temperature amplification value of each dimensional parameter of text data is obtained;
Using temperature amplification value and the product of default pad value as the hot value of dimensional parameter, and the hot value of each dimensional parameter is carried out
Summation is added, the content hot value of text data is obtained;
According to content hot value, each text data is ranked up, each text data after being sorted;
Each text data after sequence is shown or sent as the hot content searched out according to search key
To corresponding applications.
On the other hand, the embodiment of the present invention additionally provides a kind of hot content search system, including:
Full-text search unit, for obtaining search key, and according to search key, is examined in default index database
Rope, obtains each text data;
Content hot value acquiring unit, for according to default time fluctuation temperature algorithm, obtaining each dimension of text data
Spend the corresponding temperature amplification value of parameter;Using temperature amplification value and the product of default pad value as dimensional parameter hot value, and
Hot value to each dimensional parameter carries out addition summation, obtains the content hot value of text data;
Sequencing unit, for according to content hot value, being ranked up to each text data, each textual data after being sorted
According to;
Feedback unit, for each text data after sequence to be entered as the hot content searched out according to search key
Row display is sent to corresponding applications.
The invention has the advantages that and beneficial effect:
Hot content searching method of the present invention and system, according to default time fluctuation temperature algorithm, obtain each textual data
According to content hot value;Wherein, for example passage time temperature pad value is multiplied by the step of temperature amplification is worth hot value, Ke Yi great
Big reduction increases the deviation defined to content temperature over time, and the content hot value drawn is more accurate;Then according to content heat
Angle value, is ranked up to each text data, obtains accurately embodying the ranking results of content temperature;Above step causes this to send out
Bright the temperature situation that can be embodied in the period and content hot value it is ageing;Simultaneously according to the temperature amplification value in the period
Calculated, using the hot value sum of each dimensional parameter as content hot value, acquisition hot content information can be effectively improved
Accuracy.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of hot content searching method embodiment 1 of the present invention;
Fig. 2 is the schematic flow sheet of hot content searching method embodiment 2 of the present invention;
Fig. 3 is the structural representation of hot content search system embodiment 1 of the present invention;
Fig. 4 is the structural representation of hot content search system embodiment 2 of the present invention.
Embodiment
For the ease of understanding the present invention, the present invention is described more fully below with reference to relevant drawings.In accompanying drawing
Give the preferred embodiment of the present invention.But, the present invention can be realized in many different forms, however it is not limited to this paper institutes
The embodiment of description.On the contrary, the purpose that these embodiments are provided be make to the disclosure more it is thorough comprehensively.
Unless otherwise defined, all of technologies and scientific terms used here by the article is with belonging to technical field of the invention
The implication that technical staff is generally understood that is identical.Term used in the description of the invention herein is intended merely to description tool
The purpose of the embodiment of body, it is not intended that in the limitation present invention.Term as used herein " and/or " include one or more phases
The arbitrary and all combination of the Listed Items of pass.
Hot content searching method of the present invention and system application scenarios explanation:
In conventional contents search method, attention rate is entered according to dimension datas such as issuing time, number of reviews, thumb up numbers
The value drawn after row summation operation, value is higher, and degree of paying close attention to is higher.And index and used for providing full-text search, attention rate
It is the reference foundation to the sort result of retrieval.Traditional hot content searching method is when being retrieved, according to search key
Each text data is obtained, final ranking results are determined then in conjunction with attention rate.But conventional method is according to dimensional parameter (i.e. dimension
Data) value carry out direct computing, the dimension data of different contents easily occur has a case that larger deviation, eventually leads
Reason is that the size of content hot value in itself causes the inaccurate of ranking results.
Hot content searching method of the present invention and system, specifically go for targetedly website, such as every profession and trade net
Stand;It is preferred that, hot content searching method of the present invention and system are soft suitable for this kind of content cloud series of intelligent semantic knowledge mapping
Part project;The central kitchen that intelligent semantic knowledge mapping is runed as media, serve as media data collect, cleaning, be put in storage with
And the key player of retrieval service is provided to Edition Contains, i.e., in intelligent semantic platform according to the rule set in advance that crawls from conjunction
Make media client website and crawl associated media data deposit database, precipitate media data, editor provides data for media content
Search service.The present invention can crawl related data from cooperation media client website, the hot content that final search is arrived closer to
The industry temperature of a certain class industry, improves the accuracy of search result.
Hot content searching method embodiment 1 of the present invention:
In order to solve the problem of accuracy rate that traditional hot content searching method obtains hot content information is low, the present invention is carried
A kind of hot content searching method embodiment 1 is supplied, Fig. 1 illustrates for the flow of hot content searching method embodiment 1 of the present invention
Figure;As shown in figure 1, may comprise steps of:
Step S110:Obtain search key;According to search key, retrieved in default index database, obtain each
Text data;
Step S120:According to default time fluctuation temperature algorithm, the corresponding heat of each dimensional parameter of text data is obtained
Spend amplification value;Using temperature amplification value and the product of default pad value as dimensional parameter hot value, and to each dimensional parameter
Hot value carries out addition summation, obtains the content hot value of text data;
Step S130:According to content hot value, each text data is ranked up, each text data after being sorted;
Step S140:Each text data after sequence is shown as the hot content searched out according to search key
Show or be sent to corresponding applications.
Specifically, the present invention obtains each text data (preferably, can take the mode of full-text search) by retrieving,
To text data according to default time fluctuation temperature algorithm, the heat that temperature amplification calculates each dimensional parameter is multiplied by by pad value
Angle value, and it is worth to according to temperature the content hot value of text data;When user entered keyword is retrieved, first according to key
Word carries out full-text search, then each text data is ranked up according to content hot value, then the result after sequence is returned to
User.
Wherein, dimensional parameter is the parameter of the measurement content temperature obtained according to user behavior data;It is preferred that, dimension ginseng
Number refer to embodying user's attention rates of text data dimension data (for example like, thumb up number, comment number and reprinting
Number etc. records the data of user behavior);Pad value can be being incremented by and gradually declining over time according to the difference in the period
The numeric constant subtracted.Temperature amplification value can be increased according to a certain dimension data in time range (i.e. a certain dimensional parameter)
Value.And content hot value is the value for the popular degree for embodying content change over time and embodying, it is worth bigger represent
It is more popular.It is preferred that, temperature amplification value can refer to according to calculating text data dimensional parameter (such as thumb up in a period of time
Number, read number, comment number) amplification value.Pad value can be obtained according to the period flexibly to divide, it is preferred that three days
The pad value of time is 0.8, and the pad value of week age is 0.5, and the pad value of half month is 0.3, pad value smaller generation
The degree of table decay is bigger.
The present invention carries out full-text search by default index database, then to obtained each text data according to content hot value
Be ranked up, such ranking results can accurately embody text data temperature situation and content hot value it is ageing,
So as to effectively improve the accuracy for obtaining hot content information.
In a specific embodiment, according to default time fluctuation temperature algorithm, text is obtained based on below equation
The corresponding temperature amplification value of each dimensional parameter of data:
The parameter value of a period on parameter value-dimensional parameter of temperature amplification value=dimensional parameter current time.
Specifically, by default time fluctuation temperature algorithm in the present invention, the temperature feelings in the period can be embodied
Condition, rather than simply according to the direct computing of value progress of the dimensional parameters such as comment number, reading number, thumb up number, because different is interior
Holding its comment number, reading number, thumb up number has larger deviation, eventually results in because the size of value itself has influence on sequence knot
Fruit it is inaccurate, and according to this default time fluctuation temperature algorithm so that the present invention can be according to the amplification in the period
Calculated, effectively raise accuracy.
It is preferred that, default time fluctuation temperature algorithm can be realized according to below equation:(1) temperature amplification value=a certain
The value of the value of dimension (i.e. a certain dimensional parameter) current time-certain dimension (i.e. a certain dimensional parameter) upper period;(2)
Pad value is the particular constant value for elapsing and constantly decaying over time;(3) hot value of certain dimension=pad value * is one-dimensional
The temperature amplification value (i.e. the product of the two) of degree;(4) the content hot value=hot values of multiple dimensions is subjected to summation addition;
Further, when the quantity of dimensional parameter is one, content hot value can also be multiplied by the dimension by pad value
The temperature amplification value of degree parameter calculates what is obtained.
In a specific embodiment, dimensional parameter includes thumb up parameter, comment parameter and reading parameters;
The step of hot value of each dimensional parameter is carried out into addition summation, the content hot value for obtaining text data includes:
The product of hot value temperature weight corresponding with dimensional parameter is obtained, addition summation is carried out to each product, obtains interior
Hold hot value.
Specifically, in order to more be accurately obtained content hot value, can according to user behavior (such as thumb up, like,
Comment etc.) frequency of usage, respectively set dimensional parameter temperature weight;It is preferred that, the temperature weight that user can be liked
Numerical value is set to maximum, and comment is taken second place.Then temperature weight and the hot value of dimensional parameter are subjected to product calculation, by each product
Value carries out addition summation, so as to obtain the content hot value of text data.It so, it is possible more accurately to reflect the true of text data
Real heat degree.
In a specific embodiment, step is also included before the step of obtaining search key:
Rule is crawled according to default, the content information of website is crawled, the text data of content information is obtained;
Participle is carried out to text data, the word after participle and sentence is obtained;
According to the word and sentence after participle, inverted index is set up, and according to inverted index, builds default index database.
Specifically, the industrial sustainability in the present invention can be profession portal website;The present invention can be according to crawling
The text of web site contents sets up index, carries out participle to text data first, is then built according to the word and sentence that cut out
Vertical inverted index, index is exactly to be used to be stored in some word depositing in a document or one group of document under full-text search
The data for the mapping that storage space is put.By crawling related data from profession portal website, the hot content that final search is arrived closer to
The industry temperature of a certain class industry, can further improve the accuracy of search result.
It should be noted that the default rule that crawls can refer to web crawlers;Participle is carried out to text data, participle is obtained
The step of rear word and sentence, participle that can be using such as segmenting method based on dictionary pattern matching, based on semantic analysis is calculated
Method and segmenting method based on probability statistics model are realized.
It is preferred that, it can realize according to search key to be entered in the present invention by solr (enterprise-level search application server)
Row full-text search, the step of obtaining each text data, so as to further improve the matching degree of each text data and keyword, it is ensured that
Search for the accuracy of hot content.Simultaneously technology is provided to set up index and obtaining the accurate temperature amplification value of dimensional parameter
Support.
Hot content searching method embodiment 1 of the present invention, according to default time fluctuation temperature algorithm, obtains each textual data
According to content hot value;Wherein, the hot value that use time temperature pad value is multiplied by temperature amplification and is worth going out can substantially reduce with
The deviation that time growth is defined to content temperature, the content hot value drawn is more accurate;Then according to content hot value, to each
Text data is ranked up, and obtains accurately embodying the ranking results of content temperature;Above step causes the present invention can be with body
Temperature situation and content hot value in the existing period it is ageing;Counted simultaneously according to the temperature amplification value in the period
Calculate, using the hot value sum of each dimensional parameter as content hot value, can effectively improve and obtain the accurate of hot content information
Property.
Hot content searching method embodiment 2 of the present invention:
In order to solve the problem of accuracy rate that traditional hot content searching method obtains hot content information is low, the present invention is also
There is provided a kind of hot content searching method embodiment 2;Embodiment 2 is compared with above-described embodiment 1, except according to content hot value
Outside each text data is ranked up, when carrying out full-text search to text data, in addition it is also necessary to according to the degree meter of text matches
Calculate and retrieval content, such sequence knot are returned after matching value score, and the sort result that matching value and hot value are combined
Fruit can more embody the temperature situation of article.Fig. 2 is the schematic flow sheet of hot content searching method embodiment 2 of the present invention;Such as Fig. 2
It is shown, it may comprise steps of:
Step S210:Obtain search key;According to search key, retrieved in default index database, obtain each
Text data;
Step S220:According to default time fluctuation temperature algorithm, the corresponding heat of each dimensional parameter of text data is obtained
Spend amplification value;Using temperature amplification value and the product of default pad value as dimensional parameter hot value, and to each dimensional parameter
Hot value carries out addition summation, obtains the content hot value of text data;
Step S230:According to the matching degree of words and phrases in search key and default index database, each text data is obtained
With value;
Step S240:Content hot value and matching value are carried out being added summation, final score value is obtained;
Step S250:According to the order that final score value is descending, each text data is ranked up, after being sorted
Each text data;
Step S260:Each text data after sequence is shown as the hot content searched out according to search key
Show or be sent to corresponding applications.
Specifically, i.e., in embodiment 1 according to content hot value, before the step of being ranked up to each text data also
Including step:
According to the matching degree of words and phrases in search key and default index database, each matches text data value is obtained;
Each text data is ranked up in embodiment 1, can be included the step of each text data after being sorted:
Content hot value and matching value are carried out being added summation, final score value is obtained;
According to the order that final score value is descending, each text data is ranked up, each textual data after being sorted
According to.
It is preferred that, the present invention carries out full text matching into index database, according to keyword according to keyword first in retrieval
The degree matched with the word in index database calculates score value (for example, obtaining matching value by similarity algorithm), then ties again
Co content hot value be added obtaining final score value, and each text data is exactly the sequence according to fractional value progress from big to small
Return, such ranking results can more embody the temperature situation of text data.
It is clear that, other steps flow charts of hot content searching method embodiment 2 of the present invention can be with above-mentioned reality
The step flow applied in example 1 is identical, and reaches that identical or more preferably technique effect (for example more accurately embodies the heat of search content
Degree obtains more accurately content hot value etc.), it is no longer repeated herein.
Hot content search system embodiment 1 of the present invention:
Based on the technical scheme of each embodiment of above hot content searching method, while being searched to solve traditional hot content
Suo Fangfa obtain hot content information accuracy rate it is low the problem of, present invention also offers a kind of implementation of hot content search system
Example 1;Fig. 3 is the structural representation of hot content search system embodiment 1 of the present invention;As shown in figure 3, can include
Full-text search unit 310, for obtaining search key, and according to search key, enters in default index database
Row full-text search, obtains each text data;
Content hot value acquiring unit 320, for according to default time fluctuation temperature algorithm, obtaining each of text data
The corresponding temperature amplification value of dimensional parameter;Using temperature amplification value and the product of default pad value as dimensional parameter hot value,
And addition summation is carried out to the hot value of each dimensional parameter, obtain the content hot value of text data;
Sequencing unit 330, for according to content hot value, being ranked up to each text data, each text after being sorted
Notebook data;
Feedback unit 340, for using each text data after sequence as according to search key search out hot topic in
Appearance is shown or is sent to corresponding applications.
In a specific embodiment, content hot value acquiring unit is according to default time fluctuation temperature algorithm, base
The corresponding temperature amplification value of each dimensional parameter for obtaining text data in below equation:
The parameter value of a period on parameter value-dimensional parameter of temperature amplification value=dimensional parameter current time.
In a specific embodiment, dimensional parameter includes thumb up parameter, comment parameter and reading parameters;
Content hot value acquiring unit 320, is additionally operable to obtain the product of hot value temperature weight corresponding with dimensional parameter,
Addition summation is carried out to each product, content hot value is obtained.
In a specific embodiment, hot content search system also includes index database construction unit 350;
Building index library unit 350 includes:
Module 352 is crawled, for crawling rule according to default, the content information of industrial sustainability is crawled, content letter is obtained
The text data of breath;
Word-dividing mode 354, for carrying out participle to text data, obtains the words and phrases after participle;
Index database builds module 356, according to the words and phrases after participle, sets up inverted index, and according to inverted index, builds pre-
If index database.
In particular, it is desirable to which explanation is that hot content search system embodiment 1 of the present invention correspondingly can realize above-mentioned heat
Various method steps in door content search method embodiment 1, it is no longer repeated herein.
Hot content search system embodiment 2 of the present invention:
Based on the technical scheme of each embodiment of above hot content searching method, while being searched to solve traditional hot content
Suo Fangfa obtain hot content information accuracy rate it is low the problem of, the present invention is based on hot content search system embodiment 1
System structure, additionally provides a kind of hot content search system embodiment 2;Fig. 4 is hot content search system embodiment 2 of the present invention
Structural representation;As shown in Fig. 2 hot content search system can also include:
Matching value acquiring unit 460, for the matching degree according to words and phrases in search key and default index database, is obtained
Each matches text data value;
Sequencing unit 430 can include:
Plus with module 432, for carrying out being added summation to content hot value and matching value, obtain final score value;
Order module 434, for according to the descending order of final score value, being ranked up, obtaining to each text data
Each text data after sequence.
In particular, it is desirable to which explanation is that hot content search system embodiment 2 of the present invention correspondingly can realize above-mentioned heat
Various method steps in door content search method embodiment 2, it is no longer repeated herein.
Each embodiment of hot content search system of the present invention, according to default time fluctuation temperature algorithm, obtains each text
Data content hot value;Wherein, the hot value that use time temperature pad value is multiplied by temperature amplification and is worth going out can be substantially reduced
Increase the deviation defined to content temperature over time, the content hot value drawn is more accurate;Then it is right according to content hot value
Each text data is ranked up, and obtains accurately embodying the ranking results of content temperature;Above step make it that the present invention can be with
Embody the period in temperature situation and content hot value it is ageing;Counted simultaneously according to the temperature amplification value in the period
Calculate, using the hot value sum of each dimensional parameter as content hot value, can effectively improve and obtain the accurate of hot content information
Property.
Each technical characteristic of embodiment described above can be combined arbitrarily, to make description succinct, not to above-mentioned reality
Apply all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, the scope of this specification record is all considered to be.One of ordinary skill in the art will appreciate that realizing above-mentioned implementation
All or part of step in example method can be by program to instruct the hardware of correlation to complete, and described program can be deposited
Be stored in a computer read/write memory medium, the program upon execution, including the step described in above method, described storage
Medium, such as:ROM/RAM, magnetic disc, CD etc..
Embodiment described above only expresses the several embodiments of the present invention, and it describes more specific and detailed, but simultaneously
Can not therefore it be construed as limiting the scope of the patent.It should be pointed out that coming for one of ordinary skill in the art
Say, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the protection of the present invention
Scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
Claims (10)
1. a kind of hot content searching method, it is characterised in that comprise the following steps:
Obtain search key;According to the search key, retrieved in default index database, obtain each text data;
According to default time fluctuation temperature algorithm, the corresponding temperature amplification value of each dimensional parameter of the text data is obtained;
Using the temperature amplification value and the product of default pad value as the dimensional parameter hot value, and to each dimensional parameter
Hot value carry out addition summation, obtain the content hot value of the text data;
According to the content hot value, each text data is ranked up, each text data after being sorted;
Each text data after the sequence is shown as the hot content searched out according to the search key or
It is sent to corresponding applications.
2. hot content searching method according to claim 1, it is characterised in that calculated according to default time fluctuation temperature
Method, the corresponding temperature amplification value of each dimensional parameter that the text data is obtained based on below equation:
The ginseng of a period on parameter value-dimensional parameter of the temperature amplification value=dimensional parameter current time
Numerical value.
3. hot content searching method according to claim 1, it is characterised in that the dimensional parameter is joined including thumb up
Number, comment parameter and reading parameters;
The step of hot value of each dimensional parameter is carried out into addition summation, the content hot value for obtaining the text data is wrapped
Include:
The product of hot value temperature weight corresponding with the dimensional parameter is obtained, carrying out addition to each product asks
With obtain the content hot value.
4. the hot content searching method according to claims 1 to 3 any one, it is characterised in that according to the content
Also include before hot value, the step of being ranked up to each text data:
According to the matching degree of words and phrases in the search key and the default index database, of each text data is obtained
With value;
Each text data is ranked up, included the step of each text data after being sorted:
The content hot value and the matching value are carried out being added summation, final score value is obtained;
According to the order that the final score value is descending, each text data is ranked up, obtained after the sequence
Each text data.
5. hot content searching method according to claim 4, it is characterised in that the step of search key is obtained it
It is preceding also to include step:
Rule is crawled according to default, the content information of website is crawled, the text data of the content information is obtained;
Participle is carried out to the text data, the word after participle and sentence is obtained;
According to the word and sentence after the participle, inverted index is set up, and according to the inverted index, builds the default rope
Draw storehouse.
6. a kind of hot content search system, it is characterised in that including:
Full-text search unit, for obtaining search key, and according to the search key, is examined in default index database
Rope, obtains each text data;
Content hot value acquiring unit, for according to default time fluctuation temperature algorithm, obtaining each dimension of the text data
Spend the corresponding temperature amplification value of parameter;Using the temperature amplification value and the product of default pad value as the dimensional parameter heat
Angle value, and addition summation is carried out to the hot value of each dimensional parameter, obtain the content hot value of the text data;
Sequencing unit, for according to the content hot value, being ranked up to each text data, each text after being sorted
Notebook data;
Feedback unit, for using each text data after the sequence as according to the search key search out hot topic in
Appearance is shown or is sent to corresponding applications.
7. hot content search system according to claim 6, it is characterised in that the content hot value acquiring unit root
According to default time fluctuation temperature algorithm, increased based on the corresponding temperature of each dimensional parameter that below equation obtains the text data
Amplitude:
The ginseng of a period on parameter value-dimensional parameter of the temperature amplification value=dimensional parameter current time
Numerical value.
8. hot content search system according to claim 6, it is characterised in that the dimensional parameter is joined including thumb up
Number, comment parameter and reading parameters;
The content hot value acquiring unit, is additionally operable to obtain hot value temperature weight corresponding with the dimensional parameter
Product, carries out addition summation to each product, obtains the content hot value.
9. the hot content search system according to claim 6 to 8 any one, it is characterised in that also include:
Matching value acquiring unit, for the matching degree according to words and phrases in the search key and the default index database, is obtained
To the matching value of each text data;
The sequencing unit includes:
Plus and module, for carrying out being added summation to the content hot value and the matching value, obtain final score value;
Order module, for according to the descending order of the final score value, being ranked up, obtaining to each text data
Each text data after the sequence.
10. hot content search system according to claim 9, it is characterised in that the hot content search system is also
Including index database construction unit;
The index library unit that builds includes:
Module is crawled, for crawling rule according to default, the content information of website is crawled, obtains the text of the content information
Data;
Word-dividing mode, for carrying out participle to the text data, obtains the word after participle and sentence;
Index database builds module, according to the word and sentence after the participle, sets up inverted index;And according to the row's of falling rope
Draw, build the default index database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710301979.XA CN107239497B (en) | 2017-05-02 | 2017-05-02 | Hot content search method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710301979.XA CN107239497B (en) | 2017-05-02 | 2017-05-02 | Hot content search method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107239497A true CN107239497A (en) | 2017-10-10 |
CN107239497B CN107239497B (en) | 2020-11-03 |
Family
ID=59984213
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710301979.XA Active CN107239497B (en) | 2017-05-02 | 2017-05-02 | Hot content search method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107239497B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109145246A (en) * | 2018-07-31 | 2019-01-04 | 成都华栖云科技有限公司 | A kind of news virtual click amount implementation method based on paas media cloud multi-tenant platform |
CN109275031A (en) * | 2018-09-25 | 2019-01-25 | 有米科技股份有限公司 | A kind of temperature appraisal procedure, device and the electronic equipment of video |
CN109582852A (en) * | 2018-12-05 | 2019-04-05 | 中国银行股份有限公司 | A kind of sort method and system of full-text search result |
CN110532419A (en) * | 2019-08-29 | 2019-12-03 | 腾讯科技(深圳)有限公司 | A kind of processing method and processing device of audio |
CN110704436A (en) * | 2019-09-26 | 2020-01-17 | 郑州阿帕斯科技有限公司 | Hbase-based index generation method and device |
CN111026958A (en) * | 2019-11-29 | 2020-04-17 | 微梦创科网络科技(中国)有限公司 | Hot microblog sorting method and device |
CN113886685A (en) * | 2021-09-23 | 2022-01-04 | 北京三快在线科技有限公司 | Searching method, searching device, storage medium and electronic equipment |
CN114154075A (en) * | 2022-02-08 | 2022-03-08 | 北京大氪信息科技有限公司 | Hot information determination method, hot information determination device, computer equipment and medium |
CN114996550A (en) * | 2021-05-24 | 2022-09-02 | 中移互联网有限公司 | Information retrieval method and device |
CN111159312B (en) * | 2019-12-27 | 2024-04-26 | 东软集团股份有限公司 | Fault related information auxiliary retrieval method and device, storage medium and electronic equipment |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101246499A (en) * | 2008-03-27 | 2008-08-20 | 腾讯科技(深圳)有限公司 | Network information search method and system |
CN101742722A (en) * | 2009-12-22 | 2010-06-16 | 卓望数码技术(深圳)有限公司 | Service searching method and device |
CN102937960A (en) * | 2012-09-06 | 2013-02-20 | 北京邮电大学 | Device and method for identifying and evaluating emergency hot topic |
CN103077190A (en) * | 2012-12-20 | 2013-05-01 | 人民搜索网络股份公司 | Hot event ranking method based on order learning technology |
CN103365902A (en) * | 2012-03-31 | 2013-10-23 | 北大方正集团有限公司 | Method and device for evaluating Internet News |
CN104657496A (en) * | 2015-03-09 | 2015-05-27 | 杭州朗和科技有限公司 | Method and equipment for calculating information hot value |
CN105488196A (en) * | 2015-12-07 | 2016-04-13 | 中国人民大学 | Automatic hot topic mining system based on internet corpora |
CN105653705A (en) * | 2015-12-30 | 2016-06-08 | 北京奇艺世纪科技有限公司 | Hot event searching method and device |
CN105653661A (en) * | 2015-12-29 | 2016-06-08 | 云南电网有限责任公司电力科学研究院 | Search result re-ranking method and device |
CN105653737A (en) * | 2016-03-01 | 2016-06-08 | 广州神马移动信息科技有限公司 | Method, equipment and electronic equipment for content document sorting |
CN105718598A (en) * | 2016-03-07 | 2016-06-29 | 天津大学 | AT based time model construction method and network emergency early warning method |
CN106599181A (en) * | 2016-12-13 | 2017-04-26 | 浙江网新恒天软件有限公司 | Hot news detecting method based on topic model |
-
2017
- 2017-05-02 CN CN201710301979.XA patent/CN107239497B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101246499A (en) * | 2008-03-27 | 2008-08-20 | 腾讯科技(深圳)有限公司 | Network information search method and system |
CN101742722A (en) * | 2009-12-22 | 2010-06-16 | 卓望数码技术(深圳)有限公司 | Service searching method and device |
CN103365902A (en) * | 2012-03-31 | 2013-10-23 | 北大方正集团有限公司 | Method and device for evaluating Internet News |
CN102937960A (en) * | 2012-09-06 | 2013-02-20 | 北京邮电大学 | Device and method for identifying and evaluating emergency hot topic |
CN103077190A (en) * | 2012-12-20 | 2013-05-01 | 人民搜索网络股份公司 | Hot event ranking method based on order learning technology |
CN104657496A (en) * | 2015-03-09 | 2015-05-27 | 杭州朗和科技有限公司 | Method and equipment for calculating information hot value |
CN105488196A (en) * | 2015-12-07 | 2016-04-13 | 中国人民大学 | Automatic hot topic mining system based on internet corpora |
CN105653661A (en) * | 2015-12-29 | 2016-06-08 | 云南电网有限责任公司电力科学研究院 | Search result re-ranking method and device |
CN105653705A (en) * | 2015-12-30 | 2016-06-08 | 北京奇艺世纪科技有限公司 | Hot event searching method and device |
CN105653737A (en) * | 2016-03-01 | 2016-06-08 | 广州神马移动信息科技有限公司 | Method, equipment and electronic equipment for content document sorting |
CN105718598A (en) * | 2016-03-07 | 2016-06-29 | 天津大学 | AT based time model construction method and network emergency early warning method |
CN106599181A (en) * | 2016-12-13 | 2017-04-26 | 浙江网新恒天软件有限公司 | Hot news detecting method based on topic model |
Non-Patent Citations (1)
Title |
---|
魏萌 等: "《基于关键词的微博热点话题实时检测方法》", 《计算机与现代化》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109145246A (en) * | 2018-07-31 | 2019-01-04 | 成都华栖云科技有限公司 | A kind of news virtual click amount implementation method based on paas media cloud multi-tenant platform |
CN109275031A (en) * | 2018-09-25 | 2019-01-25 | 有米科技股份有限公司 | A kind of temperature appraisal procedure, device and the electronic equipment of video |
CN109275031B (en) * | 2018-09-25 | 2021-09-28 | 有米科技股份有限公司 | Video popularity evaluation method and device and electronic equipment |
CN109582852B (en) * | 2018-12-05 | 2021-04-09 | 中国银行股份有限公司 | Method and system for sorting full-text retrieval results |
CN109582852A (en) * | 2018-12-05 | 2019-04-05 | 中国银行股份有限公司 | A kind of sort method and system of full-text search result |
CN110532419A (en) * | 2019-08-29 | 2019-12-03 | 腾讯科技(深圳)有限公司 | A kind of processing method and processing device of audio |
CN110704436A (en) * | 2019-09-26 | 2020-01-17 | 郑州阿帕斯科技有限公司 | Hbase-based index generation method and device |
CN111026958A (en) * | 2019-11-29 | 2020-04-17 | 微梦创科网络科技(中国)有限公司 | Hot microblog sorting method and device |
CN111159312B (en) * | 2019-12-27 | 2024-04-26 | 东软集团股份有限公司 | Fault related information auxiliary retrieval method and device, storage medium and electronic equipment |
CN114996550A (en) * | 2021-05-24 | 2022-09-02 | 中移互联网有限公司 | Information retrieval method and device |
CN114996550B (en) * | 2021-05-24 | 2024-03-19 | 中移互联网有限公司 | Information retrieval method and device |
CN113886685A (en) * | 2021-09-23 | 2022-01-04 | 北京三快在线科技有限公司 | Searching method, searching device, storage medium and electronic equipment |
CN113886685B (en) * | 2021-09-23 | 2023-01-06 | 北京三快在线科技有限公司 | Searching method, searching device, storage medium and electronic equipment |
CN114154075A (en) * | 2022-02-08 | 2022-03-08 | 北京大氪信息科技有限公司 | Hot information determination method, hot information determination device, computer equipment and medium |
CN114154075B (en) * | 2022-02-08 | 2022-05-17 | 北京大氪信息科技有限公司 | Hot information determination method, device, computer equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN107239497B (en) | 2020-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107239497A (en) | Hot content searching method and system | |
JP5391634B2 (en) | Selecting tags for a document through paragraph analysis | |
US9317593B2 (en) | Modeling topics using statistical distributions | |
CN106339502A (en) | Modeling recommendation method based on user behavior data fragmentation cluster | |
Eom | Author Cocitation Analysis: Quantitative Methods for Mapping the Intellectual Structure of an Academic Discipline: Quantitative Methods for Mapping the Intellectual Structure of an Academic Discipline | |
JP2009093649A (en) | Recommendation for term specifying ontology space | |
US20170083564A1 (en) | Computer-Implemented System And Method For Assigning Document Classifications | |
US20150363405A1 (en) | Method and apparatus for generating ordered user expert lists for a shared digital document | |
Zhou et al. | Relevance feature mapping for content-based multimedia information retrieval | |
CN107958014A (en) | Search engine | |
Levine-Clark et al. | A new comparative citation analysis: Google Scholar, Microsoft Academic, Scopus, and Web of Science | |
Yang et al. | Exploiting various implicit feedback for collaborative filtering | |
CN108009194A (en) | A kind of books method for pushing, electronic equipment, storage medium and device | |
US20120191725A1 (en) | Document ranking system with user-defined continuous term weighting | |
Homocianu et al. | An Analysis of Scientific Publications on'Decision Support Systems' and'Business Intelligence'Regarding Related Concepts Using Natural Language Processing Tools | |
Sharma et al. | A trend analysis of significant topics over time in machine learning research | |
CN106951517B (en) | Method for inquiring diversity of documents in narrow range | |
Al Zamil et al. | A model based on multi-features to enhance healthcare and medical document retrieval | |
Basuki et al. | Detection of reference topics and suggestions using latent Dirichlet allocation (LDA) | |
Ayyasamy et al. | Mining Wikipedia knowledge to improve document indexing and classification | |
CN114461778A (en) | Comprehensive scientific research result recommendation method and device for mass scientific research data | |
Ependi et al. | Sentiment Analysis of Covid-19 Handling in Indonesia Based on Lexicon Weighting | |
MirShojaee et al. | Biogeography-based optimization algorithm for automatic extractive text summarization | |
Arjannikov et al. | Verifying tag annotations through association analysis | |
Pu et al. | Enriching user‐oriented class associations for library classification schemes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |