CN101963974A - EPG column generating method - Google Patents
EPG column generating method Download PDFInfo
- Publication number
- CN101963974A CN101963974A CN2010102726575A CN201010272657A CN101963974A CN 101963974 A CN101963974 A CN 101963974A CN 2010102726575 A CN2010102726575 A CN 2010102726575A CN 201010272657 A CN201010272657 A CN 201010272657A CN 101963974 A CN101963974 A CN 101963974A
- Authority
- CN
- China
- Prior art keywords
- epg
- subclauses
- clauses
- similarity
- column
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an EPG (electronic program guide) column generating method, which comprises the following steps of: performing syntax analysis on items in the EPG, dividing the EPG items into different terms and endowing the terms with corresponding weights; performing text clustering on the output results of the syntax analysis, calculating the similarity of the EPG items according to the terms and the weights thereof, and performing text clustering analysis on the EGP items by using a hierarchical clustering method according to the preset clustering parameters and the similarity of the EPG items to obtain a text clustering result; and finally, outputting the text clustering result to a new EPG column in a column hierarchy mode. The EPG column generating method can be used for a digital television so that a user can quickly position a program.
Description
Technical field
The present invention relates to the generation method of Digital Television EPG, especially relate to the EPG column generation method that a kind of user of making searches program fast.
Background technology
Usually all use EPG (Electronic Program Guide now, electric program menu) comes to provide the index and the navigation of miscellaneous service for Digital Television, the menu that the user utilizes EPG to provide can be selected the channel oneself liked, the program that program request oneself is liked, search various information etc., in Digital Television, play important effect.But, there is not association between a plurality of channel programs of existing EPG, do not have column to describe yet and belong to a series of programs.This makes and also makes the user can not fast and effeciently use EPG to search the program of oneself liking by keeper's structurized overall framework of neither one in management EPG, may need the long time just can find the program of oneself liking, even can miss.
Therefore, how to develop a kind of can many channels of related EPG and describe the EPG column that belongs to a series of programs, with the program that helps a plurality of channels of Admin Administration and help the user to find the program of oneself liking fast, become one of technical barrier that present urgent need solves.
Summary of the invention
The present invention does not belong to a series of programs for having association between a plurality of channel programs that solve prior art EPG and not having the EPG column to describe, and being unfavorable for the technical matters of managing and searching provides a kind of EPG column generation method.
For solving the problems of the technologies described above, the technical solution used in the present invention comprises the following steps: for a kind of EPG column generation method of design
Step 1: the clauses and subclauses among the EPG are carried out grammatical analysis, the EPG clauses and subclauses are divided into different entries and give corresponding weights;
Step 2: the output result to grammatical analysis carries out text cluster, similarity according to entry and its weight calculation EPG clauses and subclauses, and use the hierarchical clustering method that the EPG clauses and subclauses are carried out the text cluster analysis according to the similarity of predefined cluster parameter and EPG clauses and subclauses, obtain the text cluster result;
Step 3: at last the text cluster result is exported new EPG column in the mode of column level.
Grammatical analysis described in the step 1 comprises lexical analysis and grammatical analysis, and described lexical analysis uses lex definition regular expression to realize, described grammatical analysis uses yacc definition syntax rule to realize.
In the step 2, when calculating EPG clauses and subclauses similarity, regard each clauses and subclauses as a vector, according to the similarity of weight calculation vector, the similarity between the EPG clauses and subclauses vector by cosine apart from statement.
In the step 2, process of cluster analysis comprises:
(1) with each EPG clauses and subclauses all as a class, be made as original state;
(2) calculate EPG clauses and subclauses distance according to EPG clauses and subclauses similarity, will merge into one apart from two the most close classes;
(3) repetitive process (2) is till similarity arrives a threshold value.
Same words bar in the EPG clauses and subclauses that merge in the text cluster step is as the title of EPG column.
Comprise also after the step 3 that the user adjusts the cluster parameter by presentation layer, and carry out the step of text cluster according to adjusted cluster parameter once more.
The present invention is by analyzing the EPG clauses and subclauses, the EPG clauses and subclauses are divided into different entries and give corresponding weights, calculate the similarity of EPG clauses and subclauses again according to weight and entry, and use the hierarchical clustering method that the EPG clauses and subclauses are carried out the text cluster analysis according to the similarity of predefined cluster parameter and EPG clauses and subclauses, obtain the text cluster result and the text cluster result is exported new EPG column in the mode of column level describe the program that belongs to a series of, thereby be very beneficial for helping the program of these similar and different TV stations of Admin Administration, also can help very fast the navigating to of user to want the program of seeing.
Description of drawings
The present invention is described in detail below in conjunction with embodiment and accompanying drawing, wherein:
Fig. 1 is the schematic diagram of EPG column generation method of the present invention.
Embodiment
The purpose of EPG column generation method of the present invention is with the EPG program structureization, its major technology thought is: the programme information that draws on the basis of having analyzed EPG clauses and subclauses characteristics among the EPG is actually the conclusion (for example: relay CCTV-1 news hookup (June 23)) that is combined by different terms, numbers and symbols, thereby in view of the above the EPG column is carried out grammatical analysis, it is divided into different entries and gives corresponding weights; Carry out text cluster then, find out each high entry of similarity, these stratified entries are exactly the column that generates automatically among the EPG.
See also Fig. 1.EPG column generation method of the present invention comprises the following steps:
One, the syntax analysis step of EPG clauses and subclauses.
Grammar component is taken out a plurality of TV stations from the EPG storehouse EPG carries out very not rigorous grammatical analysis to it, its objective is for the EPG clauses and subclauses are divided into different entries, and gives different weights.Because may there be repetition in the program of a plurality of TV stations, the EPG column generates the program that can help these similar and different TV stations of Admin Administration, also can help very fast the navigating to of user to want the program of seeing.
The theory of grammer is that we think a kind of non-rigorous word, the description that numbers and symbols is organized with the clauses and subclauses among the EPG, as TV play: red light note 5, N. B. university (the evening edition), the process of grammatical analysis is at first to carry out lexical analysis, carries out grammatical analysis then.Here we are with word (be Chinese character in the example below, can certainly be the literal of letter or other Languages) and digital as dissimilar custom words, and ": " and symbols such as " () " are key words.As being expressed as:
The similar TV play of super term ': ' term---: red light note;
The similar man and nature of super term ' term---Africa leopard;
Term ' (' child term ') '---similar news is reported (version at high noon);
The similar big gate 21 of term number---.
Therefore the EPG clauses and subclauses as " TV play: red light note 5 " will analyzed as being: higher level's vocabulary: TV play; Subordinate's vocabulary: red light note; Numeral qualifier: 5.
Lexical analysis can use lex definition regular expression to realize, grammatical analysis then can use yacc definition syntax rule to realize.
Two, text cluster step:
Calculate EPG clauses and subclauses similarity according to weight and described entry, and the EPG clauses and subclauses of EPG clauses and subclauses similarity in the cluster parameter area are merged, obtain the text cluster result according to predefined cluster parameter and EPG clauses and subclauses similarity.
Clustering Engine is regarded the EPG clauses and subclauses as vector, and to the index term in these vectors, just the entry that is split by the grammatical analysis assembly carries out similarity calculating, because these index terms have weight, distance between the last EPG clauses and subclauses is to draw by the cosine distance calculation, according to distance the EPG clauses and subclauses are carried out hierarchical clustering again, reached pre-set distance threshold up in advance, perhaps till user's satisfaction.
1, EPG clauses and subclauses calculation of similarity degree
The characteristic of EPG has determined text similarity algorithm that text cluster adopted should be based on editing distance, and should be to calculate with the form of mating identical entry in EPG.This is that for example: N. B. university (version at noon) and N. B. university (the evening edition), digital in addition qualifier is not participated in cluster because the similar of EPG clauses and subclauses is to occur in the identical mode of entry.
The EPG item analysis makes different entries be made a distinction by key word, and had different weights, in the text cluster field, weight generally is to be determined by the word frequency of entry in text, but here determine by the EPG analysis, this is because of the characteristics of EPG with when calculating EPG clauses and subclauses similarity, regards each EPG clauses and subclauses as a vector, according to the similarity of weight calculation vector.Similarity between the EPG clauses and subclauses vector, here is not repeated apart from statement by cosine.
2, EPG clauses and subclauses cluster
We use the hierarchical clustering method that the EPG clauses and subclauses are carried out cluster analysis, hierarchical clustering algorithm calculates similarity between per two clauses and subclauses according to the EPG clauses and subclauses similarity calculating method that provides in above-mentioned 1, make up a cluster level according to the similarity size then, its cluster process can be described below:
1), each EPG clauses and subclauses is all as a class, this is an original state.
2), according to the EPG clauses and subclauses distance that the EPG clauses and subclauses similarity calculation method described in 1 is drawn, will merge into one apart from two the most close classes.
3), repetitive process 2), till similarity arrives predefined cluster parameter threshold.
Three, new EPG column generates step:
The result of text cluster is as the output of column, and presentation layer shows the result of text cluster to the user in the mode of EPG column.For the EPG clauses and subclauses in the same classification, extract the title of its same words bar as the EPG column.By adjusting the similarity threshold of clustering algorithm, can obtain the column output result under the different restrictive conditions.The user divides when dissatisfied column, can adjust the cluster parameter by presentation layer, as distance threshold, carries out text cluster once more, so repeats, till satisfaction.
The method that the present invention is based on EPG item analysis and text cluster generates a new EPG column, can be used for Digital Television, helps the program of these similar and different TV stations of Admin Administration, also can help very fast the navigating to of user to want the program of seeing.
Claims (6)
1. an EPG column generation method is characterized in that comprising the following steps:
Step 1: the clauses and subclauses among the EPG are carried out grammatical analysis, the EPG clauses and subclauses are divided into different entries and give corresponding weights;
Step 2: the output result to grammatical analysis carries out text cluster, similarity according to entry and its weight calculation EPG clauses and subclauses, and use the hierarchical clustering method that the EPG clauses and subclauses are carried out the text cluster analysis according to the similarity of predefined cluster parameter and EPG clauses and subclauses, obtain the text cluster result;
Step 3: at last the text cluster result is exported new EPG column in the mode of column level.
2. EPG column generation method according to claim 1, it is characterized in that: the grammatical analysis described in the step 1 comprises lexical analysis and grammatical analysis, described lexical analysis uses lex definition regular expression to realize, described grammatical analysis uses yacc definition syntax rule to realize.
3. EPG column generation method according to claim 1, it is characterized in that: in the step 2, when calculating EPG clauses and subclauses similarity, regard each clauses and subclauses as a vector, according to the similarity of weight calculation vector, the similarity between the EPG clauses and subclauses vector by cosine apart from statement.
4. EPG column generation method according to claim 1, it is characterized in that: in the step 2, process of cluster analysis comprises:
(1) with each EPG clauses and subclauses all as a class, be made as original state;
(2) calculate EPG clauses and subclauses distance according to EPG clauses and subclauses similarity, will merge into one apart from two the most close classes;
(3) repetitive process (2) is till similarity arrives a threshold value.
5. EPG column generation method according to claim 4 is characterized in that: the same words bar in the EPG clauses and subclauses that merge in the text cluster step is as the title of EPG column.
6. EPG column generation method according to claim 1 is characterized in that: comprise also after the step 3 that the user adjusts the cluster parameter by presentation layer, and carry out the step of text cluster according to adjusted cluster parameter once more.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010102726575A CN101963974A (en) | 2010-09-03 | 2010-09-03 | EPG column generating method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010102726575A CN101963974A (en) | 2010-09-03 | 2010-09-03 | EPG column generating method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101963974A true CN101963974A (en) | 2011-02-02 |
Family
ID=43516847
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2010102726575A Pending CN101963974A (en) | 2010-09-03 | 2010-09-03 | EPG column generating method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101963974A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102291604A (en) * | 2011-08-31 | 2011-12-21 | 华南理工大学 | Making method of electronic program guide (EPG) for time-shifting network television |
CN103218372A (en) * | 2012-01-20 | 2013-07-24 | 华为终端有限公司 | Method and device for aggregating information |
CN106686460A (en) * | 2016-12-22 | 2017-05-17 | Ut斯达康(深圳)技术有限公司 | Video program recommendation method and device |
CN112348123A (en) * | 2020-12-08 | 2021-02-09 | 武汉卓尔数字传媒科技有限公司 | User clustering method and device and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1609859A (en) * | 2004-11-26 | 2005-04-27 | 孙斌 | Search result clustering method |
CN101094335A (en) * | 2006-06-20 | 2007-12-26 | 株式会社日立制作所 | TV program recommender and method thereof |
CN101452463A (en) * | 2007-12-05 | 2009-06-10 | 浙江大学 | Method and apparatus for directionally grabbing page resource |
CN101620608A (en) * | 2008-07-04 | 2010-01-06 | 全国组织机构代码管理中心 | Information collection method and system |
US20100011020A1 (en) * | 2008-07-11 | 2010-01-14 | Motorola, Inc. | Recommender system |
-
2010
- 2010-09-03 CN CN2010102726575A patent/CN101963974A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1609859A (en) * | 2004-11-26 | 2005-04-27 | 孙斌 | Search result clustering method |
CN101094335A (en) * | 2006-06-20 | 2007-12-26 | 株式会社日立制作所 | TV program recommender and method thereof |
CN101452463A (en) * | 2007-12-05 | 2009-06-10 | 浙江大学 | Method and apparatus for directionally grabbing page resource |
CN101620608A (en) * | 2008-07-04 | 2010-01-06 | 全国组织机构代码管理中心 | Information collection method and system |
US20100011020A1 (en) * | 2008-07-11 | 2010-01-14 | Motorola, Inc. | Recommender system |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102291604A (en) * | 2011-08-31 | 2011-12-21 | 华南理工大学 | Making method of electronic program guide (EPG) for time-shifting network television |
CN102291604B (en) * | 2011-08-31 | 2014-02-26 | 华南理工大学 | Making method of electronic program guide (EPG) for time-shifting network television |
CN103218372A (en) * | 2012-01-20 | 2013-07-24 | 华为终端有限公司 | Method and device for aggregating information |
CN103218372B (en) * | 2012-01-20 | 2017-04-26 | 华为终端有限公司 | Method and device for aggregating information |
CN106686460A (en) * | 2016-12-22 | 2017-05-17 | Ut斯达康(深圳)技术有限公司 | Video program recommendation method and device |
CN106686460B (en) * | 2016-12-22 | 2020-03-13 | 优地网络有限公司 | Video program recommendation method and video program recommendation device |
CN112348123A (en) * | 2020-12-08 | 2021-02-09 | 武汉卓尔数字传媒科技有限公司 | User clustering method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10810272B2 (en) | Method and apparatus for broadcasting search result based on artificial intelligence | |
CN108920497B (en) | Man-machine interaction method and device | |
CN110941692B (en) | Internet political outturn news event extraction method | |
CN107124653B (en) | Method for constructing television user portrait | |
US10631057B2 (en) | System and method for natural language driven search and discovery in large data sources | |
US10847175B2 (en) | System and method for natural language driven search and discovery in large data sources | |
JP2023176014A (en) | Method and system for using machine-learning extract and semantic graph to create structured data to drive search, recommendation, and discovery | |
CN106921891A (en) | The methods of exhibiting and device of a kind of video feature information | |
CN104281649A (en) | Input method and device and electronic equipment | |
KR102612355B1 (en) | Systems and methods for correcting errors in subtitled text | |
GB2546863A (en) | Systems and methods for providing a contextual menu with information related to an emergency alert | |
CN104025077A (en) | Real-Time Natural Language Processing Of Datastreams | |
US10394886B2 (en) | Electronic device, computer-implemented method and computer program | |
CN104731959A (en) | Video abstraction generating method, device and system based on text webpage content | |
CN103778207A (en) | LDA-based news comment topic digging method | |
KR101811468B1 (en) | Semantic enrichment by exploiting top-k processing | |
CN102760169A (en) | Method for detecting advertising slots in television direct transmission streams | |
CN101963974A (en) | EPG column generating method | |
CN106126503B (en) | Service field positioning method and terminal | |
CN105005561A (en) | Bilingual retrieval statistical translation system based on corpus | |
KR20120070850A (en) | System and method for generating content tag with web mining | |
KR20160062667A (en) | A method and device of various-type media resource recommendation | |
US20210240983A1 (en) | Method and apparatus for building extraction, and storage medium | |
CN107066633A (en) | Deep learning method and apparatus based on human-computer interaction | |
KR20130093889A (en) | Apparatus and method for interpreting korean keyword search phrase |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20110202 |
|
RJ01 | Rejection of invention patent application after publication |