WO2004012101A1 - Dispositif et procede d'extraction automatique de mot-cle, support d'enregistrement et programme - Google Patents

Dispositif et procede d'extraction automatique de mot-cle, support d'enregistrement et programme Download PDF

Info

Publication number
WO2004012101A1
WO2004012101A1 PCT/JP2003/009678 JP0309678W WO2004012101A1 WO 2004012101 A1 WO2004012101 A1 WO 2004012101A1 JP 0309678 W JP0309678 W JP 0309678W WO 2004012101 A1 WO2004012101 A1 WO 2004012101A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
character string
dictionary
registered
extracting
Prior art date
Application number
PCT/JP2003/009678
Other languages
English (en)
Japanese (ja)
Inventor
Hitoshi Kimura
Kensuke Ohnuma
Hidetoshi Ichioka
Original Assignee
Sony Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corporation filed Critical Sony Corporation
Priority to EP03771430A priority Critical patent/EP1544751A4/fr
Priority to US10/523,332 priority patent/US7577972B2/en
Publication of WO2004012101A1 publication Critical patent/WO2004012101A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/35Arrangements for identifying or recognising characteristics with a direct linkage to broadcast information or to broadcast space-time, e.g. for identifying broadcast stations or for identifying users
    • H04H60/48Arrangements for identifying or recognising characteristics with a direct linkage to broadcast information or to broadcast space-time, e.g. for identifying broadcast stations or for identifying users for recognising items expressed in broadcast information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5846Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/56Arrangements characterised by components specially adapted for monitoring, identification or recognition covered by groups H04H60/29-H04H60/54
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • H04N21/4828End-user interface for program selection for searching program descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • H04N21/8405Generation or processing of descriptive data, e.g. content descriptors represented by keywords

Definitions

  • Kokiaki is a device and method for automatically extracting keywords from title string information and detailed string information of content such as EPG (Electronic Program Guide) information, recording media, and About the program.
  • EPG Electronic Program Guide
  • EPG information including information indicating the genre of the program are transmitted from the broadcasting station.
  • an electronic program guide can be displayed on a screen based on the EPG information.
  • the user can use this electronic program guide to select a broad genre (for example, sports, drama, etc.) and then search from the title or read the detailed character string information. Searching.
  • a broad genre for example, sports, drama, etc.
  • the present invention provides a method for a user to obtain content from content title character string information and detailed character string information, such as EPG information, even in a home electric appliance having a CPU processing capacity and a memory capacity not so large.
  • the purpose of the present invention is to make it possible to automatically and efficiently extract a keyword for searching for a keyword. Disclosure of the invention
  • the present applicant extracts a keyword from the title character string information using a first keyword dictionary in which a character string indicating a sub-genre is registered. From the extraction means 1 and the detailed character string information of this content, a keyword is extracted using a second keyword dictionary in which personal names are registered, and a keyword is extracted using a character type separation method.
  • a keyword is extracted using a second keyword dictionary in which personal names are registered, and a keyword is extracted using a character type separation method.
  • the first character string indicating the sub-genre is registered from the title character string information of the content (for example, in TV broadcasting, the title character string information in the EPG information).
  • the keywords are extracted using the keyword dictionary.
  • a keyword is extracted using a second keyword dictionary in which personal names are registered.
  • keywords are extracted using the character type cutting method.
  • the person name is registered in the second keyword dictionary, the last name Are also kanji characters and the names of hiragana and katakana are also extracted as keywords.
  • even a person's name not registered in the second keyword dictionary is extracted as a keyword by using the character type separation method.
  • keyword extraction from the title string information and the keyword extraction from the detailed character string information are performed in accordance with each type of keyword dictionary and rule (character type separation method).
  • keywords can be extracted accurately with a small-sized program or dictionary.
  • users can search for content from title character string information and detailed character string information of content such as EPG information even for home appliances that do not have large CPU processing capacity and memory capacity. Can be automatically and efficiently extracted with high accuracy.
  • the first extraction means is a predetermined exclusion character string of a title character string including a character string registered in the first keyword dictionary. It is preferable to extract keywords from the parts excluding character strings registered in the dictionary.
  • the first extraction means is configured to use, for example, Hiragana and Katakana among the title character strings including the character strings registered in the first keyword dictionary.
  • a character string delimited by special characters other than, kanji, numbers, and alphabets It is preferable to extract as a code.
  • Titles that are not delimited by these special characters are not useful as keywords for content search because the individual character strings in the title are too broad (search results). And the title itself often serves only as a keyword for efficient content search. Therefore, the user can search the content more efficiently using the extracted keyword (the title itself).
  • individual character strings separated by special characters will be extracted as keywords.
  • the second extraction means includes, among the remaining portions of the detailed character string information obtained by extracting the keywords using the second keyword dictionary, Predetermined Excluded Character Strings It is preferable to extract a keyword from the portion excluding the character strings registered in the dictionary using the character type cutting method. As a result, it is possible to prevent an inappropriate character string for searching for a content from being included in the keyword, out of the character strings that may be included in the detailed character string information. Therefore, the user can use the extracted keywords to search for the content more efficiently.
  • the second extraction means uses the character type separation method, treats katakana and alphabet as the same character type, and If the character immediately preceding it is a katakana or an alphabet, it is preferable to treat it as katakana or an alphabet.
  • a means for downloading a second key word dictionary via a network is further provided, and the second extraction means is provided with the downloaded second key word dictionary. It is preferable to use a dictionary.
  • keywords can be extracted using the latest dictionary (a dictionary in which the names of people who have just recently become famous) are registered as the second keyword dictionary.
  • the present applicant performs a first step of extracting a keyword from the title character string information of the content using a first keyword dictionary in which a character string indicating a subgenre is registered; From the detailed character string information of this content, a key word is extracted using a second key word dictionary in which personal names are registered, and a keyword is extracted using the character type cutting method for IJ.
  • a program for an automatic keyword extracting device which extracts keywords from the title character string information of the content using a first keyword dictionary in which a character string indicating a subgenre is registered.
  • the first extraction step from the detailed character string information of the content, extraction of a keyword using a second keyword dictionary in which personal names are registered, and extraction of a keyword using a character type separation method
  • the present invention proposes a recording medium in which a computer-readable program including an extraction step and a second extraction step is recorded.
  • the computer that controls the automatic keyword extraction device extracts keywords from the title character string information of the content using the first keyword dictionary in which character strings indicating subgenres are registered.
  • the first extraction step from the detailed character string information of this content, the extraction of keywords using a second keyword dictionary in which personal names are registered, and the extraction of keywords using character type separation
  • a household electrical appliance having a CPU processing capacity and a memory capacity not so large.
  • FIG. 1 is a diagram showing an outline of a digital television broadcast receiving system including a program recording / reproducing apparatus to which the present invention is applied.
  • FIG. 2 is a block diagram showing a hardware configuration of the program recording / reproducing apparatus of FIG.
  • FIG. 3 is a flowchart showing an automatic keyword extraction process executed by the CPU of FIG.
  • FIG. 4 is a flowchart showing automatic keyword extraction processing executed by the CPU of FIG.
  • FIG. 5 is a diagram showing rules for keyword extraction in the processing of FIG.
  • FIG. 6 is a diagram showing rules for keyword extraction in the processing of FIG.
  • FIG. 7 is a block diagram showing a hardware configuration of a program recording / reproducing apparatus for analog television broadcasting to which the present invention is applied.
  • FIG. 1 is a diagram showing an outline of a digital television broadcast receiving system including a program recording / reproducing apparatus to which the present invention is applied.
  • a digital broadcast signal transmitted from a television broadcasting station is received by an antenna 1 and input to a program recording / reproducing apparatus 2.
  • the program recording / reproducing device 2 is connected to a display device 3 including a display and a speaker, and is also connected to the Internet 4.
  • FIG. 2 is a block diagram showing a hardware configuration of the program recording / reproducing apparatus 2.
  • a tuner 11, a demodulator 12, a descrambler 13, and a demultiplexer 14 are connected in order, and a video decoder 15 and a demultiplexer 14 are connected to the demultiplexer 14.
  • the video signal processing circuit 17, the audio decoder 16, and the DZA converter 18 are connected in that order.
  • the interface 23 for the motor controller, the interface 24 for the HDD (node disk drive), and the communication interface 25 for connecting the Internet are connected to the system bus 2. 6 tied.
  • the interface 24 is connected to a hard disk drive (HDD) 27 for recording television programs.
  • HDD hard disk drive
  • a remote controller (hereinafter referred to as a remote controller) 28 attached to the program recording / reproducing apparatus 2 includes various types of remote controllers similar to those used in a normal digital broadcast television receiver. Operation buttons (power button, channel selection button ⁇ , recording reservation button ⁇ , playback button ⁇ , direction key and enter key for selecting on the EPG screen, etc.) are provided.
  • the digital broadcast signal input to the program recording / reproducing device 2 is selected by the tuner 11 based on the tuning operation of the remote controller 28, and then the demodulator 12 is used to select the frequency band. After being demodulated and descrambled by the descrambler 13, it is separated by the demultiplexer 14 into a bucket of video and audio data of a program for a plurality of channels and a packet of EPG information.
  • the video and audio data of the bucket for one channel extracted based on the channel selection operation of the remote control 28 are the video and audio data, respectively.
  • Decoder 15 and audio decoder 16 decode MPEG-2 Video and MPEG-2 Audio. Also, the No. 0 packet of the EPG information is sent to the CPU 19.
  • the video signal decoded by the video decoder 15 and the video signal for electronic program guide display created by the CPU 19 using the EPG information are converted into the NTSC format by the video signal processing circuit 17.
  • the image is output from the video output terminal 29 and sent to the display device 3 in FIG.
  • the audio signal decoded by the audio decoder 16 is converted into an analog signal by the DZA converter 18, output from the audio output terminal 30, and sent to the display device 3 shown in FIG.
  • the CPU 19 controls the entire program recording / reproducing apparatus 2 using the main memory 21 as working memory based on programs and data stored in the ROM 20.
  • the processing performed by the CPU 19 includes processing during viewing of a TV program based on the channel selection operation of the remote controller 28 and television program transmission to the HDD 27 based on the recording reservation operation of the remote controller 28.
  • processing during viewing of a TV program based on the channel selection operation of the remote controller 28 and television program transmission to the HDD 27 based on the recording reservation operation of the remote controller 28 includes processing during viewing of a TV program based on the channel selection operation of the remote controller 28 and television program transmission to the HDD 27 based on the recording reservation operation of the remote controller 28.
  • there is an automatic keyword extraction process In addition to the video recording process, there is an automatic keyword extraction process.
  • the ROM 20 includes a dictionary for titles, a keyword dictionary for titles, a dictionary for character strings excluded, a keyword dictionary for detailed information, and a character dictionary for detailed information. Contains the column dictionary.
  • the title keyword dictionary contains sub-genres such as' professional baseball, '' golf, '' sucker, '' onsen, '' go, '' shogi ',' movie, etc. (based on genre information in EPG information).
  • a general genre such as 'sports'
  • a character string such as 'love' or 'love'
  • a character string of a professional baseball team name Of the character strings often included in the title, valid and important character strings for searching for programs are registered.
  • the title exclusion string dictionary contains the program title, such as 'movie', 'BS', or a program listing-specific symbol (for example, a symbol indicating a news program with N enclosed in a square frame).
  • a character string that is too general as a keyword for searching for programs is registered.
  • Keyword dictionaries for detailed information include celebrities who often appear in television programs (entertainers, athletes, politicians, cultural figures, etc.). Of the names, only hiragana, a combination of hiragana and kanji, a combination of hiragana and katakana, a combination of kanji and katakana,
  • Character strings with names of only two or fewer kanji characters and only six or more kanji characters are registered.
  • the keyword dictionary for detailed information includes character strings other than person names, such as 'hot spring', which are often included in detailed character string information in EPG information.
  • An appropriate character string is also registered as a key word for searching programs.
  • Excluded character string dictionaries for detailed information include character strings that are often included in detailed character string information in EPG information, such as “guest”, “over”, and “director”. Inappropriate character strings are registered as keywords for searching for programs.
  • the CPU 19 downloads the latest dictionary (the name of the person who has just recently become famous) from the dedicated site via the Internet. And store it in flash memory 22 as well.
  • the CPU 19 is also supposed to perform the automatic keyword extraction processing by assuming the packet of the EPG information sent from the demultiplexer 14 at the time of the user's channel selection operation or the recording based on the user's recording reservation operation. Is stored in the flash memory 22.
  • FIGS. 3 and 4 are flowcharts showing an automatic keyword extraction process executed by the CPU 19.
  • FIG. 3 shows a process for extracting a keyword from the title character string information.
  • the title character string information is extracted from the EPG information stored in the flash memory 22 (step S 1).
  • step S 2 From the title of a plurality of programs indicated by the title character string information, the title for the keyword dictionary that are registered string ( 'Gol full ,,' whip over ',' hot spring ,, f Go ' , 'Shogi', 'movie' Character string indicating the genre). Then, of the titles of those programs, the entire character string of the title including the character string registered in this title keyword dictionary is extracted as a keyword (step S 2).
  • step S3 the character strings ('movie', 'BS', etc.) registered in the title exclusion string dictionary are represented by spaces. Replace (step S3).
  • step S4 keywords are extracted from the title character string after step S3 according to the extraction rule for titles as shown in Fig. 5 (step S4).
  • ' ⁇ ' (Midpoint) is not treated as a special character. If a character string extracted as a keyword has a “•” (middle point) at the beginning or end of the character string, the portion excluding the “•” (middle point) is used as a keyword.
  • step S4 the keyword extracted in step S4 is stored in flash memory 22 as a list of keywords in the title character string information (step S5).
  • FIG. 4 shows a process of extracting a keyword from the detailed character string information.
  • the detailed character string information is extracted from the EPG information stored in the flash memory 22 (step S11). ).
  • a character string (such as a name of a famous person) registered in the detailed information keyword dictionary is searched. And of the detailed character string information, a character string registered in this detailed information keyword dictionary is extracted as a keyword, and the character string part is replaced with a single-byte space (step S). 1 2).
  • step S12 the character strings ('guest', 'or more,', 'director', etc.) registered in the detailed information exclusion string dictionary are extracted. Replace the part with a space (step S13).
  • step S14 a keyword is extracted from the character string of the detailed character string information that has undergone step S13 by using an extraction rule for the detailed character string information as shown in FIG. 6 (step S14).
  • the extraction rules for detailed character string information basically use a character type separation method that separates hiragana, katakana, kanji, numbers, alphabets, and other characters.
  • katakana and alphabet are treated as the same character type (do not separate). If the character immediately preceding it is katakana or alphabet, it is treated as katakana or alphabet, respectively (not separated).
  • the character strings excluding the hiragana-only character string, the two-character or less kanji-only character string, and the six-character or more kanji-only character string are used as keywords. Extract. However, if ' ⁇ ' (middle point) exists at the beginning or end of the character string extracted as a keyword, the portion excluding ' ⁇ ' (middle point) is used as the keyword.
  • step S12 and the keyword extracted in step S14 are stored in the flash memory 22 as a list of keywords in the detailed character string information. (Step S15).
  • the title character string information in the EPG information sent from the demultiplexer 14 and stored in the flash memory 22 at the time of the user's channel selection operation or recording based on the user's recording reservation operation is as follows. It is assumed that such titles are included (however, ⁇ , ⁇ are the names of professional baseball teams).
  • step S4 since the noise of love is not separated by a special symbol, in step S4, the noise of love, which is the title itself, is extracted as a keyword. You. Therefore, in step S5, the following character strings are stored in the flash memory 22 as keywords for program search (as described above, ⁇ , ⁇ are professional baseball team names).
  • Titles that are not separated by such special characters are used as keywords for program search because the individual character strings such as 'love ,,' and 'space' contained in the title are too broad in meaning. It is not very useful (the search results are very large) and often serves only as a title itself as a keyword for efficient program search. Therefore, the user will be able to efficiently search for programs using the extracted keywords (the titles themselves).
  • the title string of the movie Space Wars is too general for searching for programs such as 'BS' and 'movie' which were added to this title in the title string information.
  • the keyword is not included in the keyword, and "", which surrounds the title in the title string information, is not included in the keyword. Therefore, users can search for programs efficiently.
  • step S12 the names of the host of the turmoil and the show, the guest, and the names of actors appearing in the movie Space Wars are extracted as keywords in step S12.
  • the part such as the name of the celebrity and character strings registered in the detailed information exclusion character string dictionary ('guest' 'Over', 'director' etc.) are replaced with single-byte spaces in steps S12 and S13.
  • step S14 a keyword is extracted from the space-replaced character string of the detailed character string information according to the rule shown in FIG.
  • Katakana and Alphabet are treated as the same character type, and ' ⁇ , (middle dot) indicates Katakana and Alphabet if the character immediately before it is Katakana and Alphabet, respectively. Since it is treated as an alphabet, a foreign name with a ' ⁇ ' (middle dot) inserted between the name and last name (for example, ⁇ ⁇ Dooly) is also extracted as a keyword.
  • the names of people who are not yet registered in the latest keyword dictionary for detailed information are only hiragana, kanji characters of 2 characters or less, or characters of 6 characters or more. If it is not a name consisting only of vague characters (that is, a name that is unlikely to be a personal name), it is extracted as a keyword.
  • character strings that are inappropriate for program search such as 'guest,', 'more than,' and 'director' are not extracted as keywords because of the space substitution. .
  • step S15 the surname was written in kanji, the name of a celebrity such as Hiragana or Katakana, the name of a person who just became famous recently, the name was written in alphabetical letters, and the last name was written in katakana
  • the names of foreigners and the names of foreigners with a " ⁇ " (middle dot) between their first and last names are also stored in the flash memory 22 as keywords for program search. Therefore, the user can efficiently search for programs using the extracted keywords.
  • the keyword stored in the flash memory 22 by the processing of FIGS. 3 and 4 is used by the user for program search.
  • the CPU 19 displays a program search screen (a list of keywords and a user selects a desired keyword in the screen).
  • An appropriate method such as creating a video signal for selecting and instructing a search) and sending it to the display device 3 via the video signal processing circuit 17 and the video output terminal 29, may be used. .
  • the extraction of the key word from the title character string information in the EPG information and the extraction of the key word from the detailed character string information are performed according to the respective information.
  • keywords can be extracted accurately with a small program or dictionary.
  • the present invention is applied to an apparatus that records and reproduces digital television broadcast programs.
  • the present invention is not limited to this, and it goes without saying that the present invention may be applied to a program recording / reproducing apparatus for recording / reproducing a program of analog television broadcasting.
  • FIG. 7 is a block diagram showing a hardware configuration of a program recording / reproducing apparatus for analog television broadcasting to which the present invention is applied.
  • the video and audio signals in the analog broadcast signal received by the antenna 31 and input to the program recording / reproducing device 41 are selected in frequency band by the tuner 42 and encoded by the MPEG encoder 43.
  • the encoded video / audio data is decoded by an MPEG decoder 47, and the program recording / reproducing device 41 To the display device 6 1.
  • the video / audio data encoded by the MPEG encoder 43 is sent to the main storage device 45 via the bus 44 and recorded on the main storage device 45.
  • the video and audio data read from the main storage device 45 is sent to the MPEG decoder 47 via the bus 44 and decoded by the MPEG decoder 47 so that the program recording / reproducing device 41 Is sent to the display device 6 1.
  • EPG information is acquired by the EPG acquisition module 46 from the analog broadcast signal whose frequency band is selected by the tuner 42. This EPG information is also sent to the main storage device 45 via the bus 44 and stored in the main storage device 45.
  • a communication interface 48, ROM 49, main storage device 50, auxiliary storage device 51, and MPEG decoder 47 for connecting to the Internet 71 are connected to each other by a path 52. ing.
  • the title keyword dictionary, title exclusion character string dictionary, detailed information keyword dictionary, and detailed information exclusion character string dictionary are stored in the ROM 49 as described above.
  • the detailed information keyword dictionary the latest one is downloaded from a dedicated site via the Internet and stored in the auxiliary storage device 51.
  • the CPU 53 that controls the whole performs the same keyword automatic extraction processing as shown in FIGS. 3 and 4 using these dictionaries and the EPG information in the main storage device 45, and extracts the extracted keywords. Is stored in the auxiliary storage device 51.
  • the keyword extraction from the title character string information in the EPG information and the detailed character string are performed in exactly the same way as described for the program recording / reproducing device 2 in FIGS. From information By extracting keywords using different keyword dictionaries and rules according to the respective information, it is possible to extract keywords with high accuracy using a small-sized program or dictionary.
  • the title character string information and the detailed character string information in the EPG information can be used.
  • the present invention is applied to a program recording / reproducing device separate from the display device.
  • the present invention is not limited to this, and the present invention is also applied to a television receiver in which the program recording / reproducing device and the display device are integrated or a television receiver having no program recording / reproducing function. Good.
  • the present invention is applied to search for a keyword from the title character string information and detailed character string information of the program in the EPG information.
  • the present invention is not limited to this, and may be used to search for a keyword from the title character string information and detailed character string information of content other than television programs (for example, contents distributed via the Internet). May be applied.
  • the present invention is not limited to the above examples, and may take various other configurations without departing from the gist of the present invention.
  • the present invention even in a home electric appliance where the processing capacity of the CPU and the capacity of the memory are not so large, the user can obtain the information from the title character string information and the detailed character string information of the program such as the EPG information.
  • the advantage is that keywords for searching for programs can be automatically and efficiently extracted with high accuracy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

L'invention concerne un dispositif d'extraction automatique de mot-clé comprenant des premiers moyens d'extraction(19) d'un mot-clé à partir d'une information de chaîne de caractères de titre de contenu au moyen d'un premier dictionnaire de mots-clés contenant des chaînes de caractères indiquant les sous-genres, et des seconds moyens d'extraction (19) d'un mot-clé à partir de l'information détaillée de chaîne de caractères de contenu au moyen d'un second dictionnaire de mots-clés contenant des noms de personnes et d'extraction d'un mot clé au moyen du procédé zisyukiri (procédé de délimitation de police). Ainsi, dans un appareil électrique domestique, à faible pouvoir de traitement informatique et à faible capacité de mémoire, il est possible d'extraire de façon efficace, précise et automatique un mot-clé afin qu'un utilisateur recherche un contenu à partir d'une information de chaîne de caractères de titre de contenu et d'une information détaillée de chaîne de caractères telle qu'une information de guide de programme électronique.
PCT/JP2003/009678 2002-07-30 2003-07-30 Dispositif et procede d'extraction automatique de mot-cle, support d'enregistrement et programme WO2004012101A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP03771430A EP1544751A4 (fr) 2002-07-30 2003-07-30 Dispositif et procede d'extraction automatique de mot-cle, support d'enregistrement et programme
US10/523,332 US7577972B2 (en) 2002-07-30 2003-07-30 Extracting keywords from multilingual alphabetic and glyph scripts in an electronic programming guide

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2002221698A JP4040382B2 (ja) 2002-07-30 2002-07-30 キーワードの自動抽出装置及び方法、記録媒体、並びにプログラム
JP2002-221698 2002-07-30

Publications (1)

Publication Number Publication Date
WO2004012101A1 true WO2004012101A1 (fr) 2004-02-05

Family

ID=31184873

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2003/009678 WO2004012101A1 (fr) 2002-07-30 2003-07-30 Dispositif et procede d'extraction automatique de mot-cle, support d'enregistrement et programme

Country Status (6)

Country Link
US (1) US7577972B2 (fr)
EP (1) EP1544751A4 (fr)
JP (1) JP4040382B2 (fr)
KR (1) KR100993957B1 (fr)
CN (1) CN100530174C (fr)
WO (1) WO2004012101A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1610240A1 (fr) * 2004-06-22 2005-12-28 Pioneer Corporation Dispositif de traitement de données, son procédé, son programme, et moyèn d'enregistrement pour stocker le programme
CN105554519A (zh) * 2015-12-24 2016-05-04 北京酷云互动科技有限公司 Epg信息解析方法及系统

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006041978A (ja) * 2004-07-28 2006-02-09 Matsushita Electric Ind Co Ltd 放送受信装置
JP4498903B2 (ja) * 2004-11-30 2010-07-07 シャープ株式会社 番組情報抽出装置、番組情報表示装置、番組情報抽出方法、プログラム、および、プログラムを記録したコンピュータ読み取り可能な記録媒体
JP2007074169A (ja) * 2005-09-05 2007-03-22 Sharp Corp 番組抽出装置
US7461093B2 (en) 2005-09-12 2008-12-02 Sharp Kabushiki Kaisha Network connecting device, server device, terminal device, system, receiving method, character input method, transmission method, program, and computer-readable storage medium
JP2007079745A (ja) * 2005-09-12 2007-03-29 Sharp Corp ネットワーク接続装置、サーバ装置、端末装置、システム、受信方法、文字入力方法、送信方法、プログラムおよびコンピュータ読み取り可能な記録媒体
CN100444591C (zh) * 2006-08-18 2008-12-17 北京金山软件有限公司 获取网页关键字的方法及其应用系统
EP1901187A3 (fr) 2006-09-16 2009-02-04 LOEWE OPTA GmbH Procédé destiné à la recherche de données utiles dans des bases de données d'appareils électroniques de loisir
TW200836564A (en) * 2007-02-16 2008-09-01 Mstar Semiconductor Inc Control circuit of a display with program searching function, and method for controlling the display to receive program information and select program
JP5178109B2 (ja) * 2007-09-25 2013-04-10 株式会社東芝 検索装置、方法及びプログラム
JP2009094658A (ja) * 2007-10-05 2009-04-30 Hitachi Ltd 関連情報提供装置、及び関連情報提供方法
JP2010003383A (ja) * 2008-06-23 2010-01-07 Victor Co Of Japan Ltd 放送番組記録再生装置
JP5392227B2 (ja) * 2010-10-14 2014-01-22 株式会社Jvcケンウッド フィルタリング装置およびフィルタリング方法
US8606788B2 (en) * 2011-06-15 2013-12-10 Microsoft Corporation Dictionary for hierarchical attributes from catalog items
JP5516641B2 (ja) * 2012-04-27 2014-06-11 株式会社Jvcケンウッド 放送番組記録再生装置
CN106933799A (zh) * 2015-12-31 2017-07-07 北京四维图新科技股份有限公司 一种兴趣点poi名称的中文分词方法及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0810452B2 (ja) * 1988-04-18 1996-01-31 日本電信電話株式会社 日本語対象文固有用語抽出処理装置
JPH10198667A (ja) * 1996-12-28 1998-07-31 Casio Comput Co Ltd 文字列変換装置およびそのプログラム記録媒体
JP2001075959A (ja) * 1999-08-31 2001-03-23 Matsushita Electric Ind Co Ltd 文書処理装置

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9286294B2 (en) * 1992-12-09 2016-03-15 Comcast Ip Holdings I, Llc Video and digital multimedia aggregator content suggestion engine
US5870084A (en) * 1996-11-12 1999-02-09 Thomson Consumer Electronics, Inc. System and method for efficiently storing and quickly retrieving glyphs for large character set languages in a set top box
JP3880116B2 (ja) * 1996-12-27 2007-02-14 キヤノン株式会社 電子ファイリングシステム、電子ファイリング方法及び記録媒体
KR100686622B1 (ko) * 1998-05-22 2007-02-23 코닌클리케 필립스 일렉트로닉스 엔.브이. 키워드 검출수단을 구비한 기록장치
JP3645720B2 (ja) * 1998-10-02 2005-05-11 松下電器産業株式会社 Epg情報表示方法、及びプログラム記録媒体
US7209942B1 (en) 1998-12-28 2007-04-24 Kabushiki Kaisha Toshiba Information providing method and apparatus, and information reception apparatus
US6449766B1 (en) * 1999-12-23 2002-09-10 Webtv Networks, Inc. System and method for consolidating television rating systems
CA2362416C (fr) * 2000-01-05 2009-08-04 Mitsubishi Denki Kabushiki Kaisha Dispositif d'extraction d'un mot-cle
US6463428B1 (en) * 2000-03-29 2002-10-08 Koninklijke Philips Electronics N.V. User interface providing automatic generation and ergonomic presentation of keyword search criteria
JP2001337980A (ja) * 2000-05-29 2001-12-07 Sony Corp 電子番組ガイド検索方法及び電子番組ガイド検索装置
US6925650B1 (en) * 2000-08-21 2005-08-02 Hughes Electronics Corporation Method and apparatus for automated creation of linking information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0810452B2 (ja) * 1988-04-18 1996-01-31 日本電信電話株式会社 日本語対象文固有用語抽出処理装置
JPH10198667A (ja) * 1996-12-28 1998-07-31 Casio Comput Co Ltd 文字列変換装置およびそのプログラム記録媒体
JP2001075959A (ja) * 1999-08-31 2001-03-23 Matsushita Electric Ind Co Ltd 文書処理装置

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KEN'ICHI HINATSU: "JICST ni okeru keyword jido chushutsu system no shiyo", THE JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY ASSOCIATION, vol. 42, no. 11, 1 November 1992 (1992-11-01), pages 1051 - 1057, XP002973372 *
See also references of EP1544751A4 *
YUICHIRO AOKI ET AL.: "information retrieval system data-710", NEC GIHO, vol. 41, no. 12, 31 October 1998 (1998-10-31), pages 33 - 39, XP002973333 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1610240A1 (fr) * 2004-06-22 2005-12-28 Pioneer Corporation Dispositif de traitement de données, son procédé, son programme, et moyèn d'enregistrement pour stocker le programme
US7506811B2 (en) 2004-06-22 2009-03-24 Pioneer Corporation Data processing device, method thereof, program thereof, and recording medium recording the program
CN105554519A (zh) * 2015-12-24 2016-05-04 北京酷云互动科技有限公司 Epg信息解析方法及系统

Also Published As

Publication number Publication date
JP4040382B2 (ja) 2008-01-30
KR20050025999A (ko) 2005-03-14
EP1544751A1 (fr) 2005-06-22
EP1544751A4 (fr) 2007-12-26
US20060116869A1 (en) 2006-06-01
CN1682220A (zh) 2005-10-12
KR100993957B1 (ko) 2010-11-11
CN100530174C (zh) 2009-08-19
JP2004062639A (ja) 2004-02-26
US7577972B2 (en) 2009-08-18

Similar Documents

Publication Publication Date Title
US7890490B1 (en) Systems and methods for providing advanced information searching in an interactive media guidance application
JP4198786B2 (ja) 情報フィルタリングシステム、情報フィルタリング装置、映像機器および情報フィルタリング方法
WO2004012101A1 (fr) Dispositif et procede d'extraction automatique de mot-cle, support d'enregistrement et programme
US8381249B2 (en) Systems and methods for acquiring, categorizing and delivering media in interactive media guidance applications
US7587673B2 (en) Information processing apparatus, method and program
US20150007234A1 (en) Systems and methods for acquiring, categorizing and delivering media in interactive media guidance applications
US20060167859A1 (en) System and method for personalized searching of television content using a reduced keypad
EP2080117A2 (fr) Systèmes et procédés permettant d'acquérir, de catégoriser et de délivrer du multimédia dans des applications de guidage multimédia interactives
US8195687B2 (en) Program retrieval support device for accumulating and searching pieces of program information and corresponding programs and a method for performing the same
JP4200393B2 (ja) 情報処理装置および情報処理方法
JP2004343320A (ja) 情報処理装置および方法、プログラム、並びに記録媒体
WO2004004323A1 (fr) Appareil, procede et programme de traitement d'information
JPWO2008078717A1 (ja) 番組データ管理サーバ、識別子割当装置、番組データ管理方法及びプログラム
JP4461354B2 (ja) 情報検索システムおよび方法、情報処理装置および方法、プログラム、並びに記録媒体
US20040193592A1 (en) Recording and reproduction apparatus
WO2007060968A1 (fr) Récepteur d’émissions, appareil d’enregistrement et de reproduction d’informations et méthode d’extraction d’informations
CN101605011B (zh) 信息处理装置、信息处理方法
JP2008027186A (ja) 情報検索装置および情報検索方法
JP6029530B2 (ja) 情報処理装置及び情報処理方法
JP4623070B2 (ja) キーワードの自動抽出装置及び方法、記録媒体、並びにプログラム
WO2009107708A1 (fr) Dispositif de reproduction de contenus, système de reproduction de contenus, procédé de reproduction de contenus, programme de reproduction de contenus et support d'enregistrement
JP2009159475A (ja) 番組検索装置および番組検索方法
JP4403717B2 (ja) 番組受信装置、番組受信方法、番組記録装置、情報処理装置及び情報提供システム
AU2018241142B2 (en) Systems and Methods for Acquiring, Categorizing and Delivering Media in Interactive Media Guidance Applications
JP2005057523A (ja) 番組付加情報抽出装置、番組表示装置および番組記録装置

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CN KR US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1020057001427

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2003771430

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020057001427

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 20038223856

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 2003771430

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2006116869

Country of ref document: US

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 10523332

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 10523332

Country of ref document: US