WO2011039773A2 - Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue - Google Patents
Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue Download PDFInfo
- Publication number
- WO2011039773A2 WO2011039773A2 PCT/IN2010/000617 IN2010000617W WO2011039773A2 WO 2011039773 A2 WO2011039773 A2 WO 2011039773A2 IN 2010000617 W IN2010000617 W IN 2010000617W WO 2011039773 A2 WO2011039773 A2 WO 2011039773A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- news
- news programs
- programs
- identified
- repeat
- Prior art date
Links
- 230000000007 visual effect Effects 0.000 claims abstract description 41
- 238000001514 detection method Methods 0.000 claims abstract description 20
- 238000000034 method Methods 0.000 claims description 43
- 238000012545 processing Methods 0.000 claims description 20
- 238000012360 testing method Methods 0.000 claims description 8
- 238000012217 deletion Methods 0.000 claims description 4
- 230000037430 deletion Effects 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 description 7
- 230000011218 segmentation Effects 0.000 description 6
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 241001672694 Citrus reticulata Species 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- PWPJGUXAGUPAHP-UHFFFAOYSA-N lufenuron Chemical compound C1=C(Cl)C(OC(F)(F)C(C(F)(F)F)F)=CC(Cl)=C1NC(=O)NC(=O)C1=C(F)C=CC=C1F PWPJGUXAGUPAHP-UHFFFAOYSA-N 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000005204 segregation Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/16—Analogue secrecy systems; Analogue subscription systems
- H04N7/162—Authorising the user terminal, e.g. by paying; Registering the use of a subscription channel, e.g. billing
- H04N7/163—Authorising the user terminal, e.g. by paying; Registering the use of a subscription channel, e.g. billing by receiver means only
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
- H04N21/4331—Caching operations, e.g. of an advertisement for later insertion during playback
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44016—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/454—Content or additional data filtering, e.g. blocking advertisements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8106—Monomedia components thereof involving special audio data, e.g. different tracks for different languages
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/812—Monomedia components thereof involving advertisement data
Definitions
- the present invention relates to the field of computer vision and audio processing techniques.
- the present invention relates to analysis of television (TV) news channels.
- this invention relates to TV news analysis system for multilingual broadcast channels
- US2004189873 discloses a VIDEO DETECTION AND INSERTION SYSTEM.
- US2004189873 system includes means which detects defined segments in a video stream.
- the defined segments may be advertisements, as mentioned.
- US6614987 discloses a TELEVISION PROGRAM RECORDING WITH USER PREFERENCE DETERMINATION.
- the system includes a module which is responsive to attribute information in accordance with categorization (classification) parameters or viewing preferences of the user.
- categorization classification
- US2002162118 discloses an EFFICIENT INTERACTIVE TV.
- This system includes content identifier means to identify content or a subset of content. This identification, not only helps in identify news stories and advertisements, but is also poised to identify repetitive news stories.
- US6608930 discloses a METHOD AND SYSTEM FOR ANALYZING VIDEO CONTENT USING DETECTED TEXT IN VIDEO FRAMES. This system detects video streams based on user-selected image text attributes. A selected attribute may be news stories or advertisements or both. So, both may be individually identified for segregation purposes. Further, recognition of persons featuring in the detected video is also disclosed. However, all these features are enabled due to the (image) text that is available in each video stream.
- ARTIFICIAL TV NEWS PROGRAMS It includes means to process the language of a newscaster to be translated into choice of user, combines automatic speech recognition (Speech-To-Text processing), automatic machine translation, and audio-visual Text-To-Speech (TTS) synthesis techniques for automatically personalizing TV news programs.
- speech-To-Text processing automatic speech recognition
- machine translation automatic machine translation
- audio-visual Text-To-Speech (TTS) synthesis techniques for automatically personalizing TV news programs.
- this patent application does not provide a solution for identifying news programs and classification of said programs, or even identifying programs based on metadata.
- An object of the invention is to provide an integrated and complete solution for news video analysis.
- Another object of the invention is to provide a system wherein TV newscasts in different languages can be processed.
- Yet another object of the invention is to automatically identify news programs in a broadcast stream and separate it out from other programs.
- Still another object of the invention is to provide a system wherein advertisements in TV newscasts are automatically identified and removed from the news program.
- Still an additional object of the invention is to provide a system for news analysis wherein similar stories (pertaining to the same event) on different channels are identified and clustered.
- Another additional object of the invention is to provide a system for news analysis wherein each news story is indexed with keywords identified in the speech and visual text as well as other metadata, such as a recognized face.
- Another object of the invention is to provide a system where news stories in languages, for which speech and OCR technologies are not mature, are indexed based on their similarity with stories in other languages where speech and OCR technologies is mature.
- Yet another additional object of the invention is to provide a system for news analysis wherein the stories are classified and can be retrieved.
- a system for identification, classification, storage, and analysis of news programs containing an audio channel, video channel, and metadata relating to it, broadcasted/relayed on a television (TV) channel by means of a plurality of TV broadcast streams, said system comprising:
- - recording module adapted to record said captured streams on a physical storage
- - news program identification module adapted to identify news programs in said stored broadcast streams
- - news program clipping module adapted to separate said identified news programs from other programs
- - advertisement clipping module adapted for removal of said identified advertisements
- - seam detection module adapted to detect and identify seams of said news programs in order to demarcate individual stories in a news program
- - keyword generation module adapted to generate a list of keywords
- - text-keyword identification module adapted to identify said created keywords from visual text of identified said news programs
- - speech-keyword identification module adapted to identify the created keywords from the speech of said identified news programs
- - repeat-identification module adapted to identify similar/repeat news programs from said plurality of TV broadcast streams
- - clustering module adapted to cluster said repeat news programs into one news programs, in order to avoid duplication or multiplication
- - removal module adapted to remove said repeat-identified news programs
- - logical interconnection module adapted to logically interconnecting each of said modules for determining the sequence of steps for a multilingual news video analysis system.
- said keyword generation module is a multilingual keyword generation module adapted to generate keywords in multiple languages.
- said text-keyword identification module is a multilingual text- keyword identification module adapted to identify said created keywords from visual text of identified said news programs, in different languages, in the visual channel of the news program.
- said speech-keyword identification module is a multilingual speech-keyword identification module adapted to identify said created keywords from the speech in different languages, in the audio channel of the news program.
- said system includes a multilingual lexicon database for generating multilingual synonymous keywords for said created keywords.
- an acquisition module adapted to capture a TV broadcast stream and further includes a recording module adapted to record said captured stream on a physical storage, typically on disk, in chunks of manageable size.
- a news program identification module adapted to identify news programs in the broadcast stream and to separate them from other programs.
- an advertisement identification module for identification of advertisements from said news programs, and further including an advertisement clipping module adapted for removal of said identified advertisement breaks.
- a keyword generation module to create a list of desired keywords of contemporary interest in different languages.
- a text-keyword identification module adapted to identify the desired created keywords from the visual text of said news stories, in different languages, typically appearing in form of ticker text on the screen.
- a speech-keyword identification module adapted to identify the desired keywords from the speech, in different languages, in the audio channel of the news.
- a seam detection module adapted to detect and identify seams i.e. story boundaries and demarcate the individual stories in a news program.
- a repeat-identification module adapted to identify similar/repeat stories from multiple channels and further including a clustering module adapted to cluster said repeat stories into one story, to avoid duplication or multiplication.
- a removal module adapted to identify duplicate stories (repeat telecasts) and remove them from the selected stories.
- a repository adapted for storing the news contents, content description of the news videos, various indexes and links as discovered in the previously described modules.
- said system includes a retrieval module adapted for retrieving a news program from said repository.
- said system includes a navigation means adapted for navigation in said repository for retrieving a news program.
- a logical interconnection module adapted to logically interconnect all the said modules for determining the sequence of steps for a multilingual news video analysis system.
- repeat-identification module includes visual matching means adapted to use visual cues in order to identify repeat news programs, said visual matching means comprises:
- - key frame identification means adapted to identify at least a key frame in a plurality of news program
- - key frame visual feature extruding means adapted to extrude visual features relating to pre-defined parameters of said identified key frames
- - processing means adapted to process said extruded features based on pre-defined tests in order to obtain a processed similarity score in relation to plurality of news program
- - identification means adapted to identify repeat news programs based on pre-defined criteria of said similarity score
- - deletion means adapted to delete said identified repeat news programs.
- repeat-identification module included audio matching means adapted to use audio cues in order to identify repeat news programs, said audio matching means comprises:
- - window determination means adapted to determine a window of frames in a plurality of news programs
- - fingerprint detection means adapted to detect audio fingerprint based on pre-defined processing criteria on said determined window of frames;
- processing means adapted to process said detected audio fingerprint based on pre-defined tests in order to obtain a processed similarity score in relation to plurality of news program;
- - identification means adapted to identify repeat news programs based on pre-defined criteria of said similarity score
- - deletion means adapted to delete said identified repeat news programs.
- a method for identification, classification, storage, and analysis of news programs containing an audio channel, video channel, and metadata relating to it, broadcasted/relayed on a television (TV) channel by means of a plurality of TV broadcast streams, said method comprises the steps of:
- said step of identifying said created keywords from visual text of identified said news programs includes the step of generating keywords in multiple languages.
- said step of identifying created text keywords includes the step of identifying created keywords from visual text of identified said news programs, in different languages, in the visual channel of the news program.
- said step of identifying speech-keywords includes the step of identifying said created keywords from the speech in different languages, in the audio channel of the news program.
- said method includes a step of retrieving a news program from said repository.
- said method includes a step of navigating in said repository for retrieving a news program.
- said step of removing repeat-identified news programs includes a method of using visual cues in order to identify repeat news programs, said method comprises the steps of:
- said step of removing repeat-identified news programs includes a method of using audio cues in order to identify repeat news programs, said method comprises the steps of:
- Figure 1 illustrates a schematic block diagram of the multilingual news video analysis system.
- Figure 1 illustrates a schematic block diagram of the multilingual news video analysis system in accordance with the present invention.
- the Telecast Acquisition Module (10) captures telecast from several possible sources, e.g. a DTH dish, cable TV, etc., tunes to a particular channel, decodes the TV signals and converts the transmission in standard digital video format, e.g. MPEG-4 or the like. This module is replicated for every channel to be monitored.
- sources e.g. a DTH dish, cable TV, etc.
- This module is replicated for every channel to be monitored.
- the video streams captured by the Telecast Acquisition modules (10) are stored in a Recording Module (20) in chunks of manageable size with unique file names.
- Video Description Module (40).
- Advertisement breaks within a news program are now detected using absence of specific ticker-text bands and marked in the video in Advertisement Identification Module (50).
- the video is decomposed into constituent shots and several visual and audio parameters are extracted. The additional information accumulates in Video Description Module (40).
- a set of keywords of contemporary interest are selected by analysis of RSS feeds by a Keyword Generation Module (60).
- the video segments representing news programs are now processed to detect these keywords.
- a Keyword Recognition Module (70) analyzes the visual text and speech to spot the identified keywords.
- the visual keywords are classified into 'global' and 'local' categories, depending on the ticker-text band where they appear. While the 'local' keywords pertain to the current story being telecast, the 'global' keywords do not pertain to a story that may appear anywhere in the news program.
- Speaker identification Module (80) identifies the speaker using face recognition and speaker identification (speech) technologies in the scenes containing one dominant speaker, for example in speeches made by important personalities. The additional information further augments the description in Video Description Module (40).
- the keyword generation module is a multilingual keyword generation module.
- the system provided an ability process multilingual news programs, according to this invention.
- a multilingual keyword list in multiple languages, is created, in order to enable keyword spotting in multilingual TV news broadcast channels, both in spoken and visual forms.
- multilingual keyword list helps to automatically map the spotted keywords in different languages to a primary language (say English) equivalents for uniform indexing across multiple channels. Restricting the keyword list to a small number helps in improving the accuracy of the system, especially for keyword spotting in speech.
- a sample multilingual keyword list is shown below:
- the method for creating a multilingual keyword list is fueled by RSS feeds, maintained by some website systems.
- RSS feeds captures the contemporary news in a semi-structured XML format and contains hyperlinks to the full-text news stories usually in English.
- the system of this invention identifies the common (statistical language processing) and proper nouns (using named entity detection processing) in the RSS feed text and the associated stories as the keywords.
- the keywords in the language of the RSS (usually English) forms a set of concepts, which need to be identified in the audio-visual broadcast in different language telecasts.
- the equivalent keywords in other languages from the English keywords can be derived using a word level English-to-language dictionary (for common noun keywords) that language; a pronunciation lexicon (a lexicon is an association of words and their phonetic transcription. It is a special kind of dictionary that maps a word to all the possible phonemic representations of the word.) for transliterating proper names in a semi-automatic matter as suggested.
- the keywords in multilingual form is dynamic keyword list structure in XML format. This becomes an active keyword list for the news video channels and is used for both keyword spotting in audio-visual new telecast.
- One of the novelties of this invention is the use of keyword spotting instead of adopting a full transcription of new telecast to annotate multilingual news broadcast.
- This serves three purposes (a) one need not determine the language of telecast a priori and (b) one need not have language specific speech recognition engines and (c) it is easier to keyword spot than try a full text transcription because the search space of the speech to text (speech recognition engine) is constrained in search space.
- it is sufficient to annotate the news telecast by the keywords because news broadcast is all about places and people (proper nouns) and a set of commonly nouns; additionally keyword annotation of the news broadcast occupies much less space than the erroneous full text transcription.
- the Seam Detection Module (90) uses the video descriptors available in Video Description Module (40) to identify story boundaries.
- Repeat- Identification Module (100) identifies similar and duplicate stories from multiple channels.
- the OCR and speech technology for many Indian languages are not mature enough for reliable keyword extraction. Similar shot detection helps in classification of news stories in these languages.
- the additional information further augments the description in Video Description Module (40) and is used to create a Repository Knowledge Base (110).
- the knowledge base enables semantic search for news clusters by semantic analysis of the various metadata associated with the news videos in the earlier stages of processing.
- both audio and visual cues are identified and used, from a plurality of news programs.
- the recorded news videos are segregated into news programs for further processing.
- shot detection technique is used where the news stories are logically segmented into distinct shots wherein each shot is represented by a key frame or representative frame.
- similar story detection module finds similarity score, using visual matching techniques, between two news stories in the range of [0, 1], where '0' means no match and T means complete match.
- the duplicate story detection module finds whether two news stories are duplicates of each other.
- the shots are detected on the basis of difference in visual features of the successive frames in a video.
- a key frame or representative frame and its corresponding visual features such as colour, texture, edges, etc are extracted for each shot.
- the shots are clustered based on the visual similarity of their representative frames. This is calculated by distance measures such as Absolute Image Difference, Histogram Intersection, Hausdorff Distance, Color Moments, SIFT, and the like.
- Each cluster in a story is now compared with every cluster in the other story by comparing the central representative frames in the clusters using a visual comparator.
- ⁇ cu, c n ... c !m J be the clusters in story Si
- ⁇ c 2 i, c 2 2 ⁇ ⁇ ⁇ c 2n ⁇ be the clusters in story s 2 .
- ky be the number of shots in j' h cluster of story i. The process is repeated with every pair of candidate similar stories and clusters of similar stories are discovered.
- SIM 12 is greater than a certain threshold, the news programs are designated to be similar.
- Duplicate stories are a subset of similar stories. Two stories are said to be duplicates of each other only if their audio-visual patterns are same.
- T s i and T s2 be the total duration of the two stories, 'm ' and are the total number of shots in the two stories respectively and and be the duration of shots in stories Si and s 2 respectively. Then the criteria for (visually) duplicate videos are:
- the two stories are not duplicates of each other visually, if any of the above condition fails.
- the audio patterns or audio fingerprint that are used are based on perceptual features of audio that are invariant, at least to certain degree, with respect to signal degradations. Thus severely degraded audio still leads to very similar audio fingerprints.
- These fingerprints are matched for each frame block, which is a group of frames, from two streams. The two streams are duplicates if the fingerprints of all the frame blocks are matched. As the streams can be from different channels, they may not match exactly at the desired points. The match may occur at few samples before or after the desired point. Thus the audio frames are matched in a window of some predetermined size.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Business, Economics & Management (AREA)
- Marketing (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
L'invention concerne un système d'identification, de classification, de stockage et d'analyse de nouveaux programmes, contenant un canal audio, un canal vidéo, et des métadonnées s'y rapportant, que l'on propose en diffusion/relais sur un canal TV à travers plusieurs flux de diffusion TV. Le système comprend: - un module d'acquisition conçu pour appréhender ces flux; - un module d'enregistrement conçu pour enregistrer les flux appréhendés sur un support de stockage physique; - un module d'identification de programme de nouvelles conçu pour identifier des programmes de nouvelles dans les flux ainsi stockés; - un module de découpage de programme de nouvelles conçu pour séparer d'autres programmes les programmes de nouvelles ainsi identifiés; - un module d'identification d'annonce publicitaire pour l'identification d'annonces publicitaires depuis ces programmes de nouvelles identifiés; - un module de découpage d'annonce publicitaire conçu pour éliminer les annonces identifiées; - un module de détection de raccord conçu pour détecter et identifier les raccords de ces programmes de nouvelles pour isoler des séquences individuelles dans un programme de nouvelles; - un module de génération de mot clé conçu pour générer une liste de mots clés; - un module d'identification de mot clé de texte conçu pour identifier les mots clés ainsi créés depuis du texte visuel dans les programmes de nouvelles ainsi identifiés; - un module d'identification de mot clé de voix pour identifier les mots clés créés depuis de la voix dans les programmes de nouvelles identifiés; - un module de répétition-identification conçu pour identifier des programmes de nouvelles similaires/répétés dans la pluralité de flux de diffusion TV; - un module de groupage conçu pour grouper les programmes de nouvelles répétés dans un programme de nouvelles pour éviter la duplication ou la multiplication; - un module d'élimination pour éliminer les programmes de nouvelles identifiés répétés; - un entrepôt de stockage de ces programmes de nouvelles et les métadonnées imbriquées dans ces programmes; et - un module d'interconnexion logique conçu pour l'interconnexion logique de chacun des modules afin de déterminer la séquence d'étapes pour un système d'analyse vidéo de nouvelles multilingue.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN2092/MUM/2009 | 2009-09-14 | ||
IN2092MU2009 | 2009-09-14 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2011039773A2 true WO2011039773A2 (fr) | 2011-04-07 |
WO2011039773A3 WO2011039773A3 (fr) | 2011-06-16 |
Family
ID=43826738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IN2010/000617 WO2011039773A2 (fr) | 2009-09-14 | 2010-09-14 | Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2011039773A2 (fr) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015102245A1 (fr) * | 2014-01-02 | 2015-07-09 | Samsung Electronics Co., Ltd. | Dispositif d'affichage, dispositif de serveur, système d'entrée vocale et procédés associés |
WO2017023719A1 (fr) * | 2015-07-31 | 2017-02-09 | Rovi Guides, Inc. | Procédé d'amélioration de l'expérience de visionnage d'un utilisateur lorsqu'il consomme une séquence de contenu multimédia |
KR102005112B1 (ko) * | 2018-10-16 | 2019-07-29 | (주) 씨이랩 | 콘텐츠 스트리밍 내 광고 서비스 제공 방법 |
CN112565820A (zh) * | 2020-12-24 | 2021-03-26 | 新奥特(北京)视频技术有限公司 | 一种视频新闻拆分方法和装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2000039707A1 (fr) * | 1998-12-23 | 2000-07-06 | Koninklijke Philips Electronics N.V. | Systeme personnalise de classement et de saisie video |
WO2002031814A1 (fr) * | 2000-10-10 | 2002-04-18 | Intel Corporation | Systeme de recherche vocale independante de la langue |
WO2007053112A1 (fr) * | 2005-11-07 | 2007-05-10 | Agency For Science, Technology And Research | Identification d'une sequence de repetition dans des donnees video |
CN101315631A (zh) * | 2008-06-25 | 2008-12-03 | 中国人民解放军国防科学技术大学 | 一种新闻视频故事单元关联方法 |
-
2010
- 2010-09-14 WO PCT/IN2010/000617 patent/WO2011039773A2/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2000039707A1 (fr) * | 1998-12-23 | 2000-07-06 | Koninklijke Philips Electronics N.V. | Systeme personnalise de classement et de saisie video |
WO2002031814A1 (fr) * | 2000-10-10 | 2002-04-18 | Intel Corporation | Systeme de recherche vocale independante de la langue |
WO2007053112A1 (fr) * | 2005-11-07 | 2007-05-10 | Agency For Science, Technology And Research | Identification d'une sequence de repetition dans des donnees video |
CN101315631A (zh) * | 2008-06-25 | 2008-12-03 | 中国人民解放军国防科学技术大学 | 一种新闻视频故事单元关联方法 |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015102245A1 (fr) * | 2014-01-02 | 2015-07-09 | Samsung Electronics Co., Ltd. | Dispositif d'affichage, dispositif de serveur, système d'entrée vocale et procédés associés |
US9749699B2 (en) | 2014-01-02 | 2017-08-29 | Samsung Electronics Co., Ltd. | Display device, server device, voice input system and methods thereof |
WO2017023719A1 (fr) * | 2015-07-31 | 2017-02-09 | Rovi Guides, Inc. | Procédé d'amélioration de l'expérience de visionnage d'un utilisateur lorsqu'il consomme une séquence de contenu multimédia |
EP3448049A1 (fr) * | 2015-07-31 | 2019-02-27 | Rovi Guides, Inc. | Procédé d'amélioration de l'expérience de visionnage d'un utilisateur lorsqu'il consomme une séquence de contenu multimédia |
US10375443B2 (en) | 2015-07-31 | 2019-08-06 | Rovi Guides, Inc. | Method for enhancing a user viewing experience when consuming a sequence of media |
US11032611B2 (en) | 2015-07-31 | 2021-06-08 | Rovi Guides, Inc. | Method for enhancing a user viewing experience when consuming a sequence of media |
EP3926966A1 (fr) * | 2015-07-31 | 2021-12-22 | Rovi Guides, Inc. | Procédé d'amélioration de l'expérience de visualisation d'un utilisateur lors de la consommation d'une séquence multimédia |
US11523182B2 (en) | 2015-07-31 | 2022-12-06 | Rovi Guides, Inc. | Method for enhancing a user viewing experience when consuming a sequence of media |
US11849182B2 (en) | 2015-07-31 | 2023-12-19 | Rovi Guides, Inc. | Method for providing identifying portions for playback at user-selected playback rate |
KR102005112B1 (ko) * | 2018-10-16 | 2019-07-29 | (주) 씨이랩 | 콘텐츠 스트리밍 내 광고 서비스 제공 방법 |
CN112565820A (zh) * | 2020-12-24 | 2021-03-26 | 新奥特(北京)视频技术有限公司 | 一种视频新闻拆分方法和装置 |
CN112565820B (zh) * | 2020-12-24 | 2023-03-28 | 新奥特(北京)视频技术有限公司 | 一种视频新闻拆分方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
WO2011039773A3 (fr) | 2011-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20030065655A1 (en) | Method and apparatus for detecting query-driven topical events using textual phrases on foils as indication of topic | |
US20040143434A1 (en) | Audio-Assisted segmentation and browsing of news videos | |
US20180039859A1 (en) | Joint acoustic and visual processing | |
CN103761261A (zh) | 一种基于语音识别的媒体搜索方法及装置 | |
CN112004164B (zh) | 一种视频海报自动生成方法 | |
US7349477B2 (en) | Audio-assisted video segmentation and summarization | |
CN114880496A (zh) | 多媒体信息话题分析方法、装置、设备及存储介质 | |
Dufour et al. | Characterizing and detecting spontaneous speech: Application to speaker role recognition | |
WO2011039773A2 (fr) | Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue | |
CN114996506A (zh) | 语料生成方法、装置、电子设备和计算机可读存储介质 | |
Rouvier et al. | Audio-based video genre identification | |
Ariki et al. | Highlight scene extraction in real time from baseball live video | |
Poignant et al. | Towards a better integration of written names for unsupervised speakers identification in videos | |
Bechet et al. | Detecting person presence in tv shows with linguistic and structural features | |
Hayashi et al. | Speech-based and video-supported indexing of multimedia broadcast news | |
Xu et al. | Affective content detection in sitcom using subtitle and audio | |
Stein et al. | From raw data to semantically enriched hyperlinking: Recent advances in the LinkedTV analysis workflow | |
Ghosh et al. | Multimodal indexing of multilingual news video | |
Palivela et al. | Dense Video Captioning Using Video-Audio Features and Topic Modeling Based on Caption | |
Nouza et al. | A system for information retrieval from large records of Czech spoken data | |
Zhu et al. | Video browsing and retrieval based on multimodal integration | |
JP4305921B2 (ja) | 動画像話題分割方法 | |
Amaral et al. | The development of a portuguese version of a media watch system | |
Kothawade et al. | Retrieving instructional video content from speech and text information | |
Papageorgiou et al. | Multimedia Indexing and Retrieval Using Natural Language, Speech and Image Processing Methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10820017 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10820017 Country of ref document: EP Kind code of ref document: A2 |