WO2011039773A2 - Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue - Google Patents

Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue Download PDF

Info

Publication number
WO2011039773A2
WO2011039773A2 PCT/IN2010/000617 IN2010000617W WO2011039773A2 WO 2011039773 A2 WO2011039773 A2 WO 2011039773A2 IN 2010000617 W IN2010000617 W IN 2010000617W WO 2011039773 A2 WO2011039773 A2 WO 2011039773A2
Authority
WO
WIPO (PCT)
Prior art keywords
news
news programs
programs
identified
repeat
Prior art date
Application number
PCT/IN2010/000617
Other languages
English (en)
Other versions
WO2011039773A3 (fr
Inventor
Ghosh Hiranmay
Kopparapu Sunilkumar
Original Assignee
Tata Consultancy Services Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tata Consultancy Services Ltd. filed Critical Tata Consultancy Services Ltd.
Publication of WO2011039773A2 publication Critical patent/WO2011039773A2/fr
Publication of WO2011039773A3 publication Critical patent/WO2011039773A3/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/16Analogue secrecy systems; Analogue subscription systems
    • H04N7/162Authorising the user terminal, e.g. by paying; Registering the use of a subscription channel, e.g. billing
    • H04N7/163Authorising the user terminal, e.g. by paying; Registering the use of a subscription channel, e.g. billing by receiver means only
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4331Caching operations, e.g. of an advertisement for later insertion during playback
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/454Content or additional data filtering, e.g. blocking advertisements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/812Monomedia components thereof involving advertisement data

Definitions

  • the present invention relates to the field of computer vision and audio processing techniques.
  • the present invention relates to analysis of television (TV) news channels.
  • this invention relates to TV news analysis system for multilingual broadcast channels
  • US2004189873 discloses a VIDEO DETECTION AND INSERTION SYSTEM.
  • US2004189873 system includes means which detects defined segments in a video stream.
  • the defined segments may be advertisements, as mentioned.
  • US6614987 discloses a TELEVISION PROGRAM RECORDING WITH USER PREFERENCE DETERMINATION.
  • the system includes a module which is responsive to attribute information in accordance with categorization (classification) parameters or viewing preferences of the user.
  • categorization classification
  • US2002162118 discloses an EFFICIENT INTERACTIVE TV.
  • This system includes content identifier means to identify content or a subset of content. This identification, not only helps in identify news stories and advertisements, but is also poised to identify repetitive news stories.
  • US6608930 discloses a METHOD AND SYSTEM FOR ANALYZING VIDEO CONTENT USING DETECTED TEXT IN VIDEO FRAMES. This system detects video streams based on user-selected image text attributes. A selected attribute may be news stories or advertisements or both. So, both may be individually identified for segregation purposes. Further, recognition of persons featuring in the detected video is also disclosed. However, all these features are enabled due to the (image) text that is available in each video stream.
  • ARTIFICIAL TV NEWS PROGRAMS It includes means to process the language of a newscaster to be translated into choice of user, combines automatic speech recognition (Speech-To-Text processing), automatic machine translation, and audio-visual Text-To-Speech (TTS) synthesis techniques for automatically personalizing TV news programs.
  • speech-To-Text processing automatic speech recognition
  • machine translation automatic machine translation
  • audio-visual Text-To-Speech (TTS) synthesis techniques for automatically personalizing TV news programs.
  • this patent application does not provide a solution for identifying news programs and classification of said programs, or even identifying programs based on metadata.
  • An object of the invention is to provide an integrated and complete solution for news video analysis.
  • Another object of the invention is to provide a system wherein TV newscasts in different languages can be processed.
  • Yet another object of the invention is to automatically identify news programs in a broadcast stream and separate it out from other programs.
  • Still another object of the invention is to provide a system wherein advertisements in TV newscasts are automatically identified and removed from the news program.
  • Still an additional object of the invention is to provide a system for news analysis wherein similar stories (pertaining to the same event) on different channels are identified and clustered.
  • Another additional object of the invention is to provide a system for news analysis wherein each news story is indexed with keywords identified in the speech and visual text as well as other metadata, such as a recognized face.
  • Another object of the invention is to provide a system where news stories in languages, for which speech and OCR technologies are not mature, are indexed based on their similarity with stories in other languages where speech and OCR technologies is mature.
  • Yet another additional object of the invention is to provide a system for news analysis wherein the stories are classified and can be retrieved.
  • a system for identification, classification, storage, and analysis of news programs containing an audio channel, video channel, and metadata relating to it, broadcasted/relayed on a television (TV) channel by means of a plurality of TV broadcast streams, said system comprising:
  • - recording module adapted to record said captured streams on a physical storage
  • - news program identification module adapted to identify news programs in said stored broadcast streams
  • - news program clipping module adapted to separate said identified news programs from other programs
  • - advertisement clipping module adapted for removal of said identified advertisements
  • - seam detection module adapted to detect and identify seams of said news programs in order to demarcate individual stories in a news program
  • - keyword generation module adapted to generate a list of keywords
  • - text-keyword identification module adapted to identify said created keywords from visual text of identified said news programs
  • - speech-keyword identification module adapted to identify the created keywords from the speech of said identified news programs
  • - repeat-identification module adapted to identify similar/repeat news programs from said plurality of TV broadcast streams
  • - clustering module adapted to cluster said repeat news programs into one news programs, in order to avoid duplication or multiplication
  • - removal module adapted to remove said repeat-identified news programs
  • - logical interconnection module adapted to logically interconnecting each of said modules for determining the sequence of steps for a multilingual news video analysis system.
  • said keyword generation module is a multilingual keyword generation module adapted to generate keywords in multiple languages.
  • said text-keyword identification module is a multilingual text- keyword identification module adapted to identify said created keywords from visual text of identified said news programs, in different languages, in the visual channel of the news program.
  • said speech-keyword identification module is a multilingual speech-keyword identification module adapted to identify said created keywords from the speech in different languages, in the audio channel of the news program.
  • said system includes a multilingual lexicon database for generating multilingual synonymous keywords for said created keywords.
  • an acquisition module adapted to capture a TV broadcast stream and further includes a recording module adapted to record said captured stream on a physical storage, typically on disk, in chunks of manageable size.
  • a news program identification module adapted to identify news programs in the broadcast stream and to separate them from other programs.
  • an advertisement identification module for identification of advertisements from said news programs, and further including an advertisement clipping module adapted for removal of said identified advertisement breaks.
  • a keyword generation module to create a list of desired keywords of contemporary interest in different languages.
  • a text-keyword identification module adapted to identify the desired created keywords from the visual text of said news stories, in different languages, typically appearing in form of ticker text on the screen.
  • a speech-keyword identification module adapted to identify the desired keywords from the speech, in different languages, in the audio channel of the news.
  • a seam detection module adapted to detect and identify seams i.e. story boundaries and demarcate the individual stories in a news program.
  • a repeat-identification module adapted to identify similar/repeat stories from multiple channels and further including a clustering module adapted to cluster said repeat stories into one story, to avoid duplication or multiplication.
  • a removal module adapted to identify duplicate stories (repeat telecasts) and remove them from the selected stories.
  • a repository adapted for storing the news contents, content description of the news videos, various indexes and links as discovered in the previously described modules.
  • said system includes a retrieval module adapted for retrieving a news program from said repository.
  • said system includes a navigation means adapted for navigation in said repository for retrieving a news program.
  • a logical interconnection module adapted to logically interconnect all the said modules for determining the sequence of steps for a multilingual news video analysis system.
  • repeat-identification module includes visual matching means adapted to use visual cues in order to identify repeat news programs, said visual matching means comprises:
  • - key frame identification means adapted to identify at least a key frame in a plurality of news program
  • - key frame visual feature extruding means adapted to extrude visual features relating to pre-defined parameters of said identified key frames
  • - processing means adapted to process said extruded features based on pre-defined tests in order to obtain a processed similarity score in relation to plurality of news program
  • - identification means adapted to identify repeat news programs based on pre-defined criteria of said similarity score
  • - deletion means adapted to delete said identified repeat news programs.
  • repeat-identification module included audio matching means adapted to use audio cues in order to identify repeat news programs, said audio matching means comprises:
  • - window determination means adapted to determine a window of frames in a plurality of news programs
  • - fingerprint detection means adapted to detect audio fingerprint based on pre-defined processing criteria on said determined window of frames;
  • processing means adapted to process said detected audio fingerprint based on pre-defined tests in order to obtain a processed similarity score in relation to plurality of news program;
  • - identification means adapted to identify repeat news programs based on pre-defined criteria of said similarity score
  • - deletion means adapted to delete said identified repeat news programs.
  • a method for identification, classification, storage, and analysis of news programs containing an audio channel, video channel, and metadata relating to it, broadcasted/relayed on a television (TV) channel by means of a plurality of TV broadcast streams, said method comprises the steps of:
  • said step of identifying said created keywords from visual text of identified said news programs includes the step of generating keywords in multiple languages.
  • said step of identifying created text keywords includes the step of identifying created keywords from visual text of identified said news programs, in different languages, in the visual channel of the news program.
  • said step of identifying speech-keywords includes the step of identifying said created keywords from the speech in different languages, in the audio channel of the news program.
  • said method includes a step of retrieving a news program from said repository.
  • said method includes a step of navigating in said repository for retrieving a news program.
  • said step of removing repeat-identified news programs includes a method of using visual cues in order to identify repeat news programs, said method comprises the steps of:
  • said step of removing repeat-identified news programs includes a method of using audio cues in order to identify repeat news programs, said method comprises the steps of:
  • Figure 1 illustrates a schematic block diagram of the multilingual news video analysis system.
  • Figure 1 illustrates a schematic block diagram of the multilingual news video analysis system in accordance with the present invention.
  • the Telecast Acquisition Module (10) captures telecast from several possible sources, e.g. a DTH dish, cable TV, etc., tunes to a particular channel, decodes the TV signals and converts the transmission in standard digital video format, e.g. MPEG-4 or the like. This module is replicated for every channel to be monitored.
  • sources e.g. a DTH dish, cable TV, etc.
  • This module is replicated for every channel to be monitored.
  • the video streams captured by the Telecast Acquisition modules (10) are stored in a Recording Module (20) in chunks of manageable size with unique file names.
  • Video Description Module (40).
  • Advertisement breaks within a news program are now detected using absence of specific ticker-text bands and marked in the video in Advertisement Identification Module (50).
  • the video is decomposed into constituent shots and several visual and audio parameters are extracted. The additional information accumulates in Video Description Module (40).
  • a set of keywords of contemporary interest are selected by analysis of RSS feeds by a Keyword Generation Module (60).
  • the video segments representing news programs are now processed to detect these keywords.
  • a Keyword Recognition Module (70) analyzes the visual text and speech to spot the identified keywords.
  • the visual keywords are classified into 'global' and 'local' categories, depending on the ticker-text band where they appear. While the 'local' keywords pertain to the current story being telecast, the 'global' keywords do not pertain to a story that may appear anywhere in the news program.
  • Speaker identification Module (80) identifies the speaker using face recognition and speaker identification (speech) technologies in the scenes containing one dominant speaker, for example in speeches made by important personalities. The additional information further augments the description in Video Description Module (40).
  • the keyword generation module is a multilingual keyword generation module.
  • the system provided an ability process multilingual news programs, according to this invention.
  • a multilingual keyword list in multiple languages, is created, in order to enable keyword spotting in multilingual TV news broadcast channels, both in spoken and visual forms.
  • multilingual keyword list helps to automatically map the spotted keywords in different languages to a primary language (say English) equivalents for uniform indexing across multiple channels. Restricting the keyword list to a small number helps in improving the accuracy of the system, especially for keyword spotting in speech.
  • a sample multilingual keyword list is shown below:
  • the method for creating a multilingual keyword list is fueled by RSS feeds, maintained by some website systems.
  • RSS feeds captures the contemporary news in a semi-structured XML format and contains hyperlinks to the full-text news stories usually in English.
  • the system of this invention identifies the common (statistical language processing) and proper nouns (using named entity detection processing) in the RSS feed text and the associated stories as the keywords.
  • the keywords in the language of the RSS (usually English) forms a set of concepts, which need to be identified in the audio-visual broadcast in different language telecasts.
  • the equivalent keywords in other languages from the English keywords can be derived using a word level English-to-language dictionary (for common noun keywords) that language; a pronunciation lexicon (a lexicon is an association of words and their phonetic transcription. It is a special kind of dictionary that maps a word to all the possible phonemic representations of the word.) for transliterating proper names in a semi-automatic matter as suggested.
  • the keywords in multilingual form is dynamic keyword list structure in XML format. This becomes an active keyword list for the news video channels and is used for both keyword spotting in audio-visual new telecast.
  • One of the novelties of this invention is the use of keyword spotting instead of adopting a full transcription of new telecast to annotate multilingual news broadcast.
  • This serves three purposes (a) one need not determine the language of telecast a priori and (b) one need not have language specific speech recognition engines and (c) it is easier to keyword spot than try a full text transcription because the search space of the speech to text (speech recognition engine) is constrained in search space.
  • it is sufficient to annotate the news telecast by the keywords because news broadcast is all about places and people (proper nouns) and a set of commonly nouns; additionally keyword annotation of the news broadcast occupies much less space than the erroneous full text transcription.
  • the Seam Detection Module (90) uses the video descriptors available in Video Description Module (40) to identify story boundaries.
  • Repeat- Identification Module (100) identifies similar and duplicate stories from multiple channels.
  • the OCR and speech technology for many Indian languages are not mature enough for reliable keyword extraction. Similar shot detection helps in classification of news stories in these languages.
  • the additional information further augments the description in Video Description Module (40) and is used to create a Repository Knowledge Base (110).
  • the knowledge base enables semantic search for news clusters by semantic analysis of the various metadata associated with the news videos in the earlier stages of processing.
  • both audio and visual cues are identified and used, from a plurality of news programs.
  • the recorded news videos are segregated into news programs for further processing.
  • shot detection technique is used where the news stories are logically segmented into distinct shots wherein each shot is represented by a key frame or representative frame.
  • similar story detection module finds similarity score, using visual matching techniques, between two news stories in the range of [0, 1], where '0' means no match and T means complete match.
  • the duplicate story detection module finds whether two news stories are duplicates of each other.
  • the shots are detected on the basis of difference in visual features of the successive frames in a video.
  • a key frame or representative frame and its corresponding visual features such as colour, texture, edges, etc are extracted for each shot.
  • the shots are clustered based on the visual similarity of their representative frames. This is calculated by distance measures such as Absolute Image Difference, Histogram Intersection, Hausdorff Distance, Color Moments, SIFT, and the like.
  • Each cluster in a story is now compared with every cluster in the other story by comparing the central representative frames in the clusters using a visual comparator.
  • ⁇ cu, c n ... c !m J be the clusters in story Si
  • ⁇ c 2 i, c 2 2 ⁇ ⁇ ⁇ c 2n ⁇ be the clusters in story s 2 .
  • ky be the number of shots in j' h cluster of story i. The process is repeated with every pair of candidate similar stories and clusters of similar stories are discovered.
  • SIM 12 is greater than a certain threshold, the news programs are designated to be similar.
  • Duplicate stories are a subset of similar stories. Two stories are said to be duplicates of each other only if their audio-visual patterns are same.
  • T s i and T s2 be the total duration of the two stories, 'm ' and are the total number of shots in the two stories respectively and and be the duration of shots in stories Si and s 2 respectively. Then the criteria for (visually) duplicate videos are:
  • the two stories are not duplicates of each other visually, if any of the above condition fails.
  • the audio patterns or audio fingerprint that are used are based on perceptual features of audio that are invariant, at least to certain degree, with respect to signal degradations. Thus severely degraded audio still leads to very similar audio fingerprints.
  • These fingerprints are matched for each frame block, which is a group of frames, from two streams. The two streams are duplicates if the fingerprints of all the frame blocks are matched. As the streams can be from different channels, they may not match exactly at the desired points. The match may occur at few samples before or after the desired point. Thus the audio frames are matched in a window of some predetermined size.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

L'invention concerne un système d'identification, de classification, de stockage et d'analyse de nouveaux programmes, contenant un canal audio, un canal vidéo, et des métadonnées s'y rapportant, que l'on propose en diffusion/relais sur un canal TV à travers plusieurs flux de diffusion TV. Le système comprend: - un module d'acquisition conçu pour appréhender ces flux; - un module d'enregistrement conçu pour enregistrer les flux appréhendés sur un support de stockage physique; - un module d'identification de programme de nouvelles conçu pour identifier des programmes de nouvelles dans les flux ainsi stockés; - un module de découpage de programme de nouvelles conçu pour séparer d'autres programmes les programmes de nouvelles ainsi identifiés; - un module d'identification d'annonce publicitaire pour l'identification d'annonces publicitaires depuis ces programmes de nouvelles identifiés; - un module de découpage d'annonce publicitaire conçu pour éliminer les annonces identifiées; - un module de détection de raccord conçu pour détecter et identifier les raccords de ces programmes de nouvelles pour isoler des séquences individuelles dans un programme de nouvelles; - un module de génération de mot clé conçu pour générer une liste de mots clés; - un module d'identification de mot clé de texte conçu pour identifier les mots clés ainsi créés depuis du texte visuel dans les programmes de nouvelles ainsi identifiés; - un module d'identification de mot clé de voix pour identifier les mots clés créés depuis de la voix dans les programmes de nouvelles identifiés; - un module de répétition-identification conçu pour identifier des programmes de nouvelles similaires/répétés dans la pluralité de flux de diffusion TV; - un module de groupage conçu pour grouper les programmes de nouvelles répétés dans un programme de nouvelles pour éviter la duplication ou la multiplication; - un module d'élimination pour éliminer les programmes de nouvelles identifiés répétés; - un entrepôt de stockage de ces programmes de nouvelles et les métadonnées imbriquées dans ces programmes; et - un module d'interconnexion logique conçu pour l'interconnexion logique de chacun des modules afin de déterminer la séquence d'étapes pour un système d'analyse vidéo de nouvelles multilingue.
PCT/IN2010/000617 2009-09-14 2010-09-14 Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue WO2011039773A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN2092/MUM/2009 2009-09-14
IN2092MU2009 2009-09-14

Publications (2)

Publication Number Publication Date
WO2011039773A2 true WO2011039773A2 (fr) 2011-04-07
WO2011039773A3 WO2011039773A3 (fr) 2011-06-16

Family

ID=43826738

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2010/000617 WO2011039773A2 (fr) 2009-09-14 2010-09-14 Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue

Country Status (1)

Country Link
WO (1) WO2011039773A2 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015102245A1 (fr) * 2014-01-02 2015-07-09 Samsung Electronics Co., Ltd. Dispositif d'affichage, dispositif de serveur, système d'entrée vocale et procédés associés
WO2017023719A1 (fr) * 2015-07-31 2017-02-09 Rovi Guides, Inc. Procédé d'amélioration de l'expérience de visionnage d'un utilisateur lorsqu'il consomme une séquence de contenu multimédia
KR102005112B1 (ko) * 2018-10-16 2019-07-29 (주) 씨이랩 콘텐츠 스트리밍 내 광고 서비스 제공 방법
CN112565820A (zh) * 2020-12-24 2021-03-26 新奥特(北京)视频技术有限公司 一种视频新闻拆分方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000039707A1 (fr) * 1998-12-23 2000-07-06 Koninklijke Philips Electronics N.V. Systeme personnalise de classement et de saisie video
WO2002031814A1 (fr) * 2000-10-10 2002-04-18 Intel Corporation Systeme de recherche vocale independante de la langue
WO2007053112A1 (fr) * 2005-11-07 2007-05-10 Agency For Science, Technology And Research Identification d'une sequence de repetition dans des donnees video
CN101315631A (zh) * 2008-06-25 2008-12-03 中国人民解放军国防科学技术大学 一种新闻视频故事单元关联方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000039707A1 (fr) * 1998-12-23 2000-07-06 Koninklijke Philips Electronics N.V. Systeme personnalise de classement et de saisie video
WO2002031814A1 (fr) * 2000-10-10 2002-04-18 Intel Corporation Systeme de recherche vocale independante de la langue
WO2007053112A1 (fr) * 2005-11-07 2007-05-10 Agency For Science, Technology And Research Identification d'une sequence de repetition dans des donnees video
CN101315631A (zh) * 2008-06-25 2008-12-03 中国人民解放军国防科学技术大学 一种新闻视频故事单元关联方法

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015102245A1 (fr) * 2014-01-02 2015-07-09 Samsung Electronics Co., Ltd. Dispositif d'affichage, dispositif de serveur, système d'entrée vocale et procédés associés
US9749699B2 (en) 2014-01-02 2017-08-29 Samsung Electronics Co., Ltd. Display device, server device, voice input system and methods thereof
WO2017023719A1 (fr) * 2015-07-31 2017-02-09 Rovi Guides, Inc. Procédé d'amélioration de l'expérience de visionnage d'un utilisateur lorsqu'il consomme une séquence de contenu multimédia
EP3448049A1 (fr) * 2015-07-31 2019-02-27 Rovi Guides, Inc. Procédé d'amélioration de l'expérience de visionnage d'un utilisateur lorsqu'il consomme une séquence de contenu multimédia
US10375443B2 (en) 2015-07-31 2019-08-06 Rovi Guides, Inc. Method for enhancing a user viewing experience when consuming a sequence of media
US11032611B2 (en) 2015-07-31 2021-06-08 Rovi Guides, Inc. Method for enhancing a user viewing experience when consuming a sequence of media
EP3926966A1 (fr) * 2015-07-31 2021-12-22 Rovi Guides, Inc. Procédé d'amélioration de l'expérience de visualisation d'un utilisateur lors de la consommation d'une séquence multimédia
US11523182B2 (en) 2015-07-31 2022-12-06 Rovi Guides, Inc. Method for enhancing a user viewing experience when consuming a sequence of media
US11849182B2 (en) 2015-07-31 2023-12-19 Rovi Guides, Inc. Method for providing identifying portions for playback at user-selected playback rate
KR102005112B1 (ko) * 2018-10-16 2019-07-29 (주) 씨이랩 콘텐츠 스트리밍 내 광고 서비스 제공 방법
CN112565820A (zh) * 2020-12-24 2021-03-26 新奥特(北京)视频技术有限公司 一种视频新闻拆分方法和装置
CN112565820B (zh) * 2020-12-24 2023-03-28 新奥特(北京)视频技术有限公司 一种视频新闻拆分方法和装置

Also Published As

Publication number Publication date
WO2011039773A3 (fr) 2011-06-16

Similar Documents

Publication Publication Date Title
US20030065655A1 (en) Method and apparatus for detecting query-driven topical events using textual phrases on foils as indication of topic
US20040143434A1 (en) Audio-Assisted segmentation and browsing of news videos
US20180039859A1 (en) Joint acoustic and visual processing
CN103761261A (zh) 一种基于语音识别的媒体搜索方法及装置
CN112004164B (zh) 一种视频海报自动生成方法
US7349477B2 (en) Audio-assisted video segmentation and summarization
CN114880496A (zh) 多媒体信息话题分析方法、装置、设备及存储介质
Dufour et al. Characterizing and detecting spontaneous speech: Application to speaker role recognition
WO2011039773A2 (fr) Système d'analyse de nouvelles télévisées pour canaux de diffusion multilingue
CN114996506A (zh) 语料生成方法、装置、电子设备和计算机可读存储介质
Rouvier et al. Audio-based video genre identification
Ariki et al. Highlight scene extraction in real time from baseball live video
Poignant et al. Towards a better integration of written names for unsupervised speakers identification in videos
Bechet et al. Detecting person presence in tv shows with linguistic and structural features
Hayashi et al. Speech-based and video-supported indexing of multimedia broadcast news
Xu et al. Affective content detection in sitcom using subtitle and audio
Stein et al. From raw data to semantically enriched hyperlinking: Recent advances in the LinkedTV analysis workflow
Ghosh et al. Multimodal indexing of multilingual news video
Palivela et al. Dense Video Captioning Using Video-Audio Features and Topic Modeling Based on Caption
Nouza et al. A system for information retrieval from large records of Czech spoken data
Zhu et al. Video browsing and retrieval based on multimodal integration
JP4305921B2 (ja) 動画像話題分割方法
Amaral et al. The development of a portuguese version of a media watch system
Kothawade et al. Retrieving instructional video content from speech and text information
Papageorgiou et al. Multimedia Indexing and Retrieval Using Natural Language, Speech and Image Processing Methods

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10820017

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10820017

Country of ref document: EP

Kind code of ref document: A2