WO2008145679A2 - Method to convert a sequence of electronic documents and relative apparatus - Google Patents

Method to convert a sequence of electronic documents and relative apparatus Download PDF

Info

Publication number
WO2008145679A2
WO2008145679A2 PCT/EP2008/056570 EP2008056570W WO2008145679A2 WO 2008145679 A2 WO2008145679 A2 WO 2008145679A2 EP 2008056570 W EP2008056570 W EP 2008056570W WO 2008145679 A2 WO2008145679 A2 WO 2008145679A2
Authority
WO
WIPO (PCT)
Prior art keywords
audio
sequence
video
task
stream
Prior art date
Application number
PCT/EP2008/056570
Other languages
French (fr)
Other versions
WO2008145679A3 (en
Inventor
Federico Pinna
Original Assignee
Reitek Spa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Reitek Spa filed Critical Reitek Spa
Publication of WO2008145679A2 publication Critical patent/WO2008145679A2/en
Publication of WO2008145679A3 publication Critical patent/WO2008145679A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234318Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2355Processing of additional data, e.g. scrambling of additional data or processing content descriptors involving reformatting operations of additional data, e.g. HTML pages
    • H04N21/2358Processing of additional data, e.g. scrambling of additional data or processing content descriptors involving reformatting operations of additional data, e.g. HTML pages for generating different versions, e.g. for different recipient devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4314Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for fitting data in a restricted space on the screen, e.g. EPG data in a rectangular grid
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/85406Content authoring involving a specific file format, e.g. MP4 format
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/16Analogue secrecy systems; Analogue subscription systems
    • H04N7/173Analogue secrecy systems; Analogue subscription systems with two-way working, e.g. subscriber sending a programme selection signal
    • H04N7/17309Transmission or handling of upstream communications
    • H04N7/17318Direct or substantially direct transmission and handling of requests

Definitions

  • the present invention concerns a method and an apparatus to convert a sequence of electronic documents defined according to SMIL (Synchronized Multimedia Integration Language), or similar language, into a stream of audio/video data.
  • SMIL Synchronized Multimedia Integration Language
  • the present invention is preferably, but not only, applied in the field of distribution and creation of audio files or multimedia video files comprising audio- visual contents of a complex and possibly interactive nature.
  • video file is intended as a file, for example compressed, containing a sequence of images, defined according to IT standards, such as MPEG, AVI 5
  • audio file is intended as a file, for example compressed, containing a sequence of audio samples, defined according to IT standards, such as WAV, MP3.
  • Video stream is intended as a transmission of a sequence of data, for example compressed, containing a sequence of images, defined according to telecommunication standards, such as H.263, YUV420P or other
  • audio stream is intended as a transmission of a sequence of data, for example compressed, containing a sequence of audio samples, defined according to telecommunication standards, such as PCMl 6, AMR-NB.
  • Distribution methods for presentations or audio/video files, regarding audio- visual content, even complex are known.
  • Said audio-visual contents are generally accessible by means of remote devices connected to the Internet or to other communication network, even local, such as electronic processors, videophones on a fixed or mobile network, television decoders.
  • known distribution methods normally comprise two steps: a first step in which a request is made for specific audio-visual contents, performed by an entity called client, which wants to make use of said contents. The request is directed to an entity called server, which acts as a distributor of said contents.
  • the second step consists of the reply, from server to client, comprising said contents.
  • a frequently used method to perform said distribution of presentations or audio/visual files is based on SMIL or similar language. This language belongs to the family of markup languages and is derived from the better known language XML (Extensible Markup Language).
  • MMS Multimedia Messaging System
  • SMIL type files When the distribution of audio/visual contents occurs by means of SMIL type files, the SMIL language is interpreted, processed and reproduced once it is received by a client. This language, therefore, allows to generate files that comprise all the information of the audio/visual contents and the further information required for a correct reproduction of said contents.
  • SMIL occurs in interactive answering and menu management applications, for example in telephone or videophone use. These applications, better known as IVR (Interactive Voice Response) or IVVR (Interactive Voice and Video Response), function by means of a request method to receive this type of file.
  • IVR Interactive Voice Response
  • IVVR Interactive Voice and Video Response
  • IVR Interactive Voice Response
  • IVVR Interactive Voice and Video Response
  • the music and/or video contents functioning as a background are suddenly interrupted. This interruption can be substantially attributed to the interval of time needed for the request of this type of file with new contents, to the reception of the file, to its interpretation, processing and reproduction. Therefore, the interaction of said menu systems is inevitably slowed down, and deteriorates the quality of the service.
  • Purpose of the present invention is to perfect a method and achieve an apparatus for the conversion of electronic documents into a sequence of data that conforms to an audio stream or a video stream, starting from the processing of a series of such documents defined according to SMIL or similar language, which solves the problems indicated above.
  • a primary purpose of the invention is to allow the interconnection or the reproduction of new audio/video sequences with sequences already being performed without interrupting the performance of the sequences already being reproduced.
  • a method according to the present invention can be used to convert into a stream of data one or more electronic documents, and/or one or more sequences thereof, defined according to SMIL or similar language.
  • the stream of data consists of one or more audio streams and/or one or more video streams, comprising one or more contents that can be reproduced even simultaneously, such as for example a heading superimposed over a video content, intended for the distribution of presentations of audio- visual contents, and not only.
  • An electronic document defined according to SMIL or similar language, on which the present invention is applied, consists of at least two elements.
  • a first element consists of a heading, which defines number and type of the audio- visual contents contained in the document.
  • a second element of said document consists of a body, comprising the audio-visual contents.
  • the audio streams and/or the video streams, intended for the presentation of audio-visual contents are not pre-defined in a rigid manner at the beginning of the reproduction thereof, but consist of pre-ordered sequences that together form the audio/video file.
  • the streams can be modified, for example following a specific request, in the course of the reproduction, thus rendering the presentation of the audio-visual contents dynamic, for example increasing and/or modifying, during the course of the reproduction, the conversion sequence of the electronic documents according to the insertion of a new audio/video sequence that interconnects to the sequence already being executed.
  • This characteristic allows to transform the audio stream or the video stream during their reproduction, without a perceptible interruption in the reproduction, in conformity with the specific audio-visual contents defined in the sequence of electronic documents subjected to increase and/or modification.
  • the method according to the present invention in order to obtain the above, comprises at least a first step, in which a request is generated for the distribution of a presentation of audio- visual contents contained in an electronic document, defined according to SMIL or similar language.
  • the presentation of audio-visual contents is structured according to at least one session comprising a plurality of functioning states, and is preferably set to an initial inactive state.
  • the request for the presentation is sent by a requesting entity, called client, to a distributing entity, called server.
  • the session indicated above also comprises a plurality of tasks suitable to regulate the reproduction modes of the audio-visual contents.
  • Each of the tasks is activated when at least one event takes place, and at the end of the task at least one further event is generated.
  • the above events are suitable to modify the state of one or more specific tasks and, therefore, the state of the presentation.
  • the method according to the present invention also comprises the following steps: - a second step in which the session indicated above is made to transit in a state of preparation for the reproduction of the aforesaid audio-visual contents; in the second step, following an operation to verify the syntactic correctness of the electronic document, through an analysis of the heading of the document, the audio-visual contents and the relative reproduction parameters are identified.
  • a specific entity sequence is also configured, also called sequence of filters, intended to process the audiovisual contents comprised in the body of the document.
  • a table is generated the content of which defines the set of transitions allowed for the session. Therefore, said table defines a list of specific tasks and, for each of said tasks, the list of the specific events admissible for the session.
  • a third step in which the session is made to transit in a state of reproduction activating an initial task, defined during the second step.
  • the initial task according to the information contained in said table, activates the specific tasks suitable for the generation of an audio stream or a video stream, according to the sequence defined in the list of tasks and the relative events contained in said table.
  • the audio or video streams are generated by processing the audio-visual contents, comprised in the document, by means of said sequence of filters.
  • all the above activities are managed at a server level, thus allowing to reduce the requests of calculus capacity at a client level.
  • the audio stream or the video stream, obtained by converting the electronic document can be distributed according to a real time mode, as in the distribution by means of RTP or RTSP (Real Time Protocol or Real Time Streaming Protocol) to a client, or according to a local mode, as for the generation of a video file or of an audio file.
  • RTP Real Time Protocol
  • RTSP Real Time Protocol or Real Time Streaming Protocol
  • the sequence of filters consists of at least three filters.
  • a first filter is suitable for operations to extract audio or video or audio/video data from an audio file or from a video file, and to transform them by means of decoding, re-sampling or re-sizing.
  • a second filter consisting of an audio mixer filter or a video mixer filter, is suitable for receiving an input of different audio or video streams, producing an output of a single audio or video stream.
  • a third filter transforms the output audio or video stream of the second filter into a stream which can be accepted by the client of the presentation of multimedia audio- visual contents.
  • the set of the said filters, used to generate the sequence of filters consists of at least three sub-sets.
  • the first sub-set consists of filters suitable substantially for operations to extract audio or video or audio/video data from an audio or video file.
  • the first sub-set is also suitable for synchronizing operations of two or more audio or video stream and for operations to transmit and receive audio or video streams, using protocols of the RTP/UDP or RTP/TCP (Real Time Protocol/User Data Protocol or Real Time Protocol/Transmission Control Protocol) type.
  • a second sub-set of filters is suitable for specific operations on video streams, such as for example compression and decompression. These operations are also intended to transform the video streams into different formats, or to transform into video streams a series of images encoded according to common electronic formats, such as for example files with JPG or BMP extensions.
  • a third sub-set of filters is suitable for operations on audio streams, such as for example compression or decompression. These operations are also intended to transform said audio sequences into different formats.
  • the present invention also concerns an apparatus for converting into a stream of audio/video data an electronic document, defined according to SMIL or similar language, and consisting of a heading, which defines the number and type of the audio-visual contents in said document, and of a body, which comprises said audio- visual contents.
  • the apparatus comprises first electronic means able to receive a request to distribute audio-visual contents contained in the electronic document.
  • the apparatus also comprises:
  • second electronic means able to retrieve and analyze the document memorized in electronic memorizing means and to generate a session representing said presentation of audio-visual contents.
  • the second electronic means are also able to generate a specific sequence of entities, also called sequence of filters, suitable to process the audio-visual contents comprised in the body of said document and to generate a table that defines the states of transition of said presentation of audio-visual contents and the relative tasks, together with the connected events;
  • - fig. 1 shows a flow chart of a method for distributing audio-visual contents according to the state of the art
  • - fig. 2 shows a flow chart of the method for converting electronic documents into a stream of audio/video data according to the present invention
  • - fig. 3 shows schematically an apparatus for converting electronic documents into a stream of audio/video data according to the present invention.
  • a method 10 and an apparatus 11 according to the present invention can be used to convert into a stream of audio/video data a sequence of one or more electronic documents 12, defined according to SMIL (Synchronized Multimedia Integration Language) or similar language, comprising a heading 13 and a body 14.
  • SMIL Synchronized Multimedia Integration Language
  • the method 10 according to the present invention has been implemented on an apparatus 11, for example a server, with hardware architecture x86 (with Intel and AMD processors), on Windows and Linux operating systems, but it is clear that it may also be implemented with different architectures and on other operating systems.
  • an apparatus 11 for example a server, with hardware architecture x86 (with Intel and AMD processors), on Windows and Linux operating systems, but it is clear that it may also be implemented with different architectures and on other operating systems.
  • the conversion of an electronic document 12 into a stream of audio/video data is achieved through a session 30 defined by at least four states 31 :
  • a session 30 also comprises, according to the illustrated form of embodiment, a series of tasks 33, which can substantially be summarized as follows:
  • - reproduction task 33 a used to reproduce an audio- visual content; the reproduction task 33a substantially consists of a video file or an audio file; - Stop task 33b, used to terminate another task 33 being performed;
  • Sequence task 33d used to define a list of tasks to be performed in sequence
  • Parallel task 33e used to define a list of tasks to be performed simultaneously.
  • the sequence 33d and parallel 33e tasks are also called father tasks; the tasks defined inside the respective lists of said sequence 33d and parallel 33e tasks are called son tasks.
  • Each of the tasks 33 can be modified by means of events 34 which can be summarized substantially as follows: - Start event 34a, used to begin reproducing a task 33;
  • Stop event 34b used to request the termination of a task 33;
  • Stop event 34c used to terminate a task 33 immediately, without waiting for the end of its execution, irrespective of other Stop events 34a provided for the specific task to be terminated.
  • the events 34 associated with each task 33 are divided according to four lists of events 35 summarized as follows:
  • the method 10 according to the present invention comprises the following steps:
  • a fourth step in which, by means of said second electronic means 21, an analysis of the body 14 of the electronic document 21 is performed, so as to define a list of tasks 33 of the session 30 and of events 34, and to define the sequence of filters 15 to be applied in order to convert the audiovisual contents into an audio/video stream; in the fourth step the list of the tasks 33 and of the events 34 admissible for the session 30 is summarized in the form of a table.
  • the body of the electronic document 12 is translated into a sequence task 33d, which represents the main task of the session 30.
  • each element of the body 14 that identifies audio, video, text or timed reproduction contributions and that is separated by specific XML tags according to SMIL language is translated into a task 33 according to rules described hereafter and according to the hierarchy of the electronic document 12:
  • one or more of the following events 34 are defined: - a Start event 34a from the list of Active End events 35b, addressed to the following son task 34 of the father task 33, if this is a sequence task 33d;
  • Stop event 34b from the list of active End events 35b, addressed to the father task 33, if this is a Parallel task 33e and has an "endsync" attribute with no value or with a value equal to "last";
  • Start event 34a from the list of Passive Start events 35c addressed respectively from the father task 33, if this is a parallel task 33e, or from the immediately preceding son task, if the father task 34 is a sequence task 33d.
  • Start event 34a from the list of Active Start events 35a, intended for the first son task 33 of the specific sequence task 33 d;
  • Stop event 34c from the list of Active End Events 35b intended for each son task 34 of the sequence task 33d; - a Stop event 34b, from the list of Passive End events 35d, intended for the specific sequence event 33d and having as a source the last son task 33 of the specific sequence task 33d.
  • a Start event 34a from the list of active Start events 35a, addressed to each of the tasks 33 comprised in the list of the specific parallel task 33e;
  • Stop event 34b from the list of passive End events 35d addressed to each of the son tasks, if the parallel task 33e has as "endsync" attribute a value equal to "last", or not specified; For each of the Reproduction tasks 33a or Clear tasks 33c one or more of the following events 34 are defined:
  • a Start event 34a is generated from the list of the Passive Start events 35c having as a source the task specified in the SMIL event; for the source task, instead, a start event 34a is generated, from the list of active end events 35b, intended for the task 33 with the "begin" attribute;
  • Start event 34a is generated from the list of the Passive Start events
  • a start event 34a is generated, from the list of active end events 35b, intended for the task 33 with the "begin" attribute.
  • the method 10 also comprises a fifth step in which, by means of third electronic means 22, the session is made to transit in a state of reproduction and the audio- visual contents are processed through the sequence of filters, producing an audio/video stream.
  • a sixth step of the method 10 by means of fourth electronic means 24, the audio/video stream is reproduced and said table is dynamically updated, erasing the tasks 33 and the relative events 34 that have been performed, until the end of the reproduction has been reached, in the absence of requests for modification of the reproduction of the audio/video stream.
  • timed activation events 34 contained in the lists of Active Start events of a specific task are activated at the reproduction time indicated by the specific timed event 34;
  • each request to modify the reproduction generated for example by a client that requests the presentation of audio-visual contents comprised in a series of electronic documents 12 not inserted in the original sequence, is managed by commuting the session 30 to said processing state. Thanks to this, the new sequence can be interconnected to the original sequence without needing to stop the reproduction of the original sequence itself, and in any case without the user perceiving the overlapping of the new sequence over the original one.
  • the method 10 starts again from the second step which leads to a new processing of the modified sequence of electronic documents 12 and to an update of the table, erasing tasks and events that are no longer useful and inserting new tasks 33 and events 34.
  • the method 10 provides to interrupt, by means of the events 34 provided for this purpose, the reproduction only of the different contribution of the new electronic document 12.
  • the method returns to the step of analyzing the new document, updating the content of the table with the new tasks 33 and the relative events 34, and activating a reproduction that leaves the reproduction of the video contribution vl unchanged, by interconnecting in the stream of outgoing data also the audio a2 and text tl contents, without perceptible interruption in the fruition of the video component of the video contribution vl .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)
  • Facsimiles In General (AREA)
  • Television Systems (AREA)

Abstract

A method (10) to convert a sequence of electronic documents (12), defined according to SMIL (Synchronized Multimedia Integration Language) or similar language and consisting of at least a heading (13) and a body (14), into a stream of data. The stream of data can be reproduced on at least a client terminal, and consists of one or more audio streams and/or one or more video streams comprising audio- visual contents that can be reproduced simultaneously. Furthermore, the stream of data is not predefined at the beginning of the reproduction, but is modified and/or integrated during reproduction, with no actual and/or perceptible interruption, increasing and/or modifying the sequence of electronic documents with one or more sequences deriving from a request from the client terminal.

Description

"METHOD TO CONVERT A SEQUENCE OF ELECTRONIC DOCUMENTS
AND RELATIVE APPARATUS"
* * * * *
FIELD OF THE INVENTION The present invention concerns a method and an apparatus to convert a sequence of electronic documents defined according to SMIL (Synchronized Multimedia Integration Language), or similar language, into a stream of audio/video data.
The present invention is preferably, but not only, applied in the field of distribution and creation of audio files or multimedia video files comprising audio- visual contents of a complex and possibly interactive nature.
Hereafter video file is intended as a file, for example compressed, containing a sequence of images, defined according to IT standards, such as MPEG, AVI5
MOV or similar; audio file is intended as a file, for example compressed, containing a sequence of audio samples, defined according to IT standards, such as WAV, MP3.
Video stream, on the contrary, is intended as a transmission of a sequence of data, for example compressed, containing a sequence of images, defined according to telecommunication standards, such as H.263, YUV420P or other, and audio stream is intended as a transmission of a sequence of data, for example compressed, containing a sequence of audio samples, defined according to telecommunication standards, such as PCMl 6, AMR-NB.
BACKGROUND OF THE INVENTION
Distribution methods for presentations or audio/video files, regarding audio- visual content, even complex, are known. Said audio-visual contents are generally accessible by means of remote devices connected to the Internet or to other communication network, even local, such as electronic processors, videophones on a fixed or mobile network, television decoders. More generally, known distribution methods normally comprise two steps: a first step in which a request is made for specific audio-visual contents, performed by an entity called client, which wants to make use of said contents. The request is directed to an entity called server, which acts as a distributor of said contents. The second step consists of the reply, from server to client, comprising said contents. A frequently used method to perform said distribution of presentations or audio/visual files is based on SMIL or similar language. This language belongs to the family of markup languages and is derived from the better known language XML (Extensible Markup Language). One of the most used applications based on SMIL is MMS (Multimedia Messaging System) messaging service of mobile phones.
When the distribution of audio/visual contents occurs by means of SMIL type files, the SMIL language is interpreted, processed and reproduced once it is received by a client. This language, therefore, allows to generate files that comprise all the information of the audio/visual contents and the further information required for a correct reproduction of said contents.
One disadvantage of using this language is the fact that the client is obliged to perform complex interpretation and processing operations of this type of file, preparatory to the presentation of the audio-visual contents. This characteristic entails the use of user terminals, or generally client terminals, with ever greater processing powers, causing an increase both in the costs of the terminal and of its consumption and, when such terminals have an autonomous energy supply, for example a battery, a decrease in operational autonomy. A further disadvantage of the distribution of audio-visual contents by means of
SMIL occurs in interactive answering and menu management applications, for example in telephone or videophone use. These applications, better known as IVR (Interactive Voice Response) or IVVR (Interactive Voice and Video Response), function by means of a request method to receive this type of file. In fact, in said applications, in the step when a new menu is presented following the specific user selection request, the music and/or video contents functioning as a background are suddenly interrupted. This interruption can be substantially attributed to the interval of time needed for the request of this type of file with new contents, to the reception of the file, to its interpretation, processing and reproduction. Therefore, the interaction of said menu systems is inevitably slowed down, and deteriorates the quality of the service.
Purpose of the present invention is to perfect a method and achieve an apparatus for the conversion of electronic documents into a sequence of data that conforms to an audio stream or a video stream, starting from the processing of a series of such documents defined according to SMIL or similar language, which solves the problems indicated above. In particular, a primary purpose of the invention is to allow the interconnection or the reproduction of new audio/video sequences with sequences already being performed without interrupting the performance of the sequences already being reproduced.
The Applicant has devised, tested and embodied the present invention to overcome the shortcomings of the state of the art and to obtain these and other purposes and advantages. SUMMARY OF THE INVENTION
The present invention is set forth and characterized in the independent claims, while the dependent claims describe other innovative characteristics of the invention.
In accordance with the above purpose, a method according to the present invention can be used to convert into a stream of data one or more electronic documents, and/or one or more sequences thereof, defined according to SMIL or similar language. The stream of data consists of one or more audio streams and/or one or more video streams, comprising one or more contents that can be reproduced even simultaneously, such as for example a heading superimposed over a video content, intended for the distribution of presentations of audio- visual contents, and not only.
An electronic document defined according to SMIL or similar language, on which the present invention is applied, consists of at least two elements. A first element consists of a heading, which defines number and type of the audio- visual contents contained in the document. A second element of said document consists of a body, comprising the audio-visual contents.
According to a characteristic of the method according to the present invention, the audio streams and/or the video streams, intended for the presentation of audio-visual contents, are not pre-defined in a rigid manner at the beginning of the reproduction thereof, but consist of pre-ordered sequences that together form the audio/video file. The streams can be modified, for example following a specific request, in the course of the reproduction, thus rendering the presentation of the audio-visual contents dynamic, for example increasing and/or modifying, during the course of the reproduction, the conversion sequence of the electronic documents according to the insertion of a new audio/video sequence that interconnects to the sequence already being executed. This characteristic allows to transform the audio stream or the video stream during their reproduction, without a perceptible interruption in the reproduction, in conformity with the specific audio-visual contents defined in the sequence of electronic documents subjected to increase and/or modification.
The method according to the present invention, in order to obtain the above, comprises at least a first step, in which a request is generated for the distribution of a presentation of audio- visual contents contained in an electronic document, defined according to SMIL or similar language. In this first step, furthermore, the presentation of audio-visual contents is structured according to at least one session comprising a plurality of functioning states, and is preferably set to an initial inactive state. According to a variant, the request for the presentation is sent by a requesting entity, called client, to a distributing entity, called server.
According to a further characteristic of said method, the session indicated above also comprises a plurality of tasks suitable to regulate the reproduction modes of the audio-visual contents. Each of the tasks is activated when at least one event takes place, and at the end of the task at least one further event is generated. The above events are suitable to modify the state of one or more specific tasks and, therefore, the state of the presentation.
The method according to the present invention also comprises the following steps: - a second step in which the session indicated above is made to transit in a state of preparation for the reproduction of the aforesaid audio-visual contents; in the second step, following an operation to verify the syntactic correctness of the electronic document, through an analysis of the heading of the document, the audio-visual contents and the relative reproduction parameters are identified. In this second step, following the aforesaid analysis, a specific entity sequence is also configured, also called sequence of filters, intended to process the audiovisual contents comprised in the body of the document. Always in this second step, by means of an analysis of the body of the document, a table is generated the content of which defines the set of transitions allowed for the session. Therefore, said table defines a list of specific tasks and, for each of said tasks, the list of the specific events admissible for the session.
- a third step in which the session is made to transit in a state of reproduction, activating an initial task, defined during the second step. The initial task, according to the information contained in said table, activates the specific tasks suitable for the generation of an audio stream or a video stream, according to the sequence defined in the list of tasks and the relative events contained in said table. The audio or video streams are generated by processing the audio-visual contents, comprised in the document, by means of said sequence of filters.
According to a further variant of the invention, all the above activities are managed at a server level, thus allowing to reduce the requests of calculus capacity at a client level.
According to a variant of the present invention, the audio stream or the video stream, obtained by converting the electronic document, can be distributed according to a real time mode, as in the distribution by means of RTP or RTSP (Real Time Protocol or Real Time Streaming Protocol) to a client, or according to a local mode, as for the generation of a video file or of an audio file.
According to a variant of the present invention the sequence of filters consists of at least three filters. A first filter is suitable for operations to extract audio or video or audio/video data from an audio file or from a video file, and to transform them by means of decoding, re-sampling or re-sizing. A second filter, consisting of an audio mixer filter or a video mixer filter, is suitable for receiving an input of different audio or video streams, producing an output of a single audio or video stream. Finally, a third filter transforms the output audio or video stream of the second filter into a stream which can be accepted by the client of the presentation of multimedia audio- visual contents.
According to a variant of the present invention the set of the said filters, used to generate the sequence of filters, consists of at least three sub-sets. The first sub-set consists of filters suitable substantially for operations to extract audio or video or audio/video data from an audio or video file. The first sub-set is also suitable for synchronizing operations of two or more audio or video stream and for operations to transmit and receive audio or video streams, using protocols of the RTP/UDP or RTP/TCP (Real Time Protocol/User Data Protocol or Real Time Protocol/Transmission Control Protocol) type.
A second sub-set of filters is suitable for specific operations on video streams, such as for example compression and decompression. These operations are also intended to transform the video streams into different formats, or to transform into video streams a series of images encoded according to common electronic formats, such as for example files with JPG or BMP extensions.
Finally, a third sub-set of filters is suitable for operations on audio streams, such as for example compression or decompression. These operations are also intended to transform said audio sequences into different formats.
In accordance with the above purpose, the present invention also concerns an apparatus for converting into a stream of audio/video data an electronic document, defined according to SMIL or similar language, and consisting of a heading, which defines the number and type of the audio-visual contents in said document, and of a body, which comprises said audio- visual contents.
According to the present invention, the apparatus comprises first electronic means able to receive a request to distribute audio-visual contents contained in the electronic document.
According to a further variant of the present invention, the apparatus also comprises:
- second electronic means able to retrieve and analyze the document memorized in electronic memorizing means and to generate a session representing said presentation of audio-visual contents. The second electronic means are also able to generate a specific sequence of entities, also called sequence of filters, suitable to process the audio-visual contents comprised in the body of said document and to generate a table that defines the states of transition of said presentation of audio-visual contents and the relative tasks, together with the connected events;
- third electronic means able to make the session transit in a state of reproduction, transforming the tasks and the relative events, defined in the table, into commands given to the sequence of filters so as to convert the electronic document into a stream of audio or video data, and
- fourth electronic means able to distribute the stream of audio or video data.
BRIEF DESCRIPTION OF THE DRAWINGS These and other characteristics of the present invention will become apparent from the following description of some preferential forms of embodiment, given as a non-restrictive example with reference to the attached drawings wherein:
- fig. 1 shows a flow chart of a method for distributing audio-visual contents according to the state of the art;
- fig. 2 shows a flow chart of the method for converting electronic documents into a stream of audio/video data according to the present invention;
- fig. 3 shows schematically an apparatus for converting electronic documents into a stream of audio/video data according to the present invention. DETAILED DESCRIPTION OF A PREFERENTIAL FORM OF
EMBODIMENT
With reference to the attached drawings, a method 10 and an apparatus 11 according to the present invention can be used to convert into a stream of audio/video data a sequence of one or more electronic documents 12, defined according to SMIL (Synchronized Multimedia Integration Language) or similar language, comprising a heading 13 and a body 14.
According to one form of embodiment, the method 10 according to the present invention has been implemented on an apparatus 11, for example a server, with hardware architecture x86 (with Intel and AMD processors), on Windows and Linux operating systems, but it is clear that it may also be implemented with different architectures and on other operating systems.
According to the preferential form of embodiment of the method 10 according to the present invention, the conversion of an electronic document 12 into a stream of audio/video data is achieved through a session 30 defined by at least four states 31 :
- an inactive state, in which the session 30 has been created;
- an processing state, in which an electronic document 12 is processed;
- a reproduction state, in which the audio/video stream is reproduced;
- an end state, in which the session 30 comes to an end. A session 30 also comprises, according to the illustrated form of embodiment, a series of tasks 33, which can substantially be summarized as follows:
- reproduction task 33 a, used to reproduce an audio- visual content; the reproduction task 33a substantially consists of a video file or an audio file; - Stop task 33b, used to terminate another task 33 being performed;
- Clear task 33f, used to terminate any other task 33 being performed;
- Sequence task 33d, used to define a list of tasks to be performed in sequence;
- Parallel task 33e, used to define a list of tasks to be performed simultaneously. The sequence 33d and parallel 33e tasks are also called father tasks; the tasks defined inside the respective lists of said sequence 33d and parallel 33e tasks are called son tasks.
Each of the tasks 33 can be modified by means of events 34 which can be summarized substantially as follows: - Start event 34a, used to begin reproducing a task 33;
- Stop event 34b, used to request the termination of a task 33;
- Immediate Stop event 34c, used to terminate a task 33 immediately, without waiting for the end of its execution, irrespective of other Stop events 34a provided for the specific task to be terminated. The events 34 associated with each task 33 are divided according to four lists of events 35 summarized as follows:
- list of the Active Start Events 35a, consisting of the events 34 that the specific task 33 generates during the execution thereof;
- list of the Active End Events 35b, consisting of the events 34 that the specific task generates at the end of the execution thereof;
- list of the Passive Start Events 35c, consisting of the list of the events 34 that the specific task receives before beginning the execution thereof;
- list of the Passive End Events 35d, consisting of the events that require the termination, during execution, of the specific task 33; the specific task terminates its execution when it receives all the Stop events 34b provided, or when it receives an immediate Stop event 34c.
The method 10 according to the present invention comprises the following steps:
- a first step in which, by means of first electronic means 20, the request is acquired for the distribution of audio- visual contents included in an electronic document 12 defined according to SMIL or similar language, and in which a session 30 is generated;
- a second step in which, retrieving the document 12 from electronic memorization means 24, a syntactic analysis of the document 12 is performed by second electronic means 21 , in order to verify the validity thereof and hence its conformity to the desired standard;
- a third step in which, always by means of the second electronic means 21, an analysis of the heading 13 of the electronic document 12 is performed in order to define the specific audio or video contents and the relative reproduction parameters; in this third step one or more sequences of filters 15 are also defined, to be used for the reproduction of the specific audio or video contents;
- a fourth step in which, by means of said second electronic means 21, an analysis of the body 14 of the electronic document 21 is performed, so as to define a list of tasks 33 of the session 30 and of events 34, and to define the sequence of filters 15 to be applied in order to convert the audiovisual contents into an audio/video stream; in the fourth step the list of the tasks 33 and of the events 34 admissible for the session 30 is summarized in the form of a table. In this fourth step the body of the electronic document 12 is translated into a sequence task 33d, which represents the main task of the session 30. Furthermore, each element of the body 14 that identifies audio, video, text or timed reproduction contributions and that is separated by specific XML tags according to SMIL language, is translated into a task 33 according to rules described hereafter and according to the hierarchy of the electronic document 12:
- if the element of the electronic document 12 is of the "seq", or sequence, type, it is translated into a sequence task 33 d;
- if the element of the electronic document 12 is of the "par", or parallel, type, it is translated into a parallel task 33e; - if the element of the electronic document 12 is of the "msa:Stop" type, it is translated into a Stop task 33b;
- if the element of the electronic document 12 is of the "msa:Clear" type, it is translated into a Clear task 33;.
- every other element of the electronic document 12, traceable to elements recognized as "ref", "audio", "video", "text" are translated into Reproduction tasks 33a, while non-recognized elements are ignored.
Subsequently, for each of the tasks 33, always analyzing the body 14 of the electronic document 12, one or more of the following events 34 are defined: - a Start event 34a from the list of Active End events 35b, addressed to the following son task 34 of the father task 33, if this is a sequence task 33d;
- a Stop event 34b from the list of active End events 35b, addressed to the father task 33, if this is a Parallel task 33e and has an "endsync" attribute with no value or with a value equal to "last";
- a Start event 34a from the list of Passive Start events 35c addressed respectively from the father task 33, if this is a parallel task 33e, or from the immediately preceding son task, if the father task 34 is a sequence task 33d.
For each of the sequence tasks 33d one or more of the following events 34 are defined:
- a Start event 34a, from the list of Active Start events 35a, intended for the first son task 33 of the specific sequence task 33 d;
- an Immediate Stop event 34c from the list of Active End Events 35b intended for each son task 34 of the sequence task 33d; - a Stop event 34b, from the list of Passive End events 35d, intended for the specific sequence event 33d and having as a source the last son task 33 of the specific sequence task 33d.
For each of the parallel tasks 33e one or more of the following events 34 are defined: - a Start event 34a, from the list of active Start events 35a, addressed to each of the tasks 33 comprised in the list of the specific parallel task 33e;
- an Immediate Stop event 34c, from the list of active End events 35a, addressed to each of the son tasks 33 of the specific parallel task 33e;
- an Immediate Stop event 34c, from the list of passive End events 35d addressed to each of the son tasks, if the parallel task 33e has as "endsync" attribute a value equal to "first";
- a Stop event 34b, from the list of passive End events 35d addressed to each of the son tasks, if the parallel task 33e has as "endsync" attribute a value equal to "last", or not specified; For each of the Reproduction tasks 33a or Clear tasks 33c one or more of the following events 34 are defined:
- if the attribute "begin" is specified and this contains a SMIL event of the start type or of the timed type, then a Start event 34a is generated from the list of the Passive Start events 35c having as a source the task specified in the SMIL event; for the source task, instead, a start event 34a is generated, from the list of active end events 35b, intended for the task 33 with the "begin" attribute;
- if the attribute "begin" is specified and this contains a SMIL event of the end type, then a Start event 34a is generated from the list of the Passive Start events
35c having as a source the task specified in the SMIL event; for the source task, instead, a start event 34a is generated, from the list of active end events 35b, intended for the task 33 with the "begin" attribute.
The method 10 also comprises a fifth step in which, by means of third electronic means 22, the session is made to transit in a state of reproduction and the audio- visual contents are processed through the sequence of filters, producing an audio/video stream.
In a sixth step of the method 10, by means of fourth electronic means 24, the audio/video stream is reproduced and said table is dynamically updated, erasing the tasks 33 and the relative events 34 that have been performed, until the end of the reproduction has been reached, in the absence of requests for modification of the reproduction of the audio/video stream.
In the sixth step the performance of the tasks 33, and the recursive management of the events 34, follows the rules summarized hereafter: - an initial task 33 is performed, identified amongst those that have no event 34 in the list of passive start events 35c;
- when a Task generates an event of the start type 34a intended for a specific task, then the corresponding start event 34a is eliminated from the list of Passive Start events 35c of the specific destination task 33; furthermore, if the list of the Passive Start events 35c of the destination task 34 is not empty, the management of the start event 34a is interrupted; furthermore, if the specific task 33 is a reproduction task 33a or a Clear task 33c and the sequence of filters 15, to which the specific task 33 has to be connected in order to generate the stream of output data, is currently occupied in another task 33, then an immediate Stop event 34c is generated towards the task 33 that is occupying the sequence of filters 15; the Immediate Stop event is generated recursively until the specific sequence of filters 15 is freed, furthermore, if the specific task 33 is a reproduction task 33 a, a specific sequence of filters 15 is generated for its reproduction; all the events 34 contained in the list of active start events 35a of the specific task 33 are managed recursively;
- when a task 33 generates a Stop event 33b, intended for a specific task 33, the corresponding event 34 is eliminated from the list of Passive End events 35d of the specific destination task 33; furthermore, if the list of Passive End results 35d of the specific destination task 34 is not empty, the management of the stop event 34b is interrupted; furthermore if the specific task is a reproduction task 33 a, it is interrupted and the sequence of filters 15 currently being used is freed; all the events 34 contained in the list of active end events 35b of the specific task 33 are managed in a recursive manner;
- when a task 33 generates an Immediate Stop event 34c, intended for a specific task 33, if the latter is a Reproduction task 33a, then it is interrupted and the sequence of filters 15, currently engaged by the destination task, is freed; furthermore, all the events 34 contained in the list of active end events 35b of the specific destination task 33 are managed in a recursive manner;
- the timed activation events 34 contained in the lists of Active Start events of a specific task, are activated at the reproduction time indicated by the specific timed event 34;
- at the end of the performance of a specific event 33, all the events of the list active End events 35d of the specific task 33 are generated;
- during the performance of a task 33, all the events of the list of active End events 35d of the specific task 33 are generated, at the specific instant of reproduction indicated in the SMIL attribute "dur" or in the SMIL attribute "end". In this sixth step, each request to modify the reproduction, generated for example by a client that requests the presentation of audio-visual contents comprised in a series of electronic documents 12 not inserted in the original sequence, is managed by commuting the session 30 to said processing state. Thanks to this, the new sequence can be interconnected to the original sequence without needing to stop the reproduction of the original sequence itself, and in any case without the user perceiving the overlapping of the new sequence over the original one. The method 10 starts again from the second step which leads to a new processing of the modified sequence of electronic documents 12 and to an update of the table, erasing tasks and events that are no longer useful and inserting new tasks 33 and events 34.
If, for example, in the sixth reproduction step of an electronic document 12, comprising audio contributions al, video vl, the presentation of a new electronic document 12 is requested, comprising audio contributions a2, and text tl, the method 10 provides to interrupt, by means of the events 34 provided for this purpose, the reproduction only of the different contribution of the new electronic document 12. In particular, the method returns to the step of analyzing the new document, updating the content of the table with the new tasks 33 and the relative events 34, and activating a reproduction that leaves the reproduction of the video contribution vl unchanged, by interconnecting in the stream of outgoing data also the audio a2 and text tl contents, without perceptible interruption in the fruition of the video component of the video contribution vl .
It is clear that modifications and/or additions may be made to the method 10 and to the apparatus 11 according to the present invention, without departing from the field and scope of the present invention.
It is also clear that, although the present invention has been described with reference to specific examples, a person of skill in the art shall certainly be able to achieve many other equivalent forms of device and method for converting electronic documents defined according to SMIL or other similar language into a stream of audio/video data, all coming within the field of protection of the present invention.

Claims

1. Method for converting a sequence of electronic documents (12), defined according to SMIL (Synchronized Multimedia Integration Language) or similar language and consisting of at least a heading (13) and a body (14), into a stream of data that can be reproduced on at least a client terminal, consisting of one or more audio streams and/or one or more video streams, comprising audio-visual contents which can be simultaneously reproduced, characterized in that said stream of data, not predefined at the beginning of said reproduction, is modified and/or integrated during said reproduction, without actual and/or perceptible interruption, increasing and/or modifying said sequence of electronic documents with one or more sequences deriving from a request from said client terminal.
2. Method as in claim 1, characterized in that it comprises at least a first step in which a request is generated for a distribution of a presentation of audio-visual contents comprised in said electronic document (12), and in that said presentation of audio- visual contents is structured according to a session (30) comprising a plurality of functioning states (31) and a plurality of tasks (33), able to regulate the reproduction modes of said audio-visual contents, said tasks (33) being activated by a specific event (34) and generating a further specific event (34) at the end.
3. Method as in claim 2, characterized in that it also comprises a second step in which, by analyzing the heading (13) of said electronic document (12), said audio-video contents and the relative reproduction parameters are identified and a specific sequence of filters (15) is generated, able to process said audio-video contents and in which, by analyzing the body (14) of said electronic document (12), a table is generated the content of which defines the set of transitions admissible for said session (30), in the form of a list of said tasks (33) and of relative events (34).
4. Method as in claim 3, characterized in that it also comprises a third step in which said tasks (33) are activated, suitable for the generation of an audio stream or of a video stream, according to said list of tasks (33) and relative events (34) contained in the table, processing said audio-video contents by means of a sequence of processing filters (15).
5. Method as in any claim hereinbefore, characterized in that said request for distribution of audio-visual contents is made by a requesting entity, called client, to a distributing entity, called server.
6. Method as in claim 4, characterized in that said tasks (33) comprise a reproduction task able to reproduce an audio or video contribution, a Stop task able to terminate any other task being performed, a Sequential task that defines a set of tasks (33) to be performed according to a pre-established order and a parallel task that defines a set of tasks (33) to be performed simultaneously.
7. Method as in claim 6, characterized in that said events (34) comprise a Start event able to begin the reproduction of a specific task, a Stop event able to request the termination of a specific task (33) and an Immediate Stop event able to request the immediate termination of a specific task (33).
8. Method as in any claim hereinbefore, characterized in that the audio stream or the video stream is distributed according to Internet protocols RTP, RTSP or protocols dedicated to real time distribution.
9. Method as in any claim hereinbefore, characterized in that the audio stream or the video stream is distributed in local mode in order to generate an audio file or a video file.
10. Method as in any claim hereinbefore, characterized in that the sequence of filters (15) comprises at least a filter suitable for operations to extract audio data or video data or audio/video data from an audio file or from a video file, in that it also comprises an audio or video mixer filter suitable for synchronizing operations of two or more audio or video streams and in that it also comprises a transmission filter of audio or video streams by means of Internet protocols RTP or RTSP.
11. Method as in claim 10, characterized in that said sequence of filters (15) comprises at least a filter suitable for receiving audio or video streams through Internet protocols RTP or RTSP.
12. Method as in claim 10, characterized in that said sequence of filters (15) comprises at least a filter suitable for compressing and decompressing audio and video streams.
13. Method as in claim 10, characterized in that said sequence of filters (15) comprises at least a filter suitable for converting audio or video streams into audio or video streams in a different format.
14. Method as in claim 10, characterized in that said sequence of filters (15) comprises at least a filter suitable for transforming a series of images encoded according to common electronic formats into a video stream.
15. Apparatus for converting a sequence of electronic documents (12), defined according to said SMIL (Synchronized Multimedia Integration Language) or similar language and consisting of at least a heading (13) and a body (14), into a stream of data, not pre-defined at the beginning of the reproduction, consisting of one or more audio streams and/or one or more video streams, comprising audiovisual contents that can be reproduced simultaneously, characterized in that it comprises electronic means able to modify said stream of data, with no interruption during reproduction, by means of increasing and/or modifying the sequence of said electronic documents (12).
16. Apparatus as in claim 15, characterized in that it comprises at least first electronic means (20) able to receive a request for distribution of said audio- visual contents contained in said document (12) and in that it comprises second electronic means (21) able to retrieve said document (12) from electronic memorizing means (24), to analyze the heading (13) of said document (12), to generate a session (30) representing said presentation, to generate said sequence of filters (15), and to generate a table that defines the states (31) of transition of the presentation of audio- visual contents and the relative tasks (33) together with the connected events (34).
17. Apparatus as in claim 16, characterized in that it comprises third electronic means (22) able to make said session (30) transit in a state (31) of reproduction transforming said tasks (33) and the relative events (34) defined in said table into commands given to said sequence of filters (15) so as to convert said electronic document (12) by means of said sequence of filters (15) into an audio stream and/or a video stream.
18. Apparatus as in claim 17, characterized in that it comprises fourth electronic means (23) able to distribute said audio stream and/or said video stream.
PCT/EP2008/056570 2007-05-30 2008-05-28 Method to convert a sequence of electronic documents and relative apparatus WO2008145679A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ITUD2007A000095 2007-05-30
ITUD20070095 ITUD20070095A1 (en) 2007-05-30 2007-05-30 PROCEDURE TO CONVERT A SEQUENCE OF ELECTRONIC DOCUMENTS AND ITS APPARATUS

Publications (2)

Publication Number Publication Date
WO2008145679A2 true WO2008145679A2 (en) 2008-12-04
WO2008145679A3 WO2008145679A3 (en) 2009-02-12

Family

ID=38477356

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/056570 WO2008145679A2 (en) 2007-05-30 2008-05-28 Method to convert a sequence of electronic documents and relative apparatus

Country Status (2)

Country Link
IT (1) ITUD20070095A1 (en)
WO (1) WO2008145679A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008539627A (en) * 2005-04-27 2008-11-13 インターナショナル・ビジネス・マシーンズ・コーポレーション Web-based integrated communication system and method, and web communication manager
WO2010027397A3 (en) * 2008-09-05 2010-07-01 Thomson Licensing Method and system for dynamic play list modification
EP2469851A4 (en) * 2009-10-27 2014-09-10 Zte Corp System and method for generating interactive voice and video response menu

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006114412A1 (en) * 2005-04-27 2006-11-02 International Business Machines Corporation Web based unified communication system and method, and web communication manager
WO2006114413A1 (en) * 2005-04-27 2006-11-02 International Business Machines Corporation System, method and engine for playing smil based multimedia contents

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006114412A1 (en) * 2005-04-27 2006-11-02 International Business Machines Corporation Web based unified communication system and method, and web communication manager
WO2006114413A1 (en) * 2005-04-27 2006-11-02 International Business Machines Corporation System, method and engine for playing smil based multimedia contents

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BEILU SHAO ET AL: "SMIL to MPEG-4 BIFS Conversion" AUTOMATED PRODUCTION OF CROSS MEDIA CONTENT FOR MULTI-CHANNEL DISTRIBUTION, 2006. AXMEDIS '06. SECOND INTERNATIONAL CONFERENCE ON, IEEE, PI, December 2006 (2006-12), pages 77-84, XP031033709 ISBN: 0-7695-2625-X *
YOSHIMURA T ET AL: "CONTENT DELIVERY NETWORK ARCHITECTURE FOR MOBILE STREAMING SERVICE ENABLED BY SMIL MODIFICATION" IEICE TRANSACTIONS ON COMMUNICATIONS, COMMUNICATIONS SOCIETY, TOKYO, JP, vol. E86-B, no. 6, June 2003 (2003-06), pages 1778-1787, XP008036766 ISSN: 0916-8516 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008539627A (en) * 2005-04-27 2008-11-13 インターナショナル・ビジネス・マシーンズ・コーポレーション Web-based integrated communication system and method, and web communication manager
US8565267B2 (en) 2005-04-27 2013-10-22 International Business Machines Corporation Web based unified communication system and method, and web communication manager
WO2010027397A3 (en) * 2008-09-05 2010-07-01 Thomson Licensing Method and system for dynamic play list modification
JP2012502351A (en) * 2008-09-05 2012-01-26 トムソン ライセンシング Method and system for dynamically changing play list
US9355076B2 (en) 2008-09-05 2016-05-31 Thomson Licensing Method and system for dynamic play list modification
EP2469851A4 (en) * 2009-10-27 2014-09-10 Zte Corp System and method for generating interactive voice and video response menu

Also Published As

Publication number Publication date
ITUD20070095A1 (en) 2008-11-30
WO2008145679A3 (en) 2009-02-12

Similar Documents

Publication Publication Date Title
US11785289B2 (en) Receiving device, transmitting device, and data processing method
KR100928998B1 (en) Adaptive Multimedia System and Method for Providing Multimedia Contents and Codecs to User Terminals
CN108495141B (en) Audio and video synthesis method and system
EP1143679B1 (en) A conversational portal for providing conversational browsing and multimedia broadcast on demand
US9344517B2 (en) Downloading and adaptive streaming of multimedia content to a device with cache assist
US20040128342A1 (en) System and method for providing multi-modal interactive streaming media applications
CN102802044A (en) Video processing method, terminal and subtitle server
US20140208373A1 (en) Systems and Methods of Processing Closed Captioning for Video on Demand Content
US8799408B2 (en) Localization systems and methods
CN101594528A (en) Information processing system, messaging device, information processing method and program
US9942620B2 (en) Device and method for remotely controlling the rendering of multimedia content
US8719437B1 (en) Enabling streaming to a media player without native streaming support
CN105208440A (en) Online playing method and system for MP4-format video
CN105142020A (en) Method and system for converting video in unsupported format in mobile terminal
WO2008145679A2 (en) Method to convert a sequence of electronic documents and relative apparatus
US8880462B2 (en) Method, system and apparatus for providing information to client devices within a network
US10547878B2 (en) Hybrid transmission protocol
CN114827734A (en) Streaming media data playback method, device, system and storage medium
Black et al. A compendium of robust data structures
CN107534792A (en) Receiving device, send equipment and data processing method
KR20040045182A (en) System and method for interactive broadcasting using return channel
CN114466201B (en) Live stream processing method and device
CN117596442A (en) Converged communication method and platform
Suchomski et al. Multimedia conversion with the focus on continuous media
CN117376593A (en) Subtitle processing method and device for live stream, storage medium and computer equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08760163

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08760163

Country of ref document: EP

Kind code of ref document: A2