DE102005045628B3 - Apparatus and method for determining a location in a film having film information applied in a temporal sequence - Google Patents

Apparatus and method for determining a location in a film having film information applied in a temporal sequence

Info

Publication number
DE102005045628B3
DE102005045628B3 DE200510045628 DE102005045628A DE102005045628B3 DE 102005045628 B3 DE102005045628 B3 DE 102005045628B3 DE 200510045628 DE200510045628 DE 200510045628 DE 102005045628 A DE102005045628 A DE 102005045628A DE 102005045628 B3 DE102005045628 B3 DE 102005045628B3
Authority
DE
Germany
Prior art keywords
film
signal
time
device
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
DE200510045628
Other languages
German (de)
Inventor
Michael Dipl.-Ing. Beckinger
Thomas Dr.-Ing. Sporer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Original Assignee
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to DE102005028978.9 priority Critical
Priority to DE102005028978 priority
Application filed by Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV filed Critical Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority to DE200510045628 priority patent/DE102005045628B3/en
Priority claimed from CN 200680024917 external-priority patent/CN101218648B/en
Application granted granted Critical
Publication of DE102005045628B3 publication Critical patent/DE102005045628B3/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B31/00Associated working of cameras or projectors with sound-recording or sound-reproducing means
    • G03B31/04Associated working of cameras or projectors with sound-recording or sound-reproducing means in which sound track is not on, but is synchronised with, a moving-picture film
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel

Abstract

It is a device for determining a location in a film (110), the film information (112, 114), the device having a memory (320). for storing a reference fingerprint representation of the movie information (112, 114), wherein the fingerprint representation is formed is that a time course of the fingerprint representation of m a temporal course of the film information depends, and wherein a stored reference fingerprint representation is a Time scale is assigned, wherein the device is a device (340) for receiving a portion read from the film (110) comprises means (350) for extracting a test fingerprint representation the read portion, and means (360) for comparing the test fingerprint representation with the reference fingerprint representation has to be based on the comparison and the time scale the Spot in the film (110) to determine.

Description

  • The The present invention relates to a device and a Method for determining a location in a movie that is in a having temporal sequence applied film information, for example Synchronize movie events with a picture playback.
  • Audio-video data are on data carriers, e.g. Film or tape, or transmission channels, e.g. broadcast or telephone, stored in a fixed format, which is an extension to novel audio formats or other synchronous or image-synchronous Additional services, such as Subtitles, does not allow. For example, when introducing new audio formats need therefore new disks or film copies are produced, which have the new audio formats.
  • 8th shows an exemplary movie 110 , Film information is applied to the film in spatial sequence, or during playback accordingly in chronological order, for example video information or images 112 , which are also referred to in English as "frames" or "video frames", and audio information or one or a plurality of analog or digital audio tracks 114 which have "audio frames" in the digital case 110 exemplary feed perforations 116 on, with the help of which the film is played.
  • to Synchronization of additives In principle, two methods are known.
  • The first method involves storing a timecode on the disk, such as e.g. in DTS (DTS = Digital Theater System) for Kinoton, or in an additional channel, which is connected to the audio signal. Examples are anxilliary data by DAB and mp3. The timecode is then used to Additional information synchronously from an external disk DTS e.g. CD, play. The disadvantage of this method, however, is that every additional one Format further space on the disk or transmission channel needed, the in certain circumstances but not available anymore is. In the case of film, these are e.g. the tracks for analog sound, Dolby Digital, DTS, SDDS (SDDS = Sony Dynamic Digital Sound). Proprietary formats prevent the use of the timecode of an extension by other extensions. Mutual disturbances of the extensions are not always prevent, an example of this is the use of anxilliary data in mp3 for Additional information and bandwidth extension from different manufacturers.
  • The second method is based on the misuse of analog sound tracks for storing time code, as e.g. in a prototype cinema, equipped with an IOSONO system is used. The disadvantage of this method, however, is that the analogue track is present in all systems and is often used as a fallback solution in case of disruption of the other systems is used, that is, a misappropriation of Analog track prevents the fallback option. The automatic switching to the analogue track, which in most cinemas is installed, leads to that the time code is played as an analog signal when on the "more modern" tracks for Dolby Digital or DTS no signal is present. In the prototype cinema must therefore in a pure wave field synthesis playback, the following still explained is switched off, the redundant analogue reproduction manually, because otherwise the timecode is over the redundant additional speakers can be heard.
  • Acoustic Wave Field Synthesis, WFS for short, goes beyond the surround approaches of the Dolby, SDDS or DTS formats. The WFS tries to reproduce the air vibrations of a real situation, which make up the sound, over a whole room. In contrast to the conventional reproduction of two or more speakers, in which the representation of the position of the original sound sources is limited to a line between the speakers, the wave field synthesis is to faithfully transfer the entire sound field to the room. This means that the virtual sound sources can be exactly spatially localized, and possibly even seem to exist in the middle of the sounded room, thus they can be bypassed. Systems with up to 200 Speakers in cinema systems and up to 900 Speakers in theater sound systems have already been realized.
  • The Wave field synthesis is based on the Huygens principle, which states that every point on a wavefront as a starting point for an elementary spherical Wave can be viewed. By interference of all elementary waves creates a new wavefront with the original Wave is identical.
  • One Such sound system is from the Fraunhofer Institute for Digital Media technology has been developed under the name IOSONO and in the Cinema Ilmenau in action.
  • When Example from the practice is therefore called the cinema Ilmenau, at the wave field synthesis is operated in two modes.
  • In the first mode, the cinema is operated as a "real" wave field synthesis system, where the analog track of the 35 mm film stores the time code, as explained previously with regard to the second "abusive" method, and the WFS sound from an external one Medium, eg festival plate or DVD, is leaked.
  • In the second mode, "compatible playback", will be on each 35mm movie stored sound from a Dolby processor, alternatively could as well DTS or SDDS are used, read and decoded, where if necessary, the Dolby processor also automatically switches to the analogue track, and the resulting multichannel signal via WFS to virtual speakers maps.
  • There for both Modes different signal paths are necessary, is a division the signal coming from the read head for the analog signal comes, necessary, which requires additional technical effort entails.
  • In summary can therefore be said that in today's movie roles no Space is more, for another synchronization track for example install external sound systems or subtitling systems. All so far available Cinema sound systems, analog and digital, receive their soundtrack either directly above one or a plurality of soundtracks on the roll of film or through a manufacturer-specific time code signal on the roll of film. This means, that for both known approaches, as explained above, new copies of the films are created at mostly substantial cost have to. Make it possible Audio formats such as Dolby Digital and SDDS, although modern audio experiences, however, still do not have timecodes for the synchronization of, for example Subtitles or other language versions of the movie.
  • Frank Jordan and Jesper Dannow therefore suggest in their publication "Generating Timecode Information from Analog Sources ", 118th Convention, Audio Engineering Society from 28 to 31 May 2005, in Barcelona, Spain, Convention Paper 6473 predates a timecode based on the analog audio track produce. The publication describes a system called "Soundtitles" that works on the analogue Soundtrack of the projector is connected. Based on a machined, digital copy of the soundtrack and the analog signal of the film projector becomes a time information or a time code by cross-correlation certainly. The system "Soundtitles" consists of three Components. The core module "Sync Tracker generates the time code signal. The second module, the "Sync Player" creates subtitles, which are projected with, for example, a projector. The third Module, the "clip Player "plays synchronized Audio clips over from wireless headphones transferred to the cinema audience become.
  • adversely in the prior art described above is that the synchronization and timing within the movie, as in the publication described on a search window, for example, a 1 minute is limited. Especially in the early stages of the film, it is difficult the right window for define or determine a successful synchronization. Lies the portion read or scanned by the film is not in the section the stored movie information used for synchronization the synchronization will be unsuccessful or it will be wrong Synchronization. The moviegoer or movie viewer then hears no or a wrong tone to the film.
  • The DE 103 22 722 A1 describes an apparatus for synchronizing an audio signal with a film having frames, each frame having an exposed time code, the apparatus comprising means for capturing the time code exposed for the sequence of single binders to obtain a detected sequence of time codes a time code generator is provided which is configured to generate a sequence of synthesis time codes starting from a start value. A decoder is further provided to decode a time code of the detected sequence of time codes to provide the start value for the time code generator. A detected time code and a corresponding synthesis time code are compared to then, when a phase deviation has been detected above a deviation threshold, manipulating the synthesis time code for that frame to be changed in its length in time. This synthesis time code is then provided to an audio processor configured to time-control the samples of the audio signal associated with that frame in response to detection of the synthesis time code for a frame.
  • The US 2005/0022252 A1 deals with detection, processing and indexing multimedia data using known ones Image processing methods. It will be a synchronous control audiovisual and textual media. For this purpose so-called "tags" are generated as metadata, stored and compared with references.
  • The publication of Kashino, Kunio; Kurozumi, Takayuki; Murase, Hiroshi: A Qick Search Method for Audio and Video Signals Based on Histogram Pruning. IEEE Transactions on Multimedia, Vol. 3, Sept. 2003, pp. 348-357 describes a quick search method based on similarity-based signal searching to detect and locate a specific audio or video signal in a stored long audio or video signal. The key to speeding up the process is an effective pruning algorithm that works in the feature comparison stage is introduced by means of feature histograms. Histogram denotes a frequency distribution of feature vectors over a window, and the window length of the considered section of the long signal corresponds to the duration of the searched short signal.
  • The US 2004/0073916 A1 relates to the so-called "monitoring" of media, such as For example, audio and audiovisual content to objective Data regarding the use of specific media content recordings within the broadcast audio and audiovisual content. It will be Methods for describes the use of content identification technologies, efficient and automatic monitoring data for the transmissions studied to obtain.
  • The WO 94/1644 A1 describes a method and devices for elimination of television commercials. These are features of video and audio signals extracted within a sampling window.
  • The The object of the present invention is to provide an efficient To create a concept to identify a position in a film.
  • These Task is performed by a device for determining a location in a film according to claim 1, a method for determining a location in a film according to claim 5 and solved by a computer program according to claim 6.
  • Of the The present invention is based on the recognition that each Make a movie in general specific for this spot Film information has, so that in a feature extraction different Make a film different, specific manifestations of Have features. This is different in other words Make different "fingerprints" in a movie. These fingerprints can turn used to find a spot in a movie.
  • Therefore, according to the invention a device for determining a location in a film, which in a temporal consequence of applied film information, created, comprising: a memory for storing a reference fingerprint representation (FAD) of the movie information, with the fingerprint representation is designed so that a temporal course of the fingerprint representation depends on a temporal course of the film information, where a stored reference fingerprint representation of a time scale a means for receiving a read from the film Section, means for extracting a test fingerprint representation from the read-in section, and a means for comparing the test fingerprint representation with the reference fingerprint representation, on the basis of the comparison and the time scale the place in to determine the film.
  • The Apparatus and method for determining a location in a Enable film it, anywhere in one at any one time Determine film without having to prepare or change the film yourself have to. The relevant time information, the time scale, is combined with saved a saved version of the movie. It is the Film stored in the form of a reference fingerprint representation, which corresponds to a feature extraction. This can save the required space and also the computing power and / or the duration for determining the Be reduced. Preferred embodiments still have the advantage, with a suitable choice of the fingerprint representation, a enable clear identification of the job.
  • The Apparatus and method for determining a location in a Film may, for example, in a device for generating a Control signal for a movie event system that uses movie events with a picture playback synchronized. Examples of movie events are the audio sound, Subtitles and special effects, where special effects e.g. Air currents, Wiggling at the cinema chairs, odors or lighting effects on the side and rear walls. there are regarding the Audio event both different languages, such as simultaneously Playing the original version and translations into other languages, as well as various audio techniques possible, e.g. the synchronization from digital surround systems like the wave field synthesis. The device or the serve Method for determining a location, in particular for synchronization in an initial phase of the film, but also cause a higher tolerance across from for example, jumps in the middle of the film, so even under adverse circumstances optimal synchronization or determining a position in a film.
  • Although the above-described and following examples speak of a moviegoer or a movie, the invention is not limited to movies for a moviegoer, but generally refers to movies or audio-video signals, whether or not they are This is about film or other media and storage media, such as magnetic tapes or hard drives, stored movie information is. Darue In addition, the invention can also be used for pure sound systems without video or for example by means of a video ID synonymous for synchronization of pure video material, ie without sound, can be used with any event.
  • preferred embodiments The present invention will be described below with reference to FIG the accompanying drawings explained in detail. Show it:
  • 1 a schematic block diagram of a preferred embodiment of an apparatus for generating a control signal for a film event system;
  • 2a a schematic block diagram of an embodiment of an apparatus for performing a correlation;
  • 2 B a schematic block diagram of a preferred embodiment of an apparatus for performing a correlation;
  • 2c.1 an exemplary section of a movie
  • 2c.2 an exemplary course of a sound signal of in 2c.1 represented portion of the film at a variable, first playback speed and a constant test sample rate;
  • 2c.3 an exemplary course of a sound signal of in 2c.1 represented portion of the film at a variable, second playback speed and a constant test sample rate;
  • 2c.4 an exemplary course of a sound signal of in 2c.1 represented portion of the film at a variable, third playback speed and a constant test sample rate;
  • 2d.1 two exemplary sections of a movie;
  • 2d.2 an exemplary course of a reference sound signal of the film;
  • 2d.3 an exemplary course of a test sound signal, based on a first playback speed and a constant test sample rate, for a portion of the film;
  • 2d.4 an exemplary first correlation result from the correlation of the reference sound signal according to 2d.2 and the test sound signal according to 2d.3 ;
  • 2d.5 two exemplary sections of a film according to 2d.1 ;
  • 2d.6 an exemplary course of a reference sound signal of the film according to 2d.2 ;
  • 2d.7 an exemplary course of a test sound signal, based on a second playback speed and a constant test sample rate, for a portion of the film;
  • 2d.8 an exemplary second correlation result from the correlation of the reference sound signal according to 2d.6 and the test sound signal according to 2d.7 ;
  • 3a a schematic block diagram of a preferred embodiment of an apparatus for determining a location in a film by means of a fingerprint representation;
  • 3b.1 two sections of a film;
  • 3b.2 an exemplary course of the reference sound signal for the two sections according to 3b.1 ;
  • 4 a schematic block diagram of a preferred embodiment of an apparatus for determining a location in a film by means of a coarse and a subsequent fine determination of the location;
  • 5a a schematic block diagram of a preferred embodiment of an apparatus for generating a control signal for a film event system;
  • 5b.1 two sections of a film;
  • 5B.2 an exemplary course of a reference sound signal for a first portion of the film;
  • 5b.3 an exemplary course of a test sound signal for a second portion of the film;
  • 5b.4 an exemplary correlation result from the correlation of the reference sound signal according to 5B.2 and the test sound signal according to 5b.3 ;
  • 6a a schematic block diagram of an exemplary film presentation system with a device for generating a control signal for a film event system and a film event system;
  • 6b a schematic block diagram of an exemplary film presentation system with a device for generating a control signal with a exemplary audio movie event system;
  • 7 a schematic representation of an exemplary assignment of a time scale to a movie information;
  • 8th a schematic representation of an exemplary film with applied film information.
  • In the following description of the invention or the preferred Embodiments will be for same or like elements have the same reference numerals.
  • in the The invention will be explained in more detail below with reference to exemplary embodiments which as film information the sound signal, which applied to the film is, use. However, this is not intended to limit the invention, but is for illustration only.
  • 1 FIG. 12 is a principle block diagram of an apparatus for generating a control signal for a movie event system and an exemplary movie. FIG 110 , as he before regarding. 8th has been explained, wherein the device for generating a control signal means for storing 120 the movie information, means for receiving a portion read from the film 140 , An institution 160 for comparing the read portion with the stored movie information 112 . 114 and a facility 180 for determining the control signal based on the comparison and the time scale.
  • The saved movie information 112 . 114 includes, for example, the sound or audio signals, the images or video signals or brands that are already on films to find, and determine, for example, where the aperture rises or from when sound is played, or when the movie stops. The stored audio and / or video signals are present, for example, in digitized form, preferably in compressed form, in order to reduce the memory requirement.
  • One Advantage of digitized storage lies in the simple and especially error-free duplication the stored image of the movie information.
  • in the Unlike traditional Systems, the film remains unchanged as described above, it will only once a stored image of the film information generated, e.g. in the production of the film.
  • When playing the film by means of a movie player, such as a movie projector, example, that is on the soundtrack 114 containing sound signal from the device 140 received for receiving and for the facility 160 for example, sampled at a given sample rate and passed as a portion of a given length or number of samples, respectively.
  • The device 160 is configured to compare this portion read from the film with the stored film information, the device 160 for comparison may be made to compare the read portion with the entire stored film information, but preferably compares the read portion with a portion of the stored film information in order to minimize the computational effort. The comparison can be done for example by cross-correlation, but also by calculating the difference, for example by calculating a compressed hash sum and search this in a database. The comparison may consist of the audio signal alone, the video signal alone, a comparison of the audio signal and the video signal and a combination with an evaluation of the aforementioned features. Based on the result of the comparison of the device for comparison 160 and the time scale determines the device 180 for determining the control signal 190 , By means of the control signal 190 a movie event system is controlled based on the control signal 190 time synchronized with the movie being played 110 For example, WFS audio signals or subtitles generated. In this case, the device for generating a control signal or in particular the means for determining the control signal 180 be designed so that the control signal is any time code format, proprietary or standardized such as the SMPTE (Society of Motion Picture and Television Engineers) standardized LTC time code format (LTC = Longitudinal Time Code).
  • Time synchronous means that based on the control signal 190 the motion picture event system of one of the locations currently being played by the movie, associated with a time on the time scale in the stored movie information, generates a concurrent event corresponding to the time scale corresponding to that time.
  • In this case, unlike the illustrated embodiment, any film player may be used instead of the film projector, any film formats, for example silent films (eg based on video information), films with analog or digital soundtracks, one sound track or several parallel soundtracks, or alternatively Any other storage media may be used in a movie, such as cassettes or hard disks whose format can not or should not be changed, for example to be compatible with the movie player, but at the same time sync other movie events.
  • In a preferred embodiment, the audio signal is used as movie information for synchronization. At this time, the portion read from the film is scanned at a given sampling rate, hereinafter referred to as the test sampling rate, to produce a test tone signal and the stored movie information is stored in digital form, the stored film information being hereinafter referred to as the reference signal , and the test tone signal and reference tone signal in the device 160 for comparing by cross-correlation.
  • In one embodiment, the test signal sample rate and the reference signal sample rate are fixed, that is, constant. The device 160 For comparison, it may then be configured, for example, to generate a first correlation result at a first time on the basis of a first test tone signal and a first reference tone signal to determine a first time scale, and at a second time a second test tone signal and a second reference tone signal to generate a second correlation result to determine a second time of the time scale to determine therefrom, for example, a time difference or playback speed or to determine a speed difference as compared to a desired or reference playback speed. On this basis, the device determines 180 for determining the control signal, the control signal, for example to synchronize the movie event system.
  • adversely at a constant sampling rate, however, that is at a changing Test play speed deteriorates the correlation result, and thus also the accuracy of the determination of the time or the spot in the film becomes less accurate and thus the synchronization worse becomes. This disadvantage can be achieved by varying the sampling rates, ie the test sample rate and / or the reference sample rate become.
  • 2a FIG. 12 shows a principle block diagram of an apparatus for performing a correlation between a test sound signal that is playable at a variable playback speed and a reference sound signal that is a digitally stored version of the test sound signal, the apparatus for performing a correlation 210 for determining a measure of a test playing speed, means 230 for varying a test sample rate or reference sample rate and means 250 for comparing. The device 230 is designed to provide a test sample rate with which the test tone signal 270 is sampled to vary to a modified test signal 272 or to vary a reference sample rate based on a reference tone signal 274 a modified reference sound signal 276 to create. The device 230 to vary is further configured to vary the test sample rate or reference sample rate such that a deviation between a test playback speed associated with the test sound signal and a reference playback speed corresponding to the modified reference sound signal 276 is assigned, or that a deviation between a test playing speed, the modified test tone signal 272 and a reference playback speed corresponding to the reference audio signal 274 or that a deviation between a test playing speed and the modified test tone signal 272 and a reference playback speed corresponding to a modified reference sound signal 276 is reduced, wherein the term playback speed or the problem of a variable playback speed will be explained in more detail below.
  • The device 250 for comparing the modified test tone signal 272 and the reference sound signal 274 , or the test tone signal 270 and the modified reference sound signal 276 , or the modified test tone signal 272 and the modified reference sound signal 276 is trained to get a result 278 to determine the correlation.
  • This in 2a For example, in an apparatus for generating a control signal for a movie event system, such as shown in FIG 1 is shown as a device 160 be used for comparing.
  • 2 B shows a schematic block diagram of a preferred embodiment of an apparatus for performing a correlation between a test sound signal and a reference sound signal.
  • 2 B shows a device 280 for storing a reference sound signal 274 , which is a digital version of the test sound signal 270 is, where the reference sound signal 274 once based on a given memory reference playback speed and a memory reference sample rate.
  • The test tone signal is played back at a variable test playback speed and sampled at a test sample rate to produce the test tone signal 270 to create.
  • The device 210 for determining the measure of the test playing speed of the test sound signal 270 controls on the basis of the measure for the Test play speed the device 230 to vary. The device 230 in turn, to vary controls a reference or sample rate converter 232 and a variable scanner 234 , where the sample rate converter 232 is formed from the reference sound signal based on the memory reference playback speed and a memory reference sampling rate into a modified reference sound signal 276 which corresponds to a reference sound signal based on another memory reference playback speed and / or memory reference sample rate, and wherein the variable sampler 234 is designed to sample the test sound signal with a varied, that is from the standard or basic sampling rate, sampling rate to a modified test sound signal 272 to create.
  • Deviating from 2 B For example, the device for performing a correlation may also be designed such that the test tone signal 270 always via the variable scanner 234 the device 250 for comparison, wherein the variable sampler 234 , is then configured such that one of the variable test sample rates corresponds to the standard or basic sample rate, and is further configured to receive the reference audio signal 274 always via the reference sample rate converter 232 the device 250 for comparison, the reference sample rate converter 232 is designed such that it with appropriate control by the device 230 for varying the reference sound signal 274 unmodified to the facility 250 to pass on to others.
  • In the 2 B selected representation of the separate supply of the test sound signal 270 opposite to the modified test tone signal 272 and the reference sound signal relative to the modified reference sound signal 276 to the device 250 to compare, serves to represent the alternative execution options or implementation options.
  • For example, in one embodiment where the device is 250 configured for comparison, the modified test tone signal 272 with the unmodified reference tone signal 274 compare, no reference sample rate converter 232 necessary or instructs the device to perform a correlation according to 2 B no reference sample rate converter 232 on. Likewise, a facility has 250 for comparison, which is formed, the unmodified test sound signal 270 with the modified reference sound signal 246 compare, no variable sampler 234 on.
  • In another embodiment, the device 280 for storing a means for storing a movie information, wherein the stored film information is associated with a time scale, and the test sound signal 270 for example, a movie sound signal. The device for performing a correlation according to 2 B can then, for example, as a means of comparison 160 according to 1 be used.
  • 2c.1 shows a portion of an exemplary movie 110 with a soundtrack 114 , as in 1 previously described. In 2c.1 are two parts of the movie 110 plotted, a first location, hereinafter referred to as location L 1 , and a second location, hereinafter referred to as location L 2 . The two places L 1 and L 2 define a section on the film 110 which has a length of ΔL = L 1 - L 2 .
  • 2c.2 shows an exemplary course of the test sound signal that corresponds to the in 2c.1 is assigned to section described between the location L 1 and L 2, further comprising the time at which the position L 1 of the movie is played back is referred to as time T 1 and the time at which the position L 2 of the movie is playing, is referred to as time T 2 . The time duration ΔT = T 1 -T 2 is dependent on the length of the relevant section and the playback speed v of the film, the following applies: ΔT = ΔL / v respectively. T 2 - T 1 = (L 2 - L 1 ) / V.
  • When sampling the test tone signal at the sampling rate f = 1 / Δt, where Δt is the sampling period and ΔT = n · Δt, the test tone signal may be produced as a result of n + 1 samples, as exemplified in FIG 2c.2 shown with n = 10.
  • When the film is played back at a playback speed v and a sampling rate f = 1 / Δt, the film segment between L 1 and L 2 or T 1 and T 2 is subdivided into n time segments or represented by n + 1 sample values, for example: n = ΔL / (Δt · v) respectively. n = ΔL · f / v, that is, the number of sampling periods or samples for a given film section ΔL is proportional to the sampling rate f, or antiproportional to the sampling period Δt and antiproportion to the playback speed v. In other words, for a section of constant length ΔL, the quotient "f / v" or the product "Δt · v" must be constant if n or the number of samples n + 1 should be constant. If, in this case, the first sample is the same, under the condition mentioned above, the ones are also one individual samples equal.
  • Accordingly, in generating the stored movie information or the reference sound signal at a memory sampling rate f memory and a memory playback speed v memory, the stored portion of the movie information or the test sound signal is represented and stored by n memory + 1 reference samples, for example.
  • To illustrate the facts, the show 2c.2 to 2c.4 exemplary samples or storages of the film portion between the location L 1 and the location L 2 for a constant sampling rate f and a variable sampling rate Δt and a variable playback speed, respectively 2.c2 shows an exemplary sample or storage for a first playback speed v 1 , 2c.3 shows a sample or storage of the same film section at a second playback speed v 2 , and 2c.4 shows a scan of the same film section for a third scan speed v 3 . In this example, v 1 is half as large as v 2 and twice as large as v 3 :
    v 1 = v 2/2 and v 1 = 2 · v 3 .
  • All three in the 2c.2 to 2c.4 Sound signals shown have the same first sample at the point L 1 or at the corresponding time T 1 . Therefore, accordingly, as in the 2c.2 to 2c.4 exemplified, the stored image information or the reference sound signal in 2c.2 represented by n 1 + 1 = 11 samples, in 2c.3 the same section of film is represented by n 2 + 1 = 6 samples, and in 2c.4 the same movie section is shown with n 3 + 1 = 21 samples.
  • Like from the 2c.2 to 2c.4 can be seen corresponds to a constant sampling rate, an increase in the playback speed v of a temporal compression of the audio signal, ie a doubling of the playback speed v 1 off 2c.2 leads as in 2c.3 shown to a halving of T 2 - T 1 and n, and a reduction of the playback speed v to a temporal extension of the audio signal, ie a halving of the playback speed v 1 from 2c.2 leads as in 2c.4 represented to a doubling of T 2 -T 1 and n.
  • The 2d.1 and 2d.2 essentially correspond to the 2c.1 and 2c.2 , Compared to 2c.1 shows 2d.1 two additional locations defining a search section or window with respect to the film and the film information applied thereto, wherein a first location of the search window is denoted L 0 , and a second location of the search window is denoted L 3 , the portion between the location L 0 and the point L 3 is greater than the portion defined by the points L 1 and L 2 , or ΔL window > ΔL with ΔL window = L 3 - L 0 and ΔL = L 2 - L 1 , Accordingly, in 2d.2 in addition to 2c.2 time T 0 , which represents the time associated with point L 0 based on a given playback speed; and time T 3 , which represents the time associated with point L 3 based on a given playback speed.
  • Transmitted to the generation of the stored film information or the reference sound signal and the additionally stored time scale, this means that T 0 defines, for example, the time on the time scale, which is assigned to the point L 0 , the time T 1 defines the time on the time scale, the point L 1 , the time T 2 defines the time on the time scale defining the point L 2 and the time T 3, the time on the time scale associated with the point L 3 on the film.
  • 2d.3 equals to 2c.2 ,
  • The following is based on the 2d.2 to 2d.4 a basic course of a comparison of two signals by means of correlation or the problem of a variable Abspielge speed in a comparison of two signals are exemplified and explained.
  • It puts 2d.3 a currently read a film information applied to the film or the test sound signal 270 and 2d.2 a stored film information or a reference sound signal, wherein in an optimal case, here by the 2d.2 and 2d.3 and the memory sampling rate at which the reference sound signal was generated coincides with the playback speed of the test sound signal and the sampling rate of the test sound signal, and the quotient of memory sampling rate f memory and memory playback speed v memory with the quotient of the sampling rate for the test sound signal f and the playback speed of the test sound signal v coincide. In this case, the reference sound signal or a portion of the reference sound signal, which is defined by T 1 and T 2 , with the test sound signal representing the portion between T 1 and T 2 , more precisely their sample value sequences, exactly match, and by correlation clear local maximum or a correlation peak are obtained, as exemplified in 2d.4 is shown.
  • The Position of the peak in turn gives the time shift of the Test sound signal compared to the Reference sound signal or the search window. Based on this then the current time the stored time scale are determined.
  • The 2d.5 to 2d.8 show in contrast to the 2d.1 to 2d.4 an example in which the playback speed of the test sound signal, shown in 2d.7 compared to the playback speed of the test sound signal, as in 2d.2 is shown reduced.
  • 2d.5 equals to 2d.1 , 2d.6 equals to 2d.2 , this means, 2d.6 FIG. 10 illustrates an example of a reference tone signal based on a memory sample rate f memory and a memory playback speed v memory . 2d.7 FIG. 12 shows an exemplary plot of the test tone signal based on one. FIG 2d.3 respectively. 2d.6 unchanged test sampling rate f, however, a changed, reduced playback speed v 'of the test sound signal.
  • Based on a considered time interval ΔT, this means that in the same time interval ΔT at a reduced speed v 'only a lesser portion or a portion of lesser length ΔL' according to ΔL '= v' · ΔT is played back by the film is reached on the currently playing film after the period .DELTA.T only one point L ' 2 , which lies before the point L 2 , as shown in 2d.5 is shown. Relative to the reference sound signal and the time scale associated therewith, the point L ' 2 is assigned the time T' 2 of the timescale, as shown in FIG 2d.7 will be shown.
  • Based on the individual samples of the test sound signal, this means that the "spatial" course of the test sound signal given by the soundtrack of the film is invariable, so that at a lower playback speed v 'a sampling period Δt or a corresponding spatial sampling section Δ1' corresponds to the smaller one is considered Δ1, so that, as in 2d.7 across from 2d.6 is shown, the samples of the test sound signal with respect to the "spatial" waveform to the left "wander".
  • In the opposite case that the changed playback speed v 'is greater than the memory playback speed v memory , the opposite occurs, in the same time interval Δt a larger spatial section Δ1 is played back, so that the samples of the test sound signal on the "spatial" course of the Test tone signal to "right" on the waveform "wander".
  • at an altered one Playback speed, independent of whether they are higher or lower than the memory playback speed is degraded thus the result of the comparison, since even under otherwise optimal Conditions, the test sound signal and the reference sound signal two different spatial Play sections of the movie. The result of the comparison will be the worse, the bigger the Deviation of the memory playing speed from the test playing speed differs. In a comparison by means of correlation, the Amount of the local maximum or peak, and the maximum itself This makes it wider and flatter, for example, so that the determination of the time in terms of The time scale becomes increasingly inaccurate until it is no longer possible.
  • Under In real conditions, the playback speed of the test tone signal varies For example, not only between different movie players, but can also while of a movie vary. An exact readjustment is therefore essential around during an entire movie synchronicity to ensure.
  • The Device for performing a correlation therefore varies the sampling rate of the test sound signal or the sampling rate of the reference sound signal to the disadvantageous Effect of a variable speed of the test tone signal, as previously described, according to the condition described above, the quotient of the sampling rate and the playback speed of the Test tone signal and the reference sound signal must be equal to to show the same movie section with the same samples to minimize.
  • For a reference digital audio signal previously generated at a memory sampling rate, the change in playback speed is effected by sample rate conversion, wherein the stored reference audio signal 274 For example, it is interpolated accordingly to generate a reference tone signal at the sampling rate corresponding to the changed playback speed.
  • The 2d.1 - 2d.8 illustrate simplified examples in which, for the sake of clarity, it has been assumed that the memory playback speed v memory corresponds to a normal or standard playback speed of a playback device for generating a test sound signal. As previously discussed, however, the quotient of sample rate f and playback speed v is the magnitude that must be the same for the reference sound signal and the test sound signal to represent the same portion of the film with the same samples as previously indicated. For example, when generating the reference sound signal, a double playback speed can also be used if the sampling rate is simultaneously doubled.
  • In an embodiment according to 2 B , the device can 210 for determining, based on the result 278 determine the correlation of a measure of a test playing speed.
  • A possibility It is a single correlation result for the determination a measure the playback speed, for example, by a Amplitude of a peak compared to a predetermined threshold is used to determine if there is a deviation between a playback speed a test sound signal and a reference sound signal in a predetermined Area is located.
  • In a preferred embodiment, at least two different reference sound signals, which are based on different reference sampling rates and / or different reference playback speeds, are compared with the test sound signal to obtain the results of the correlation, for example by means of a quality assessment relating to 5 will be explained in more detail in order to determine from these a most similar Referenztonsignal and thus based on the known sampling rate and the known memory playback speed, a measure of the playback speed of the test sound signal. In this case, the different reference sound signals can be formed one after the other and compared with the test sound signal or simultaneously formed and compared.
  • A particularly preferred embodiment of the apparatus for performing a correlation generates three reference sound signals based on different reference sampling rates, wherein the reference sound signal of the middle of the three sampling rates is based on the reference sampling rate of the reference sound signal which in a previous comparison is the best quality or maximum match with the test sound signal and the two other reference sound signals each have a reference sampling rate higher or lower than the reference sampling rate of the middle reference sound signal. This is from the facility 230 for varying on the basis of an output signal of the device 210 controlled to determine the measure of the test play speed. This ensures that the reference sampling rate or the reference playback speed of the reference sound signal is matched to the playback speed or reference sampling rate of the test sound signal.
  • 3a shows an exemplary movie, as in 8th and a basic block diagram of a device for determining a location in the film.
  • This in 3a For example, an exemplary embodiment of the device for detecting a position in a film can be used in a device for generating a control signal for a film event system, as described, for example, in US Pat 1 is shown as a device 180 be used to determine the control signal.
  • The device for determining a location in a film has a memory 320 for storing a reference fingerprint representation of the film information, wherein the fingerprint representation is designed so that a time profile of the fingerprint representation depends on a temporal course of the film information, and wherein a stored reference fingerprint representation is assigned a time scale, a device 340 for receiving a portion read from the film, means 350 for extracting a test fingerprint representation from the read-in portion and means 360 for comparing the test fingerprint representation with the reference fingerprint representation to determine the location in the film based on the comparison and the timescale.
  • at a preferred embodiment the fingerprint representation is a representation in the form of a spectral Flatness, with a time course of the fingerprint representation a temporal course of the spectral flatness comprises.
  • 3b.1 shows an exemplary movie 110 , as in 8th shown. Here, for example, a location L 100 of the movie when playing the movie at a given playback speed corresponds to time T 100 of the time scale, location L 103 corresponds to time T 103 of the time scale, location L 113 , time T 113 of the timescale and Place L 116 of time T 116 of the time scale.
  • at the step of generating the reference fingerprint representation of Film information is used in an embodiment for certain spatial or temporal sections of the film determines a fingerprint.
  • 3b.2 For example, FIG. 16 shows a first portion including the portion from the location L 100 to L 113 and T 100 to T 113 , respectively, and a second portion including the portion from the location L 103 to the location L 113 and from the time T 103 up to the time T 116 includes. Based on these sections, a fingerprint associated with this section is generated based on, for example, spectral analysis, Fourier transformation, or other feature extraction methods. In a particularly preferred embodiment of the fingerprint comprises the spectral flatness γ x 2, which is calculated from the variation of the power density spectrum, so that the value of the spectral flatness is determined for each section, and the function of the time profile of the film information, for example Sound signal, a sequence of spectral flatness results in the memory 320 with the assigned time scale get saved.
  • Sampling rate, Length or Duration of the section or the distance between two consecutive following sections will be according to the requirements, for example as to uniqueness or accuracy of the determination of the place in intended for the movie. The longer the section the clearer the expression of the feature in general, The higher the sampling rate and / or the smaller the distance between two sections, the more precise the position in the film can be determined. The higher the Sampling rate, the longer the sections and the smaller the distances between sections, the higher is the memory required for the reference signal or the request for the computing power the signal processing.
  • One significant advantage of the fingerprint representation in the form of the spectral Flatness is their small storage requirements compared to a complete example Storage of the power density spectrum for a same section. A course or sequence of spectral flatnesses is preferred as a fingerprint for used a section.
  • 4a shows an exemplary movie 110 , as in 8th and an apparatus for detecting a location in a film having film information applied in a temporal sequence.
  • This in 4a For example, an exemplary embodiment of the device for detecting a position in a film can be used in a device for generating a control signal for a film event system, as described, for example, in US Pat 1 is shown as a device 180 be used to determine the control signal.
  • The device for determining a location has a memory 420 for storing film information deposited on a film in temporal succession, wherein the stored film information is assigned a time scale, means 440 for receiving a portion read from the film and a synchronization device 460 configured to compare a sequence of samples of the read portion on which a first sampling rate is based and a first search window of the stored film information to obtain a coarse result, and a sequence of samples of the read portion which is a second Sampling rate, and to compare a second search window of the stored film information to obtain a fine result indicating the location of the film, wherein a position of the second search window in the stored movie information depends on the coarse result, and wherein the first search window in time as the second search window, and further wherein the first sample rate is lower than the second sample rate.
  • 5a shows an exemplary movie 110 , as in 8th and a preferred embodiment of a device for generating a control signal for a film event system, which is formed on the basis of an applied on the film analog audio track a read portion of the audio signal or test tone signal and a stored digital version of the test tone signal hereinafter referred to as Referenztonsignal, which is associated with a time scale, to determine by comparing the test sound signal and the reference sound signal by means of the time scale, the control signal.
  • 5a shows a preferred embodiment of a device for generating a control signal for a film event system, the a first film sound scanner 542 having, with a first A / D converter 544 is connected (A / D = analog / digital), where the first A / D converter 544 with a first feature extractor 552 , with a first device 562 for correlation with a first reference sound signal based on a first sampling rate, with a second device 564 for correlation with a second reference sound signal based on a second sampling rate and with a third device 566 for correlation with a third reference sound signal based on a third sampling rate, the sampling rate also being referred to as a sample rate. An entrance of the first facility 562 for a correlation, an input of a second device 564 for a correlation and an input of the third device 566 for a correlation are with an output of a sample rate converter 232 , which is referred to in English as a sample rate converter (SRC), connected.
  • An output of the first device 562 for a correlation, an output of the second device 564 for a correlation and an output of the third device 566 for a correlation are with an input of a first device 568 connected to the quality assessment. The device 568 for quality assessment turn is with the sampling rate converter 232 and a facility 570 coupled to the Abtasterwahl, wherein an output of the device 570 for pickup selection with an input of a timer 582 connected is. The timer 582 turn is with the stored soundtrack or a device 522 connected to store the sound track, wherein an output of the device 522 to save the soundtrack to an input of the sample rate converter 232 connected is.
  • An output of the first feature extractor 552 is with an entrance of a facility 554 for comparing a feature comprising, for example, a feature classifier and a database of features, wherein an output of the device 554 for comparing a feature with an input of the timer 582 connected is.
  • An output of the timer 582 is with an entrance of a facility 584 coupled to time code generation comprising a time code database or coupled to a time code database, further comprising an output of the device 584 for time code generation with an input of a device 586 connected to the time code smoothing, wherein the device 586 is designed for time code smoothing, a time code 592 output, and further comprising an output of the device 586 for time code smoothing with an input of a word clock generator 588 which is in turn formed, a word clock signal 594 issue.
  • The device for generating a control signal for a film event system optionally further comprises a second film sound scanner 542 ' on top of that with a second A / D converter 544 ' is connected, wherein the second A / D converter 544 ' with a second feature extractor 552 ' , with a fourth device 562 ' for correlation with a fourth reference sound signal based on the first sampling rate, with a fifth means 564 ' for correlation with a fifth reference sound signal based on the second sampling rate and with a sixth device 566 ' for correlation with a sixth reference sound signal connected at the third sampling rate.
  • An output of the fourth device 562 ' for a correlation, an output of the fifth device 564 ' for a correlation and an output of the sixth device 566 ' for a correlation are with an input of a second device 568 ' connected to the quality assessment, wherein an output of the second device 568 ' for quality assessment with offset compensation 569 and another output to an input of the sample rate converter 232 and further comprising the means for offset compensation 569 with the picker option 570 connected is.
  • This will be the first movie sound scanner 542 , also referred to as the main scanner, is positioned so that the apparatus for generating a control signal has enough time to synchronize itself. The first film sound scanner 542 So it delivers a pre-delayed signal. At Aufsynchronisationszeit still adds the correlation window width or width of the portion of the test sound signal. Based on the perforations on the film roll, the time difference for the pre-delay can be set exactly. As a first clue, three seconds is recommended.
  • The mode of operation of the exemplary embodiment of the device for generating a control signal for a film event system will be explained in more detail below, the principle being explained on the basis of the first film tone scanner 542 generated test tone signal or its signal processing chain is explained as the second, optional signal processing chain or signal processing by the second film tone scanner 542 ' generated test sound signal corresponds to the first, it is therefore only on the device 569 specifically for offset compensation.
  • The first film sound scanner 542 From the soundtrack of the film, it reads the sound signal or samples the sound signal from the soundtrack of the film and sends this signal to the first A / D converter 544 continue, with the first A / D converter 544 is based on the sampling rate of the first film sound scanner 542 and the playback speed of the movie from which the soundtrack or movie information is read to produce a digital audio signal or test tone signal.
  • On the basis of the test tone signal 270 one or a plurality of features is extracted or a test fingerprint representation is formed. For example, the spectral flatness is used as a characteristic or fingerprint for the feature extraction or fingerprint representation. The test fingerprint representation is then from the device 554 For comparison of a feature or a fingerprint representation compared with a reference fingerprint representation, wherein as stated above, the fingerprint representation is formed so that a time profile of the fingerprint representation of a temporal course of the film information depends, and wherein one in the device 554 for comparing a feature stored reference fingerprint representation is associated with a time scale, and the device 554 for comparing, to determine a location in the film and a time code signal, respectively, based on the comparison of the test fingerprint representation with the reference fingerprint representation and the time scale 554Z to create.
  • The sampling rate converter generates based on the stored reference sound signal 274 the same signal with slightly different sampling rates, ie modified reference tone signals, for the correlations to be calculated in parallel. Here, the case where a modified reference sound signal has the same sampling rate as the original reference sound signal is included herein, so that the discussion of the 5 Furthermore, the term reference tone signals is generally used.
  • In other words, the sample rate converter 232 generates three reference sound signals 276 or modified reference tone signals 276 wherein a first reference sound signal is based on a first sampling rate and the first device 562 for a correlation, wherein a second reference sound signal 276 based on a second sampling rate and the second device 564 for a correlation, and a third reference sound signal 276 based on a third sampling rate and a third device 566 for a correlation is supplied. The sampling rate converter 232 provides low-level, different in sample rate signals to the correlation or to the devices 562 . 564 . 566 for a correlation, wherein the sampling rate is always set in dependence on the previous measured maximum peak-to-noise value from the correlation. In each case a correlation gets a modified reference sound signal with this sampling rate, another correlation gets a little lower, one level lower, and another correlation gets a slightly higher graduated sampling rate. This ensures that the sample rate converter can, for example, tune or synchronize to a change in the speed of the analog audio signal.
  • The device 522 to save the soundtrack and the sampling rate converter 232 are preferably designed to use a window width of 2 n , in order to calculate large correlation windows by means of the fast Fourier transformation (FFT). In parallel, more than three correlations can be calculated to compensate for sudden jumps in the soundtrack. The correlation window is chosen to be large in order to obtain a clear correlation peak. In order to obtain the recognition accuracy of the correlation peak under a sample or a sampling period, it is possible to work with oversampling of the input signal or test tone signal.
  • The device 522 for storing the soundtrack depending on the supplied time code signal 582Z of the contemporary 582 the reference sound signal in the length of the correlation window, wherein the correlation window is the search window in which the test sound signal is searched.
  • The first device 568 for quality assessment is designed to perform a maximum value search in the Kreuzkorrelierten the signals or the amounts of the signals and the quality of the cross-correlated, depending on the height of the correlation peak compared to other peaks in the cross-correlated weight or based on the peak-to Noise distance to determine the quality of each correlation.
  • Based the quality assessment the reference sound signal is determined with the best quality and from the position of the peak of the reference sound signal with the best quality or quality the Shift of the peak opposite determined by the search window, and for example as a time code difference between measured and currently valid time code or as relative Time code output.
  • Depending on the result of the quality assessment, the first device sends 568 For quality assessment, a control signal 568A to the sampling rate converter 232 for example, distinguishes only the three signal values "0", "+1" and "-1", for example, at "0", the sample rates of the last sample rate conversion or correlation are maintained, because the correlation result from the modified reference sound signal with the middle sampling rate was determined to be the highest quality, at "+1" the sampling rates are increased by one level compared to the last sampling rate conversion or correlation because the correlation result from the modified reference tone signal with the highest sampling rate was determined to be the highest quality; -1 "the sampling rates over the previous sample rate conversion or correlation are reduced by one level, since the correlation of the test sound signal and the modified reference sound signal with the lowest reference sampling rate had the best correlation result and the best peak-to-noise distance.
  • In other words, dependent of which sample rate (first, second, or third) has the best correlation peak has been obtained, the sample rate converter is e.g. increased by a sample rate delta value or is lowered or driven so that it does not sample rate conversion performs.
  • there The correlation serves to address two important aspects. First, determining the location in the film or determining the time in the film based on the time code difference from the correlation. Secondly, the determination of the measure for the Playback speed to the optimum reference sample rate or determine optimal sample rate conversion of the reference sample rate. Wherein the adaptation of the sampling rates or the Nachempfinden adapted Playback speeds, in turn, better correlation results allows and thus again the determination of the time or determination of the position improved in the movie and thus in turn the synchronization and the prediction improved.
  • A preferred embodiment according to 5 is formed, by means of a signal analysis, to detect signal parts with specific characteristics in order to then hide them during synchronization and thus false detections or To prevent synchronization or to avoid random fluctuations of the time axis.
  • Such Characteristics can for example, the loudness of the signal part or the "problem" of a signal and the signal analysis or detection of problematic parts on the Based on SNR (signal-to-noise ratio), PNR (Peakto-Noise), Spectral power or line density spectrum, spectral flatness or averaging based on a time sequence.
  • For example may be below a threshold of the peak-to-noise ratio the time code difference is invalid be recognized. Or, for example, several peaks with similar Detected peak-to-noise ratio, the time code difference as well invalid be recognized.
  • Of Another example is the quality of low-noise correlations Signal parts, ie signal parts with low amplitude, because of the higher quantization noise the digital sample is lower than correlations with loud Signals, therefore, are quiet signal parts based on thresholds or adaptively hidden to random fluctuations of the time axis to avoid. additionally the signal energy can be another quality feature.
  • One Another example is the hiding of problematic, because recurring Signal parts, ambiguities and thus, for example, wrong To avoid synchronization.
  • problematic Signal parts or sections can also be signaled as metadata, for example, to be independent of the quality the current correlation to hide those signal parts.
  • The device 584 for time code generation is configured to be based on the time code signal 582Z of the timer 582 , which may for example be based on an internal or proprietary time code, for example in a standardized time code or a time code signal based on a standardized time code to convert.
  • The timer 582 is controlled by an internal clock (Interval or Frequency of Correlations), a coarse audio ID fingerprint, such as the time code signal 554Z from the Merkmalsbestim determination or fingerprint representation, and the determined correlation difference, for example, the time code difference signal determined from the correlation 570Z the device 570 for scanner selection. The timer must prioritize correlation signal (highest priority), time code from feature determination, and internal clock (lowest priority).
  • The device 586 for time code smoothing is formed to the time code signal 584Z to smooth out, for example, to avoid a strong jumping time code or if time codes from the correlation are missing, to find meaningful intermediate values, for example to compensate for pauses in the analog tone. That of the institution 586 time code slur generated time code signal 592 is preferably a standard time code with which the movie event system is synchronized. The time code signal 592 however, it can also be used to generate a corresponding sample clock or sample clock via a very slowly regulating Phase Locked Loop (PLL) if the included audio reproduction system is digital. Such phase locked loops are available as finished devices and are not the subject of this patent.
  • optional can more than a film scanner with a time-varying offset from the projection lens to improve the robustness of film damage or used in the synchronization of poorly suited sections become.
  • A second film sound scanner 542 ' can then be used, for example, since the second film tone scanner 542 ' already exists in conventional cinema systems. Breaks in the analog sound can here by the attached at different points on the movie film scanner 542 . 542 ' be bridged, since the probability rises with short pauses in the film sound, the at least one scanner, the first film sound scanner 542 or the second film tone scanner 542 ' , provides enough signal for a correlation and the associated synchronization.
  • Of Further can optionally different scanners, e.g. for analog sound, Dolby Digital sound (incl. Decoder), DTS digital sound (including DTS decoder) or another sound and a combination from the above as a reference soundtrack and / or test soundtrack be used.
  • there can single tracks for the comparison using averaging, majority decision or prioritization, automatically or through metadata, of the generated data Time information can be used, as well as a down mix on mono.
  • Generally spoken, can different samplers for different sound formats and / or different film scanners with different time offsets are used.
  • The use of a downmix on mono has the advantage that when the monaural track is used as a stored audio track, it has to be stored less than, for example, storing five channels.
  • The Storage of different, ie more than one soundtrack, this means So no downmix means that all channels are stored independently and then that, for example, as previously explained, corresponding Comparisons or majority decisions are made, then the synchronization using a particular channel, the actual soundtrack and a corresponding channel of the stored soundtrack.
  • The Initialization phase or first synchronization and resynchronization After a rest, two critical phases during a film screening or a synchronization of a movie event system.
  • preferred embodiments therefore compute more than three parallel correlations at the beginning, because no synchronization has taken place, that is, more as three reference tone signals of different sampling rates are included the test sound signal compared or correlated to the fastest possible correct sampling rate or playback speed of the test tone signal to determine. here we can also different sampling rates can be tried in sequence, until one of the correlations has the best signal-to-noise ratio.
  • Alternatively or additionally, the first feature extractor provides 552 and the device 554 for feature classification in association with the database, a coarse absolute time code value which defines a coarse location in the film to perform a fine determination of the location of the film or a fine time code determination in a second step, for example by the correlation. Once the synchronization has taken place, for example, three correlations can be used to resynchronize changes in the playback speed of the test sound signal during the film screening.
  • The Accuracy with which a spot in a movie or one of the site assigned time assigned on a time scale (time code) can be, hangs from the sampling rate of the reference sound signal and the sampling rate of the Test tone signal, the higher the sampling rate, the more accurately the location in the movie can be determined become. However, a lower sampling rate has the advantage that with the same number of samples, a longer portion of the reference sound signal or the test tone signal can be displayed. A preferred one embodiment is therefore designed, in a first step, a rough determination to identify a spot in a movie by taking a longer section of the film by a reference tone signal with a lower sampling rate is shown, and also a test sound signal by Abtas tion with a lower sampling rate is obtained. Based on the rough spot in the film, a reference tone signal is then in a second step higher Sample rate and a test tone signal higher sampling rate for a fine Determination of the location used in the film.
  • Different expressed will the window length at Correlate adjusted. At the beginning of the search will be long in time Window, but uses a reduced sampling rate of the signals if one time about found and tracked only should be, short windows may even with oversampling the signals used to a higher temporal To achieve accuracy.
  • In The initialization phase may, for example, a "compatible Playback of the "old" audio format until the exact position is determined.
  • Just like that can be a "compatible Playback of the "old" audio format, if the synchronization is clearly lost until the exact Position is determined again.
  • The device 570 for picker selection and the device for offset compensation 569 are only necessary in embodiments with more than one Tontonabtaster. For example, the device will decide 570 for Abtasterwahl, whether the result or the time code difference of the first device 568 for quality assessment ( 568Z ) or the result or the time code difference 568Z ' the second device 568 ' for quality assessment to the timer 582 for determining a location in the movie or a time code 582Z forwards. Because the second film sound scanner 542 ' sampling the test tone signal at a different location on the film, the difference (offset) between the location where the first film tone sampler 542 scans the film to the location where the second film sound scanner 542 ' scans the film through the device 569 compensated for offset compensation, so that the timer 582 the correct time code difference 570Z regardless of whether the time code difference 568Z or the time code difference 568Z ' is selected with respect to the last stored time or the last stored position of the movie that is stored in the timer.
  • Notwithstanding the in 5a In the embodiment shown, the different reference sound signals of different reference sampling rates can also be generated successively and compared with the test sound signal to determine the measure of the playback speed of the test sound signal and the optimum reference sampling rate, respectively men. Alternatively, more than three modified reference sound signals may be compared to the test sound signal, in parallel or serially, to enable fast synchronization not only in the initial phase, but also during movie screening, the film event system for larger cracks in the film, eg by cuts or in film missing sections causes faster sync to the current spot in the movie.
  • Notwithstanding the in 5a In the exemplary embodiment illustrated, a synchronization of a film event system can also take place on the basis of the images applied to the film, both for an evaluation of features or fingerprints and for a correlation of a test image signal with one or a plurality of reference image signals.
  • there can, as previously shown, the correlation of audio and / or Video signals for determining the time location in an audio and / or video streams are used, and a synchronous playback be controlled due to this timing.
  • alternative may also include the determination of an audio and / or video signature the raw material in the form of an audio ID / video ID (ID = Identification) roughly determining the time used in a long AV stream be used to allow synchronization at any point.
  • Of the Basic approach of the invention is, for example, already digitally save the existing analogue tone again and then by correlation and other feature determination with the analog audio track to sync to the movie. The output signal or control signal the device for generating a control signal or the Synchronisiergeräts can be any time code format. Preferably, of course, e.g. used the SMPTE standardized LTC time code format. For each Movies film must have a record for the device for production a control signal or for the synchronization device to be created.
  • For each Movie will be an extra disk for the production before described device for generating a control signal or synchronization device created. The disk includes the digitized analog audio track, e.g. in Dolby stereo format, as can be found on the film reel, feature data about the soundtrack and matching timecodes.
  • The following is based on the 5b.1 to 5b.4 an exemplary determination of a time code difference described.
  • 5b.1 shows an exemplary movie 110 with a soundtrack 114 as already in 8th described.
  • Based on the time code signal 582Z of the timer 582 gets out of the facility 522 for storing a soundtrack, a reference sound signal 274 read out and by means of the device for sample rate conversion 232 a modified reference tone signal according to 5B.2 is generated, which represents a film section from the point L 0 to the point L 3 and the point L 0 associated time T 0 or a corre sponding time code and the point L 3 associated time T 3 and time code.
  • 5b.3 shows an exemplary test sound signal or section of a test sound signal, which is defined by the start time T 1 and the end time T 2 and has been generated on the basis of the sampling rate f = 1 / Δt.
  • 5b.4 shows the result of the correlation of the modified reference sound signal according to 5B.2 and the portion of the test tone signal 5b.3 , The time difference ΔT "= T 1 -T 0 between the start time T 0 of the search window or modified reference sound signal 5B.2 and the time T 1 of the search window or reference sound signal is the time shift on the basis of time code difference or the relative time code is formed. In this case, the time T 1 is the time or the time shift of the test sound signal, in which a portion of the n = 11 samples long reference sound signal coincides with the maximum test tone signal, or a correlation of the reference sound signal and the N = 11 samples long test sound signal as a correlation result Maximum.
  • It is for the quality assessment 568 the knowledge of the absolute time T 0 or the time T 1 not necessary because, for example, the timer 582 knows the last absolute time or absolute time code and only the time code difference 570Z needed to determine the updated absolute time or time code. The difference can be represented, for example, from the position of the peak with respect to the time of the beginning of the search window. In 5b.4 For example, the peak is the fourth sample, ie, the test tone signal 5b.3 is off by "3 · Δt" from the reference sound signal 5B.2 where Δt is the sampling period corresponding to the modified sampling rate.
  • This can be the time code difference 570Z for example, consist of the value n = 3. Here comes the advantage of the matched to the variable playback speed of the test sound signal sampling rate or playback speed of the reference tone signals advantageous to wear, since the Δt is adapted to the playback speed, a more accurate determination of the location in the film or displacement relative to the search window is possible as a fixed sampling rate of the reference sound signal, because then only multiples of this sampling rate for a determination of the location be generated in the movie.
  • In this case, for example, the time T 0 of the search window or reference sound signal can be equal to the T 1 of the previous correlation, since the film is played only forward.
  • 6a shows an embodiment of a film system in which a device 100 for generating a control signal 190 with a movie event system 600 is coupled, thereby generating the device 100 for generating a control signal based on the movie 110 , as in 8th shown the control signal 190 such as a time code, with the movie event system 600 is synchronized.
  • 6b shows a film system, which is a device 100 for generating a control signal 100 and a wave field synthesis system 610 as an exemplary film event system, wherein the embodiment of the wave field synthesis system 610 An institution 620 for controlling the wave field synthesis system, a digital memory 622 for the wave field synthesis audio signals and a plurality of speakers 624 for the wave field synthesis system. Based on the movie 110 or an example analog movie soundtrack 114 generates the device 100 for generating a control signal, the control signal 190 to lip-sync a wave-analog audio experience to an originally analog-offset movie.
  • Alternative to the wave field synthesis system 610 Of course, other audio systems, such as digital audio systems or digital surround audio systems by means of the device 100 to synchronize lip sync to generate a control signal.
  • 7 shows an exemplary movie, as in 8th shown, an exemplary digitally stored reference sound signal 720 and an assignment of a timescale.
  • For example, in generating the stored movie information or reference tone signal, the analog audio signal is sampled at a given playback speed and rate, for example, 44.1 kHz, and audio portions of, for example, 10 ms are stored as a so-called audio frame, that is, the digital reference sound signal is present on the memory as a result of audio frames. The assigned time of a time scale can then consist, for example, of numbering the audio frames from 0 or 1 in ascending order as the time code or time scale. Time code TC1 corresponds to audio frame AF1 in FIG 7 or, for example, to find the start time or end time of an audio frame as the time code, eg, for the first audio frame, either 0 ms or 10 ms if an audio frame has a duration of 10 ms.
  • Timecodes usually have formats such as hour: minute: second: frame, whereby the frame usually refers to video frames with eg 24 frames per second (motion picture film). A time scale or time code can therefore, for example, assign a plurality of audio frames to a video frame or define an audio frame as the smallest time scale unit. Accordingly, the time code or the time scale then, for example, 4 audio frames assign a time code, see TC1 'in 7 comprising four audio frames AF1-AF4 or assigning a single Audi frame to a time code, see TC1 in FIG 7 to which an audio frame AF1 is assigned. Depending on the audio format, the audio frames may also represent time-overlapping sections of the audio signal.
  • The control signal 190 may for example be formed as a time code, but also as a sequence of pulses, wherein, for example, each pulse corresponds to a time scale unit and similar to a relative time code, the film event system accumulates the pulses to synchronize with the film.
  • One further embodiment, to continue, for example, an analog audio signal as a fallback to disposal but at the same time synonymous with a timecode To realize additional services, the approach offers a watermark in to embed the audio and / or video signal. Advantage of this solution is that even with "difficult" audio signals, e.g. very quiet passages or even similar "monotonous" sounds, a clean clock recovery possible is. For this variant is in principle the complete set of the relevant Watermark claims, in particular in the area of the search for the correct clock rate or the readjustment of the Sampling rate, useful. The decisive disadvantage of this approach However, that is the actual movie changed or a new version or copy of the film needs to be created in order to get the Watermarks into the movie Embed audio and / or video signal.
  • Depending on the circumstances, the method according to the invention can be implemented in hardware or in software. The implementation can be done on a digital storage medium, in particular a floppy disk or CD with electronically readable control signals, which with ei can interact with a programmable computer system that the process is performed. In general, the invention thus also consists in a computer program product with a program code stored on a machine-readable carrier for carrying out the method according to the invention, when the computer program product runs on a computer. In other words, the invention can thus be realized as a computer program with a program code for carrying out the method when the computer program runs on a computer.

Claims (6)

  1. Device for determining a position in a film ( 110 ), the film information applied in a chronological order ( 112 . 114 ), comprising: a memory ( 320 ) for storing a reference fingerprint representation of the movie information ( 112 . 114 ), wherein the fingerprint representation is formed so that a time profile of the fingerprint representation depends on a temporal course of the film information, and wherein a stored reference fingerprint representation is associated with a time scale; a facility ( 340 ) for receiving one of the film ( 110 ) read section; a facility ( 350 ) for extracting a test fingerprint representation from the read-in portion; and a facility ( 360 ) for comparing the test fingerprint representation with the reference fingerprint representation to determine, based on the comparison and the timescale, the location in the film ( 110 ) to investigate.
  2. Apparatus according to claim 1, in which the film information is applied to the film in an analogue soundtrack, and in which the device ( 340 ) for receiving to receive the analog audio information from the analog audio track.
  3. Device according to Claim 1 or 2, in which the device ( 350 ) is designed for extracting to calculate a representation with egg ner spectral flatness as a fingerprint representation, so that a time course of the fingerprint representation comprises a time course of the spectral flatness.
  4. Apparatus according to any one of claims 1 to 3, further comprising another device for receiving a portion read from the film, the portion being different from the portion passing through the device (1). 140 ) is received for receiving.
  5. Method for determining a position in a film ( 110 ), the film information applied in a chronological order ( 112 . 114 ), comprising the following steps: receiving one of the film ( 110 ) read section; Extracting a test fingerprint representation from the read portion; and comparing the test fingerprint representation with the reference fingerprint representation, wherein the fingerprint representation is configured such that a time profile of the fingerprint representation is dependent on a temporal progression of the film information ( 112 . 114 ) and wherein the stored reference fingerprint representation is associated with a time scale to determine, based on the comparison and the timescale, the location in the film ( 110 ) to investigate.
  6. Computer program with a program code for running a Process according to claim 5, when the computer program runs on a computer.
DE200510045628 2005-06-22 2005-09-23 Apparatus and method for determining a location in a film having film information applied in a temporal sequence Active DE102005045628B3 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
DE102005028978.9 2005-06-22
DE102005028978 2005-06-22
DE200510045628 DE102005045628B3 (en) 2005-06-22 2005-09-23 Apparatus and method for determining a location in a film having film information applied in a temporal sequence

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
DE200510045628 DE102005045628B3 (en) 2005-06-22 2005-09-23 Apparatus and method for determining a location in a film having film information applied in a temporal sequence
JP2008517365A JP5137826B2 (en) 2005-06-22 2006-06-09 Apparatus and method for determining a position in a film having film information applied in a time sequence
CN 200680024917 CN101218648B (en) 2005-06-22 2006-06-09 Device and method for determining a point with film information in a film
EP06754259A EP1894199A1 (en) 2005-06-22 2006-06-09 Device and method for determining a point in a film comprising film data applied in chronological order
PCT/EP2006/005553 WO2006136300A1 (en) 2005-06-22 2006-06-09 Device and method for determining a point in a film comprising film data applied in chronological order

Publications (1)

Publication Number Publication Date
DE102005045628B3 true DE102005045628B3 (en) 2007-01-11

Family

ID=36716607

Family Applications (1)

Application Number Title Priority Date Filing Date
DE200510045628 Active DE102005045628B3 (en) 2005-06-22 2005-09-23 Apparatus and method for determining a location in a film having film information applied in a temporal sequence

Country Status (4)

Country Link
EP (1) EP1894199A1 (en)
JP (1) JP5137826B2 (en)
DE (1) DE102005045628B3 (en)
WO (1) WO2006136300A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290621B (en) 2007-04-17 2011-06-15 上海申瑞电力科技股份有限公司 Safe digital card memory search method
JP5750167B2 (en) * 2010-12-07 2015-07-15 エンパイア テクノロジー ディベロップメント エルエルシー Audio fingerprint difference for measuring quality of experience between devices
JP2013178216A (en) * 2012-02-28 2013-09-09 Koichi Ono Time-code history update type loudness meter

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994016442A1 (en) * 1993-01-08 1994-07-21 Arthur D. Little Enterprises, Inc. Method and apparatus for eliminating television commercial messages
US20040073916A1 (en) * 2002-10-15 2004-04-15 Verance Corporation Media monitoring, management and information system
DE10322722A1 (en) * 2003-05-20 2004-12-23 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Device and method for synchronizing an audio signal with a film
US20050022252A1 (en) * 2002-06-04 2005-01-27 Tong Shen System for multimedia recognition, analysis, and indexing, using text, audio, and digital video

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5040081A (en) * 1986-09-23 1991-08-13 Mccutchen David Audiovisual synchronization signal generator using audio signature comparison
US5055939A (en) * 1987-12-15 1991-10-08 Karamon John J Method system & apparatus for synchronizing an auxiliary sound source containing multiple language channels with motion picture film video tape or other picture source containing a sound track
IL109649A (en) * 1994-05-12 1997-03-18 Electro Optics Ind Ltd Movie processing system
JPH1020420A (en) * 1996-06-25 1998-01-23 Sony Cinema Prod Corp Movie film
DE10134471C2 (en) * 2001-02-28 2003-05-22 Fraunhofer Ges Forschung Method and device for characterizing a signal and method and device for generating an indexed signal
KR100934460B1 (en) * 2003-02-14 2009-12-30 톰슨 라이센싱 The method and apparatus for automatically synchronizing the playback of media services between first and second media service
CN100521781C (en) * 2003-07-25 2009-07-29 皇家飞利浦电子股份有限公司 Method and device for generating and detecting fingerprints for synchronizing audio and video

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994016442A1 (en) * 1993-01-08 1994-07-21 Arthur D. Little Enterprises, Inc. Method and apparatus for eliminating television commercial messages
US20050022252A1 (en) * 2002-06-04 2005-01-27 Tong Shen System for multimedia recognition, analysis, and indexing, using text, audio, and digital video
US20040073916A1 (en) * 2002-10-15 2004-04-15 Verance Corporation Media monitoring, management and information system
DE10322722A1 (en) * 2003-05-20 2004-12-23 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Device and method for synchronizing an audio signal with a film

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Jordan,Frank, Dannow,Jesper: Generating Time Code Information from Analog Sources. AES 118th Conven- tion, Audio Engineering Society, 28-31 May 2005, Barcelona, Spanien, Convention Paper 6473, S.1-3
Jordan,Frank, Dannow,Jesper: Generating Time Code Information from Analog Sources. AES 118th Conven-tion, Audio Engineering Society, 28-31 May 2005, Barcelona, Spanien, Convention Paper 6473, S.1-3 *
Kashino,Kunio, Kurozumi,Takayuki, Murase,Hiroshi: A Quick Search Method for Audio and Video Signals Based on Histogram Pruning. IEEE Transactions on Multimedia, Vol.5, No.3, September 2003, S.348- 357 *

Also Published As

Publication number Publication date
WO2006136300A1 (en) 2006-12-28
JP5137826B2 (en) 2013-02-06
JP2008547145A (en) 2008-12-25
EP1894199A1 (en) 2008-03-05

Similar Documents

Publication Publication Date Title
US7386357B2 (en) System and method for generating an audio thumbnail of an audio track
JP4000171B2 (en) Playback device
US7085613B2 (en) System for monitoring audio content in a video broadcast
JP3816572B2 (en) Information recording apparatus, information recording method, information reproducing apparatus, and information reproducing method
US5742569A (en) Information record medium, apparatus for recording the same and apparatus for reproducing the same
US8320743B2 (en) Dynamic variation of output media signal in response to input media signal
CN1825463B (en) Data storage medium in which multiple bitstreams are recorded, apparatus and method for recording the multiple bitstreams, and apparatus and method for reproducing the multiple bitstreams
JP4317127B2 (en) System and method for indexing and summarizing music videos
EP1089279A2 (en) Synchronisation of multiple media streams
DE69926481T2 (en) Device and method for recording, designing and playing synchronized audio and video data using voice recognition and rotary books
US20040143349A1 (en) Personal audio recording system
JP3784879B2 (en) Information recording medium, information recording apparatus and method, and information reproducing apparatus and method
CN100498959C (en) Information reproducing device of reproducing information recording medium and information recording device
EP1652385B1 (en) Method and device for generating and detecting fingerprints for synchronizing audio and video
JP4403658B2 (en) Music data output device and music data output method
JP2005322401A (en) Method, device, and program for generating media segment library, and custom stream generating method and custom media stream sending system
ES2312323T3 (en) Recording device, recording method and recording media in the form of a disc.
US5641927A (en) Autokeying for musical accompaniment playing apparatus
US20070223874A1 (en) Video-Audio Synchronization
KR100533433B1 (en) Apparatus and method for information recording and reproduction
US20050123886A1 (en) Systems and methods for personalized karaoke
US8009966B2 (en) Methods and apparatus for use in sound replacement with automatic synchronization to images
EP2628047B1 (en) Alternative audio for smartphones in a movie theater.
JP4646099B2 (en) Audio information reproducing apparatus and audio information reproducing system
JP3969762B2 (en) Information recording medium, recording apparatus and method thereof, and reproducing apparatus and method thereof

Legal Events

Date Code Title Description
8100 Publication of the examined application without publication of unexamined application
8364 No opposition during term of opposition