WO1998038642A1 - Sound synchronizing - Google Patents

Sound synchronizing Download PDF

Info

Publication number
WO1998038642A1
WO1998038642A1 PCT/NZ1998/000027 NZ9800027W WO9838642A1 WO 1998038642 A1 WO1998038642 A1 WO 1998038642A1 NZ 9800027 W NZ9800027 W NZ 9800027W WO 9838642 A1 WO9838642 A1 WO 9838642A1
Authority
WO
WIPO (PCT)
Prior art keywords
computer system
sound event
symbol
symbols
medium
Prior art date
Application number
PCT/NZ1998/000027
Other languages
French (fr)
Inventor
Rhonda Violet Marion Rollinson
Timothy Morton Foreman
Original Assignee
Tall Poppy Records Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to AU63134/98A priority Critical patent/AU6313498A/en
Application filed by Tall Poppy Records Limited filed Critical Tall Poppy Records Limited
Publication of WO1998038642A1 publication Critical patent/WO1998038642A1/en

Links

Classifications

    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B31/00Associated working of cameras or projectors with sound-recording or sound-reproducing means
    • G03B31/02Associated working of cameras or projectors with sound-recording or sound-reproducing means in which sound track is on a moving-picture film
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/022Electronic editing of analogue information signals, e.g. audio or video signals
    • G11B27/028Electronic editing of analogue information signals, e.g. audio or video signals with computer assistance
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G11B27/034Electronic editing of digitised analogue information signals, e.g. audio or video signals on discs
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/34Indicating arrangements 
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/20Disc-shaped record carriers
    • G11B2220/21Disc-shaped record carriers characterised in that the disc is of read-only, rewritable, or recordable type
    • G11B2220/213Read-only discs
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/20Disc-shaped record carriers
    • G11B2220/25Disc-shaped record carriers characterised in that the disc is based on a specific recording technology
    • G11B2220/2537Optical discs
    • G11B2220/2545CDs
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/90Tape-like record carriers
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • G11B27/30Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording
    • G11B27/3027Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording used signal is digitally coded
    • G11B27/3036Time code signal
    • G11B27/3054Vertical Interval Time code [VITC]
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • G11B27/32Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on separate auxiliary tracks of the same or an auxiliary record carrier
    • G11B27/322Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on separate auxiliary tracks of the same or an auxiliary record carrier used signal is digitally coded
    • G11B27/323Time code signal, e.g. on a cue track as SMPTE- or EBU-time code
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/36Monitoring, i.e. supervising the progress of recording or reproducing

Definitions

  • This invention relates to improvements to display media which have been referred to throughout as films.
  • This expression includes conventional cinematograph films in which a changeable visual image is carried on the film as a succession of pictures, one per frame of the film. Sound signals to accompany the pictures and to be synchronised therewith are typically carried on the film as a variable density optical strip or more recently as a longitudinal stripe of magnetisable material carried on the film.
  • the expression "film” also includes magnetic tape (video tape) carrying along the length thereof a succession of frames each of which carries magnetised zones representing a picture in analogue or digital Form.
  • the expression also includes other appropriate recording media such as ROM compact discs or other forms of digital recordings
  • variable sound signal is usually reproduced as audible sound, it may also, or instead be recorded on the film in a form which is to be reproduced visually e.g. as sub-titles, sur-titles or even as a full alphanumeric representation (printing) of the speech, synchronised appropriately with the changeable visual image.
  • the recordist may have overloaded and distorted his/her recording device, the performance (dialogue wise) may not have been up to scratch or the film may be desired for viewing in multiple countries and thus need dialogue in the corresponding languages and/or dialects/accents.
  • a ribbon is wound from reel-to-reel right to left on a machine in which the linear movement of the ribbon is synchronised with the linear movement of the first magnetic tape.
  • an operator can write by hand on the ribbon, and the ribbon is then super imposed on the lower part of the picture either physically or via a projection system.
  • the writing is so placed such that when it coincides with a marker bar it is in time with the words or sentence seen spoken on the pictures.
  • both images are transferred to a second film or tape creating a permanent record of the two images.
  • This film or tape is then played to an actor in a recording studio, the actor can read and speak the written words shown on the ribbon and they will be in sync with the lips or actions seen in the visual pictures.
  • This system is an improvement on the previous audio beep/cue system but, due to the difficulty handwriting the script, and the difficulty in reading handwriting is not ideal.
  • Handwriting text for such a systems takes a relatively long time, which may slow the re-recording process substantially. Further, a performer may have problems determining which lines they are to perform and also what the lines are if the handwriting is untidy. Where a film includes a piece of fast dialogue, handwriting must be squashed and compacted together to ensure the required text is placed on the appropriate position relative to the original film recording. Again this can cause legibility problems as the handwriting will need to be squashed into a restricted space.
  • a further object of the invention is to address the foregoing problems or at least provide the public with a useful choice.
  • medium used in conjunction with the present invention as being a film or magnetic video tape. It should be appreciated however, that other types of medium may be used in conjunction with the present invention in alternative embodiments.
  • other mediums which may be used can include computer based memory adapted to record audio visual information or laser discs and CD ROM's adapted to record in play back audio visual information.
  • a sound event may include any audio signal recorded onto the medium used.
  • sound events may be vocal dialogue or language audio tracks.
  • the present invention maybe specifically adapted in some embodiments to be used as an aid to language translations for a film. Dialogue in the required language may be inserted into the final translated film easily with use of the present invention.
  • Alternative embodiments of the present invention may not deal only with language sounds as sound events.
  • other sound events such as performers blowing raspberries or making other unintelligible noises may be synchronised with symbols in accordance with the present invention, and reference to dialogue or language sound events only should in no way be seen as limiting.
  • the symbols used to represent sound events may be any type of symbol which can be clearly understood to represent the particular sound event involved.
  • textual symbols may be used to represent the particular language sounds.
  • a computer system as discussed above may include any type of processing device which is capable of acting on a set of instructions to perform the method of the present invention.
  • the computer system used may be a personal computer or work station.
  • other computer systems such as micro processors or custom designed integrated circuits may be used as a computer system and reference to a personal computer or work station only should in no way be seen as limiting.
  • symbols representing sound events may be added by the computer system to an outputted audio visual signal which presents the combined synchronised audio visual symbols and corresponding sound events. This combined signal can then be recorded and then replayed to display symbols representing sound events at the same time as the sound events occur.
  • a time code may be applied to a symbol by the computer system tagging or matching the appropriate time code to the symbol recorded in a computer file.
  • the computer system may determine at what time position in an audio visual recording or output signal a sound event occurs and may insert a symbol representing that sound event to the same place or location in the record or signal.
  • a method of modifying an existing film, video tape or such like which has existing sound or an independent script comprising determining a beginning point in time using visuals or the sound, determining an end point in time for a word, phrase or sentence, transposing the text of the word, phrase or sentence between said points in time so that when the film is replayed the text will scroll across the screen from right to left with the beginning of the text corresponding to beginning point in time and the end of the text corresponding to the end point in time.
  • This invention allows for text to be entered or scanned into a computer, and then to be saved permanently in the computer. Electrical signals representing one or more frames of the visual picture or film are then fed to the computer and displayed on the computer screen as they would be on a normal television monitor.
  • electrical signals representing a continuous code usually called longitudinal time code, or LTC are fed from the film medium via a code converter to the computer.
  • Electrical signals representing a non continuous code usually called vertical Integrated time code or NITC may also be fed from the film medium to the computer.
  • Such time code signals can be electronically applied to film or video tape, and may also be electronically read by the computer system.
  • the invention allows for a particular start frame time code value to be stored in the computer memory, and subsequently for a particular end frame time code value to be stored in the computer.
  • a particular start point can be matched to a specific word, phrase or sentence of the text stored in the computer and subsequently a particular end point can also be matched to the same specific word, phrase or sentence of the text stored in the computer.
  • the present invention may apply a time code to a text symbol by inserting the relevant time code into a computer file which records symbols, and linking the time code and symbol together.
  • the software program of the invention may now position the word(s) phrase or sentence of the text on a portion of the monitor screen of the computer designated by an operator of the computer, usually either in the lower 1/5 or upper 1/5 of the computer monitor.
  • the software program of the invention may transpose the word(s) phrase or sentence of the text between the start and end time code values relating to the start and end frames of the visual signal. In this way the time codes show at which position on a film or video tape a text symbol should be located.
  • This process can then be repeated so that all the text is synchronised to particular start and end time codes values relating to start and end frames of the visual signal.
  • the invention allows for the synchronised text to scroll from right to left across the lower or upper 1/5 of the computer monitor and maintain synchronisation with the moving pictures, such that when a particular series of frames of the moving pictures depicts a person or creature synonymous with those of speech, the scrolling text is displaying the word(s) phrase or sentence synonymous with the actions of the person depicted on the monitor, as the text scrolls past a fixed line also positioned vertically on the lower or upper 1/5 of the monitor.
  • the invention further allows for adjustments /corrections/replacements to be made to the actual word(s) phrase or sentence of the text.
  • the invention further allows for adjustments to be made to the position within the specified time codes, of particular letters or words of the text, such that letters or words may be 'tabbed' earlier or later so as to improve synchronisation's between the text and the scene depicted on the computer monitor.
  • the position of symbols transferred to the film or video tape may be varied by the computer system within the time code positions specified for a symbol or symbols.
  • the invention also allows for particular words or letters to be highlighted, underlined, coloured or others wise distinguished so as to differentiate certain words or letters or phrases from other words letters or phrases.
  • the invention also allows that when all text has been positioned correctly and in synchronisation with visual images the combined images can be permanently stored on a device such as a video recorder or other picture and sound recording device.
  • This recording can then be played in a recording studio to a performer.
  • the performer may read the scrolling text and speak the words such that the words are in synchronisation with the images.
  • a video tape player 10 is connected through an amplifier 11 to a speaker 12, so that audio signals recorded on a video tape being played by a player 10 can be heard.
  • the video tape player 10 is also connected by lines 14,15 to a computer 16, which is typically a personal computer of known kind.
  • the computer 16 is connected to a key board 17 and a mouse 18, whereby the computer 16 can be controlled and further input can be made.
  • a monitor 19 is also connected to the computer 16 so that output from the computer 16 representing visual images can be seen.
  • the computer 16 is also connected by lines 20, 21 to a video tape recorder 22, so that output from the computer 16 representing visual and/or audio signals can be recorded on the same video tape.
  • a video signal representing the pictures on the first tape pass along the line 14 into the computer 16.
  • a timing signal LTC from the first video tape passes along line 15 to the computer 16.
  • a code converter (not shown) may be incorporated into line 15 to allow the LTC signal to be read by the computer 16.
  • An additional timing signal VITC which is recorded on the video channel of the first tape may also be received by the computer via line 14.
  • the vertical interleafed time code or the VITC is particularly useful when the first tape is being advanced or reversed frame by frame, or at low speeds.
  • a known kind of video card in the computer 16 enables the computer 16 to accept the video signal from the tape player 10 and display it on the monitor 19.
  • This signal may include a visual alphanumeric display of the time code identical to the LTC, and VITC time codes on lines 14 and 16.
  • the typed script has been entered in to the memory of the computer 16. This can be done by using the keyboard 17 and, if necessary, the mouse 18 in a normal word processing operation.
  • the typed script may also be entered into the computer via a known scanning device, thus eliminating the need to re-type the script.
  • the script can be typed into another computer and transferred into the computer 16 on a floppy disc.
  • the software in the computer 16 enables the script held in the memory thereof to be displayed in a window or otherwise as if on a continuous ribbon, on a panel 24 on the monitor 19.
  • a typical use for the equipment described above would be to prepare a mastertape, that at some time later could be used in a recording studio to facilitate the replacement of unwanted dialogue with new dialogue.
  • Such a procedure would start with the first magnetic video tape which carries the pictures and may carry an audio channel.
  • the audio channel on the first video tape can be heard through the computer 11 and speaker 12.
  • the operator typically moves the first video tape backwards and forwards, using the controls on the tape player 10 until the lips or actions of the speaker seen on the monitor 19 indicate the beginning of a word(s) phrase or sentence.
  • the operator then records the time code shown on the panel 23 by pressing the appropriate key or keys on the keyboard 17.
  • the operator may use known sound editing and signal processing software with wave form representation to locate the beginning and end of a word, phrase or sentence.
  • Graphic wave form representation may facilitate the location of the start and end points. The operator would then, as above, match a point on the script to be the beginning of the correct words phrases or sentences and match a point at the end of the correct word phrase or sentence.
  • the software in the computer 16 is arranged to recognise the time code of the beginning of the word(s) phrase or sentence from the first tape and from the typed script and to bring them into coincidence.
  • the computer then lengthens or shortens the typed word(s) phrase or sentence, as appropriate so that the time code for the end of the word(s) phrase or sentence coincides with the marked end point of the word(s) phrase or sentence shown on the monitor 19.
  • the computer Upon playing the video the computer will automatically scroll the word or word(s) phrase or sentence from right to left.
  • the word passes marker bar 25 it will be synchronised with the lip and facial movements of the speaker on monitor 19. Thereby, the lip and facial movements of the speaker on the monitor 19 and the typed script on the panel 24 are in synchronism. The above steps may be repeated so that all text is now synchronised with all lip and facial movements.
  • the video signal comprising the pictures from the video tape 10 combined with the synchronised script from the panel 26 are fed along the line 20 and recorded on a second video tape in the video tape recorder 22.
  • the software also allows specific lines of text to be colour coded, underlined or otherwise distinguished so as to differentiate a particular set of words or phrases. This may be used to provide different actors a method of identifying their own lines.
  • the invention also allows symbols to be displayed to indicate additional lip/mouth movements, such as kisses, breaths, raspberries etc.
  • the second tape can be removed from the recorder 22 and can be projected or otherwise displayed in a recording studio.
  • an actor can watch the lip and facial movements of the speaker in the pictures, and at the same read the typed script which will be scrolling in synchronism along a display, similar to that shown previously on the monitor 19.
  • the actor can speak the typed script which is then recorded onto an appropriate recording medium, also containing a time code.
  • the special software in the computer 16 serves three main functions i.e. (a) to sense time codes (b) to scroll the typed text across the panel 24 and (c) to stretch or shrink the text, as required to match the lip movements of the person seen speaking on the monitor.
  • the actor making the new or replacement recording for the final video tape has the pictures displayed, and a very accurate indication of when each syllable, word, phrase or sentence should be started and finished, so as to obtain the most realistic and precise synchronisation of visual and audio effects. It is clearly easier for an actor to make a recording when they can see the actions in the pictures and also the script which the actor has to speak on the same screen at the same time, rather than having a separate script.
  • the word(s) phrases or sentences of the script can be entered into the computer 16 by use of known voice recognition software. It is further conceivable that start and end points of words phrases or sentences may be sensed by known voice recognition software and matched to start and end points of the text, thus further streamlining the process of synchronising text to picture.
  • the software used in the present invention include provision for extracting information from a prepared film that may be used to prepare computer generated reports. Such reports maybe useful for the management, scheduling, charging , estimating and/or quoting for either the preparation of the used film, or for the dialogue re-recording session.
  • the computer software used may compile and assemble information associated with the method of the present invention to provide detailed reports regarding the work done. Charges may be made to a film production company or costs estimated dependent on for example the number of words replaced on a film or alternatively by the time length of recorded sounds which need to be replaced.
  • the software used may be interfaced with language translation software to allow for language translation consistent with facial lip movement or screen, prior to the synchronising of the new language text.
  • Translation software may be used to compile a text file of a film script in the language into which the films' dialogue is to be translated into.
  • language translation software and voice recognition or signal processing software may be used to automate the synchronisation process.
  • a text file may be compiled by language translation software in the appropriate language, then have the required time codes marked into same which are identified by voice recognition or signal processing software.
  • the invention may be used to facilitate deaf persons watching and understanding a film or television program by virtue of the text scrolling in synchronism with the pictures.

Abstract

The present invention incorporates a new method and apparatus for synchronizing a sound event recorded on a medium with at least one symbol representing said sound event using a computer system, characterised by the steps of: a) sensing the start of a sound event, and b) sensing at least one time code associated with said medium which corresponds to the beginning of the sound event, c) the computer system applying said time code to at least one symbol representing said sound event to indicate a start time for said symbol, and d) sensing the end of the sound event, and e) sensing at least one further time code associated with said medium which corresponds to the end of the sound event, and f) the computer system applying said further time code to the symbol or symbols representing said sound event to indicate an end time for a said symbol.

Description

SOUND SYNCHRONIZING
TECHNICAL FIELD
This invention relates to improvements to display media which have been referred to throughout as films. This expression includes conventional cinematograph films in which a changeable visual image is carried on the film as a succession of pictures, one per frame of the film. Sound signals to accompany the pictures and to be synchronised therewith are typically carried on the film as a variable density optical strip or more recently as a longitudinal stripe of magnetisable material carried on the film. The expression "film" also includes magnetic tape (video tape) carrying along the length thereof a succession of frames each of which carries magnetised zones representing a picture in analogue or digital Form. The expression also includes other appropriate recording media such as ROM compact discs or other forms of digital recordings
With any of the above types of film, it is important that the changing visual image and the varying sounds to accompany the images must be strictly synchronised in order to maintain realism. This synchronism is not particularly difficult to achieve if the visual images and the accompanying sounds are recorded on the film at the same time. However, if the sound is to be added to the film after the visual images have been recorded thereon, great care has to be taken to match the starting time and duration of each element of the sound.
This is particularly difficult and time consuming if the varying sounds are speech, to accompany lip movements of a person in the visual images. The problem is even more critical where the speaker in the visual images speaks in a first language, the sound is recorded in that language at the same time and subsequently the first language sound is deleted and a second language is to be recorded in its place.
Although the variable sound signal is usually reproduced as audible sound, it may also, or instead be recorded on the film in a form which is to be reproduced visually e.g. as sub-titles, sur-titles or even as a full alphanumeric representation (printing) of the speech, synchronised appropriately with the changeable visual image.
Background Art
Often in the process of making a Film or video it is necessary to replace dialogue after the shooting of the film. This may be due to a number of reasons.
There may have been extraneous noise occurring at the time of the shoot, the recordist may have overloaded and distorted his/her recording device, the performance (dialogue wise) may not have been up to scratch or the film may be desired for viewing in multiple countries and thus need dialogue in the corresponding languages and/or dialects/accents.
Conventionally this has been done in an audio recording studio whereby the 'new' dialogue is recorded as the performer is shown the pictures. To enable the performer to achieve synchronisation with the movements of the on screen actors lips, a series of audio 'beeps' are fed to the performers headphones immediately prior to the moment in which they are required to deliver their line.
This process is often laborious and all too often the focus is on achieving synchronisation and the performance is neglected. In a film of 90 minutes running time should all dialogue be replaced, it could take in the order of 8-10 eight hour days to re-record.
In a known system, the applicants believe the visual images are recorded typically on a first film or magnetic tape and are projected onto a screen for viewing. A ribbon is wound from reel-to-reel right to left on a machine in which the linear movement of the ribbon is synchronised with the linear movement of the first magnetic tape.
As a word or sentence is seen to be spoken an operator can write by hand on the ribbon, and the ribbon is then super imposed on the lower part of the picture either physically or via a projection system. The writing is so placed such that when it coincides with a marker bar it is in time with the words or sentence seen spoken on the pictures.
This process is often lengthy as words may need to be erased and rewritten often so as to achieve exact synchronisation of pictures and written words.
Once all written words are synchronised with the moving pictures both images are transferred to a second film or tape creating a permanent record of the two images. This film or tape is then played to an actor in a recording studio, the actor can read and speak the written words shown on the ribbon and they will be in sync with the lips or actions seen in the visual pictures.
This system is an improvement on the previous audio beep/cue system but, due to the difficulty handwriting the script, and the difficulty in reading handwriting is not ideal.
Handwriting text for such a systems takes a relatively long time, which may slow the re-recording process substantially. Further, a performer may have problems determining which lines they are to perform and also what the lines are if the handwriting is untidy. Where a film includes a piece of fast dialogue, handwriting must be squashed and compacted together to ensure the required text is placed on the appropriate position relative to the original film recording. Again this can cause legibility problems as the handwriting will need to be squashed into a restricted space.
Object
It is the object of the present invention to reduce the above difficulties related to adding or replacing sound to existing pictures on Film and particularly to achieve recording of the sound signal onto the film faster and with better synchronisation.
A further object of the invention is to address the foregoing problems or at least provide the public with a useful choice.
Further aspects and advantages of the present invention will become apparent from the ensuing description which is given by way of example only
Disclosure of Invention
According to one aspect of the present invention there is provided a method of synchronising a sound event recorded on a medium with at least one symbol representing said sound event using a computer system,
the method being characterised by the steps of;
a) sensing the start of a sound event, and b) sensing at least one time code associated with said medium which corresponds to the beginning of the sound event, and
c) the computer system applying said time code to at least one symbol representing said sound event to indicate a start time for said symbol, and
d) sensing the end of the sound event, and
e) sensing at least one further time code associated with said medium which corresponds with the end time of the sound event, and
f) the computer systems applying said further time code to the symbol or symbols representing said sound event to indicate an end time for said symbol.
According to a further aspect of the present invention there is provided a method of synchronising a sound event with a symbol representing said sound event substantially as described above and further characterised by the additional step of the computer outputting an audio visual signal which combines the information recorded on said medium, with symbols representing sound events recorded on said medium so that sound events may be heard at the same time that corresponding synchronised symbols are displayed.
Reference throughout the specification shall now be made to the medium used in conjunction with the present invention as being a film or magnetic video tape. It should be appreciated however, that other types of medium may be used in conjunction with the present invention in alternative embodiments. For example other mediums which may be used can include computer based memory adapted to record audio visual information or laser discs and CD ROM's adapted to record in play back audio visual information.
In a preferred embodiment a sound event may include any audio signal recorded onto the medium used. In a further preferred embodiment sound events may be vocal dialogue or language audio tracks.
The present invention maybe specifically adapted in some embodiments to be used as an aid to language translations for a film. Dialogue in the required language may be inserted into the final translated film easily with use of the present invention.
Alternative embodiments of the present invention may not deal only with language sounds as sound events. For example, in other embodiments other sound events such as performers blowing raspberries or making other unintelligible noises may be synchronised with symbols in accordance with the present invention, and reference to dialogue or language sound events only should in no way be seen as limiting.
In a preferred embodiment the symbols used to represent sound events may be any type of symbol which can be clearly understood to represent the particular sound event involved. In a further preferred embodiment where the sound events is a section of dialogue, textual symbols may be used to represent the particular language sounds.
A computer system as discussed above may include any type of processing device which is capable of acting on a set of instructions to perform the method of the present invention. In a preferred embodiment the computer system used may be a personal computer or work station. However in alternative embodiments other computer systems such as micro processors or custom designed integrated circuits may be used as a computer system and reference to a personal computer or work station only should in no way be seen as limiting.
In a preferred embodiment symbols representing sound events may be added by the computer system to an outputted audio visual signal which presents the combined synchronised audio visual symbols and corresponding sound events. This combined signal can then be recorded and then replayed to display symbols representing sound events at the same time as the sound events occur.
In a preferred embodiment a time code may be applied to a symbol by the computer system tagging or matching the appropriate time code to the symbol recorded in a computer file. In this way the computer system may determine at what time position in an audio visual recording or output signal a sound event occurs and may insert a symbol representing that sound event to the same place or location in the record or signal.
According to yet another aspect of the present invention there is provided a method of modifying an existing film, video tape or such like which has existing sound or an independent script, comprising determining a beginning point in time using visuals or the sound, determining an end point in time for a word, phrase or sentence, transposing the text of the word, phrase or sentence between said points in time so that when the film is replayed the text will scroll across the screen from right to left with the beginning of the text corresponding to beginning point in time and the end of the text corresponding to the end point in time. This invention allows for text to be entered or scanned into a computer, and then to be saved permanently in the computer. Electrical signals representing one or more frames of the visual picture or film are then fed to the computer and displayed on the computer screen as they would be on a normal television monitor.
In addition electrical signals representing a continuous code, usually called longitudinal time code, or LTC are fed from the film medium via a code converter to the computer. Electrical signals representing a non continuous code usually called vertical Integrated time code or NITC may also be fed from the film medium to the computer.
Such time code signals can be electronically applied to film or video tape, and may also be electronically read by the computer system.
The invention allows for a particular start frame time code value to be stored in the computer memory, and subsequently for a particular end frame time code value to be stored in the computer.
In addition, a particular start point can be matched to a specific word, phrase or sentence of the text stored in the computer and subsequently a particular end point can also be matched to the same specific word, phrase or sentence of the text stored in the computer.
The present invention may apply a time code to a text symbol by inserting the relevant time code into a computer file which records symbols, and linking the time code and symbol together. The software program of the invention may now position the word(s) phrase or sentence of the text on a portion of the monitor screen of the computer designated by an operator of the computer, usually either in the lower 1/5 or upper 1/5 of the computer monitor.
In addition the software program of the invention may transpose the word(s) phrase or sentence of the text between the start and end time code values relating to the start and end frames of the visual signal. In this way the time codes show at which position on a film or video tape a text symbol should be located.
This process can then be repeated so that all the text is synchronised to particular start and end time codes values relating to start and end frames of the visual signal.
Further the invention allows for the synchronised text to scroll from right to left across the lower or upper 1/5 of the computer monitor and maintain synchronisation with the moving pictures, such that when a particular series of frames of the moving pictures depicts a person or creature synonymous with those of speech, the scrolling text is displaying the word(s) phrase or sentence synonymous with the actions of the person depicted on the monitor, as the text scrolls past a fixed line also positioned vertically on the lower or upper 1/5 of the monitor. The invention further allows for adjustments /corrections/replacements to be made to the actual word(s) phrase or sentence of the text.
The invention further allows for adjustments to be made to the position within the specified time codes, of particular letters or words of the text, such that letters or words may be 'tabbed' earlier or later so as to improve synchronisation's between the text and the scene depicted on the computer monitor. In this way the position of symbols transferred to the film or video tape may be varied by the computer system within the time code positions specified for a symbol or symbols.
The invention also allows for particular words or letters to be highlighted, underlined, coloured or others wise distinguished so as to differentiate certain words or letters or phrases from other words letters or phrases.
The invention also allows that when all text has been positioned correctly and in synchronisation with visual images the combined images can be permanently stored on a device such as a video recorder or other picture and sound recording device.
This recording can then be played in a recording studio to a performer. The performer may read the scrolling text and speak the words such that the words are in synchronisation with the images.
Brief Description of Drawings
One embodiment of the invention is described, by way of example only, with reference to the accompanying Figure, which is of a diagrammatic nature only.
Best Modes for carrying out the Invention
In the drawing a video tape player 10 is connected through an amplifier 11 to a speaker 12, so that audio signals recorded on a video tape being played by a player 10 can be heard. The video tape player 10 is also connected by lines 14,15 to a computer 16, which is typically a personal computer of known kind. The computer 16 is connected to a key board 17 and a mouse 18, whereby the computer 16 can be controlled and further input can be made. A monitor 19 is also connected to the computer 16 so that output from the computer 16 representing visual images can be seen. The computer 16 is also connected by lines 20, 21 to a video tape recorder 22, so that output from the computer 16 representing visual and/or audio signals can be recorded on the same video tape.
When the first video tape is played on the tape player 10, a video signal representing the pictures on the first tape pass along the line 14 into the computer 16. In synchronism, a timing signal LTC from the first video tape passes along line 15 to the computer 16.
A code converter (not shown) may be incorporated into line 15 to allow the LTC signal to be read by the computer 16. An additional timing signal VITC which is recorded on the video channel of the first tape may also be received by the computer via line 14. The vertical interleafed time code or the VITC is particularly useful when the first tape is being advanced or reversed frame by frame, or at low speeds.
A known kind of video card in the computer 16 enables the computer 16 to accept the video signal from the tape player 10 and display it on the monitor 19. This signal may include a visual alphanumeric display of the time code identical to the LTC, and VITC time codes on lines 14 and 16.
At some prior convenient time the typed script has been entered in to the memory of the computer 16. This can be done by using the keyboard 17 and, if necessary, the mouse 18 in a normal word processing operation. The typed script may also be entered into the computer via a known scanning device, thus eliminating the need to re-type the script. Alternatively, the script can be typed into another computer and transferred into the computer 16 on a floppy disc. The software in the computer 16 enables the script held in the memory thereof to be displayed in a window or otherwise as if on a continuous ribbon, on a panel 24 on the monitor 19.
A typical use for the equipment described above, would be to prepare a mastertape, that at some time later could be used in a recording studio to facilitate the replacement of unwanted dialogue with new dialogue.
Such a procedure would start with the first magnetic video tape which carries the pictures and may carry an audio channel. The audio channel on the first video tape can be heard through the computer 11 and speaker 12.
To match the moving picture on the monitor 19 with the typed script shown in the panel 24, the operator typically moves the first video tape backwards and forwards, using the controls on the tape player 10 until the lips or actions of the speaker seen on the monitor 19 indicate the beginning of a word(s) phrase or sentence. The operator then records the time code shown on the panel 23 by pressing the appropriate key or keys on the keyboard 17.
These keystrokes will record the time code in a computer file which matches the time code to the appropriate text symbol or symbols. The operator then advances the first video tape in the player 10 until the speaker's lips indicate the end of the particular word(s) phrase or sentence, and again records the time code into the computer 16.
In addition the operator may use known sound editing and signal processing software with wave form representation to locate the beginning and end of a word, phrase or sentence. Graphic wave form representation may facilitate the location of the start and end points. The operator would then, as above, match a point on the script to be the beginning of the correct words phrases or sentences and match a point at the end of the correct word phrase or sentence.
The software in the computer 16 is arranged to recognise the time code of the beginning of the word(s) phrase or sentence from the first tape and from the typed script and to bring them into coincidence. The computer then lengthens or shortens the typed word(s) phrase or sentence, as appropriate so that the time code for the end of the word(s) phrase or sentence coincides with the marked end point of the word(s) phrase or sentence shown on the monitor 19. Upon playing the video the computer will automatically scroll the word or word(s) phrase or sentence from right to left.
As the word passes marker bar 25 it will be synchronised with the lip and facial movements of the speaker on monitor 19. Thereby, the lip and facial movements of the speaker on the monitor 19 and the typed script on the panel 24 are in synchronism. The above steps may be repeated so that all text is now synchronised with all lip and facial movements. The video signal, comprising the pictures from the video tape 10 combined with the synchronised script from the panel 26 are fed along the line 20 and recorded on a second video tape in the video tape recorder 22.
The software also allows specific lines of text to be colour coded, underlined or otherwise distinguished so as to differentiate a particular set of words or phrases. This may be used to provide different actors a method of identifying their own lines. In additional to alpha/numeric text the invention also allows symbols to be displayed to indicate additional lip/mouth movements, such as kisses, breaths, raspberries etc.
In this way the appearance of symbols applied to the film or video tape may be varied depending on the sounds they represent.
When a significant part, or all of the video pictures from the first tape have been recorded on the second tape, the second tape can be removed from the recorder 22 and can be projected or otherwise displayed in a recording studio. There, an actor can watch the lip and facial movements of the speaker in the pictures, and at the same read the typed script which will be scrolling in synchronism along a display, similar to that shown previously on the monitor 19. The actor can speak the typed script which is then recorded onto an appropriate recording medium, also containing a time code.
Thus, it can be seen that the special software in the computer 16 serves three main functions i.e. (a) to sense time codes (b) to scroll the typed text across the panel 24 and (c) to stretch or shrink the text, as required to match the lip movements of the person seen speaking on the monitor.
When the original recording was made in a different language, not only will the individual words, or even syllables, need to be stretched or shortened to match the lip movements, but the time between individual word(s) phrases or sentences may need to be adjusted, and this, can be achieved, for example, by use of the "tab" keys on the keyboard 17.
By use of the above techniques, the actor making the new or replacement recording for the final video tape, has the pictures displayed, and a very accurate indication of when each syllable, word, phrase or sentence should be started and finished, so as to obtain the most realistic and precise synchronisation of visual and audio effects. It is clearly easier for an actor to make a recording when they can see the actions in the pictures and also the script which the actor has to speak on the same screen at the same time, rather than having a separate script.
In appropriate circumstances, the word(s) phrases or sentences of the script can be entered into the computer 16 by use of known voice recognition software. It is further conceivable that start and end points of words phrases or sentences may be sensed by known voice recognition software and matched to start and end points of the text, thus further streamlining the process of synchronising text to picture.
It is also conceivable the software used in the present invention include provision for extracting information from a prepared film that may be used to prepare computer generated reports. Such reports maybe useful for the management, scheduling, charging , estimating and/or quoting for either the preparation of the used film, or for the dialogue re-recording session.
The computer software used may compile and assemble information associated with the method of the present invention to provide detailed reports regarding the work done. Charges may be made to a film production company or costs estimated dependent on for example the number of words replaced on a film or alternatively by the time length of recorded sounds which need to be replaced.
Furthermore it is also conceivable the software used may be interfaced with language translation software to allow for language translation consistent with facial lip movement or screen, prior to the synchronising of the new language text. Translation software may be used to compile a text file of a film script in the language into which the films' dialogue is to be translated into.
In another embodiment language translation software, and voice recognition or signal processing software may be used to automate the synchronisation process. A text file may be compiled by language translation software in the appropriate language, then have the required time codes marked into same which are identified by voice recognition or signal processing software.
Although the invention has been described with reference to the synchronisation of speech with the pictures, the techniques and apparatus can also be used for adding sound affects or music at the correct timing.
It is also possible that the invention may be used to facilitate deaf persons watching and understanding a film or television program by virtue of the text scrolling in synchronism with the pictures.
Aspects of the present invention have been described by way of example only and it should be appreciated that modifications and additions may be made thereto without departing from the scope thereof as defined in the appended claims.

Claims

THE CLAIMS DEFINING THE INVENTION ARE:
1. A method of synchronizing a sound event recorded on a medium with at least one symbol representing said sound event, using a computer system, characterised by the steps of:
a) sensing the start of a sound event, and
b) sensing at least one time code associated with said medium which corresponds to the beginning of the sound event,
c) the computer system applying said time code to at least one symbol representing said sound event to indicate a start time for said symbol, and
d) sensing the end of the sound event, and
e) sensing at least one further time code associated with said medium which corresponds to the end of the sound event, and
f) the computer system applying said further time code to the symbol or symbols representing said sound event to indicate an end time for a said symbol.
2. A method of synchronizing a sound event with at least one symbol as claimed in claim 1, further characterised by the additional step of the computer outputting an audio visual signal which combines the information recorded on said medium with symbols representing sound events recorded on said medium so that sound events can be heard at the same time that the corresponding synchronized symbols are displayed.
3. A method as claimed in claim 2 wherein the computer system is connected to recording apparatus to record the outputted audio visual signal.
4. A method as claimed in any one of claims 1-3 wherein the medium is an audio visual film or video recording.
5. A method as claimed in any one of claims 1-4 wherein the symbols applied to said medium are firstly recorded in said computer system.
6. A method as claimed in any previous claim wherein a time code is applied to a symbol by matching said time code to a symbol or symbols recorded in a computer file of said computer system.
7. A method as claimed in any previous claim wherein a symbol applied to the medium is a text symbol.
8. A method as claimed in any one of claims 1-7 wherein the start or end of a sound event is sensed by use of an audio output.
9. A method as claimed in any one of claims 1-7 wherein the start or end of a sound event is sensed using an visual display of information recorded on said medium.
10. A method as claimed in any one of claims 1-7 wherein the start or end point of a sound event is sensed using voice recognition software loaded into said computer system.
11. A method as claimed in any previous claim wherein a time code sensed has been electronically applied to said medium and may be electronically read by said computer system.
12. A method as claimed in claim 11 wherein the computer system may sense and apply a longitudinal time code to a symbol.
13. A method as claimed in claim 11 wherein the computer system may sense and apply a vertical integration time code to a symbol.
14. A method as claimed in anyone of claims 2-13 wherein symbols combined into the computer system audio visual output, scroll along the visual display.
15. A method as claimed in claim 14 wherein audio visual signal outputted from the computer system includes a marker so when the audio visual signal is displayed, the intersection of the marker bar and symbols scrolling across said display indicates that the sound event corresponding to said symbols is occurring.
16. A method as claimed in any one of claims 1-15 wherein the computer system is connected to recording apparatus to record the audio visual signal outputted by the computer system.
17. A method as claimed in any one of claims 1-16 wherein the appearance of symbols in said audio visual signal may be varied depending on the sound event the symbols represents.
18. A method as claimed in claim 17 wherein the appearance of symbols may vary by varying the colour of symbols applied to said medium.
19. A method as claimed in any one of the claims 1-18 wherein the position of symbols within the audio visual signal outputted by the computer system may be varied by the computer system within the time code positions specified for the symbol.
20. A method as claimed in any one of the claims 1-19 wherein symbols representing sound events may be entered into said computer system using a keyboard.
21. A method as claimed in any one of the claims 1-19 wherein symbols representing sound events may be entered into said computer system using voice recognition software and an audio input.
22. A method as claimed in any one of the claims 1-19 wherein symbols representing sound events may be entered into said computer system by transferring a computer file into said computer system.
23. A computer system programmed to perform a method of synchronizing a sound event with symbols representing said sound event as claimed in any previous claim.
24. A method of synchronizing a sound event with a symbol representing said sound event substantially as herein described with reference to and as illustrated by the accompanying examples or drawing.
PCT/NZ1998/000027 1997-02-26 1998-02-27 Sound synchronizing WO1998038642A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU63134/98A AU6313498A (en) 1997-02-26 1998-02-26 Sound synchronizing

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NZ314313 1997-02-26
NZ31431397 1997-02-26

Publications (1)

Publication Number Publication Date
WO1998038642A1 true WO1998038642A1 (en) 1998-09-03

Family

ID=19926161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NZ1998/000027 WO1998038642A1 (en) 1997-02-26 1998-02-27 Sound synchronizing

Country Status (2)

Country Link
AU (1) AU6313498A (en)
WO (1) WO1998038642A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2814888A1 (en) * 2000-10-04 2002-04-05 Cedric Denooz Cinema film sound/vision synchronisation having portable computer transport housed managing/restoring/synchronising sound image signals.
WO2003023765A1 (en) * 2001-09-12 2003-03-20 Ryshco Media Inc. Method and device for processing audiovisual data using speech recognition
WO2004040576A1 (en) * 2002-11-01 2004-05-13 Synchro Arts Limited Methods and apparatus for use in sound replacement with automatic synchronization to images

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2101795A (en) * 1981-07-07 1983-01-19 Cross John Lyndon Dubbing translating of soundtracks on films
EP0649144A1 (en) * 1993-10-18 1995-04-19 International Business Machines Corporation Automatic indexing of audio using speech recognition
FR2730582A1 (en) * 1995-02-14 1996-08-14 Regis Dubos Audio-visual method for dubbing sound onto projected film esp. for video screen use

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2101795A (en) * 1981-07-07 1983-01-19 Cross John Lyndon Dubbing translating of soundtracks on films
EP0649144A1 (en) * 1993-10-18 1995-04-19 International Business Machines Corporation Automatic indexing of audio using speech recognition
FR2730582A1 (en) * 1995-02-14 1996-08-14 Regis Dubos Audio-visual method for dubbing sound onto projected film esp. for video screen use

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2814888A1 (en) * 2000-10-04 2002-04-05 Cedric Denooz Cinema film sound/vision synchronisation having portable computer transport housed managing/restoring/synchronising sound image signals.
WO2003023765A1 (en) * 2001-09-12 2003-03-20 Ryshco Media Inc. Method and device for processing audiovisual data using speech recognition
US7343082B2 (en) 2001-09-12 2008-03-11 Ryshco Media Inc. Universal guide track
WO2004040576A1 (en) * 2002-11-01 2004-05-13 Synchro Arts Limited Methods and apparatus for use in sound replacement with automatic synchronization to images
GB2401985A (en) * 2002-11-01 2004-11-24 Synchro Arts Ltd Methods and apparatus for use in sound replacement with automatic synchronization to images
US8009966B2 (en) 2002-11-01 2011-08-30 Synchro Arts Limited Methods and apparatus for use in sound replacement with automatic synchronization to images

Also Published As

Publication number Publication date
AU6313498A (en) 1998-09-18

Similar Documents

Publication Publication Date Title
CA2538981C (en) Method and device for processing audiovisual data using speech recognition
EP0890171B1 (en) Automated synchronization of video image sequences to new soundtracks
Gambier The position of audiovisual translation studies
O’Connell Screen translation
NZ532174A (en) Voice over synchronisation
US20070011012A1 (en) Method, system, and apparatus for facilitating captioning of multi-media content
CN111538851A (en) Method, system, device and storage medium for automatically generating demonstration video
Goecke et al. The audio-video Australian English speech data corpus AVOZES
KR950034155A (en) Audio recording system and re-recording method of audiovisual media
Minutella (Re) Creating Language Identities in Animated Films: Dubbing Linguistic Variation
Spiteri Miggiani Exploring applied strategies for English-language dubbing
GB2101795A (en) Dubbing translating of soundtracks on films
JPH07261652A (en) Language learning method and recording medium for language learning
Romero-Fresco Audio introductions
Georgakopoulou Reduction levels in subtitling: DVD subtitling: a compromise of trends
WO1998038642A1 (en) Sound synchronizing
JP2002344805A (en) Method for controlling subtitles display for open caption
JP3222283B2 (en) Guidance device
García-Escribano et al. Editing in audiovisual translation (subtitling)
JP2003224774A (en) Semi-automatic caption program production system
Lacković Audiovisual Translation: Challenges of Traslating Humor from English into Croatian in the American Sitcom" The Office"
Babić The dubbing of animated films in Croatia
Perron A Rhythmo-Band Dialogue Replacement Technique
WO2023167212A1 (en) Computer program, information processing method, and information processing device
Kosarin Methods of Translating Used in Bilingual Films

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 1998537552

Format of ref document f/p: F

122 Ep: pct application non-entry in european phase