WO2009031979A1 - A method for incorporating a soundtrack into an edited video-with-audio recording and an audio tag - Google Patents

A method for incorporating a soundtrack into an edited video-with-audio recording and an audio tag Download PDF

Info

Publication number
WO2009031979A1
WO2009031979A1 PCT/SG2008/000332 SG2008000332W WO2009031979A1 WO 2009031979 A1 WO2009031979 A1 WO 2009031979A1 SG 2008000332 W SG2008000332 W SG 2008000332W WO 2009031979 A1 WO2009031979 A1 WO 2009031979A1
Authority
WO
WIPO (PCT)
Prior art keywords
soundtrack
audio
video
content
creator
Prior art date
Application number
PCT/SG2008/000332
Other languages
French (fr)
Inventor
Wong Hoo Sim
Original Assignee
Creative Technology Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Creative Technology Ltd. filed Critical Creative Technology Ltd.
Priority to EP20080829376 priority Critical patent/EP2208344A4/en
Priority to US12/676,882 priority patent/US20100226620A1/en
Priority to CN200880105676XA priority patent/CN101796829B/en
Publication of WO2009031979A1 publication Critical patent/WO2009031979A1/en
Priority to HK11100830.1A priority patent/HK1146775A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G11B27/034Electronic editing of digitised analogue information signals, e.g. audio or video signals on discs
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8455Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/44Receiver circuitry for the reception of television signals according to analogue transmission standards
    • H04N5/60Receiver circuitry for the reception of television signals according to analogue transmission standards for the sound signals
    • H04N5/602Receiver circuitry for the reception of television signals according to analogue transmission standards for the sound signals for digital sound signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/44Receiver circuitry for the reception of television signals according to analogue transmission standards
    • H04N5/60Receiver circuitry for the reception of television signals according to analogue transmission standards for the sound signals
    • H04N5/607Receiver circuitry for the reception of television signals according to analogue transmission standards for the sound signals for more than one sound signal, e.g. stereo, multilanguages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/79Processing of colour television signals in connection with recording
    • H04N9/80Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
    • H04N9/804Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components
    • H04N9/806Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components with processing of the sound signal
    • H04N9/8063Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components with processing of the sound signal using time division multiplex of the PCM audio and PCM video signals

Definitions

  • the present invention relates to a method for incorporating a soundtrack into a video recording, primarily to that of an edited video recording.
  • the present invention also relates to an audio tag used in the aforementioned method.
  • the aforementioned software may also require training over time for a user to gain a degree of competence in using the software, time and effort which users may not be willing to commit.
  • the method includes recording the video-with-audio for subsequent playback. Subsequently, at least one soundtrack is incorporated as an audio tag, the at least one soundtrack being selected from a plurality of soundtracks generated from a soundtrack creator dependent on conditions such as, for example, parameters of content of the recorded video-with-audio, user-defined characteristics for the at least one soundtrack, or a combination of both of the aforementioned conditions. Content of the video-with-audio incorporated with the selected soundtrack is reviewed.
  • the user may edit the content of video- with-audio incorporated with the selected soundtrack by removing image frames of the video-with-audio; and rectify the selected soundtrack using the soundtrack creator such that the edited video-with-audio has a fluent soundtrack similar to the selected soundtrack.
  • the edited video-with-audio is thus enhanced in relation to aural effects in a desired manner by the user.
  • the method may also include prevention of tampering to the selected soundtrack incorporated with the video-with-audio.
  • the soundtrack creator is able to perform tasks aiding in a creation of the audio tag, the tasks being, for example, output an original composition for use as a soundtrack using stored sound samples, output at least one stored digitized musical files as a soundtrack, or a combination of the aforementioned tasks.
  • the digitized musical files may include file types such as, for example, MP3, WMA, OGG, MID, WAV and AAC. Incorporating the selected soundtrack may involve either replacement of all audio in recorded content of the video-with-audio or combining audio in recorded content of the video-with-audio with the selected soundtrack.
  • the parameters of content include, for example, level of lighting in content of recorded video-with-audio, volume level of audio in content of recorded video- with-audio, density of audio in content of recorded video-with-audio, movement of subjects in content of recorded video-with-audio, or any combination of the aforementioned parameters.
  • Rectification of the selected soundtrack may involve the soundtrack creator performing the steps of identifying gaps in the selected soundtrack and adapting/assessing fluency of the selected soundtrack.
  • the duration of the rectified selected soundtrack may preferably be similar to a duration of the edited video-with-audio. It is preferable that adapting the soundtrack involves the soundtrack creator performing tasks like stretching/compressing the soundtrack, increasing/decreasing the number of loops of the soundtrack, and re-combination of sound samples. It is advantageous that the aural effects emphasise a mood and ambience of the content of the edited video-with- audio during playback.
  • the audio tag may be stored separately from audio content of the video-with- audio to facilitate editing of the content of the video-with-audio.
  • the audio tag may preferably be defined by a series of alphanumeric code that represent characteristics of the selected soundtrack, characteristics such as, for example, mood, ambience, or beat frequency.
  • the characteristics may preferably be input using a user interface which converts the characteristics into the series of alphanumeric code.
  • the alphanumeric code may preferably be readable by the soundtrack creator.
  • a recipient may able to vary the selected soundtrack when the recipient has either remote or local access to the soundtrack creator.
  • an audio tag used to generate a soundtrack in a video-with-audio using a soundtrack creator.
  • Figure 1 shows a process flow of a preferred embodiment of the method of the present invention.
  • Figure 2 shows a process flow of a soundtrack creator as employed in several aspects in the preferred embodiment.
  • a method 100 for incorporating a soundtrack into an edited video-with-audio recording using a video recording apparatus includes recording the video- with-audio for subsequent playback (102).
  • video-with- audio also includes capturing a series of moving images with sound being muted during recording.
  • the video-with-audio may be captured using multiple streams, with separate streams for images and audio.
  • the video- with-audio may be captured as a single stream which includes both images and audio.
  • the method 100 may be applied to both of the aforementioned forms of video-with-audio.
  • a soundtrack may be incorporated into the video-with-audio (104).
  • the soundtrack may aid in emphasising a mood and/or ambience of content of the video-with-audio after incorporation into the video-with-audio.
  • a plurality of soundtracks may be generated by a soundtrack creator.
  • the soundtrack creator assesses content on the recorded video-with-audio (200).
  • the soundtrack creator may be able to assess both single and multiple stream videos.
  • the recorded video-with-audio is assessed using parameters such as, for example: • level of lighting in content of recorded video-with-audio;
  • the soundtrack creator may generate a soundtrack with an upbeat rhythm.
  • the soundtrack creator may generate a soundtrack with a sombre rhythm.
  • the soundtrack creator may generate a soundtrack with an upbeat rhythm.
  • the soundtrack creator may generate a soundtrack with a sombre rhythm.
  • the soundtrack creator may also utilize digital imaging technology such as, for example, face recognition technology, pixel agglomeration technology and the like to detect movement of subjects in the content of recorded video-with- audio. Generally, if vigorous movements are detected, the soundtrack creator may generate a soundtrack with an upbeat rhythm. Correspondingly, if few movements of subjects in the content are detected, the soundtrack creator may generate a soundtrack with a sombre rhythm.
  • digital imaging technology such as, for example, face recognition technology, pixel agglomeration technology and the like to detect movement of subjects in the content of recorded video-with- audio.
  • the soundtrack creator may generate a soundtrack with an upbeat rhythm.
  • the soundtrack creator may generate a soundtrack with a sombre rhythm.
  • the soundtrack creator is not only able to generate soundtracks with upbeat and sombre rhythms.
  • the aforementioned are merely examples used to aid in understanding the preferred embodiment.
  • Other rhythms such as, for example, frantic, relaxed, inspirational and so forth may also be generated by the soundtrack creator.
  • the soundtrack creator may have a capability to output an original composition as a soundtrack using stored sound samples (202), output at least one stored digitized musical file as a soundtrack (204), or output an original composition which combines stored sound samples and at least one stored digitized musical file (206).
  • the digitized musical files may include file types such as, for example, MP3, WMA, OGG, MID, WAV, AAC and so forth.
  • the digitized musical files may include short musical loops which may be repeated, stretched, and compressed.
  • Each soundtrack may be stored as an audio tag (230).
  • the soundtrack creator may provide several soundtracks for selection, each of which may be previewed on the video recording apparatus by a user subsequent to incorporation with the video-with-audio before a soundtrack is selected for definitive incorporation with the video-with-audio (106). Incorporating the selected soundtrack may involve either replacement of all audio in the recorded video-with-audio or combining (mixing) audio in the recorded video-with-audio with the selected soundtrack.
  • the selected soundtrack may be stored separately from audio content of the video-with- audio to facilitate editing of the content of the video-with-audio at a later time. Storing the selected soundtrack separately may also enable removal of the selected soundtrack. This may depend on a preference of the user.
  • the audio tag may be defined by a series of alphanumeric codes that represent characteristics of the selected soundtrack, such as, for example, mood, ambience, beat frequency and the like.
  • the alphanumeric code may be a nine digit arrangement of alphabets and numerals, such as, "abc 456 t5i", where the first three characters represent the mood, the middle three characters represent the ambience, and the final three characters represent the beat frequency.
  • the alphanumeric code of the audio tag is readable by the soundtrack creator to aid in the generation of a soundtrack. This ensures that anyone with access to the soundtrack creator will have a capability to generate a soundtrack for a video-with-audio with an audio tag.
  • the user need not know the alphanumeric code or a form of conversion per se, as the alphanumeric code represents a means to quantify non-quantifiable objects like mood and ambience.
  • “abc” may mean “sleepy mood”
  • "456” may mean “celebratory occasion” and so forth.
  • the conversion may be performed using a user interface which the user interacts with.
  • the user may be able to use the user interface to input the terms "sleepy mood” and "celebratory occasion” and the conversion into alphanumeric code is correspondingly performed for input into the soundtrack creator. It should be noted that this nine digit arrangement of characters is not meant to be limiting as it is merely an illustrative example.
  • the user may input soundtrack characteristics (103) into the soundtrack creator such that output from the soundtrack creator is similar to what the user desires.
  • a presence of the audio tag may aid the user in ensuring that any particular characteristic which is mandatory for the soundtrack of the video- with-audio would be taken into account by the soundtrack creator.
  • More than one audio tag may be incorporated into the content of the video-with-audio as the mood and/or ambience of the content may vary, so a single audio tag may not be suitable for an entire content.
  • Each audio tag may be defined to be invoked at a specific point in time during playback.
  • each audio tag may include more than one soundtrack, and each soundtrack may be defined to be invoked at a specific point in time during playback.
  • the user may edit the content of the video-with-audio incorporated with the selected soundtrack by removing portions such as bad takes or unwanted scenes from the content of the video-with-audio (108).
  • the soundtrack creator may rectify the selected soundtrack such that the edited video-with-audio has a fluent soundtrack similar to the selected soundtrack (110).
  • rectifying the selected soundtrack may involve the soundtrack creator identifying gaps in the selected soundtrack (208) caused by removal of portions of content of the video-with-audio. It is likely that the removal of portions of content would cause a loss in fluency of the selected soundtrack.
  • the soundtrack creator may perform adaptation tasks (210) such as, for example stretch/compress the soundtrack, increase/decrease the number of loops of the soundtrack, re-combination of sound samples and so forth.
  • a duration of the rectified selected soundtrack may be similar to a duration of the edited video-with-audio, and this correspondingly also maintains mood and/or ambience of the content. Rectification of the selected soundtrack may also be reflected in the audio tag (232).
  • the use of the audio tag allows the edited video-with-audio to be customized by either a maker of the video-with-audio (the user of the video recording apparatus) or a receiver/recipient of the edited video-with-audio.
  • the receiver/recipient would be able to customize the selected soundtrack to their preference if there was a presence of the soundtrack creator on the device which is used to consume the edited video-with-audio.
  • the soundtrack creator may be accessed remotely by the device and need not be installed in the device.
  • the presence of the audio tag which contains information from the maker pertaining to any particular characteristic which is mandatory for the soundtrack of the video-with-audio may ensure that the soundtrack creator would not generate unsuitable soundtracks for selection.
  • the maker may also have an option of restricting rights of the receiver/recipient to tamper with any aspect of the edited video-with-audio, regardless of whether the receiver/recipient has access to the soundtrack creator. This measure may be implemented to preserve a style, ambience and mood as originally intended by the maker.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Television Signal Processing For Recording (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

There is provided a method for incorporating a soundtrack into an edited video-with-audio recording using a video recording apparatus. The method includes recording the video-with-audio for subsequent playback. Subsequently, at least one soundtrack is incorporated as an audio tag, the at least one soundtrack being selected from a plurality of soundtracks generated from a soundtrack creator dependent on conditions such as, for example, parameters of content of the recorded video-with-audio, user-defined characteristics for the at least one soundtrack, or a combination of both of the aforementioned conditions. Content of the video-with-audio incorporated with the selected soundtrack is reviewed. The user may edit the content of video-with-audio incorporated with the selected soundtrack by removing image frames of the video-with-audio; and rectify the selected soundtrack using the soundtrack creator such that the edited video-with-audio has a fluent soundtrack similar to the selected soundtrack. The edited video-with-audio is thus enhanced in relation to aural effects in a desired manner by the user. In another aspect, there is provided an audio tag used to generate a soundtrack in a video-with-audio using a soundtrack creator.

Description

A METHOD FOR INCORPORATING A SOUNDTRACK INTO AN EDITED VIDEO-WITH-AUDIO RECORDING AND AN AUDIO TAG
FIELD OF INVENTION
The present invention relates to a method for incorporating a soundtrack into a video recording, primarily to that of an edited video recording. The present invention also relates to an audio tag used in the aforementioned method.
BACKGROUND
In this age of the information highway where sharing of information is prevalent, the popularity of devices capable of recording video recordings has steadily increased. It is likely that the popularity of such devices will increase at an even faster rate once data transmission bandwidths are able to cope with transmissions of high volumes of video data.
While these devices possess one or several audio inputs which permit mixing or replacement of the soundtrack that was recorded originally during the recording of the image with an external audio source, editing the recorded video generally requires specialized software such as, for example,
Videostudio from ULead, Win DVD Creator from Intervideo, Powerdirector from Cyberlink and the like. The aforementioned software may also require training over time for a user to gain a degree of competence in using the software, time and effort which users may not be willing to commit.
As such, users who prefer to use as little time and effort as possible to record and edit certain aspects of their video content have few options available to them. This is especially critical for users who wish to post "live" near instantaneous video-blog entries. SUMMARY
There is provided a method for incorporating a soundtrack into an edited video-with-audio recording using a video recording apparatus. The method includes recording the video-with-audio for subsequent playback. Subsequently, at least one soundtrack is incorporated as an audio tag, the at least one soundtrack being selected from a plurality of soundtracks generated from a soundtrack creator dependent on conditions such as, for example, parameters of content of the recorded video-with-audio, user-defined characteristics for the at least one soundtrack, or a combination of both of the aforementioned conditions. Content of the video-with-audio incorporated with the selected soundtrack is reviewed. The user may edit the content of video- with-audio incorporated with the selected soundtrack by removing image frames of the video-with-audio; and rectify the selected soundtrack using the soundtrack creator such that the edited video-with-audio has a fluent soundtrack similar to the selected soundtrack. The edited video-with-audio is thus enhanced in relation to aural effects in a desired manner by the user.
The method may also include prevention of tampering to the selected soundtrack incorporated with the video-with-audio.
It is advantageous that the soundtrack creator is able to perform tasks aiding in a creation of the audio tag, the tasks being, for example, output an original composition for use as a soundtrack using stored sound samples, output at least one stored digitized musical files as a soundtrack, or a combination of the aforementioned tasks. The digitized musical files may include file types such as, for example, MP3, WMA, OGG, MID, WAV and AAC. Incorporating the selected soundtrack may involve either replacement of all audio in recorded content of the video-with-audio or combining audio in recorded content of the video-with-audio with the selected soundtrack.
The parameters of content include, for example, level of lighting in content of recorded video-with-audio, volume level of audio in content of recorded video- with-audio, density of audio in content of recorded video-with-audio, movement of subjects in content of recorded video-with-audio, or any combination of the aforementioned parameters.
Rectification of the selected soundtrack may involve the soundtrack creator performing the steps of identifying gaps in the selected soundtrack and adapting/assessing fluency of the selected soundtrack. The duration of the rectified selected soundtrack may preferably be similar to a duration of the edited video-with-audio. It is preferable that adapting the soundtrack involves the soundtrack creator performing tasks like stretching/compressing the soundtrack, increasing/decreasing the number of loops of the soundtrack, and re-combination of sound samples. It is advantageous that the aural effects emphasise a mood and ambience of the content of the edited video-with- audio during playback.
The audio tag may be stored separately from audio content of the video-with- audio to facilitate editing of the content of the video-with-audio. The audio tag may preferably be defined by a series of alphanumeric code that represent characteristics of the selected soundtrack, characteristics such as, for example, mood, ambience, or beat frequency. The characteristics may preferably be input using a user interface which converts the characteristics into the series of alphanumeric code. The alphanumeric code may preferably be readable by the soundtrack creator.
It is advantageous that a recipient may able to vary the selected soundtrack when the recipient has either remote or local access to the soundtrack creator.
In another aspect, there is provided an audio tag used to generate a soundtrack in a video-with-audio using a soundtrack creator.
DESCRIPTION OF DRAWINGS
In order that the present invention may be fully understood and readily put into practical effect, there shall now be described by way of non-limitative example only preferred embodiments of the present invention, the description being with reference to the accompanying illustrative drawings.
Figure 1 shows a process flow of a preferred embodiment of the method of the present invention.
Figure 2 shows a process flow of a soundtrack creator as employed in several aspects in the preferred embodiment.
DESCRIPTION OF PREFERRED EMBODIMENTS
Referring to Figure 1 , there is provided a method 100 for incorporating a soundtrack into an edited video-with-audio recording using a video recording apparatus. It should be noted that the editing of the video-with-audio recording may also be done on a PC. The method 100 includes recording the video- with-audio for subsequent playback (102). It should be noted that video-with- audio also includes capturing a series of moving images with sound being muted during recording. The video-with-audio may be captured using multiple streams, with separate streams for images and audio. Alternatively, the video- with-audio may be captured as a single stream which includes both images and audio. The method 100 may be applied to both of the aforementioned forms of video-with-audio.
Subsequently, a soundtrack may be incorporated into the video-with-audio (104). The soundtrack may aid in emphasising a mood and/or ambience of content of the video-with-audio after incorporation into the video-with-audio. A plurality of soundtracks may be generated by a soundtrack creator.
Referring to Figure 2, there is shown a process in relation to how the plurality of soundtracks is generated from the soundtrack creator when assessing parameters of the content of the video-with-audio. Firstly, the soundtrack creator assesses content on the recorded video-with-audio (200). The soundtrack creator may be able to assess both single and multiple stream videos. The recorded video-with-audio is assessed using parameters such as, for example: • level of lighting in content of recorded video-with-audio;
• volume level of audio in content of recorded video-with-audio;
• density of audio in content of recorded video-with-audio; • movement of subjects in content of recorded video-with-audio; and
• any combination of the aforementioned parameters.
Generally, if the level of lighting detected in the content is higher than a predetermined level, the soundtrack creator may generate a soundtrack with an upbeat rhythm. Correspondingly, if the level of lighting detected in the content is lower than the pre-determined level, the soundtrack creator may generate a soundtrack with a sombre rhythm.
Similarly, if the volume level/density of audio in the content is higher than a pre-determined level, the soundtrack creator may generate a soundtrack with an upbeat rhythm. Correspondingly, if the volume level/density of audio detected in the content is lower than the pre-determined level, the soundtrack creator may generate a soundtrack with a sombre rhythm.
The soundtrack creator may also utilize digital imaging technology such as, for example, face recognition technology, pixel agglomeration technology and the like to detect movement of subjects in the content of recorded video-with- audio. Generally, if vigorous movements are detected, the soundtrack creator may generate a soundtrack with an upbeat rhythm. Correspondingly, if few movements of subjects in the content are detected, the soundtrack creator may generate a soundtrack with a sombre rhythm.
It should be noted that the soundtrack creator is not only able to generate soundtracks with upbeat and sombre rhythms. The aforementioned are merely examples used to aid in understanding the preferred embodiment. Other rhythms such as, for example, frantic, relaxed, inspirational and so forth may also be generated by the soundtrack creator. The soundtrack creator may have a capability to output an original composition as a soundtrack using stored sound samples (202), output at least one stored digitized musical file as a soundtrack (204), or output an original composition which combines stored sound samples and at least one stored digitized musical file (206). The digitized musical files may include file types such as, for example, MP3, WMA, OGG, MID, WAV, AAC and so forth. The digitized musical files may include short musical loops which may be repeated, stretched, and compressed. Each soundtrack may be stored as an audio tag (230). The soundtrack creator may provide several soundtracks for selection, each of which may be previewed on the video recording apparatus by a user subsequent to incorporation with the video-with-audio before a soundtrack is selected for definitive incorporation with the video-with-audio (106). Incorporating the selected soundtrack may involve either replacement of all audio in the recorded video-with-audio or combining (mixing) audio in the recorded video-with-audio with the selected soundtrack. The selected soundtrack may be stored separately from audio content of the video-with- audio to facilitate editing of the content of the video-with-audio at a later time. Storing the selected soundtrack separately may also enable removal of the selected soundtrack. This may depend on a preference of the user.
The audio tag may be defined by a series of alphanumeric codes that represent characteristics of the selected soundtrack, such as, for example, mood, ambience, beat frequency and the like. The alphanumeric code may be a nine digit arrangement of alphabets and numerals, such as, "abc 456 t5i", where the first three characters represent the mood, the middle three characters represent the ambience, and the final three characters represent the beat frequency. The alphanumeric code of the audio tag is readable by the soundtrack creator to aid in the generation of a soundtrack. This ensures that anyone with access to the soundtrack creator will have a capability to generate a soundtrack for a video-with-audio with an audio tag. The user need not know the alphanumeric code or a form of conversion per se, as the alphanumeric code represents a means to quantify non-quantifiable objects like mood and ambience. For example, "abc" may mean "sleepy mood", "456" may mean "celebratory occasion" and so forth. The conversion may be performed using a user interface which the user interacts with. The user may be able to use the user interface to input the terms "sleepy mood" and "celebratory occasion" and the conversion into alphanumeric code is correspondingly performed for input into the soundtrack creator. It should be noted that this nine digit arrangement of characters is not meant to be limiting as it is merely an illustrative example.
The user may input soundtrack characteristics (103) into the soundtrack creator such that output from the soundtrack creator is similar to what the user desires. A presence of the audio tag may aid the user in ensuring that any particular characteristic which is mandatory for the soundtrack of the video- with-audio would be taken into account by the soundtrack creator. More than one audio tag may be incorporated into the content of the video-with-audio as the mood and/or ambience of the content may vary, so a single audio tag may not be suitable for an entire content. Each audio tag may be defined to be invoked at a specific point in time during playback. Alternatively, each audio tag may include more than one soundtrack, and each soundtrack may be defined to be invoked at a specific point in time during playback.
Subsequently, the user may edit the content of the video-with-audio incorporated with the selected soundtrack by removing portions such as bad takes or unwanted scenes from the content of the video-with-audio (108).
Consequently, the soundtrack creator may rectify the selected soundtrack such that the edited video-with-audio has a fluent soundtrack similar to the selected soundtrack (110). Referring back to Figure 2, rectifying the selected soundtrack may involve the soundtrack creator identifying gaps in the selected soundtrack (208) caused by removal of portions of content of the video-with-audio. It is likely that the removal of portions of content would cause a loss in fluency of the selected soundtrack. In order to maintain fluency of the selected soundtrack, the soundtrack creator may perform adaptation tasks (210) such as, for example stretch/compress the soundtrack, increase/decrease the number of loops of the soundtrack, re-combination of sound samples and so forth. A duration of the rectified selected soundtrack may be similar to a duration of the edited video-with-audio, and this correspondingly also maintains mood and/or ambience of the content. Rectification of the selected soundtrack may also be reflected in the audio tag (232).
The use of the audio tag allows the edited video-with-audio to be customized by either a maker of the video-with-audio (the user of the video recording apparatus) or a receiver/recipient of the edited video-with-audio. The receiver/recipient would be able to customize the selected soundtrack to their preference if there was a presence of the soundtrack creator on the device which is used to consume the edited video-with-audio. It should be noted that the soundtrack creator may be accessed remotely by the device and need not be installed in the device. The presence of the audio tag which contains information from the maker pertaining to any particular characteristic which is mandatory for the soundtrack of the video-with-audio may ensure that the soundtrack creator would not generate unsuitable soundtracks for selection.
The maker may also have an option of restricting rights of the receiver/recipient to tamper with any aspect of the edited video-with-audio, regardless of whether the receiver/recipient has access to the soundtrack creator. This measure may be implemented to preserve a style, ambience and mood as originally intended by the maker.
It should be the noted that the audio tag mentioned in the preceding paragraphs is another aspect of the present invention.
Whilst there has been described in the foregoing description preferred embodiments of the present invention, it will be understood by those skilled in the technology concerned that many variations or modifications in details of design or construction may be made without departing from the present invention.

Claims

1. A method for incorporating a soundtrack into an edited video-with- audio recording using a video recording apparatus, the method including: recording the video-with-audio for subsequent playback; incorporating at least one soundtrack as an audio tag, the at least one soundtrack selected from a plurality of soundtracks, the plurality of soundtracks being generated from a soundtrack creator that depends on conditions selected from a group consisting of: parameters of content of the recorded video-with-audio, user-defined characteristics for the at least one soundtrack, and a combination of both of the aforementioned conditions; reviewing content of the video-with-audio which is incorporated with the selected soundtrack; editing the content of video-with-audio incorporated with the selected soundtrack by removing image frames of the video-with-audio; and rectifying the selected soundtrack using the soundtrack creator such that the edited video-with-audio has a fluent soundtrack similar to the selected soundtrack, wherein the edited video-with-audio is enhanced in relation to aural effects in a desired manner by the user.
2. The method as claimed in claim 1 , wherein the soundtrack creator is able to perform tasks aiding in a creation of the audio tag, the tasks selected from the group consisting of: output an original composition for use as a soundtrack using stored sound samples, output at least one stored digitized musical files as a soundtrack, and a combination of the aforementioned tasks.
3. The method as claimed in claim 2, wherein the digitized musical files include file types selected from the group consisting of: MP3, WMA, OGG, MID, WAV and AAC.
4. The method as claimed in claim 1 , wherein the parameters are selected from the group consisting of: level of lighting in content of recorded video-with-audio, volume level of audio in content of recorded video-with- audio, density of audio in content of recorded video-with-audio, movement of subjects in content of recorded video-with-audio, and any combination of the aforementioned parameters.
5. The method as claimed in claim 1, wherein incorporating the selected soundtrack involves either replacement of all audio in recorded content of the video-with-audio or combining audio in recorded content of the video-with- audio with the selected soundtrack.
6. The method as claimed in claim 1, wherein rectifying the selected soundtrack involves the soundtrack creator performing the steps of: identifying gaps in the selected soundtrack; and adapting and assessing fluency of the selected soundtrack; wherein a duration of the rectified selected soundtrack is similar to a duration of the edited video-with-audio.
7. The method as claimed in claim 1 , wherein the aural effects emphasise a mood and ambience of the content of the edited video-with-audio during playback.
8. The method as claimed in claim 6, wherein the adapting of the soundtrack involves the soundtrack creator performing tasks selected from the group consisting of: stretch/compress the soundtrack, increase/decrease the number of loops of the soundtrack, and re-combination of sound samples.
9. The method as claimed in claim 1 , wherein the audio tag is stored separately from audio content of the video-with-audio to facilitate editing of the content of the video-with-audio.
10. The method as claimed in claim 9, wherein the audio tag is defined by a series of alphanumeric code that represent characteristics of the selected soundtrack, the characteristics selected from the group consisting of: mood, ambience, and beat frequency.
11. The method as claimed in claim 1 , further including preventing tampering to the selected soundtrack incorporated with the video-with-audio.
12. The method as claimed in claim 10, wherein the characteristics are input using a user interface which converts the characteristics into the series of alphanumeric code.
13. The method as claimed in claim 9, wherein the alphanumeric code is readable by the soundtrack creator.
14. The method as claimed in claim 1 , wherein a recipient is able to vary the selected soundtrack when the recipient has access to the soundtrack creator.
15. The method as claimed in claim 14, wherein the access is either remote or local.
16. An audio tag used to generate a soundtrack in a video-with-audio using a soundtrack creator of the method of claim 1.
PCT/SG2008/000332 2007-09-05 2008-09-05 A method for incorporating a soundtrack into an edited video-with-audio recording and an audio tag WO2009031979A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP20080829376 EP2208344A4 (en) 2007-09-05 2008-09-05 A method for incorporating a soundtrack into an edited video-with-audio recording and an audio tag
US12/676,882 US20100226620A1 (en) 2007-09-05 2008-09-05 Method For Incorporating A Soundtrack Into An Edited Video-With-Audio Recording And An Audio Tag
CN200880105676XA CN101796829B (en) 2007-09-05 2008-09-05 A method for incorporating a soundtrack into an edited video-with-audio recording and an audio tag
HK11100830.1A HK1146775A1 (en) 2007-09-05 2011-01-27 A method for incorporating a soundtrack into an edited video-with-audio recording and an audio tag

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG200706527-9 2007-09-05
SG200706527-9A SG150415A1 (en) 2007-09-05 2007-09-05 A method for incorporating a soundtrack into an edited video-with-audio recording and an audio tag

Publications (1)

Publication Number Publication Date
WO2009031979A1 true WO2009031979A1 (en) 2009-03-12

Family

ID=40429140

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2008/000332 WO2009031979A1 (en) 2007-09-05 2008-09-05 A method for incorporating a soundtrack into an edited video-with-audio recording and an audio tag

Country Status (7)

Country Link
US (1) US20100226620A1 (en)
EP (1) EP2208344A4 (en)
CN (1) CN101796829B (en)
HK (1) HK1146775A1 (en)
SG (1) SG150415A1 (en)
TW (1) TWI519157B (en)
WO (1) WO2009031979A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013079763A1 (en) 2011-11-30 2013-06-06 Nokia Corporation Quality enhancement in multimedia capturing
CN105227763A (en) * 2015-08-31 2016-01-06 武汉工程大学 A kind of instrumental audio real time method for segmenting realized on Intelligent mobile equipment
EP2991076A1 (en) * 2014-08-28 2016-03-02 Thomson Licensing Method for selecting a sound track for a target video clip and corresponding device
CN113038258A (en) * 2021-03-04 2021-06-25 重庆电子工程职业学院 Digital multimedia audio transfer method and device

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4719250B2 (en) * 2008-06-20 2011-07-06 株式会社ソニー・コンピュータエンタテインメント Screen recording apparatus, screen recording method, screen recording program, and information storage medium
JP5887941B2 (en) * 2012-01-12 2016-03-16 ティアック株式会社 Electronic equipment with faders
US9858899B2 (en) 2013-06-13 2018-01-02 Microsoft Technology Licensing, Llc Managing transitions of adaptive display rates for different video playback scenarios
US9620169B1 (en) * 2013-07-26 2017-04-11 Dreamtek, Inc. Systems and methods for creating a processed video output
CN104347096A (en) * 2013-08-09 2015-02-11 上海证大喜马拉雅网络科技有限公司 Recording system and method integrating audio cutting, continuous recording and combination
US9767846B2 (en) * 2014-04-29 2017-09-19 Frederick Mwangaguhunga Systems and methods for analyzing audio characteristics and generating a uniform soundtrack from multiple sources
EP3161829B1 (en) * 2014-06-30 2019-12-04 Mario Amura Audio/video editing device, movie production method starting from still images and audio tracks and associated computer program
US9990349B2 (en) 2015-11-02 2018-06-05 Microsoft Technology Licensing, Llc Streaming data associated with cells in spreadsheets
US10713428B2 (en) 2015-11-02 2020-07-14 Microsoft Technology Licensing, Llc Images associated with cells in spreadsheets
EP3387523A4 (en) * 2015-12-07 2019-08-21 Creative Technology Ltd. An audio system
CN105872727A (en) * 2016-03-31 2016-08-17 乐视控股(北京)有限公司 Video stream transcoding method and device
CN106371797A (en) * 2016-08-31 2017-02-01 腾讯科技(深圳)有限公司 Method and device for configuring sound effect
US10734026B2 (en) * 2016-09-01 2020-08-04 Facebook, Inc. Systems and methods for dynamically providing video content based on declarative instructions
US10991379B2 (en) * 2018-06-22 2021-04-27 Babblelabs Llc Data driven audio enhancement
CN109034011A (en) * 2018-07-06 2018-12-18 成都小时代科技有限公司 It is a kind of that Emotional Design is applied to the method and system identified in label in car owner

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997036297A1 (en) * 1996-03-25 1997-10-02 Interval Research Corporation Automated synchronization of video image sequences to new soundtracks
US5999906A (en) * 1997-09-24 1999-12-07 Sony Corporation Sample accurate audio state update
US6289165B1 (en) 1998-11-12 2001-09-11 Max Abecassis System for and a method of playing interleaved presentation segments
US20050228663A1 (en) * 2004-03-31 2005-10-13 Robert Boman Media production system using time alignment to scripts
US20060147185A1 (en) * 2005-01-05 2006-07-06 Creative Technology Ltd. Combined audio/video/USB device
US7225405B1 (en) * 1999-09-28 2007-05-29 Ricoh Company, Ltd. System and method for audio creation and editing in a multimedia messaging environment

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09507941A (en) * 1995-04-18 1997-08-12 インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン Block normalization without wait cycles in a multi-add floating point sequence
US6396874B1 (en) * 1997-11-12 2002-05-28 Sony Corporation Decoding method and apparatus and recording method and apparatus for moving picture data
US20030227473A1 (en) * 2001-05-02 2003-12-11 Andy Shih Real time incorporation of personalized audio into video game
US7236960B2 (en) * 2002-06-25 2007-06-26 Eastman Kodak Company Software and system for customizing a presentation of digital images
US20040064702A1 (en) * 2002-09-27 2004-04-01 Yu Hong Heather Methods and apparatus for digital watermarking and watermark decoding
EP1666967B1 (en) * 2004-12-03 2013-05-08 Magix AG System and method of creating an emotional controlled soundtrack
US20060204214A1 (en) * 2005-03-14 2006-09-14 Microsoft Corporation Picture line audio augmentation
WO2007068090A1 (en) * 2005-12-12 2007-06-21 Audiokinetic Inc. System and method for authoring media content
JP4229127B2 (en) * 2006-02-14 2009-02-25 ソニー株式会社 Video processing apparatus and time code adding method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997036297A1 (en) * 1996-03-25 1997-10-02 Interval Research Corporation Automated synchronization of video image sequences to new soundtracks
US5999906A (en) * 1997-09-24 1999-12-07 Sony Corporation Sample accurate audio state update
US6289165B1 (en) 1998-11-12 2001-09-11 Max Abecassis System for and a method of playing interleaved presentation segments
US7225405B1 (en) * 1999-09-28 2007-05-29 Ricoh Company, Ltd. System and method for audio creation and editing in a multimedia messaging environment
US20050228663A1 (en) * 2004-03-31 2005-10-13 Robert Boman Media production system using time alignment to scripts
US20060147185A1 (en) * 2005-01-05 2006-07-06 Creative Technology Ltd. Combined audio/video/USB device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2208344A4 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013079763A1 (en) 2011-11-30 2013-06-06 Nokia Corporation Quality enhancement in multimedia capturing
EP2786373A4 (en) * 2011-11-30 2015-10-14 Nokia Technologies Oy Quality enhancement in multimedia capturing
US9282279B2 (en) 2011-11-30 2016-03-08 Nokia Technologies Oy Quality enhancement in multimedia capturing
EP2991076A1 (en) * 2014-08-28 2016-03-02 Thomson Licensing Method for selecting a sound track for a target video clip and corresponding device
CN105227763A (en) * 2015-08-31 2016-01-06 武汉工程大学 A kind of instrumental audio real time method for segmenting realized on Intelligent mobile equipment
CN113038258A (en) * 2021-03-04 2021-06-25 重庆电子工程职业学院 Digital multimedia audio transfer method and device

Also Published As

Publication number Publication date
EP2208344A4 (en) 2011-03-02
HK1146775A1 (en) 2011-07-08
TW200920115A (en) 2009-05-01
CN101796829A (en) 2010-08-04
SG150415A1 (en) 2009-03-30
US20100226620A1 (en) 2010-09-09
CN101796829B (en) 2012-07-11
TWI519157B (en) 2016-01-21
EP2208344A1 (en) 2010-07-21

Similar Documents

Publication Publication Date Title
US20100226620A1 (en) Method For Incorporating A Soundtrack Into An Edited Video-With-Audio Recording And An Audio Tag
AU2021201716B2 (en) Rhythmic Synchronization Of Cross Fading For Musical Audio Section Replacement For Multimedia Playback
US9113109B2 (en) Collection and concurrent integration of supplemental information related to currently playing media
JP6367334B2 (en) Video processing method, apparatus, and playback apparatus
EP1648172A1 (en) System and method for embedding multimedia editing information in a multimedia bitstream
EP1635575A1 (en) System and method for embedding scene change information in a video bitstream
US20060078292A1 (en) Apparatus and method for embedding content information in a video bit stream
US8068719B2 (en) Systems and methods for detecting exciting scenes in sports video
US20160100204A1 (en) Media management based on derived quantitative data of quality
GB2457968A (en) Forming a presentation of content
US20050281289A1 (en) System and method for embedding multimedia processing information in a multimedia bitstream
JP2007060060A (en) Reproduction system, reproducing apparatus, reproducing method, information processing apparatus, information processing method, and program
US20060059509A1 (en) System and method for embedding commercial information in a video bitstream
JP4735413B2 (en) Content playback apparatus and content playback method
US20160093333A1 (en) Recording medium recorded with multi-track media file, method for editing multi-track media file, and apparatus for editing multi-track media file
US7899752B2 (en) Method and system for preventing skipping playback of a special content section of a digital media stream
US20100329638A1 (en) Method and apparatus for independent licensing of audio in distribution of audiovisual assets
TWI407322B (en) Multimedia identification system and method, and the application
JPWO2019130763A1 (en) Information processing equipment, information processing methods and programs
US20190237050A1 (en) Systems and methods for detecting musical features in audio content
JP2009033413A (en) Information processor, information processing method, and program
JP4032122B2 (en) Video editing apparatus, video editing program, recording medium, and video editing method
WO2009044351A1 (en) Generation of image data summarizing a sequence of video frames
JP2010057003A (en) Image recording device, and image recording method
US20160127807A1 (en) Dynamically determined audiovisual content guidebook

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200880105676.X

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08829376

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 12676882

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2008829376

Country of ref document: EP