WO2007132427A1 - Ringtone customization for portable telecommunication applications - Google Patents

Ringtone customization for portable telecommunication applications Download PDF

Info

Publication number
WO2007132427A1
WO2007132427A1 PCT/IB2007/051836 IB2007051836W WO2007132427A1 WO 2007132427 A1 WO2007132427 A1 WO 2007132427A1 IB 2007051836 W IB2007051836 W IB 2007051836W WO 2007132427 A1 WO2007132427 A1 WO 2007132427A1
Authority
WO
WIPO (PCT)
Prior art keywords
ringtone
user
audio data
electronic device
synthesizer
Prior art date
Application number
PCT/IB2007/051836
Other languages
French (fr)
Inventor
Laurent Lucat
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2007132427A1 publication Critical patent/WO2007132427A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M19/00Current supply arrangements for telephone systems
    • H04M19/02Current supply arrangements for telephone systems providing ringing current or supervisory tones, e.g. dialling tone or busy tone
    • H04M19/04Current supply arrangements for telephone systems providing ringing current or supervisory tones, e.g. dialling tone or busy tone the ringing-current being generated at the substations
    • H04M19/041Encoding the ringing signal, i.e. providing distinctive or selective ringing capability
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • G10H1/366Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems with means for modifying or correcting the external signal, e.g. pitch correction, reverberation, changing a singer's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2230/00General physical, ergonomic or hardware implementation of electrophonic musical tools or instruments, e.g. shape or architecture
    • G10H2230/005Device type or category
    • G10H2230/021Mobile ringtone, i.e. generation, transmission, conversion or downloading of ringing tones or other sounds for mobile telephony; Special musical data formats or protocols herefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/011Files or data streams containing coded musical information, e.g. for transmission
    • G10H2240/031File merging MIDI, i.e. merging or mixing a MIDI-like file or stream with a non-MIDI file or stream, e.g. audio or video
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/011Files or data streams containing coded musical information, e.g. for transmission
    • G10H2240/046File format, i.e. specific or non-standard musical file format used in or adapted for electrophonic musical instruments, e.g. in wavetables
    • G10H2240/056MIDI or other note-oriented file format
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/471General musical sound synthesis principles, i.e. sound category-independent synthesis methods
    • G10H2250/481Formant synthesis, i.e. simulating the human speech production mechanism by exciting formant resonators, e.g. mimicking vocal tract filtering as in LPC synthesis vocoders, wherein musical instruments may be used as excitation signal to the time-varying filter estimated from a singer's speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/541Details of musical waveform synthesis, i.e. audio waveshape processing from individual wavetable samples, independently of their origin or of the sound they represent
    • G10H2250/615Waveform editing, i.e. setting or modifying parameters for waveform synthesis.

Definitions

  • This invention relates generally to ringtone customization for portable telecommunication applications and, more specifically, to a system and method for use with a polyphonic synthesizer in a portable electronic device for ringtone customization, wherein the portable electronic device has a microphone for receiving audio data input by a user and a stored bank of pre-stored ringtones.
  • GM Levell specifies the synthesizer being able to address the 128 predefined melodic GM instruments, as well as 47 sounds drumkit.
  • Such devices are also increasingly becoming compliant with the well-known XMF (extensible Music Format), which basically consists of the combination of MIDI score data together with DLS (DownLoadable Sounds) the latter one consisting of customized sound waveforms (sounds not belonging to the GM database).
  • XMF extensible Music Format
  • DLS DownLoadable Sounds
  • the XMF like other proprietary formats, such as Hyundai SMAF for Standard Mobile Audio Format
  • These sounds are designed by the XMF (or SMAF,...) content creator, which would typically be a person experienced in music and audio applications, and with the help of content creation software running on a PC (Personal Computer).
  • US Patent Application Publication No. US2004/0120506A1 describes a system wherein an embedded microphone in a portable electronic device can be used to receive audio data input by a user, such as a cough or other inconspicuous sound, and then playback this audio data as a ringtone for event notification.
  • an embedded microphone in a portable electronic device can be used to receive audio data input by a user, such as a cough or other inconspicuous sound, and then playback this audio data as a ringtone for event notification.
  • a ringtone customization system for use with a polyphonic synthesizer in a portable electronic device, said portable electronic device further comprising a microphone for receiving audio data input by a user, and a stored bank of pre-stored ringtones formatted for playback by said polyphonic synthesizer, the system comprising means for formatting the audio data input by a user for playback by said polyphonic synthesizer, and means for combining said audio data with at least a portion of a pre-stored ringtone to form a customized ringtone for playback by said polyphonic synthesizer.
  • a selected pre-recorded ringtone can be customized and personalized using audio data input via an embedded microphone, rather than requiring the use of an external computing device to create a customized ringtone for download onto the portable electronic device.
  • a pre-recorded ringtone comprises a plurality of sound channels, one or more of which is substituted with said audio data input by the user to form said customized ringtone.
  • Means are beneficially provided for extracting audio data input by the user from background noise received simultaneously at said microphone.
  • means may be provided for extracting and storing audio parameters from said audio data input by the user.
  • the analogue audio data is first digitized (e.g. by converting it to PCM for Pulse Code Modulation) before audio parameter extraction.
  • the pre-recorded ringtones are in the known MIDI format and, in one exemplary embodiment, the polyphonic synthesizer may be wavetable-based and XMF compliant.
  • Figure 1 is a schematic black diagram illustrating the principal components of a system according to a first exemplary embodiment of the invention
  • Figure 2 is a schematic black diagram illustrating the principal components of a system according to a second exemplary embodiment of the present invention
  • Figure 3 is a schematic flow diagram illustrating the principal steps in building a DLS file in respect of audio data input by a user to the system of Figure 2;
  • Figure 4 is a schematic flow diagram illustrating the principal steps in determining and storing user-sound timbre parameters in respect of audio data input by a user to the system of Figure 1.
  • the wavetable approach means that real instrument notes are captured and analyzed (in Labs, at the design stage) in order to extract stationary fragments. In this way, it is not necessary to store, for example, four seconds of a trumpet sound in order to playback four seconds of the trumpet sound. Instead, by determining the start and end of a looping inside the recorded sound, it is possible to play the note with iterated looping between the start and end looping points. In this way, it is only necessary to store the samples up to the end- loop point. Depending on the stationary nature of the instrument sound, and on the targeted sounding quality, this enables a significant memory saving.
  • a time-evolving slope can be extracted from the signal, based on its power evolution over the time. Such information can be easily captured, and can afterwards drive the synthesis process, for instance according to an ADSR (Attack-Decay-Sustain-Release) parameter model.
  • ADSR Adttack-Decay-Sustain-Release
  • the synthesizer is not wavetable-based or is not XMF compliant.
  • the implementation architecture of the invention inside the mobile phone is summarized in the schematic drawing of Figure 1.
  • Input sound 100 (e.g. user singing voice) is captured by the microphone 102 embedded in a mobile phone 104 and converted into PCM by the analogue-to-digital converter (ADC) 106.
  • a dedicated module 108 extracts audio parameters from the PCMs by means of a known technique of user sound (or voice) analysis.
  • a sinusoidal-parametric synthesizer harmonic frequencies, amplitudes and phases are extracted, like for coding purposes.
  • Looping points start and end of loops
  • the user-sound parameters are converted to the synthesizer internal format, as for other GM devices at module 110 stored in a user sound bank 112 (i.e. storage of user- sound timbre parameters: loop start, loop end, harmonic frequencies, amplitudes and phases, temporal envelope parameters such as pitch and amplitude modulation parameters) within the synthesizer module, in the same way as the other GM instruments.
  • a user sound bank 112 i.e. storage of user- sound timbre parameters: loop start, loop end, harmonic frequencies, amplitudes and phases, temporal envelope parameters such as pitch and amplitude modulation parameters
  • the MIDI synthesizer substitutes one or more channel, or one of more GM instrument of the MIDI melody by the "user-instrument" retrieved from the user sound bank 112.
  • the required MIDI file is retrieved from a MIDI file database 114 and a required user sound is used as a substitute for one or more of the GM channels of the ringtone (or defined by the GM sound bank 113).
  • the resultant file generated by the MIDI polyphonic synthesis kernel 115 is converted to analogue format by the DAC 116, amplified at 118 and a user-customized ringtone 120 is played back by a loudspeaker 122.
  • the synthesizer is wavetable-based and is XMF compliant.
  • the implementation architecture of the invention inside the mobile phone is summarized by the schematic drawing of Figure 2.
  • Audio input data 200 (e.g. user singing voice) is captured by the microphone 202 embedded in the mobile phone 204 and converted into PCM by the analogue-to-digital converter (ADC) 206.
  • a dedicated module 208 extracts basic waveforms from the signal, by determining looping points (start and end of loop).
  • a temporal slope can be also extracted (e.g. using signal energy computation), leading e.g. to ADSR parameters, or it can be modeled based on a-priori knowledge about the signal (for instance, based on the fact that the user sound is assumed to be a singing-voice sound).
  • the waveforms are converted at module 210 into suitable DLS format. This closes the "initialization process”.
  • the MIDI file is converted into XMF format at module 224, involving the inclusion of the "user-DLS" and the replacement of one or more GM instrument allocation (defined by the GM sound bank 213) by the user- instrument inside the XMF file, and the obtained XMF file is sent to the XMF synthesizer 215.
  • the resultant file is then converted to analogue at 216, amplified at 218 and a user- customized ringtone 220 is played back by loudspeaker 222 as before.
  • signal calibration 300 basically consists in isolating the real audio signal content by cutting the first samples where the sound is not present (i.e. it is only background noise), as well as the last background-noise samples after the sound. This can be enhanced by a known process, such as equalization, noise reduction etc.
  • Amplitude normalization 302 is basically again applied to the signal in order to get a suitable signal dynamics.
  • Pitch extraction 304 is a state-of-the-art process as will be known to a person skilled in the art.
  • Pitch normalization 306 is based on the detected pitch from the previous stage. It aims at shifting the signal pitch to a desired standard value (one of the music scale note) by any state-of-the-art process like sample interpolation or resampling.
  • Loop points start and end are determined at 308 using, for example, signal autocorrelation process. An optimal looping will correspond to a maximal value of autocorrelation. Furthermore, since signal has been normalized in pitch, one may exploit the fact that the looping is expected to match with the sample period (pitch period), which will help the looping search. Signal truncation to end- loop point 310 is trivial, as will be apparent to a person skilled in the art and the result can be stored in PCM format.
  • Envelope parameter extraction 312 consists in extracting slopes/duration of the different phases of the signal.
  • a typical ADSR (attack, decay, sustain, release) model may be used.
  • Process can use signal local energy computation (state-of-the-art).
  • Pitch and amplitude modulation parameter (typically corresponding to vibrato and tremolo) extraction 314 is optional. Instead, a default (typical) value can be used. If real extraction is desired, they can be respectively based on pitch and amplitude temporal evolution (state-of-the-art).
  • the output from modules 308, 310, 312 and 314 are used to build at 316 a DLS file and the DLS file 318 is output.
  • the presented description of the first configuration in relation to Figure 1 takes the example of a parametric sinusoidal MIDI synthesizer, such as the LifeVibes JingleBlaster developed by Philips.
  • signal calibration 400, amplitude normalization 402 and pitch extract 406 are performed as before, but pitch normalization is unnecessary.
  • Loop points (start and end) determination 408, envelope parameter extraction 412 and pitch and amplitude modulation parameter extraction 414 are performed as before, and extraction 420 of harmonics frequencies, amplitudes (and optionally, phases) is basically strictly similar to the parametric sinusoidal audio encoding process.
  • Phase extraction is optional, since phases can be computed at the synthesis stage.
  • pitch and amplitude modulation parameter extraction is also optional.
  • the parameter storage format is dependent on the synthesizer implementation. Basically, this can be the same as for the known GM instruments.
  • the DLS level 1 specification allows up to 16 different samples for the same instrument. DLS also specifies how to manage these different samples.
  • prior art MIDI polyphonic ringers do not allow the end- users to personalize the sound rendering of their melodies without the help of an external computer.
  • sound-rendering personalization is useful for portable devices such as mobile phones, in the way that it allows better user-ringer identification for an incoming call/incoming SMS notification, and, according to the tuning, it can produce distinctive sounding ringtones, which are commonly appreciated from a young-user experience point of view.
  • the proposed solution performs the sound rendering personalization through the use of the embedded microphone, which allows capturing sounds like the user singing voice.
  • the vocal samples are not compressed in the proposed solution, but are instead analyzed for extracting voice parameter, in order to build an instrument that can be afterward used by the synthesizer kernel, for substituting one or more voices (one or more instruments) when playing any incoming MIDI file.
  • the user-sound e.g. user-voice
  • the microphone can be considered as a synthesizer initialization, which has to be done only once (or only each time the user wants to change the personalization), that is, it does not need to be done when the incoming MIDI file is changed.
  • the proposed invention combines already existing elements of mobile phone devices (microphone, MIDI/XMF player) together with a new module, preferably implemented as a Software module, which aims to convert input sounds into instrument-characteristic data that can be used by the MIDI/XMF player. In this way, a "user" instrument is created.
  • the synthesizer will be able to play input MIDI-music with some channels producing music notes having the sounding of the microphone-recorded sound (e.g. the user singing voice).
  • Embedded solution in mobile or home phones equipped with MIDI or XMF synthesizer, for user-customized ringing (incoming call, SMS or any other notification alert) feature for user-customized ringing (incoming call, SMS or any other notification alert) feature.
  • Embedded solution in mobile devices such as mobile or cordless phones, or Personal Digital Assistants (PDAs), for entertainment purposes.
  • mobile devices such as mobile or cordless phones, or Personal Digital Assistants (PDAs)
  • PDAs Personal Digital Assistants

Abstract

A ringtone customization system for use with a polyphonic synthesizer (115) of a portable electronic device (104) having a microphone (102) embedded therein. A user inputs audio (e.g. voice) data (100) via the microphone (102). The audio input data (100) is then formatted and used to replace one or more channels of a pre-recorded MIDI format ringtone. Thus, a customized ringtone (120) can be generated for playback by the synthesizer (115) without the need for an external computing device.

Description

Ringtone customization for portable telecommunication applications.
FIELD OF THE INVENTION This invention relates generally to ringtone customization for portable telecommunication applications and, more specifically, to a system and method for use with a polyphonic synthesizer in a portable electronic device for ringtone customization, wherein the portable electronic device has a microphone for receiving audio data input by a user and a stored bank of pre-stored ringtones.
BACKGROUND OF THE INVENTION
Nowadays, most mobile phone handsets include a software or hardware polyphonic synthesizer for their ringtone feature. Typical achievable polyphony level is 16, 32 or 64 voices, depending on the platform capabilities. Most of (if not all) the synthesizers are compliant with the well-known MIDI (Musical Instrument and Digital Interface) file format, as well as GM (General Midi) Levell or Level2 capabilities. For instance, GM Levell capability specifies the synthesizer being able to address the 128 predefined melodic GM instruments, as well as 47 sounds drumkit.
Such devices are also increasingly becoming compliant with the well-known XMF (extensible Music Format), which basically consists of the combination of MIDI score data together with DLS (DownLoadable Sounds) the latter one consisting of customized sound waveforms (sounds not belonging to the GM database). The XMF (like other proprietary formats, such as Yamaha SMAF for Standard Mobile Audio Format) allows one to play music with sounds other than the conventional GM ones. These sounds are designed by the XMF (or SMAF,...) content creator, which would typically be a person experienced in music and audio applications, and with the help of content creation software running on a PC (Personal Computer). As a result, the end-user of the mobile device cannot personalize a ringtone by any means (the only option in this case is to select another ringing file) US Patent Application Publication No. US2004/0120506A1 describes a system wherein an embedded microphone in a portable electronic device can be used to receive audio data input by a user, such as a cough or other inconspicuous sound, and then playback this audio data as a ringtone for event notification. However, there is no facility for customizing existing pre-recorded ringtones using audio data input by the user without the need for an external computing device on which to build the customized ringtone and from which the customized ringtone can then be downloaded to the portable electronic device.
SUMMARY OF THE INVENTION It is therefore an object of the present invention to provide a system wherein user- recorded audio data can be used to customize the audio output of a portable electronic device, without the need for an external computing device.
In accordance with the present invention, there is provided a ringtone customization system for use with a polyphonic synthesizer in a portable electronic device, said portable electronic device further comprising a microphone for receiving audio data input by a user, and a stored bank of pre-stored ringtones formatted for playback by said polyphonic synthesizer, the system comprising means for formatting the audio data input by a user for playback by said polyphonic synthesizer, and means for combining said audio data with at least a portion of a pre-stored ringtone to form a customized ringtone for playback by said polyphonic synthesizer.
Thus, a selected pre-recorded ringtone can be customized and personalized using audio data input via an embedded microphone, rather than requiring the use of an external computing device to create a customized ringtone for download onto the portable electronic device. In a preferred embodiment, a pre-recorded ringtone comprises a plurality of sound channels, one or more of which is substituted with said audio data input by the user to form said customized ringtone.
Means are beneficially provided for extracting audio data input by the user from background noise received simultaneously at said microphone. In a first exemplary embodiment, means may be provided for extracting and storing audio parameters from said audio data input by the user. Preferably, the analogue audio data is first digitized (e.g. by converting it to PCM for Pulse Code Modulation) before audio parameter extraction. In a preferred embodiment, the pre-recorded ringtones are in the known MIDI format and, in one exemplary embodiment, the polyphonic synthesizer may be wavetable-based and XMF compliant.
These and other aspects of the invention will be apparent from, and elucidated with reference to, the embodiments described herein.
BRIEF DESCRIPTION OF THE DRAWINGS Embodiments of the present invention will now be described by way of example only and with reference to the accompanying drawings, in which:
Figure 1 is a schematic black diagram illustrating the principal components of a system according to a first exemplary embodiment of the invention; Figure 2 is a schematic black diagram illustrating the principal components of a system according to a second exemplary embodiment of the present invention;
Figure 3 is a schematic flow diagram illustrating the principal steps in building a DLS file in respect of audio data input by a user to the system of Figure 2; and
Figure 4 is a schematic flow diagram illustrating the principal steps in determining and storing user-sound timbre parameters in respect of audio data input by a user to the system of Figure 1.
DETAILED DESCRIPTION OF THE INVENTION
By way of background, while the first polyphonic synthesizers embedded in high- constrained mobile devices such as mobile phones were hardware implemented and frequency-modulation (FM) based (see e.g. Yamaha MA-I, MA-2 chipsets), before 2000, most embedded synthesizers are nowadays hardware or software implemented and based on the wavetable technology.
Basically, the wavetable approach means that real instrument notes are captured and analyzed (in Labs, at the design stage) in order to extract stationary fragments. In this way, it is not necessary to store, for example, four seconds of a trumpet sound in order to playback four seconds of the trumpet sound. Instead, by determining the start and end of a looping inside the recorded sound, it is possible to play the note with iterated looping between the start and end looping points. In this way, it is only necessary to store the samples up to the end- loop point. Depending on the stationary nature of the instrument sound, and on the targeted sounding quality, this enables a significant memory saving.
Furthermore, it is not required to store the data for all notes of the instrument, since spectral content does not change significantly between two adjacent notes (in the chromatic scale), thereby further enhancing data storage saving. In addition, a time-evolving slope can be extracted from the signal, based on its power evolution over the time. Such information can be easily captured, and can afterwards drive the synthesis process, for instance according to an ADSR (Attack-Decay-Sustain-Release) parameter model.
Some alternative approaches to the wavetable are also currently being investigated or developed, such as a parametric sinusoidal synthesizer by Philips. Basically, at the synthesizer design stage, instrument sounds are analyzed and modeled as a combination of time-evolving sinusoids (with variable frequencies and amplitudes, + noise). As was the case with the wavetable approach, and for the same reasons, start and end looping points are determined, as well as temporal slope parameters. Only the model parameter have to be stored (e.g. frequencies, amplitudes, phases, loop start, loop end, ADSR slope parameters). Instrument sound synthesis can be obtained using only these stored parameters.
It will be appreciated that the detailed implementation of the proposed invention will depend on the configuration and capabilities of the components already embedded in the mobile terminal. More precisely, it will depend on the synthesis technology used by the polyphonic synthesizer and its compliance to file formats such as XMF.
It is therefore useful to distinguish herein between two main exemplary configurations and the difference between the two configurations will lead to slightly different implementations, which will be discussed hereafter.
In a first configuration, the synthesizer is not wavetable-based or is not XMF compliant. The implementation architecture of the invention inside the mobile phone is summarized in the schematic drawing of Figure 1.
Input sound 100 (e.g. user singing voice) is captured by the microphone 102 embedded in a mobile phone 104 and converted into PCM by the analogue-to-digital converter (ADC) 106. A dedicated module 108 extracts audio parameters from the PCMs by means of a known technique of user sound (or voice) analysis. By way of example, in the case of a sinusoidal-parametric synthesizer, harmonic frequencies, amplitudes and phases are extracted, like for coding purposes. Looping points (start and end of loops) are also determined, based, for example, on (near-) cyclic periodicity detection into the signal, as well as temporal slope parameter (e.g. ADSR), based for instance on signal energy computation or an a-priori knowledge about the signal (for example, the user sound is assumed to be a singing- vo ice sound).
Then, the user-sound parameters are converted to the synthesizer internal format, as for other GM devices at module 110 stored in a user sound bank 112 (i.e. storage of user- sound timbre parameters: loop start, loop end, harmonic frequencies, amplitudes and phases, temporal envelope parameters such as pitch and amplitude modulation parameters) within the synthesizer module, in the same way as the other GM instruments. This closes a so-called "initialization process". Afterwards, for each MIDI files coming from the MIDI files database, the MIDI synthesizer substitutes one or more channel, or one of more GM instrument of the MIDI melody by the "user-instrument" retrieved from the user sound bank 112.
When a ringtone is required to be played back the required MIDI file is retrieved from a MIDI file database 114 and a required user sound is used as a substitute for one or more of the GM channels of the ringtone (or defined by the GM sound bank 113). The resultant file generated by the MIDI polyphonic synthesis kernel 115 is converted to analogue format by the DAC 116, amplified at 118 and a user-customized ringtone 120 is played back by a loudspeaker 122.
In a second configuration, the synthesizer is wavetable-based and is XMF compliant. The implementation architecture of the invention inside the mobile phone is summarized by the schematic drawing of Figure 2.
Audio input data 200 (e.g. user singing voice) is captured by the microphone 202 embedded in the mobile phone 204 and converted into PCM by the analogue-to-digital converter (ADC) 206. A dedicated module 208 extracts basic waveforms from the signal, by determining looping points (start and end of loop). A temporal slope can be also extracted (e.g. using signal energy computation), leading e.g. to ADSR parameters, or it can be modeled based on a-priori knowledge about the signal (for instance, based on the fact that the user sound is assumed to be a singing-voice sound). Then, the waveforms are converted at module 210 into suitable DLS format. This closes the "initialization process". Afterwards, for each MIDI file coming from the MIDI file database 214, the MIDI file is converted into XMF format at module 224, involving the inclusion of the "user-DLS" and the replacement of one or more GM instrument allocation (defined by the GM sound bank 213) by the user- instrument inside the XMF file, and the obtained XMF file is sent to the XMF synthesizer 215. The resultant file is then converted to analogue at 216, amplified at 218 and a user- customized ringtone 220 is played back by loudspeaker 222 as before.
Referring to Figure 3 of the drawings, in the sound parameter extraction module 208 DLS, conversion module 210, signal calibration 300 basically consists in isolating the real audio signal content by cutting the first samples where the sound is not present (i.e. it is only background noise), as well as the last background-noise samples after the sound. This can be enhanced by a known process, such as equalization, noise reduction etc. Amplitude normalization 302 is basically again applied to the signal in order to get a suitable signal dynamics.
Pitch extraction 304 is a state-of-the-art process as will be known to a person skilled in the art. Pitch normalization 306 is based on the detected pitch from the previous stage. It aims at shifting the signal pitch to a desired standard value (one of the music scale note) by any state-of-the-art process like sample interpolation or resampling.
After that, a "clean" input sample is obtained: normalized in amplitude and pitch, isolated from background. Loop points (start and end) are determined at 308 using, for example, signal autocorrelation process. An optimal looping will correspond to a maximal value of autocorrelation. Furthermore, since signal has been normalized in pitch, one may exploit the fact that the looping is expected to match with the sample period (pitch period), which will help the looping search. Signal truncation to end- loop point 310 is trivial, as will be apparent to a person skilled in the art and the result can be stored in PCM format.
Envelope parameter extraction 312 consists in extracting slopes/duration of the different phases of the signal. A typical ADSR (attack, decay, sustain, release) model may be used. Process can use signal local energy computation (state-of-the-art). Pitch and amplitude modulation parameter (typically corresponding to vibrato and tremolo) extraction 314 is optional. Instead, a default (typical) value can be used. If real extraction is desired, they can be respectively based on pitch and amplitude temporal evolution (state-of-the-art). The output from modules 308, 310, 312 and 314 are used to build at 316 a DLS file and the DLS file 318 is output. The presented description of the first configuration in relation to Figure 1 takes the example of a parametric sinusoidal MIDI synthesizer, such as the LifeVibes JingleBlaster developed by Philips.
Referring to Figure 4 in respect of the first configuration, signal calibration 400, amplitude normalization 402 and pitch extract 406 are performed as before, but pitch normalization is unnecessary. Loop points (start and end) determination 408, envelope parameter extraction 412 and pitch and amplitude modulation parameter extraction 414 are performed as before, and extraction 420 of harmonics frequencies, amplitudes (and optionally, phases) is basically strictly similar to the parametric sinusoidal audio encoding process. Phase extraction is optional, since phases can be computed at the synthesis stage. As in the previous case, pitch and amplitude modulation parameter extraction is also optional. The parameter storage format is dependent on the synthesizer implementation. Basically, this can be the same as for the known GM instruments.
For both configurations, in an improved version, it is possible to process the user- sound analysis for several samples, each one having a different pitch. Because sound characteristics of a given source (instrument, user singing voice etc) are not constant over the whole musical note range, it is desirable to have such a frequency- split analysis, in order to reach a better sound naturalness. Considering the second configuration, the DLS level 1 specification allows up to 16 different samples for the same instrument. DLS also specifies how to manage these different samples.
Thus, as explained above, prior art MIDI polyphonic ringers do not allow the end- users to personalize the sound rendering of their melodies without the help of an external computer. However, sound-rendering personalization is useful for portable devices such as mobile phones, in the way that it allows better user-ringer identification for an incoming call/incoming SMS notification, and, according to the tuning, it can produce distinctive sounding ringtones, which are commonly appreciated from a young-user experience point of view. The proposed solution performs the sound rendering personalization through the use of the embedded microphone, which allows capturing sounds like the user singing voice.
The vocal samples are not compressed in the proposed solution, but are instead analyzed for extracting voice parameter, in order to build an instrument that can be afterward used by the synthesizer kernel, for substituting one or more voices (one or more instruments) when playing any incoming MIDI file. As a consequence, the user-sound (e.g. user-voice) captured by the microphone can be considered as a synthesizer initialization, which has to be done only once (or only each time the user wants to change the personalization), that is, it does not need to be done when the incoming MIDI file is changed.
The proposed invention combines already existing elements of mobile phone devices (microphone, MIDI/XMF player) together with a new module, preferably implemented as a Software module, which aims to convert input sounds into instrument-characteristic data that can be used by the MIDI/XMF player. In this way, a "user" instrument is created.
Afterwards, by substituting one or more MIDI channel/instruments by the created "user- instrument", either inside the melody file in case of a XMF-like file or inside the synthesizer in case of a "simple" MIDI configuration, the synthesizer will be able to play input MIDI-music with some channels producing music notes having the sounding of the microphone-recorded sound (e.g. the user singing voice).
Applications of the present invention include:
Embedded solution in mobile or home phones equipped with MIDI or XMF synthesizer, for user-customized ringing (incoming call, SMS or any other notification alert) feature.
Embedded solution in mobile devices, such as mobile or cordless phones, or Personal Digital Assistants (PDAs), for entertainment purposes.
Integration in PC software, for use with a microphone or any other audio input, for entertainment purposes.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be capable of designing many alternative embodiments without departing from the scope of the invention as defined by the appended claims. In the claims, any reference signs placed in parentheses shall not be construed as limiting the claims. The word "comprising" and "comprises", and the like, does not exclude the presence of elements or steps other than those listed in any claim or the specification as a whole. The singular reference of an element does not exclude the plural reference of such elements and vice- versa. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

1. A portable electronic device (104, 204) comprising: a polyphonic synthesizer (115, 215); - a microphone (102, 202) for receiving audio data (100, 200) input by a user; a stored bank (114, 214) of pre-stored ringtones formatted for playback by said polyphonic synthesizer (115, 215); a ringtone customization system for use with the polyphonic synthesizer (115, 215), said ringtone customization system comprising means (110, 210) for formatting the audio data (100, 200) input by a user for playback by said polyphonic synthesizer (115, 215), and means for combining said audio data with at least a portion of a pre-recorded ringtone to form a customized ringtone (120, 220) for playback by said polyphonic synthesizer (115, 215).
2. A portable electronic device according to claim 1, wherein a pred-recorded ringtone comprises a plurality of sound channels, one or more of which is substituted with said audio data (100, 200) input by the user to form said customized ringtone (120, 220).
3. A portable electronic device according to claim 1, wherein the ringtone customization system comprises means (108, 208) for extracting audio data (100, 200) input by the user from background noise received simultaneously at said microphone (102, 202).
4. A portable electronic device according to claim 1, wherein the ringtone customization system comprises means (110, 210) for extracting and storing audio parameters from said audio data (100, 200) input by the user.
5. A portable electronic device according to claim 4, wherein the analogue audio data is first digitized before audio parameter extraction.
6. A portable electronic device according to claim 1, wherein the pre-stored ringtones are in the MIDI format.
7. A portable electronic device according to claim 1, wherein said polyphonic synthesizer (115, 215) is wavetable-based and XMF compliant.
8. A ringtone customization system for use with a polyphonic synthesizer (115, 215) in a portable electronic device (104, 204), said portable electronic device further comprising a microphone (102, 202) for receiving audio data (100, 200) input by a user, and a stored bank (114, 214) of pre-stored ringtones formatted for playback by said polyphonic synthesizer (115, 215), the ringtone customization system comprising means (110, 210) for formatting the audio data (100, 200) input by a user for playback by said polyphonic synthesizer (115, 215), and means for combining said audio data with at least a portion of a pre-recorded ringtone to form a customized ringtone (120, 220) for playback by said polyphonic synthesizer (115, 215).
9. A method for ringtone customization in a polyphonic synthesizer (115, 215) of a portable electronic device (104, 204), said portable electronic device (104, 204) further comprising a microphone (102, 202) for receiving audio data (100, 200) input by a user, and a stored bank of pre-stored ringtones (114, 214) formatted for playback by said polyphonic synthesizer (115, 215), the method comprising formatting the audio data input by a user for playback by said polyphonic synthesizer (115, 215) combining said audio data with at least a portion of a pre- stored ringtone to form a customized ringtone (120, 220) for playback by said polyphonic synthesizer (115, 215).
PCT/IB2007/051836 2006-05-17 2007-05-15 Ringtone customization for portable telecommunication applications WO2007132427A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP06300481 2006-05-17
EP06300481.6 2006-05-17

Publications (1)

Publication Number Publication Date
WO2007132427A1 true WO2007132427A1 (en) 2007-11-22

Family

ID=38462075

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2007/051836 WO2007132427A1 (en) 2006-05-17 2007-05-15 Ringtone customization for portable telecommunication applications

Country Status (1)

Country Link
WO (1) WO2007132427A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999065221A1 (en) * 1998-06-09 1999-12-16 Telefonaktiebolaget Lm Ericsson A telecommunication device with acoustically programmable ringtone generating means and a method for the programming thereof
WO2004072944A1 (en) * 2003-02-14 2004-08-26 Koninklijke Philips Electronics N.V. Mobile telecommunication apparatus comprising a melody generator
WO2006043929A1 (en) * 2004-10-12 2006-04-27 Madwaves (Uk) Limited Systems and methods for music remixing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999065221A1 (en) * 1998-06-09 1999-12-16 Telefonaktiebolaget Lm Ericsson A telecommunication device with acoustically programmable ringtone generating means and a method for the programming thereof
WO2004072944A1 (en) * 2003-02-14 2004-08-26 Koninklijke Philips Electronics N.V. Mobile telecommunication apparatus comprising a melody generator
WO2006043929A1 (en) * 2004-10-12 2006-04-27 Madwaves (Uk) Limited Systems and methods for music remixing

Similar Documents

Publication Publication Date Title
JP6645956B2 (en) System and method for portable speech synthesis
EP1736961B1 (en) System and method for automatic creation of digitally enhanced ringtones for cellphones
US8816180B2 (en) Systems and methods for portable audio synthesis
KR100634572B1 (en) Method for generating audio data and user terminal and record medium using the same
KR100664677B1 (en) Method for generating music contents using handheld terminal
US20060266201A1 (en) Method for synchronizing at least one multimedia peripheral of a portable communication device, and corresponding portable communication device
KR100619826B1 (en) Music and audio synthesize apparatus and method for mobile communication device
TW529018B (en) Terminal apparatus, guide voice reproducing method, and storage medium
KR100731232B1 (en) Musical data editing and reproduction apparatus, and portable information terminal therefor
WO2007132427A1 (en) Ringtone customization for portable telecommunication applications
KR100509126B1 (en) Audio melody tune generation device and portable terminal device using it
KR100884225B1 (en) Generating percussive sounds in embedded devices
KR100574808B1 (en) Musical tone signal generating apparatus
KR20060116229A (en) Device and method for providing a signal melody
JP4012410B2 (en) Musical sound generation apparatus and musical sound generation method
KR100862126B1 (en) Portable communication terminal
TW583865B (en) Mobile communication terminal and server
KR20060076638A (en) Midi file synthesizer and synthesis method
JP3675361B2 (en) Communication terminal
KR20030083655A (en) Composition and Conversion method of Ring tone file and Music file of Mobile Handset
KR100598207B1 (en) MIDI playback equipment and method
KR20080080013A (en) Mobile terminal apparatus
JP2002341872A (en) Communication terminal
JP2001045103A (en) Telephone system
JP2006091460A (en) Determining device for waveform data for sound source

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07735906

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07735906

Country of ref document: EP

Kind code of ref document: A1