WO2015028715A1 - Directional audio apparatus - Google Patents
Directional audio apparatus Download PDFInfo
- Publication number
- WO2015028715A1 WO2015028715A1 PCT/FI2014/050653 FI2014050653W WO2015028715A1 WO 2015028715 A1 WO2015028715 A1 WO 2015028715A1 FI 2014050653 W FI2014050653 W FI 2014050653W WO 2015028715 A1 WO2015028715 A1 WO 2015028715A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio signal
- data
- channel
- single channel
- channel audio
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 claims abstract description 365
- 238000000034 method Methods 0.000 claims description 32
- 238000012545 processing Methods 0.000 claims description 29
- 238000001914 filtration Methods 0.000 claims description 20
- 230000000007 visual effect Effects 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 5
- 230000008859 change Effects 0.000 description 8
- 238000013461 design Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 6
- 239000004065 semiconductor Substances 0.000 description 6
- 238000012546 transfer Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 230000003068 static effect Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000008867 communication pathway Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/002—Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
- H04S3/004—For headphones
Definitions
- the present application relates to apparatus for processing of audio and additionally audio-video signals to enable directional audio encoding and decoding over a mono channel audio signal.
- the invention further relates to, but is not limited to, apparatus for processing of audio and additionally audio-video signals to enable directional audio encoding and decoding over a mono channel audio signal between mobile devices.
- Stereo recording is a well-known feature and widely supported in all kinds of audio devices.
- multi-microphone recorders that enable good quality multi-channel audio recordings and which for example can be played back by home theatre systems.
- binaural microphones can used to capture the true spatial audio environment at the user's ears.
- the recorder device requires capturing two or more channels of audio.
- many mobile recording devices especially user equipment or phones and headsets, support only single channel recording.
- two channel capture support for Bluetooth is very rare.
- the transmission encoding is configured for single or mono channel audio signals and so unable to convey multichannel information. Summary Aspects of this application thus provide suitable audio capture and audio playback for multichannel audio signals using single channel storage or recording to permit a better audio listening experience.
- an apparatus comprising at least one processor and at least one memory including computer code for one or more programs, the at least one memory and the computer code configured to with the at least one processor cause the apparatus to at least: generate a single channel audio signal; generate at least one channel of data associated with the single channel audio signal; and encode the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
- Generating the single channel audio signal may cause the apparatus to perform at least one of: receive a single channel audio signal from a single microphone; generate a single channel audio signal from at least two channel audio signals; generate a single channel audio signal from at least two channel audio signals received from at least one microphone; receive the single channel audio signal from a memory; and receive the single channel audio signal from a further apparatus separate from the apparatus.
- Generating the at least one channel of data associated with the single channel audio signal may cause the apparatus to perform at least one of: generate at least one spatial parameter associated with the single channel audio signal; generate an azimuth parameter associated with the single channel audio signal; generate an elevation parameter associated with the single channel audio signal; generate a range parameter associated with the single channel audio signal; generate at least one velocity parameter associated with the single channel audio signal; generate at least one recording configuration parameter associated with the single channel audio signal; generate an azimuth parameter associated with at least one microphone recording the single channel audio signal; generate an elevation parameter associated with at least one microphone recording the single channel audio signal; and generate at least one processing parameter associated with the generation of the single channel audio signal.
- Encoding the at least one channel of data within the single channel audio signal may cause the apparatus to perform at least one of: quantize the at least one channel of data; modulate the quantized at least one channel of data; and combine the modulated quantized at least one channel of data with the single channel audio signal.
- the apparatus may be further caused to band-pass filter the single channel audio signal, and wherein modulating the quantized at least one channel of data may cause the apparatus to frequency shift the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter.
- Band-pass filtering the single channel audio signal may cause the apparatus to perform at least one of: low pass filter the single channel audio signal, wherein frequency shifting the quantized at least one channel of data may cause the apparatus to frequency shift the quantized at least one channel of data to a frequency range above the range of the low pass filter; and high pass filter the single channel audio signal, wherein frequency shifting the quantized at least one channel of data may cause the apparatus to frequency shift the quantized at least one channel of data to a frequency range below the range of the high pass filter.
- the apparatus may be further caused to band-stop filter the single channel audio signal, and wherein modulating the quantized at least one channel of data may cause the apparatus to frequency shift the quantized at least one channel of data to at least one frequency of the band-stop filter. Modulating the quantized at least one channel of data may cause the apparatus to: apply a defined pseudo-random frequency shifting to the at least one channel of data; and determine the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
- an apparatus comprising at least one processor and at least one memory including computer code for one or more programs, the at least one memory and the computer code configured to with the at least one processor cause the apparatus to at least: receive a single channel audio signal, the single channel audio signal comprising at least one channel of data; separate the at least one channel of data from the single channel audio signal; and control at least one operation of the apparatus based on the at least one channel of data. Separating the at least one channel of data from the single channel audio signal may cause the apparatus to: apply a first band pass filter to extract the at least one channel of data; and apply a second band pass filter to extract an audio signal from the single channel audio signal.
- the first band pass filter may be a high pass filter
- the second band pass filter may be a low pass filter.
- the first band pass filter may be a low pass filter
- the second band pass filter may be a high pass filter
- the first band pass filter may be at least one narrow band filter
- Controlling at least one operation of the apparatus based on the at least one channel of data may cause the apparatus to perform at least one of: process the single channel audio signal based on the at least one channel of data; generate at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data; generate at least user interface input based on the at least one channel of data; generate at least gaming control input based on the at least one channel of data; and generate at least one visual effect to be displayed based on the at least one channel of data.
- Processing the single channel audio signal based on the at least one channel of data may cause the apparatus to perform at least one of: generate at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; generate a spatial audio signal based on the single channel audio signal and the at least one channel of data.
- a method comprising: generating a single channel audio signal; generating at least one channel of data associated with the single channel audio signal; and encoding the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
- Generating the single channel audio signal may comprise at least one of: receiving a single channel audio signal from a single microphone; generating a single channel audio signal from at least two channel audio signals; generating a single channel audio signal from at least two channel audio signals received from at least one microphone; receiving the single channel audio signal from a memory; and receiving the single channel audio signal from a further apparatus separate from the apparatus.
- Generating the at least one channel of data associated with the single channel audio signal may comprise at least one of: generating at least one spatial parameter associated with the single channel audio signal; generating an azimuth parameter associated with the single channel audio signal; generating an elevation parameter associated with the single channel audio signal; generating a range parameter associated with the single channel audio signal; generating at least one velocity parameter associated with the single channel audio signal; generating at least one recording configuration parameter associated with the single channel audio signal; generating an azimuth parameter associated with at least one microphone recording the single channel audio signal; generating an elevation parameter associated with at least one microphone recording the single channel audio signal; and generating at least one processing parameter associated with the generation of the single channel audio signal.
- Encoding the at least one channel of data within the single channel audio signal may comprise at least one of: quantizing the at least one channel of data; modulating the quantized at least one channel of data; and combining the modulated quantized at least one channel of data with the single channel audio signal.
- the method may further comprise band-pass filtering the single channel audio signal, and wherein modulating the quantized at least one channel of data may comprise frequency shifting the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter.
- Band-pass filtering the single channel audio signal may comprise at least one of: low pass filtering the single channel audio signal, wherein frequency shifting the quantized at least one channel of data may comprise frequency shifting the quantized at least one channel of data to a frequency range above the range of the low pass filter; and high pass filtering the single channel audio signal, wherein frequency shifting the quantized at least one channel of data may comprise frequency shifting the quantized at least one channel of data to a frequency range below the range of the high pass filter.
- the method may further comprise band-stop filtering the single channel audio signal, and wherein modulating the quantized at least one channel of data may comprise frequency shifting the quantized at least one channel of data to at least one frequency of the band-stop filter.
- Modulating the quantized at least one channel of data may comprise: applying a defined pseudo-random frequency shifting to the at least one channel of data; and determining the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
- a method comprising: receiving a single channel audio signal, the single channel audio signal comprising at least one channel of data; separating the at least one channel of data from the single channel audio signal; and controlling at least one operation of the apparatus based on the at least one channel of data.
- Separating the at least one channel of data from the single channel audio signal may comprise: applying a first band pass filter to extract the at least one channel of data; and applying a second band pass filter to extract an audio signal from the single channel audio signal.
- the first band pass filter may be a high pass filter, and the second band pass filter may be a low pass filter.
- the first band pass filter may be a low pass filter, and the second band pass filter may be a high pass filter.
- the first band pass filter may be at least one narrow band filter
- the second band pass filter may be a notch filter comprising stop bands at the frequencies of the at least one narrow band filter.
- Separating the at least one channel of data from the single channel audio signal may comprise: applying a pseudo-random decoding of the single channel audio signal comprising the at least one channel of data to extract the at least one channel of data; and demodulating the decoded at least one channel of data to regenerate the at least one channel of data.
- Controlling at least one operation of the apparatus based on the at least one channel of data may comprise at least one of: processing the single channel audio signal based on the at least one channel of data; generating at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data; generating at least user interface input based on the at least one channel of data; generating at least gaming control input based on the at least one channel of data; and generating at least one visual effect to be displayed based on the at least one channel of data.
- Processing the single channel audio signal based on the at least one channel of data may comprise at least one of: generating at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; generating a spatial audio signal based on the single channel audio signal and the at least one channel of data.
- an apparatus comprising: means for generating a single channel audio signal; means for generating at least one channel of data associated with the single channel audio signal; and means for encoding the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
- the means for generating the single channel audio signal may comprise at least one of: means for receiving a single channel audio signal from a single microphone; means for generating a single channel audio signal from at least two channel audio signals; means for generating a single channel audio signal from at least two channel audio signals received from at least one microphone; means for receiving the single channel audio signal from a memory; and receiving the single channel audio signal from a further apparatus separate from the apparatus.
- the means for generating the at least one channel of data associated with the single channel audio signal may comprise at least one of: means for generating at least one spatial parameter associated with the single channel audio signal; means for generating an azimuth parameter associated with the single channel audio signal; generating an elevation parameter associated with the single channel audio signal; means for generating a range parameter associated with the single channel audio signal; means for generating at least one velocity parameter associated with the single channel audio signal; means for generating at least one recording configuration parameter associated with the single channel audio signal; means for generating an azimuth parameter associated with at least one microphone recording the single channel audio signal; means for generating an elevation parameter associated with at least one microphone recording the single channel audio signal; and means for generating at least one processing parameter associated with the generation of the single channel audio signal.
- the means for encoding the at least one channel of data within the single channel audio signal may comprise at least one of: means for quantizing the at least one channel of data; means for modulating the quantized at least one channel of data; and means for combining the modulated quantized at least one channel of data with the single channel audio signal.
- the apparatus may further comprise means for band-pass filtering the single channel audio signal, and wherein the means for modulating the quantized at least one channel of data may comprise means for frequency shifting the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter.
- the means for band-pass filtering the single channel audio signal may comprise at least one of: means for low pass filtering the single channel audio signal, wherein the means for frequency shifting the quantized at least one channel of data may comprise means for frequency shifting the quantized at least one channel of data to a frequency range above the range of the low pass filter; and means for high pass filtering the single channel audio signal, wherein the means for frequency shifting the quantized at least one channel of data may comprise means for frequency shifting the quantized at least one channel of data to a frequency range below the range of the high pass filter.
- the apparatus may further comprise means for band-stop filtering the single channel audio signal, and wherein the means for modulating the quantized at least one channel of data may comprise means for frequency shifting the quantized at least one channel of data to at least one frequency of the band-stop filter.
- the means for modulating the quantized at least one channel of data may comprise: means for applying a defined pseudo-random frequency shifting to the at least one channel of data; and means for determining the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
- an apparatus comprising: means for receiving a single channel audio signal, the single channel audio signal comprising at least one channel of data; means for separating the at least one channel of data from the single channel audio signal; and means for controlling at least one operation of the apparatus based on the at least one channel of data.
- the means for separating the at least one channel of data from the single channel audio signal may comprise: means for applying a first band pass filter to extract the at least one channel of data; and means for applying a second band pass filter to extract an audio signal from the single channel audio signal.
- the first band pass filter may be a high pass filter
- the second band pass filter may be a low pass filter.
- the first band pass filter may be a low pass filter
- the second band pass filter may be a high pass filter
- the first band pass filter may be at least one narrow band filter
- the second band pass filter may be a notch filter comprising stop bands at the frequencies of the at least one narrow band filter.
- the means for separating the at least one channel of data from the single channel audio signal may comprise: means for applying a pseudo-random decoding of the single channel audio signal comprising the at least one channel of data to extract the at least one channel of data; and means for demodulating the decoded at least one channel of data to regenerate the at least one channel of data.
- the means for controlling at least one operation of the apparatus based on the at least one channel of data may comprise at least one of: means for processing the single channel audio signal based on the at least one channel of data; means for generating at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data; means for generating at least user interface input based on the at least one channel of data; means for generating at least gaming control input based on the at least one channel of data; and means for generating at least one visual effect to be displayed based on the at least one channel of data.
- the means for processing the single channel audio signal based on the at least one channel of data may comprise at least one of: means for generating at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; means for generating a spatial audio signal based on the single channel audio signal and the at least one channel of data.
- an apparatus comprising: an audio signal generator configured to generate a single channel audio signal; a data generator configured to generate at least one channel of data associated with the single channel audio signal; and an encoder configured to encode the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
- the audio signal generator may comprise at least one of: a microphone input configured to receive a single channel audio signal from a single microphone; a downmixer configured to generate a single channel audio signal from at least two channel audio signals; a microphone downmixer configured to generate a single channel audio signal from at least two channel audio signals received from at least one microphone; a input configured to receive the single channel audio signal from a memory; and a receiver configured to receive the single channel audio signal from a further apparatus separate from the apparatus.
- the data generator may comprise at least one of: a spatial parameter generator configured to generate at least one spatial parameter associated with the single channel audio signal; a azimuth generator configured to generate an azimuth parameter associated with the single channel audio signal; an elevation generator configured to generate an elevation parameter associated with the single channel audio signal; a range generator configured to generate a range parameter associated with the single channel audio signal; a velocity determiner configured to generate at least one velocity parameter associated with the single channel audio signal; a configuration determiner configured to generate at least one recording configuration parameter associated with the single channel audio signal; a microphone azimuth generator configured to generate an azimuth parameter associated with at least one microphone recording the single channel audio signal; a microphone elevation generator configured to generate an elevation parameter associated with at least one microphone recording the single channel audio signal; and a processing determiner configured to generate at least one processing parameter associated with the generation of the single channel audio signal.
- a spatial parameter generator configured to generate at least one spatial parameter associated with the single channel audio signal
- a azimuth generator configured to generate an azimuth parameter associated with the single channel audio signal
- the encoder may comprise at least one of: a quantizer configured to quantize the at least one channel of data; a modulator configured to modulate the quantized at least one channel of data; and a combiner configured to combine the modulated quantized at least one channel of data with the single channel audio signal.
- the apparatus may further comprise a band-pass filter configured to band-pass filter the single channel audio signal, and wherein the modulator may be configured to frequency shift the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter.
- the band-pass filter may comprise at least one of: a low pass filter configured to low-pass filter the single channel audio signal, wherein the modulator may be configured to frequency shift the quantized at least one channel of data to a frequency range above the range of the low pass filter; and a high pass filter configured to high pass filter the single channel audio signal, wherein the modulator may be configured to frequency shift the quantized at least one channel of data to a frequency range below the range of the high pass filter.
- the apparatus may comprise a band-stop filter configured to band-stop filter the single channel audio signal, and wherein the modulator may be configured to frequency shift the quantized at least one channel of data to at least one frequency of the band-stop filter.
- the modulator may comprise: a pseudo-random frequency shifter configured to apply a defined pseudo-random frequency shift to the at least one channel of data; and an audibility determiner configured to determine the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
- an apparatus comprising: a receiver configured to receive a single channel audio signal, the single channel audio signal comprising at least one channel of data; a separator configured to separate the at least one channel of data from the single channel audio signal; and a controller configured to control at least one operation of the apparatus based on the at least one channel of data.
- the separator may comprise: a first band pass filter configured to band pass filter the single channel audio signal to extract the at least one channel of data; and a second band pass filter configured to band pass filter the single channel audio signal to extract an audio signal.
- the first band pass filter may be a high pass filter
- the second band pass filter may be a low pass filter
- the first band pass filter may be a low pass filter, and the second band pass filter may be a high pass filter.
- the first band pass filter may be at least one narrow band filter, and the second band pass filter may be a notch filter comprising stop bands at the frequencies of the at least one narrow band filter.
- the separator may comprise: a decoder configured to apply a pseudo-random decoding of the single channel audio signal comprising the at least one channel of data to extract the at least one channel of data; and a demodulator configured to demodulate the decoded at least one channel of data to regenerate the at least one channel of data.
- Controlling at least one operation of the apparatus based on the at least one channel of data may comprise at least one of: a processor configured to process the single channel audio signal based on the at least one channel of data; a signal generator configured to generate at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data; a user input generator configured to generate at least user interface input based on the at least one channel of data; a gaming control generator configured to generate at least gaming control input based on the at least one channel of data; and a visual effect generator configured to generate at least one visual effect to be displayed based on the at least one channel of data.
- the processor may comprise at least one of: an up-mixer configured to generate at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; a spatial processor configured to generate a spatial audio signal based on the single channel audio signal and the at least one channel of data.
- a computer program product stored on a medium may cause an apparatus to perform the method as described herein.
- An electronic device may comprise apparatus as described herein.
- a chipset may comprise apparatus as described herein.
- Embodiments of the present application aim to address problems associated with the state of the art.
- FIG 1 shows schematically an apparatus suitable for being employed in embodiments of the application
- Figure 2 shows schematically example recording and playback apparatus according to some embodiments
- Figure 3 shows schematically an example recording by the recording apparatus shown in Figure 2 according to some embodiments
- Figure 4 shows schematically an example playback by the playback apparatus shown in Figure 2 according to some embodiments
- Figure 5 shows a flow diagram of the operation of the example recording and playback apparatus as shown in Figure 2;
- Figure 6 shows schematically an example encoder as shown in Figure 2 according to some embodiments
- Figure 7 shows schematically an example decoder as shown in Figure 2 according to some embodiments
- Figure 8 shows schematically example modulation ranges suitable for use by the encoder/decoder as shown in Figures 6 and 7.
- Figure 9 shows schematically a further example encoder as shown in Figure 2 according to some embodiments.
- Figure 10 shows schematically a further example decoder as shown in
- Figure 2 shows a flow diagram of the operation of the example encoder as shown in Figure 6 according to some embodiments;
- Figure 12 shows a flow diagram of the operation of the example decoder as shown in Figure 7 according to some embodiments
- Figure 13 shows a flow diagram of the operation of the further example encoder as shown in Figure 9 according to some embodiments.
- Figure 14 shows a flow diagram of the operation of the further example decoder as shown in Figure 10 according to some embodiments.
- audio signals and audio capture signals are described. However it would be appreciated that in some embodiments the audio signal/audio capture is a part of an audio-video system.
- the concept of this application is related to assisting in the production of immersive communication.
- the concept as described by the embodiments herein is capture an audio recording of the user's surroundings, capture spatial attributes of the situation (for example the apparatus orientation and microphone configuration or the user's head orientation data), encode the spatial attributes data with the audio for normal mono audio transfer channel, and decode the spatial attributes data from the audio stream to generate a suitable multi-channel audio signal using the mono audio and spatial attributes.
- the transfer channel (or storage channel) supports only a single channel of audio data.
- the spatial attributes data is recorded, and this spatial meta data is encoded with the mono audio signal for transmission/storage.
- the spatial audio attributes can be extracted (decoded) from the mono audio track.
- the original audio can be enriched with spatial effects to further improve the playback experience.
- the embodiments as described herein differ from multichannel recording operations such as for example the directional audio (DirAC) encoding system in that whereas DirAC requires Multi-channel capture the embodiments as described herein feature mono capture. Furthermore whereas DirAC systems capture the sound environment in relation to the microphone the embodiments as described herein capture a mono sound and the user or apparatus motion. Also it would be understood that multi-channel capture methods such as DirAC in reproduction replicate the recorded sound environment with the available sound system whereas in the embodiments as described herein the apparatus/user motion is used to enrich the listening/playback experience.
- multi-channel capture methods such as DirAC in reproduction replicate the recorded sound environment with the available sound system whereas in the embodiments as described herein the apparatus/user motion is used to enrich the listening/playback experience.
- Figure 1 shows a schematic block diagram of an exemplary apparatus or electronic device 10, which may be used to record (or operate as a recording or capturing apparatus) or listen (or operate as a playback apparatus) to the audio signals (and similarly to record or view the audiovisual images and data).
- the electronic device 10 may for example be a mobile terminal or user equipment of a wireless communication system when functioning as the recording device or listening device 1 13.
- the apparatus can be an audio player or audio recorder, such as an MP3 player, a media recorder/player (also known as an MP4 player), or any suitable portable device suitable for recording audio or audio/video camcorder/memory audio or video recorder.
- the apparatus 10 can in some embodiments comprise an audio subsystem.
- the audio subsystem for example can comprise in some embodiments a microphone or array of microphones 1 1 for audio signal capture.
- the microphone or array of microphones can be a solid state microphone, in other words capable of capturing audio signals and outputting a suitable digital format signal.
- the microphone or array of microphones 1 1 can comprise any suitable microphone or audio capture means, for example a condenser microphone, capacitor microphone, electrostatic microphone, electret condenser microphone, dynamic microphone, ribbon microphone, carbon microphone, piezoelectric microphone, or microelectrical-mechanical system (MEMS) microphone.
- the microphone 1 1 or array of microphones can in some embodiments output the audio captured signal to an analogue-to-digital converter (ADC) 14.
- ADC analogue-to-digital converter
- the apparatus can further comprise an analogue-to-digital converter (ADC) 14 configured to receive the analogue captured audio signal from the microphones and outputting the audio captured signal in a suitable digital form.
- ADC analogue-to-digital converter
- the analogue-to-digital converter 14 can be any suitable analogue-to-digital conversion or processing means.
- the apparatus 10 audio subsystem further comprises a digital-to-analogue converter 32 for converting digital audio signals from a processor 21 to a suitable analogue format.
- the digital-to-analogue converter (DAC) or signal processing means 32 can in some embodiments be any suitable DAC technology.
- the audio subsystem can comprise in some embodiments a speaker 33.
- the speaker 33 can in some embodiments receive the output from the digital- to-analogue converter 32 and present the analogue audio signal to the user.
- the speaker 33 can be representative of a headset, for example a set of headphones, or cordless headphones.
- the apparatus 10 is shown having both audio capture and audio playback components, it would be understood that in some embodiments the apparatus 10 can comprise one or the other of the audio capture and audio playback parts of the audio subsystem such that in some embodiments of the apparatus the microphone (for audio capture) or the speaker (for audio playback) are present.
- the apparatus 10 comprises a processor 21 .
- the processor 21 is coupled to the audio subsystem and specifically in some examples the analogue-to-digital converter 14 for receiving digital signals representing audio signals from the microphone 1 1 , and the digital-to-analogue converter (DAC) 12 configured to output processed digital audio signals.
- the processor 21 can be configured to execute various program codes.
- the implemented program codes can comprise for example audio signal processing routines.
- the apparatus further comprises a memory 22.
- the processor is coupled to memory 22.
- the memory can be any suitable storage means.
- the memory 22 comprises a program code section 23 for storing program codes implementable upon the processor 21 .
- the memory 22 can further comprise a stored data section 24 for storing data, for example data that has been processed in accordance with the application or data to be processed via the application embodiments as described later.
- the implemented program code stored within the program code section 23, and the data stored within the stored data section 24 can be retrieved by the processor 21 whenever needed via the memory-processor coupling.
- the apparatus 10 can comprise a user interface 15.
- the user interface 15 can be coupled in some embodiments to the processor 21 .
- the processor can control the operation of the user interface and receive inputs from the user interface 15.
- the user interface 15 can enable a user to input commands to the electronic device or apparatus 10, for example via a keypad, and/or to obtain information from the apparatus 10, for example via a display which is part of the user interface 15.
- the user interface 15 can in some embodiments comprise a touch screen or touch interface capable of both enabling information to be entered to the apparatus 10 and further displaying information to the user of the apparatus 10.
- the apparatus further comprises a transceiver 13, the transceiver in such embodiments can be coupled to the processor and configured to enable a communication with other apparatus or electronic devices, for example via a wireless communications network.
- the transceiver 13 or any suitable transceiver or transmitter and/or receiver means can in some embodiments be configured to communicate with other electronic devices or apparatus via a wire or wired coupling.
- the transceiver 13 can communicate with further devices by any suitable known communications protocol, for example in some embodiments the transceiver 13 or transceiver means can use a suitable universal mobile telecommunications system (UMTS) protocol, a wireless local area network (WLAN) protocol such as for example IEEE 802.X, a suitable short-range radio frequency communication protocol such as Bluetooth, or infrared data communication pathway (IRDA).
- UMTS universal mobile telecommunications system
- WLAN wireless local area network
- IRDA infrared data communication pathway
- the apparatus comprises a position sensor 16 configured to estimate the position of the apparatus 10.
- the position sensor 16 can in some embodiments be a satellite positioning sensor such as a GPS (Global Positioning System), GLONASS or Galileo receiver.
- GPS Global Positioning System
- GLONASS Galileo receiver
- the positioning sensor can be a cellular ID system or an assisted GPS system.
- the apparatus 10 further comprises a direction or orientation sensor.
- the orientation/direction sensor can in some embodiments be an electronic compass, accelerometer, a gyroscope or be determined by the motion of the apparatus using the positioning estimate. It is to be understood again that the structure of the electronic device 10 could be supplemented and varied in many ways.
- FIG. 2 an example recording apparatus 91 and playback apparatus 93 are shown according to some embodiments.
- the recording apparatus 91 and playback apparatus 93 are shown coupled by a single channel (Mono) transfer or transmission channel 121 .
- Figure 3 shows an example recording by the recording apparatus 91 as shown in Figure 2 according to some embodiments.
- Figure 4 shows schematically an example playback by the playback apparatus 93 shown in Figure 2 according to some embodiments.
- Figure 5 shows a flow diagram of the operation of the example recording apparatus 91 and playback apparatus 93 according to some embodiments.
- the recording apparatus 91 in some embodiments comprises a Mono recorder 100.
- the Mono recorder 100 in some embodiments comprises the microphone(s) 1 1 and analogue to digital converter 14 configured to generate a suitable digital format signal to be passed to the encoder 101 .
- the recording apparatus comprises a position sensor/orientation sensor 16 configured to monitor the orientation angle of the recording apparatus 91 .
- the orientation angle sensor 16 is configured to measure the orientation (or angle) of the microphone, where the microphone is detached or separate from the recording apparatus.
- the orientation angle sensor 16 is configured to determine the orientation of the headset comprising the microphone.
- the microphone can be attached to the user's chest or other parts of the body rather than the apparatus and thus as the user rotates himself or herself the orientation of the microphone changes.
- Figure 3 where the user is wearing a traditional stereo headset with mono microphone the user 203 or the apparatus can turn from the left to the right. These user 203 turning an angle 205 thus records the sound source 201 but also the change in orientation (the angle 205). In other words both the mono audio and of the recording orientation data is captured.
- an apparatus or microphone orientation change can be implemented by a multiple mono microphone array where the microphones are arranged with an associated orientation.
- An orientation change can be implemented in such embodiments by the Mono recorder 100 being configured to select individual mono microphone inputs and the orientation angle sensor 16 configured to monitor which of the microphones has been selected and generate the associated orientation of the selected microphone.
- the Mono recorder 91 can be configured to generate a Mono audio signal output from a beam formed microphone array wherein the orientation angle sensor 16 is a beam formed orientation angle of the processed audio signals.
- the recording apparatus further comprises multiple microphones from which a mono audio channel is generated and recorded.
- the apparatus comprises a multiple microphone configuration which is output to a suitable delay network.
- the delay network can in some embodiments be implemented by a digital processing means or mechanical means and the various delays applied to the microphone outputs before combining them in order to form a directional recording audio channel or beam.
- any suitable digital processing can be applied to the microphone(s) audio signals to generate a recording beam. In such a way the direction of the audio signal can be changed digitally (or by mechanical/analogue processing).
- the mechanically changing the orientation of the microphones to generate a 'beamformed' or directional audio signal is described. However it would be understood that in some embodiments the beamformed or directional audio signals are recorded or captured using the processing means as described above.
- the physical change of the apparatus/microphone/recording orientation by moving the microphone or apparatus or selecting a separate one of several Mono or multichannel microphones with different recording or capture orientations enables the apparatus to generate a mono (or reduced channel) audio signal.
- step 400 The operation of rotating or changing the orientation of the microphones/apparatus is shown in Figure 5 by step 400.
- the Mono recorder is configured to capture or record a mono audio signal as the recording angle/orientation is changed (microphone is rotated).
- the Mono recorder can be configured to output this mono audio signal to the encoder 101 .
- step 401 The operation of capturing or recording the mono audio signal as the microphone is rotated is shown in Figure 5 by step 401 .
- the orientation angle sensor 16 can be configured to capture/record the orientation angle of the apparatus (or microphone) as the apparatus (or microphone) is rotated. The orientation angle sensor 16 can then be configured to output this orientation angle to the encoder 401 .
- the orientation angle sensor can monitor the orientation of the microphone set separate from the apparatus, or the beamformer orientation, or the configuration orientation of the microphone. The operation of capturing or recording the orientation angle of the microphone as 'rotated' is shown in Figure 5 by step 403.
- the recording apparatus comprises an encoder 101 .
- the encoder 101 is configured to encode the mono audio signal with associated meta data with the spatial attributes generated from the orientation angle sensor 16. The encoder 101 can then output an encoded form to be stored or passed to a playback apparatus 93.
- the encoding of the audio signal with associated meta data with spatial attributes is shown in Figure 5 by step 405.
- a single metadata channel defining the direction or orientation of the audio capture is described as being encoded with the audio signal it would be understood that in some embodiments further metadata channels can be encoded.
- the position sensor/orientation sensor 16 can be configured to detect vertical motion of the apparatus, and the metadata therefore be configured to comprise multiple channels representing the motion of the apparatus in more than a single direction or axis.
- step 407 The operation of transmitting and receiving (or alternatively storing and retrieving) the combined signal comprising the mono audio signal and metadata is shown in Figure 5 by step 407.
- the transfer as described herein is shown in Figure 2 by the transfer channel 121 .
- the transfer can be a wired or wireless coupling between the recording apparatus 91 and the playback apparatus 93.
- the encoder in some embodiments passes the combined signal to a suitable transmitter where the combined signal is encoded for transmission and transmitted.
- the playback apparatus 93 similarly comprises a suitable receiver (not shown) configured to receive the transmitted signal and demodulate it to regenerate the combined signal.
- the transmitter and/or receiver in some embodiments can be components of a suitable transceiver, for example such as described with respect to Figure 1 .
- the playback apparatus 93 comprises a decoder 151 .
- the decoder 151 can be configured to receive the combined signal comprising the (single channel) mono audio signal and the spatial attributes metadata. The decoder 151 can then be configured to decode the data stream to generate a separate mono audio output 153 and orientation angle output 155. The decoder 151 can in some embodiments output the mono audio signal as a mono audio output (mono audio o/p) 153 to the externalizer 157 and output the orientation angle output (orientation angle o/p) 155 (or other suitable spatial attributes) to the externalizer 157.
- the playback apparatus 93 comprises an externalizer 157.
- the externalizer 157 can be configured to receive the mono audio signal in the form of the mono audio output 153 as an audio input and the orientation angle or other spatial attributes in the form of the orientation angle output 155 as a control input and further be configured to process the mono audio signal based on the orientation angle to generate a suitable spatial audio output 159.
- the output playback system comprises a right audio channel speaker 301 and a left audio channel speaker 303 the user 307 located between these then the externalizer 157 can pan the mono audio signal from right to left (based on the rotation of the recording apparatus) to add a spatial dimension to the playback.
- One of the issues addressed by the embodiments as described herein is how to combine the mono audio signal and the spatial attributes (such as the orientation angle) to generate a combined audio signal which can be transmitted and/or stored in a manner similar to typical audio signals.
- the spatial attributes such as the orientation angle information is encoded as a subband located next to mono audio channel.
- the orientation information can be coded on a limited frequency band at a lower or higher frequency band in the spectrum of the mono audio channel.
- the spatial attributes such as the orientation information is encoded under the audio signal masking threshold.
- the spatial attributes orientation angle
- the spatial attributes is encoded on a band-limited frequency channel, then the encoded signal is mixed to the audio signal so that the encoded signal is always inaudible, in other words masked by the audio signal.
- the band limited spatial attributes (such as the orientation angle) is spread over the whole available audio band using pseudorandom code (PRN), and then the encoded signal is mixed with the audio signal so that the encoding is always masked by the audio signal.
- PRN pseudorandom code
- an example encoder 101 is shown. Furthermore with respect to Figure 1 1 the operation of the example encoder 101 shown in Figure 6 is further shown.
- the encoder 101 in some embodiments is configured to receive the orientation signal input 501 .
- the encoder 101 in some embodiments comprises a quantizer 502 configured to receive the orientation input from the orientation angle sensor 16. The quantizer 502 is then configured to quantize the orientation signal and pass to the modulator 504.
- the quantizer 502 can be any suitable quantizer type such as a static quantizer, a dynamic quantizer, a linear quantizer, and a non-linear quantizer.
- the encoder 101 comprises a modulator 504.
- the modulator 504 can in some embodiments be configured to receive the quantized output of the orientation value from the quantizer 502 and modulate the quantized value such that the data output from the modulator occupies a defined frequency range (or a series of spikes). The modulator 504 can then be output the modulated quantized orientation signal to a combiner 508.
- step 1 105 The operation of modulating the quantized orientation signal is shown in Figure 1 1 by step 1 105.
- the encoder 101 can be further configured to receive the mono audio input 503 which is passed to a band-pass filter 506.
- the encoder 101 comprises a band-pass filter 506.
- the band-pass filter 506 can be configured to receive the mono audio signal and bandpass filter the mono audio signal to produce a filtered audio signal with a defined frequency range (which is separate from that of the orientation signal from the modulator) 504.
- the band-pass filter 506 can then output this filtered signal to a combiner 508.
- the encoder 101 in some embodiments can comprise a combiner 508 configured to receive the modulated quantized orientation signal and the band-pass filtered audio signal and combine these to generate a frequency spectrum suitable for outputting (or recording).
- the combined signal can be further processed for transmission or storage purposes.
- the further processing can be configured to perform processing on the combined audio signal as if the combined signal was a conventional audio signal.
- the operation of outputting the combined signal is shown in Figure 1 1 by step 1009.
- the encoder 101 generates a frequency spectrum 551 such that the output of the band-pass filter generated an audio signal with a frequency range 533 of between 20Hz to 17kHz and the modulated quantized orientation signal is encoded into a frequency range 555 above 17kHz. It would be understood that the frequency ranges described herein are examples only and that in some embodiments the frequency ranges can be any suitable range.
- Figure 8 shows a frequency spectrum wherein the audio signal is band-pass filtered to an audible frequency band 751 which is approximately from 20 Hz to 17 kHz.
- the quantized orientation signal can be modulated to a frequency higher than the audible frequency band frequencies, in other words the orientation information is modulated to use the frequency band beyond typical human hearing and therefore above 17 kHz. This higher frequency modulation is shown by the frequency range shown by the block 705.
- the modulator 504 can be configured to modulate the orientation signal as a DC level in other words the orientation value is represented by a DC current or voltage value.
- the DC modulation is represented in Figure 8 by the DC frequency spike 701 .
- the modulator 504 can be configured to modulate the orientation signal to a frequency lower than the audible frequencies (such as below 20 Hz).
- the lower frequency modulation is shown in Figure 8 by the frequency band 703.
- an example decoder 151 is shown suitable for decoding the output of the encoder 101 as shown in Figure 6. Furthermore with respect to Figure 12 the operation of the decoder 151 shown in Figure 7 according to some embodiments is further described.
- the decoder 151 is configured to receive the combined signal (comprising the orientation signal and audio signal). It would be understood that in some embodiments the combined signal is received from a transceiver or receiver which decodes or demodulates a received signal into the combined signal. Furthermore it would be understood that in some embodiments the combined signal is retrieved from memory or other storage media.
- step 1 101 The operation of receiving the combined signal is shown in Figure 12 by step 1 101 .
- the decoder 151 comprises a first band-pass filter 601 .
- the first band-pass filter 601 comprises a band-pass filter with frequency characteristics similar to the band-pass filter 506 in the encoder 101 and as such configured to output the audio frequency components.
- step 1 103 The operation of band-pass filtering the combined signal to generate the audio signal is shown in Figure 12 by step 1 103.
- the band-pass filter can then output the audio signal 61 1 as the mono audio output 153.
- step 1 105 The operation of outputting the audio signal is shown in Figure 12 by step 1 105.
- the decoder 151 comprises a second band-pass filter 603.
- the second band-pass filter 603 is configured to filter the combined signal to separate out the modulated orientation values.
- the second band-pass filter 603 has frequency characteristics which aim to remove the audible frequency components or filter the frequency band or range generated by the modulator output 504 of the encoder 101 .
- the second bandpass filter 603 is a DC pass circuit or filter (or AC block circuit or filter), where the modulator produces a lower frequency modulation then the band-pass filter is a lower frequency band-pass filter, where the modulator 504 is a higher frequency modulator then the second band-pass filter 603 is a higher frequency band pass filter, and where the modulator 504 is at least one single frequency component modulator then the bandpass filter 603 comprises a series of single frequency pass filters.
- the second band-pass filter 603 can then output the filtered frequency components to the demodulator 605.
- the decoder 151 comprises a demodulator 605 configured to perform an inverse of the modulation scheme performed by the modulator 504 in the encoder 101 . It would be understood that the modulator 504 and therefore the demodulator 605 can be configured to perform any suitable modulation/demodulation scheme or method to modulate the orientation signal such as for example frequency modulation, amplitude modulation, phase modulation, static or dynamic modulation or variants thereof.
- step 1 104 The operation of demodulating the band-pass filtered modulated orientation signal is shown in Figure 12 by step 1 104.
- the demodulator 605 can then output the orientation signal 613 as the orientation angle output 155.
- the operation of outputting the orientation angle output is shown in Figure 12 by step 1 106.
- a further example encoder 101 is shown according to some embodiments.
- the operation of the example encoder 101 shown in Figure 9 is further shown.
- the encoder 101 in some embodiments is configured to receive the orientation signal input 801 .
- step 1201 The operation of receiving the orientation signal input is shown in Figure 13 by step 1201 .
- the encoder 101 in some embodiments comprises a quantizer 802 configured to receive the orientation input from the orientation angle sensor 16. The quantizer 802 is then configured to quantize the orientation signal and pass to the modulator 804.
- the quantizer 802 can be any suitable quantizer type such as a static quantizer, a dynamic quantizer, a linear quantizer, and a non-linear quantizer.
- the encoder 101 comprises a modulator 804.
- the modulator 804 can in some embodiments be configured to receive the quantized output of the orientation value from the quantizer 802 and modulate the quantized value such that the data output from the modulator occupies a defined frequency range (or a series of spikes).
- the modulator 804 can then be output the modulated quantized orientation signal to pseudo-random number encoder (PRN coder) 806.
- PRN coder pseudo-random number encoder
- the encoder 101 comprises a pseudo-random number encoder (PRN coder) 806.
- the PRN coder 806 is configured to receive the modulated orientation signals and furthermore a key 805 and encode the modulated orientation signals using the key.
- the PRN coder 806 can then output the PRN coded modulated orientation signals to a psychoacoustic processor 808.
- the operation of PRN coding the modulated quantized orientation signals according to a key is shown in Figure 13 by step 1207.
- the encoder 101 comprises a psychoacoustic processor 808.
- the psychoacoustic processor 808 can be configured to receive the PRN coded modulated quantized orientation signals and furthermore receive the audio input 803.
- the psychoacoustic processor 808 can in some embodiments be configured to check whether the coding noise (the coded modulated quantized orientation signals) is inaudible with respect to the audio signals. Where the coding noise is within the audio signal range then the psychoacoustic processor 808 can be configured to output the PRN coded modulated quantized orientation signals to a combiner 810. Where the coding noise is outside the audio signal range then the psychoacoustic processor 808 can be configured to control the PRN coder to recode the modulated quantized orientation signals.
- the encoder 101 comprises a combiner 810.
- the combiner 810 is configured to receive the audio signal 803 and the PRN coded modulated quantized orientation values and combine them in a suitable manner to be output as a combined signal.
- the operation of combining the audio signal and the PRN coded modulated quantized orientation signals is shown in Figure 13 by step 121 1 .
- the combined signal can be further processed for transmission or storage purposes.
- the further processing can be configured to perform processing on the combined audio signal as if the combined signal was a conventional audio signal.
- the operation of outputting the combined signal is shown in Figure 13 by step 1213.
- an example decoder 151 is shown suitable for decoding the output of the encoder 101 as shown in Figure 9. Furthermore with respect to Figure 14 the operation of the decoder 151 shown in Figure 10 according to some embodiments is further described.
- the decoder 151 is configured to receive the combined signal (comprising the orientation signal and audio signal). It would be understood that in some embodiments the combined signal is received from a transceiver or receiver which decodes or demodulates a received signal into the combined signal. Furthermore it would be understood that in some embodiments the combined signal is retrieved from memory or other storage media. The operation of receiving the combined signal is shown in Figure 14 by step 1301 .
- the decoder 151 outputs the combined signal as the audio signal 81 1 (in other words as the mono audio output 153).
- the decoder 151 comprises a pseudo-random number decoder (PRN decoder) 901 .
- the PRN decoder 901 is configured to receive the combined signal and a key 905.
- the key 905 is the same key used by the PRN coder 806 within the encoder 101 and configured to permit the decoding of the combined signal to output a modulated orientation signal.
- the output of the PRN decoder 901 can then be output to the demodulator 903.
- the decoder 151 comprises a demodulator 903 configured to perform an inverse of the modulation scheme performed by the modulator 804 in the encoder 101 . It would be understood that the modulator 804 and therefore the demodulator 903 can be configured to perform any suitable modulation/demodulation scheme or method to modulate the orientation signal such as for example frequency modulation, amplitude modulation, phase modulation, static or dynamic modulation or variants thereof.
- the operation of demodulating the modulated orientation signal is shown in Figure 14 by step 1305.
- the demodulator 903 can then output the orientation signal 913 as the orientation angle output 155.
- the metadata channel has been a single data channel which defines an orientation value.
- the metadata can comprise more than one data channel and/or data or information other than orientation.
- the metadata comprises channels defining orientation or azimuth, and elevation and thus is able to define a spherical co-ordinate.
- the metadata comprises range information.
- an audio or sound signal can be encoded with information defining the location of the audio or sound signal such that it can be represented in the playback apparatus.
- the metadata can be used or employed in any suitable manner.
- the metadata the orientation information
- the metadata can be employed to produce and suitable output which is based on the metadata.
- the metadata can be used to trigger additional audio effects.
- the decoder can inject a suitable audio signal into the output audio stream when a determined condition is met. For example a sudden orientation change can cause the decoder to inject a swoosh sound effect, a sudden elevation change can cause the decoder to inject a bouncing or 'boing' sound effect.
- the metadata can be used to generate or modify display or video outputs. For example a rapid change in orientation or elevation can cause the decoder to 'shake' the displayed image.
- the metadata can be employed as a control for an apparatus function other than displaying audio and/or video images.
- the control can be gaming function control, and thus in some embodiments can be used a remote controller controlling the motion of a gaming character.
- any suitable functional control can employ the metadata information.
- metadata can be used to control the user interface on a remote or decoder apparatus.
- embodiments may also be applied to audio-video signals where the audio signal components of the recorded data are processed in terms of the determining of the base signal and the determination of the time alignment factors for the remaining signals and the video signal components may be synchronised using the above embodiments of the invention.
- the video parts may be synchronised using the audio synchronisation information.
- user equipment is intended to cover any suitable type of wireless user equipment, such as mobile telephones, portable data processing devices or portable web browsers.
- PLMN public land mobile network
- the various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof.
- some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
- firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
- While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
- the embodiments of this invention may be implemented by computer software executable by a data processor of the mobile device, such as in the processor entity, or by hardware, or by a combination of software and hardware.
- any blocks of the logic flow as in the Figures may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions.
- the software may be stored on such physical media as memory chips, or memory blocks implemented within the processor, magnetic media such as hard disk or floppy disks, and optical media such as for example DVD and the data variants thereof, CD.
- the memory may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory.
- the data processors may be of any type suitable to the local technical environment, and may include one or more of general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASIC), gate level circuits and processors based on multi-core processor architecture, as non-limiting examples.
- Embodiments of the inventions may be practiced in various components such as integrated circuit modules.
- the design of integrated circuits is by and large a highly automated process.
- Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
- Programs such as those provided by Synopsys, Inc. of Mountain View, California and Cadence Design, of San Jose, California automatically route conductors and locate components on a semiconductor chip using well established rules of design as well as libraries of pre-stored design modules.
- the resultant design in a standardized electronic format (e.g., Opus, GDSII, or the like) may be transmitted to a semiconductor fabrication facility or "fab" for fabrication.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
An apparatus comprising: an audio signal generator configured to generate a single channel audio signal; a data generator configured to generate at least one channel of data associated with the single channel audio signal; and an encoder configured to encode the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
Description
DIRECTIONAL AUDIO APPARATUS
Field The present application relates to apparatus for processing of audio and additionally audio-video signals to enable directional audio encoding and decoding over a mono channel audio signal. The invention further relates to, but is not limited to, apparatus for processing of audio and additionally audio-video signals to enable directional audio encoding and decoding over a mono channel audio signal between mobile devices.
Background
Viewing recorded or streamed audio-video or audio content is well known.
Stereo recording is a well-known feature and widely supported in all kinds of audio devices. There are many kinds of multi-microphone recorders that enable good quality multi-channel audio recordings and which for example can be played back by home theatre systems. For very impressive personal spatial sound recording, binaural microphones can used to capture the true spatial audio environment at the user's ears.
In the above examples the recorder device requires capturing two or more channels of audio. However, many mobile recording devices, especially user equipment or phones and headsets, support only single channel recording. For example, two channel capture support for Bluetooth is very rare. Similarly in some circumstances the transmission encoding is configured for single or mono channel audio signals and so unable to convey multichannel information. Summary
Aspects of this application thus provide suitable audio capture and audio playback for multichannel audio signals using single channel storage or recording to permit a better audio listening experience. There is provided according to a first aspect an apparatus comprising at least one processor and at least one memory including computer code for one or more programs, the at least one memory and the computer code configured to with the at least one processor cause the apparatus to at least: generate a single channel audio signal; generate at least one channel of data associated with the single channel audio signal; and encode the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding. Generating the single channel audio signal may cause the apparatus to perform at least one of: receive a single channel audio signal from a single microphone; generate a single channel audio signal from at least two channel audio signals; generate a single channel audio signal from at least two channel audio signals received from at least one microphone; receive the single channel audio signal from a memory; and receive the single channel audio signal from a further apparatus separate from the apparatus.
Generating the at least one channel of data associated with the single channel audio signal may cause the apparatus to perform at least one of: generate at least one spatial parameter associated with the single channel audio signal; generate an azimuth parameter associated with the single channel audio signal; generate an elevation parameter associated with the single channel audio signal; generate a range parameter associated with the single channel audio signal; generate at least one velocity parameter associated with the single channel audio signal; generate at least one recording configuration parameter associated with the single channel audio signal; generate an azimuth parameter associated with at least one
microphone recording the single channel audio signal; generate an elevation parameter associated with at least one microphone recording the single channel audio signal; and generate at least one processing parameter associated with the generation of the single channel audio signal.
Encoding the at least one channel of data within the single channel audio signal may cause the apparatus to perform at least one of: quantize the at least one channel of data; modulate the quantized at least one channel of data; and combine the modulated quantized at least one channel of data with the single channel audio signal.
The apparatus may be further caused to band-pass filter the single channel audio signal, and wherein modulating the quantized at least one channel of data may cause the apparatus to frequency shift the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter.
Band-pass filtering the single channel audio signal may cause the apparatus to perform at least one of: low pass filter the single channel audio signal, wherein frequency shifting the quantized at least one channel of data may cause the apparatus to frequency shift the quantized at least one channel of data to a frequency range above the range of the low pass filter; and high pass filter the single channel audio signal, wherein frequency shifting the quantized at least one channel of data may cause the apparatus to frequency shift the quantized at least one channel of data to a frequency range below the range of the high pass filter.
The apparatus may be further caused to band-stop filter the single channel audio signal, and wherein modulating the quantized at least one channel of data may cause the apparatus to frequency shift the quantized at least one channel of data to at least one frequency of the band-stop filter.
Modulating the quantized at least one channel of data may cause the apparatus to: apply a defined pseudo-random frequency shifting to the at least one channel of data; and determine the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
According to a second aspect there is provided an apparatus comprising at least one processor and at least one memory including computer code for one or more programs, the at least one memory and the computer code configured to with the at least one processor cause the apparatus to at least: receive a single channel audio signal, the single channel audio signal comprising at least one channel of data; separate the at least one channel of data from the single channel audio signal; and control at least one operation of the apparatus based on the at least one channel of data. Separating the at least one channel of data from the single channel audio signal may cause the apparatus to: apply a first band pass filter to extract the at least one channel of data; and apply a second band pass filter to extract an audio signal from the single channel audio signal. The first band pass filter may be a high pass filter, and the second band pass filter may be a low pass filter.
The first band pass filter may be a low pass filter, and the second band pass filter may be a high pass filter.
The first band pass filter may be at least one narrow band filter, and the second band pass filter may be a notch filter comprising stop bands at the frequencies of the at least one narrow band filter. Separating the at least one channel of data from the single channel audio signal may cause the apparatus to: apply a pseudo-random decoding of the single
channel audio signal comprising the at least one channel of data to extract the at least one channel of data; and demodulate the decoded at least one channel of data to regenerate the at least one channel of data. Controlling at least one operation of the apparatus based on the at least one channel of data may cause the apparatus to perform at least one of: process the single channel audio signal based on the at least one channel of data; generate at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data; generate at least user interface input based on the at least one channel of data; generate at least gaming control input based on the at least one channel of data; and generate at least one visual effect to be displayed based on the at least one channel of data.
Processing the single channel audio signal based on the at least one channel of data may cause the apparatus to perform at least one of: generate at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; generate a spatial audio signal based on the single channel audio signal and the at least one channel of data. According to a third aspect there is provided a method comprising: generating a single channel audio signal; generating at least one channel of data associated with the single channel audio signal; and encoding the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
Generating the single channel audio signal may comprise at least one of: receiving a single channel audio signal from a single microphone; generating a single channel audio signal from at least two channel audio signals; generating a single channel audio signal from at least two channel audio signals received from at least one microphone; receiving the single channel audio signal from a memory; and
receiving the single channel audio signal from a further apparatus separate from the apparatus.
Generating the at least one channel of data associated with the single channel audio signal may comprise at least one of: generating at least one spatial parameter associated with the single channel audio signal; generating an azimuth parameter associated with the single channel audio signal; generating an elevation parameter associated with the single channel audio signal; generating a range parameter associated with the single channel audio signal; generating at least one velocity parameter associated with the single channel audio signal; generating at least one recording configuration parameter associated with the single channel audio signal; generating an azimuth parameter associated with at least one microphone recording the single channel audio signal; generating an elevation parameter associated with at least one microphone recording the single channel audio signal; and generating at least one processing parameter associated with the generation of the single channel audio signal.
Encoding the at least one channel of data within the single channel audio signal may comprise at least one of: quantizing the at least one channel of data; modulating the quantized at least one channel of data; and combining the modulated quantized at least one channel of data with the single channel audio signal.
The method may further comprise band-pass filtering the single channel audio signal, and wherein modulating the quantized at least one channel of data may comprise frequency shifting the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter.
Band-pass filtering the single channel audio signal may comprise at least one of: low pass filtering the single channel audio signal, wherein frequency shifting the quantized at least one channel of data may comprise frequency shifting the
quantized at least one channel of data to a frequency range above the range of the low pass filter; and high pass filtering the single channel audio signal, wherein frequency shifting the quantized at least one channel of data may comprise frequency shifting the quantized at least one channel of data to a frequency range below the range of the high pass filter.
The method may further comprise band-stop filtering the single channel audio signal, and wherein modulating the quantized at least one channel of data may comprise frequency shifting the quantized at least one channel of data to at least one frequency of the band-stop filter.
Modulating the quantized at least one channel of data may comprise: applying a defined pseudo-random frequency shifting to the at least one channel of data; and determining the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
According to a fourth aspect there is provided a method comprising: receiving a single channel audio signal, the single channel audio signal comprising at least one channel of data; separating the at least one channel of data from the single channel audio signal; and controlling at least one operation of the apparatus based on the at least one channel of data.
Separating the at least one channel of data from the single channel audio signal may comprise: applying a first band pass filter to extract the at least one channel of data; and applying a second band pass filter to extract an audio signal from the single channel audio signal.
The first band pass filter may be a high pass filter, and the second band pass filter may be a low pass filter.
The first band pass filter may be a low pass filter, and the second band pass filter may be a high pass filter.
The first band pass filter may be at least one narrow band filter, and the second band pass filter may be a notch filter comprising stop bands at the frequencies of the at least one narrow band filter.
Separating the at least one channel of data from the single channel audio signal may comprise: applying a pseudo-random decoding of the single channel audio signal comprising the at least one channel of data to extract the at least one channel of data; and demodulating the decoded at least one channel of data to regenerate the at least one channel of data.
Controlling at least one operation of the apparatus based on the at least one channel of data may comprise at least one of: processing the single channel audio signal based on the at least one channel of data; generating at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data; generating at least user interface input based on the at least one channel of data; generating at least gaming control input based on the at least one channel of data; and generating at least one visual effect to be displayed based on the at least one channel of data.
Processing the single channel audio signal based on the at least one channel of data may comprise at least one of: generating at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; generating a spatial audio signal based on the single channel audio signal and the at least one channel of data.
According to a fifth aspect there is provided an apparatus comprising: means for generating a single channel audio signal; means for generating at least one channel of data associated with the single channel audio signal; and means for
encoding the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding. The means for generating the single channel audio signal may comprise at least one of: means for receiving a single channel audio signal from a single microphone; means for generating a single channel audio signal from at least two channel audio signals; means for generating a single channel audio signal from at least two channel audio signals received from at least one microphone; means for receiving the single channel audio signal from a memory; and receiving the single channel audio signal from a further apparatus separate from the apparatus.
The means for generating the at least one channel of data associated with the single channel audio signal may comprise at least one of: means for generating at least one spatial parameter associated with the single channel audio signal; means for generating an azimuth parameter associated with the single channel audio signal; generating an elevation parameter associated with the single channel audio signal; means for generating a range parameter associated with the single channel audio signal; means for generating at least one velocity parameter associated with the single channel audio signal; means for generating at least one recording configuration parameter associated with the single channel audio signal; means for generating an azimuth parameter associated with at least one microphone recording the single channel audio signal; means for generating an elevation parameter associated with at least one microphone recording the single channel audio signal; and means for generating at least one processing parameter associated with the generation of the single channel audio signal.
The means for encoding the at least one channel of data within the single channel audio signal may comprise at least one of: means for quantizing the at least one channel of data; means for modulating the quantized at least one channel of data;
and means for combining the modulated quantized at least one channel of data with the single channel audio signal.
The apparatus may further comprise means for band-pass filtering the single channel audio signal, and wherein the means for modulating the quantized at least one channel of data may comprise means for frequency shifting the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter. The means for band-pass filtering the single channel audio signal may comprise at least one of: means for low pass filtering the single channel audio signal, wherein the means for frequency shifting the quantized at least one channel of data may comprise means for frequency shifting the quantized at least one channel of data to a frequency range above the range of the low pass filter; and means for high pass filtering the single channel audio signal, wherein the means for frequency shifting the quantized at least one channel of data may comprise means for frequency shifting the quantized at least one channel of data to a frequency range below the range of the high pass filter. The apparatus may further comprise means for band-stop filtering the single channel audio signal, and wherein the means for modulating the quantized at least one channel of data may comprise means for frequency shifting the quantized at least one channel of data to at least one frequency of the band-stop filter. The means for modulating the quantized at least one channel of data may comprise: means for applying a defined pseudo-random frequency shifting to the at least one channel of data; and means for determining the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal. According to a sixth aspect there is provided an apparatus comprising: means for receiving a single channel audio signal, the single channel audio signal comprising
at least one channel of data; means for separating the at least one channel of data from the single channel audio signal; and means for controlling at least one operation of the apparatus based on the at least one channel of data. The means for separating the at least one channel of data from the single channel audio signal may comprise: means for applying a first band pass filter to extract the at least one channel of data; and means for applying a second band pass filter to extract an audio signal from the single channel audio signal. The first band pass filter may be a high pass filter, and the second band pass filter may be a low pass filter.
The first band pass filter may be a low pass filter, and the second band pass filter may be a high pass filter.
The first band pass filter may be at least one narrow band filter, and the second band pass filter may be a notch filter comprising stop bands at the frequencies of the at least one narrow band filter. The means for separating the at least one channel of data from the single channel audio signal may comprise: means for applying a pseudo-random decoding of the single channel audio signal comprising the at least one channel of data to extract the at least one channel of data; and means for demodulating the decoded at least one channel of data to regenerate the at least one channel of data.
The means for controlling at least one operation of the apparatus based on the at least one channel of data may comprise at least one of: means for processing the single channel audio signal based on the at least one channel of data; means for generating at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data; means for generating at least user interface input based on the at least one channel of data; means for
generating at least gaming control input based on the at least one channel of data; and means for generating at least one visual effect to be displayed based on the at least one channel of data. The means for processing the single channel audio signal based on the at least one channel of data may comprise at least one of: means for generating at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; means for generating a spatial audio signal based on the single channel audio signal and the at least one channel of data.
According to a seventh aspect there is provided an apparatus comprising: an audio signal generator configured to generate a single channel audio signal; a data generator configured to generate at least one channel of data associated with the single channel audio signal; and an encoder configured to encode the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding. The audio signal generator may comprise at least one of: a microphone input configured to receive a single channel audio signal from a single microphone; a downmixer configured to generate a single channel audio signal from at least two channel audio signals; a microphone downmixer configured to generate a single channel audio signal from at least two channel audio signals received from at least one microphone; a input configured to receive the single channel audio signal from a memory; and a receiver configured to receive the single channel audio signal from a further apparatus separate from the apparatus.
The data generator may comprise at least one of: a spatial parameter generator configured to generate at least one spatial parameter associated with the single channel audio signal; a azimuth generator configured to generate an azimuth
parameter associated with the single channel audio signal; an elevation generator configured to generate an elevation parameter associated with the single channel audio signal; a range generator configured to generate a range parameter associated with the single channel audio signal; a velocity determiner configured to generate at least one velocity parameter associated with the single channel audio signal; a configuration determiner configured to generate at least one recording configuration parameter associated with the single channel audio signal; a microphone azimuth generator configured to generate an azimuth parameter associated with at least one microphone recording the single channel audio signal; a microphone elevation generator configured to generate an elevation parameter associated with at least one microphone recording the single channel audio signal; and a processing determiner configured to generate at least one processing parameter associated with the generation of the single channel audio signal. The encoder may comprise at least one of: a quantizer configured to quantize the at least one channel of data; a modulator configured to modulate the quantized at least one channel of data; and a combiner configured to combine the modulated quantized at least one channel of data with the single channel audio signal. The apparatus may further comprise a band-pass filter configured to band-pass filter the single channel audio signal, and wherein the modulator may be configured to frequency shift the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter. The band-pass filter may comprise at least one of: a low pass filter configured to low-pass filter the single channel audio signal, wherein the modulator may be configured to frequency shift the quantized at least one channel of data to a frequency range above the range of the low pass filter; and a high pass filter configured to high pass filter the single channel audio signal, wherein the modulator may be configured to frequency shift the quantized at least one channel of data to a frequency range below the range of the high pass filter.
The apparatus may comprise a band-stop filter configured to band-stop filter the single channel audio signal, and wherein the modulator may be configured to frequency shift the quantized at least one channel of data to at least one frequency of the band-stop filter.
The modulator may comprise: a pseudo-random frequency shifter configured to apply a defined pseudo-random frequency shift to the at least one channel of data; and an audibility determiner configured to determine the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
According to an eighth aspect there is provided an apparatus comprising: a receiver configured to receive a single channel audio signal, the single channel audio signal comprising at least one channel of data; a separator configured to separate the at least one channel of data from the single channel audio signal; and a controller configured to control at least one operation of the apparatus based on the at least one channel of data.
The separator may comprise: a first band pass filter configured to band pass filter the single channel audio signal to extract the at least one channel of data; and a second band pass filter configured to band pass filter the single channel audio signal to extract an audio signal.
The first band pass filter may be a high pass filter, and the second band pass filter may be a low pass filter.
The first band pass filter may be a low pass filter, and the second band pass filter may be a high pass filter.
The first band pass filter may be at least one narrow band filter, and the second band pass filter may be a notch filter comprising stop bands at the frequencies of the at least one narrow band filter. The separator may comprise: a decoder configured to apply a pseudo-random decoding of the single channel audio signal comprising the at least one channel of data to extract the at least one channel of data; and a demodulator configured to demodulate the decoded at least one channel of data to regenerate the at least one channel of data.
Controlling at least one operation of the apparatus based on the at least one channel of data may comprise at least one of: a processor configured to process the single channel audio signal based on the at least one channel of data; a signal generator configured to generate at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data; a user input generator configured to generate at least user interface input based on the at least one channel of data; a gaming control generator configured to generate at least gaming control input based on the at least one channel of data; and a visual effect generator configured to generate at least one visual effect to be displayed based on the at least one channel of data.
The processor may comprise at least one of: an up-mixer configured to generate at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; a spatial processor configured to generate a spatial audio signal based on the single channel audio signal and the at least one channel of data.
A computer program product stored on a medium may cause an apparatus to perform the method as described herein.
An electronic device may comprise apparatus as described herein.
A chipset may comprise apparatus as described herein.
Embodiments of the present application aim to address problems associated with the state of the art.
Summary of the Figures
For better understanding of the present application, reference will now be made by way of example to the accompanying drawings in which:
Figure 1 shows schematically an apparatus suitable for being employed in embodiments of the application;
Figure 2 shows schematically example recording and playback apparatus according to some embodiments;
Figure 3 shows schematically an example recording by the recording apparatus shown in Figure 2 according to some embodiments;
Figure 4 shows schematically an example playback by the playback apparatus shown in Figure 2 according to some embodiments;
Figure 5 shows a flow diagram of the operation of the example recording and playback apparatus as shown in Figure 2;
Figure 6 shows schematically an example encoder as shown in Figure 2 according to some embodiments;
Figure 7 shows schematically an example decoder as shown in Figure 2 according to some embodiments;
Figure 8 shows schematically example modulation ranges suitable for use by the encoder/decoder as shown in Figures 6 and 7.
Figure 9 shows schematically a further example encoder as shown in Figure 2 according to some embodiments;
Figure 10 shows schematically a further example decoder as shown in
Figure 2 according to some embodiments;
Figure 1 1 shows a flow diagram of the operation of the example encoder as shown in Figure 6 according to some embodiments;
Figure 12 shows a flow diagram of the operation of the example decoder as shown in Figure 7 according to some embodiments;
Figure 13 shows a flow diagram of the operation of the further example encoder as shown in Figure 9 according to some embodiments; and
Figure 14 shows a flow diagram of the operation of the further example decoder as shown in Figure 10 according to some embodiments. Embodiments of the Application
The following describes in further detail suitable apparatus and possible mechanism for the provision of effective audio signal capture and playback. In the following examples, audio signals and audio capture signals are described. However it would be appreciated that in some embodiments the audio signal/audio capture is a part of an audio-video system.
The concept of this application is related to assisting in the production of immersive communication. The concept as described by the embodiments herein is capture an audio recording of the user's surroundings, capture spatial attributes of the situation (for example the apparatus orientation and microphone configuration or the user's head orientation data), encode the spatial attributes data with the audio for normal mono audio transfer channel, and decode the spatial attributes data from the audio stream to generate a suitable multi-channel audio signal using the mono audio and spatial attributes.
In the examples described herein the transfer channel (or storage channel) supports only a single channel of audio data. Thus in the embodiments described herein the spatial attributes data is recorded, and this spatial meta data is encoded with the mono audio signal for transmission/storage. At reception/recovery, the spatial audio attributes can be extracted (decoded) from the mono audio track.
With the spatial audio attributes meta data, the original audio can be enriched with spatial effects to further improve the playback experience.
The embodiments as described herein for example differ from multichannel recording operations such as for example the directional audio (DirAC) encoding system in that whereas DirAC requires Multi-channel capture the embodiments as described herein feature mono capture. Furthermore whereas DirAC systems capture the sound environment in relation to the microphone the embodiments as described herein capture a mono sound and the user or apparatus motion. Also it would be understood that multi-channel capture methods such as DirAC in reproduction replicate the recorded sound environment with the available sound system whereas in the embodiments as described herein the apparatus/user motion is used to enrich the listening/playback experience. In this regard reference is first made to Figure 1 which shows a schematic block diagram of an exemplary apparatus or electronic device 10, which may be used to record (or operate as a recording or capturing apparatus) or listen (or operate as a playback apparatus) to the audio signals (and similarly to record or view the audiovisual images and data).
The electronic device 10 may for example be a mobile terminal or user equipment of a wireless communication system when functioning as the recording device or listening device 1 13. In some embodiments the apparatus can be an audio player or audio recorder, such as an MP3 player, a media recorder/player (also known as an MP4 player), or any suitable portable device suitable for recording audio or audio/video camcorder/memory audio or video recorder.
The apparatus 10 can in some embodiments comprise an audio subsystem. The audio subsystem for example can comprise in some embodiments a microphone or array of microphones 1 1 for audio signal capture. In some embodiments the microphone or array of microphones can be a solid state microphone, in other
words capable of capturing audio signals and outputting a suitable digital format signal. In some other embodiments the microphone or array of microphones 1 1 can comprise any suitable microphone or audio capture means, for example a condenser microphone, capacitor microphone, electrostatic microphone, electret condenser microphone, dynamic microphone, ribbon microphone, carbon microphone, piezoelectric microphone, or microelectrical-mechanical system (MEMS) microphone. The microphone 1 1 or array of microphones can in some embodiments output the audio captured signal to an analogue-to-digital converter (ADC) 14.
In some embodiments the apparatus can further comprise an analogue-to-digital converter (ADC) 14 configured to receive the analogue captured audio signal from the microphones and outputting the audio captured signal in a suitable digital form. The analogue-to-digital converter 14 can be any suitable analogue-to-digital conversion or processing means.
In some embodiments the apparatus 10 audio subsystem further comprises a digital-to-analogue converter 32 for converting digital audio signals from a processor 21 to a suitable analogue format. The digital-to-analogue converter (DAC) or signal processing means 32 can in some embodiments be any suitable DAC technology.
Furthermore the audio subsystem can comprise in some embodiments a speaker 33. The speaker 33 can in some embodiments receive the output from the digital- to-analogue converter 32 and present the analogue audio signal to the user. In some embodiments the speaker 33 can be representative of a headset, for example a set of headphones, or cordless headphones.
Although the apparatus 10 is shown having both audio capture and audio playback components, it would be understood that in some embodiments the apparatus 10 can comprise one or the other of the audio capture and audio playback parts of the
audio subsystem such that in some embodiments of the apparatus the microphone (for audio capture) or the speaker (for audio playback) are present.
In some embodiments the apparatus 10 comprises a processor 21 . The processor 21 is coupled to the audio subsystem and specifically in some examples the analogue-to-digital converter 14 for receiving digital signals representing audio signals from the microphone 1 1 , and the digital-to-analogue converter (DAC) 12 configured to output processed digital audio signals. The processor 21 can be configured to execute various program codes. The implemented program codes can comprise for example audio signal processing routines.
In some embodiments the apparatus further comprises a memory 22. In some embodiments the processor is coupled to memory 22. The memory can be any suitable storage means. In some embodiments the memory 22 comprises a program code section 23 for storing program codes implementable upon the processor 21 . Furthermore in some embodiments the memory 22 can further comprise a stored data section 24 for storing data, for example data that has been processed in accordance with the application or data to be processed via the application embodiments as described later. The implemented program code stored within the program code section 23, and the data stored within the stored data section 24 can be retrieved by the processor 21 whenever needed via the memory-processor coupling.
In some further embodiments the apparatus 10 can comprise a user interface 15. The user interface 15 can be coupled in some embodiments to the processor 21 . In some embodiments the processor can control the operation of the user interface and receive inputs from the user interface 15. In some embodiments the user interface 15 can enable a user to input commands to the electronic device or apparatus 10, for example via a keypad, and/or to obtain information from the apparatus 10, for example via a display which is part of the user interface 15. The user interface 15 can in some embodiments comprise a touch screen or touch
interface capable of both enabling information to be entered to the apparatus 10 and further displaying information to the user of the apparatus 10.
In some embodiments the apparatus further comprises a transceiver 13, the transceiver in such embodiments can be coupled to the processor and configured to enable a communication with other apparatus or electronic devices, for example via a wireless communications network. The transceiver 13 or any suitable transceiver or transmitter and/or receiver means can in some embodiments be configured to communicate with other electronic devices or apparatus via a wire or wired coupling.
The transceiver 13 can communicate with further devices by any suitable known communications protocol, for example in some embodiments the transceiver 13 or transceiver means can use a suitable universal mobile telecommunications system (UMTS) protocol, a wireless local area network (WLAN) protocol such as for example IEEE 802.X, a suitable short-range radio frequency communication protocol such as Bluetooth, or infrared data communication pathway (IRDA).
In some embodiments the apparatus comprises a position sensor 16 configured to estimate the position of the apparatus 10. The position sensor 16 can in some embodiments be a satellite positioning sensor such as a GPS (Global Positioning System), GLONASS or Galileo receiver.
In some embodiments the positioning sensor can be a cellular ID system or an assisted GPS system.
In some embodiments the apparatus 10 further comprises a direction or orientation sensor. The orientation/direction sensor can in some embodiments be an electronic compass, accelerometer, a gyroscope or be determined by the motion of the apparatus using the positioning estimate.
It is to be understood again that the structure of the electronic device 10 could be supplemented and varied in many ways.
With respect to Figure 2 an example recording apparatus 91 and playback apparatus 93 are shown according to some embodiments. The recording apparatus 91 and playback apparatus 93 are shown coupled by a single channel (Mono) transfer or transmission channel 121 .
Furthermore Figure 3 shows an example recording by the recording apparatus 91 as shown in Figure 2 according to some embodiments. Figure 4 shows schematically an example playback by the playback apparatus 93 shown in Figure 2 according to some embodiments. Figure 5 shows a flow diagram of the operation of the example recording apparatus 91 and playback apparatus 93 according to some embodiments.
As shown in Figure 3, the recording apparatus 91 in some embodiments comprises a Mono recorder 100. The Mono recorder 100 in some embodiments comprises the microphone(s) 1 1 and analogue to digital converter 14 configured to generate a suitable digital format signal to be passed to the encoder 101 .
Furthermore in some embodiments the recording apparatus comprises a position sensor/orientation sensor 16 configured to monitor the orientation angle of the recording apparatus 91 . In some embodiments the orientation angle sensor 16 is configured to measure the orientation (or angle) of the microphone, where the microphone is detached or separate from the recording apparatus. Thus for example where a user is wearing a stereo headset with a mono microphone the orientation angle sensor 16 is configured to determine the orientation of the headset comprising the microphone. In some embodiments the microphone can be attached to the user's chest or other parts of the body rather than the apparatus and thus as the user rotates himself or herself the orientation of the microphone changes.
For example in Figure 3 where the user is wearing a traditional stereo headset with mono microphone the user 203 or the apparatus can turn from the left to the right. These user 203 turning an angle 205 thus records the sound source 201 but also the change in orientation (the angle 205). In other words both the mono audio and of the recording orientation data is captured.
In the following examples the rotation of a single microphone or mono recorder is discussed. However it would be understood that a similar rotational or orientation change arrangement can be implemented according to any suitable manner. For example in some embodiments an apparatus or microphone orientation change can be implemented by a multiple mono microphone array where the microphones are arranged with an associated orientation. An orientation change can be implemented in such embodiments by the Mono recorder 100 being configured to select individual mono microphone inputs and the orientation angle sensor 16 configured to monitor which of the microphones has been selected and generate the associated orientation of the selected microphone. Similarly in some embodiments the Mono recorder 91 can be configured to generate a Mono audio signal output from a beam formed microphone array wherein the orientation angle sensor 16 is a beam formed orientation angle of the processed audio signals. In some embodiments the recording apparatus further comprises multiple microphones from which a mono audio channel is generated and recorded. For example in some embodiments the apparatus comprises a multiple microphone configuration which is output to a suitable delay network. The delay network can in some embodiments be implemented by a digital processing means or mechanical means and the various delays applied to the microphone outputs before combining them in order to form a directional recording audio channel or beam. It would be understood that in some embodiments any suitable digital processing can be applied to the microphone(s) audio signals to generate a recording beam. In such a way the direction of the audio signal can be changed digitally (or by mechanical/analogue processing). In the following examples the mechanically
changing the orientation of the microphones to generate a 'beamformed' or directional audio signal is described. However it would be understood that in some embodiments the beamformed or directional audio signals are recorded or captured using the processing means as described above.
Similarly the physical change of the apparatus/microphone/recording orientation by moving the microphone or apparatus or selecting a separate one of several Mono or multichannel microphones with different recording or capture orientations enables the apparatus to generate a mono (or reduced channel) audio signal.
The operation of rotating or changing the orientation of the microphones/apparatus is shown in Figure 5 by step 400.
In such embodiments the Mono recorder is configured to capture or record a mono audio signal as the recording angle/orientation is changed (microphone is rotated). The Mono recorder can be configured to output this mono audio signal to the encoder 101 .
The operation of capturing or recording the mono audio signal as the microphone is rotated is shown in Figure 5 by step 401 .
Furthermore the orientation angle sensor 16 can be configured to capture/record the orientation angle of the apparatus (or microphone) as the apparatus (or microphone) is rotated. The orientation angle sensor 16 can then be configured to output this orientation angle to the encoder 401 . As described herein in some embodiments the orientation angle sensor can monitor the orientation of the microphone set separate from the apparatus, or the beamformer orientation, or the configuration orientation of the microphone. The operation of capturing or recording the orientation angle of the microphone as 'rotated' is shown in Figure 5 by step 403.
In some embodiments the recording apparatus comprises an encoder 101 . The encoder 101 is configured to encode the mono audio signal with associated meta data with the spatial attributes generated from the orientation angle sensor 16. The encoder 101 can then output an encoded form to be stored or passed to a playback apparatus 93.
The encoding of the audio signal with associated meta data with spatial attributes is shown in Figure 5 by step 405. Although in the following examples a single metadata channel defining the direction or orientation of the audio capture is described as being encoded with the audio signal it would be understood that in some embodiments further metadata channels can be encoded. For example in some embodiments the position sensor/orientation sensor 16 can be configured to detect vertical motion of the apparatus, and the metadata therefore be configured to comprise multiple channels representing the motion of the apparatus in more than a single direction or axis.
The operation of transmitting and receiving (or alternatively storing and retrieving) the combined signal comprising the mono audio signal and metadata is shown in Figure 5 by step 407.
The transfer as described herein is shown in Figure 2 by the transfer channel 121 . It would be understood that in some embodiments the transfer can be a wired or wireless coupling between the recording apparatus 91 and the playback apparatus 93. For example the encoder in some embodiments passes the combined signal to a suitable transmitter where the combined signal is encoded for transmission and transmitted. The playback apparatus 93 similarly comprises a suitable receiver (not shown) configured to receive the transmitted signal and demodulate it to regenerate the combined signal. The transmitter and/or receiver in some embodiments can be components of a suitable transceiver, for example such as described with respect to Figure 1 .
In some embodiments the playback apparatus 93 comprises a decoder 151 . The decoder 151 can be configured to receive the combined signal comprising the (single channel) mono audio signal and the spatial attributes metadata. The decoder 151 can then be configured to decode the data stream to generate a separate mono audio output 153 and orientation angle output 155. The decoder 151 can in some embodiments output the mono audio signal as a mono audio output (mono audio o/p) 153 to the externalizer 157 and output the orientation angle output (orientation angle o/p) 155 (or other suitable spatial attributes) to the externalizer 157.
The operation of decoding the combined signal into mono audio signal and spatial attribute components outputs is shown in Figure 5 by step 409. In some embodiments the playback apparatus 93 comprises an externalizer 157. The externalizer 157 can be configured to receive the mono audio signal in the form of the mono audio output 153 as an audio input and the orientation angle or other spatial attributes in the form of the orientation angle output 155 as a control input and further be configured to process the mono audio signal based on the orientation angle to generate a suitable spatial audio output 159.
For example as shown in Figure 4 where the output playback system comprises a right audio channel speaker 301 and a left audio channel speaker 303 the user 307 located between these then the externalizer 157 can pan the mono audio signal from right to left (based on the rotation of the recording apparatus) to add a spatial dimension to the playback.
The operation of externalising the audio signal to generate spatial impressions of the playback is shown in Figure 5 by step 41 1 .
Furthermore the operation of outputting the multichannel audio signals from the externalizer audio is shown in Figure 5 by step 413.
One of the issues addressed by the embodiments as described herein is how to combine the mono audio signal and the spatial attributes (such as the orientation angle) to generate a combined audio signal which can be transmitted and/or stored in a manner similar to typical audio signals.
In the embodiments described herein some methods and encoder-decoder apparatus is shown to achieve this.
For example in some embodiments the spatial attributes such as the orientation angle information is encoded as a subband located next to mono audio channel. Thus in some embodiments the orientation information can be coded on a limited frequency band at a lower or higher frequency band in the spectrum of the mono audio channel.
Furthermore in some embodiments the spatial attributes such as the orientation information is encoded under the audio signal masking threshold. For example in some embodiments the spatial attributes (orientation angle) is encoded on a band- limited frequency channel, then the encoded signal is mixed to the audio signal so that the encoded signal is always inaudible, in other words masked by the audio signal. In some embodiments the band limited spatial attributes (such as the orientation angle) is spread over the whole available audio band using pseudorandom code (PRN), and then the encoded signal is mixed with the audio signal so that the encoding is always masked by the audio signal.
With respect to Figure 6 an example encoder 101 is shown. Furthermore with respect to Figure 1 1 the operation of the example encoder 101 shown in Figure 6 is further shown.
The encoder 101 in some embodiments is configured to receive the orientation signal input 501 .
The operation of receiving the orientation signal input is shown in Figure 1 1 by step 1001 .
The encoder 101 in some embodiments comprises a quantizer 502 configured to receive the orientation input from the orientation angle sensor 16. The quantizer 502 is then configured to quantize the orientation signal and pass to the modulator 504. The quantizer 502 can be any suitable quantizer type such as a static quantizer, a dynamic quantizer, a linear quantizer, and a non-linear quantizer.
The operation of quantizing the orientation signal is shown in Figure 1 1 by step 1003.
In some embodiments the encoder 101 comprises a modulator 504. The modulator 504 can in some embodiments be configured to receive the quantized output of the orientation value from the quantizer 502 and modulate the quantized value such that the data output from the modulator occupies a defined frequency range (or a series of spikes). The modulator 504 can then be output the modulated quantized orientation signal to a combiner 508.
The operation of modulating the quantized orientation signal is shown in Figure 1 1 by step 1 105.
The encoder 101 can be further configured to receive the mono audio input 503 which is passed to a band-pass filter 506.
The operation of receiving the mono audio signal input is shown in Figure 1 1 by step 1002.
In some embodiments the encoder 101 comprises a band-pass filter 506. The band-pass filter 506 can be configured to receive the mono audio signal and bandpass filter the mono audio signal to produce a filtered audio signal with a defined frequency range (which is separate from that of the orientation signal from the modulator) 504. The band-pass filter 506 can then output this filtered signal to a combiner 508.
The operation of band-pass filtering the audio signal is shown in Figure 1 1 by step 1004.
The encoder 101 in some embodiments can comprise a combiner 508 configured to receive the modulated quantized orientation signal and the band-pass filtered audio signal and combine these to generate a frequency spectrum suitable for outputting (or recording).
The operation of combining the band-pass filtered audio signal and the quantized modulated orientation signals is shown in Figure 1 1 by step 1007.
As described herein in some embodiments the combined signal can be further processed for transmission or storage purposes. However it would be understood that in some embodiments the further processing can be configured to perform processing on the combined audio signal as if the combined signal was a conventional audio signal. The operation of outputting the combined signal is shown in Figure 1 1 by step 1009.
As shown in Figure 6 the encoder 101 generates a frequency spectrum 551 such that the output of the band-pass filter generated an audio signal with a frequency range 533 of between 20Hz to 17kHz and the modulated quantized orientation signal is encoded into a frequency range 555 above 17kHz. It would be
understood that the frequency ranges described herein are examples only and that in some embodiments the frequency ranges can be any suitable range.
With respect to Figure 8 example sub-bands available for modulation according to the encoder shown in Figure 6 are shown.
Figure 8 shows a frequency spectrum wherein the audio signal is band-pass filtered to an audible frequency band 751 which is approximately from 20 Hz to 17 kHz. As shown in Figure 6 in some embodiments the quantized orientation signal can be modulated to a frequency higher than the audible frequency band frequencies, in other words the orientation information is modulated to use the frequency band beyond typical human hearing and therefore above 17 kHz. This higher frequency modulation is shown by the frequency range shown by the block 705.
In some embodiments the modulator 504 can be configured to modulate the orientation signal as a DC level in other words the orientation value is represented by a DC current or voltage value. The DC modulation is represented in Figure 8 by the DC frequency spike 701 .
In some embodiments the modulator 504 can be configured to modulate the orientation signal to a frequency lower than the audible frequencies (such as below 20 Hz). The lower frequency modulation is shown in Figure 8 by the frequency band 703.
With respect to Figure 7 an example decoder 151 is shown suitable for decoding the output of the encoder 101 as shown in Figure 6. Furthermore with respect to Figure 12 the operation of the decoder 151 shown in Figure 7 according to some embodiments is further described.
In some embodiments the decoder 151 is configured to receive the combined signal (comprising the orientation signal and audio signal). It would be understood that in some embodiments the combined signal is received from a transceiver or receiver which decodes or demodulates a received signal into the combined signal. Furthermore it would be understood that in some embodiments the combined signal is retrieved from memory or other storage media.
The operation of receiving the combined signal is shown in Figure 12 by step 1 101 .
In some embodiments the decoder 151 comprises a first band-pass filter 601 . The first band-pass filter 601 comprises a band-pass filter with frequency characteristics similar to the band-pass filter 506 in the encoder 101 and as such configured to output the audio frequency components.
The operation of band-pass filtering the combined signal to generate the audio signal is shown in Figure 12 by step 1 103.
The band-pass filter can then output the audio signal 61 1 as the mono audio output 153.
The operation of outputting the audio signal is shown in Figure 12 by step 1 105.
In some embodiments the decoder 151 comprises a second band-pass filter 603. The second band-pass filter 603 is configured to filter the combined signal to separate out the modulated orientation values. Thus in some embodiments the second band-pass filter 603 has frequency characteristics which aim to remove the audible frequency components or filter the frequency band or range generated by the modulator output 504 of the encoder 101 . Thus for example in some embodiments where a DC level encoding is used by the encoder the second bandpass filter 603 is a DC pass circuit or filter (or AC block circuit or filter), where the
modulator produces a lower frequency modulation then the band-pass filter is a lower frequency band-pass filter, where the modulator 504 is a higher frequency modulator then the second band-pass filter 603 is a higher frequency band pass filter, and where the modulator 504 is at least one single frequency component modulator then the bandpass filter 603 comprises a series of single frequency pass filters.
The second band-pass filter 603 can then output the filtered frequency components to the demodulator 605.
Furthermore the operation of performing a second band-pass filtering on the combined audio signal to generate the modulated orientation signal is shown in Figure 12 by step 1 102. In some embodiments the decoder 151 comprises a demodulator 605 configured to perform an inverse of the modulation scheme performed by the modulator 504 in the encoder 101 . It would be understood that the modulator 504 and therefore the demodulator 605 can be configured to perform any suitable modulation/demodulation scheme or method to modulate the orientation signal such as for example frequency modulation, amplitude modulation, phase modulation, static or dynamic modulation or variants thereof.
The operation of demodulating the band-pass filtered modulated orientation signal is shown in Figure 12 by step 1 104.
The demodulator 605 can then output the orientation signal 613 as the orientation angle output 155.
The operation of outputting the orientation angle output is shown in Figure 12 by step 1 106.
With respect to Figure 9 a further example encoder 101 is shown according to some embodiments. Furthermore with respect to Figure 13 the operation of the example encoder 101 shown in Figure 9 is further shown. The encoder 101 in some embodiments is configured to receive the orientation signal input 801 .
The operation of receiving the orientation signal input is shown in Figure 13 by step 1201 .
The encoder 101 in some embodiments comprises a quantizer 802 configured to receive the orientation input from the orientation angle sensor 16. The quantizer 802 is then configured to quantize the orientation signal and pass to the modulator 804. The quantizer 802 can be any suitable quantizer type such as a static quantizer, a dynamic quantizer, a linear quantizer, and a non-linear quantizer.
The operation of quantizing the orientation signal is shown in Figure 13 by step 1203. In some embodiments the encoder 101 comprises a modulator 804. The modulator 804 can in some embodiments be configured to receive the quantized output of the orientation value from the quantizer 802 and modulate the quantized value such that the data output from the modulator occupies a defined frequency range (or a series of spikes). The modulator 804 can then be output the modulated quantized orientation signal to pseudo-random number encoder (PRN coder) 806.
In some embodiments the encoder 101 comprises a pseudo-random number encoder (PRN coder) 806. The PRN coder 806 is configured to receive the modulated orientation signals and furthermore a key 805 and encode the modulated orientation signals using the key. The PRN coder 806 can then output the PRN coded modulated orientation signals to a psychoacoustic processor 808.
The operation of PRN coding the modulated quantized orientation signals according to a key is shown in Figure 13 by step 1207. In some embodiments the encoder 101 comprises a psychoacoustic processor 808. The psychoacoustic processor 808 can be configured to receive the PRN coded modulated quantized orientation signals and furthermore receive the audio input 803. The psychoacoustic processor 808 can in some embodiments be configured to check whether the coding noise (the coded modulated quantized orientation signals) is inaudible with respect to the audio signals. Where the coding noise is within the audio signal range then the psychoacoustic processor 808 can be configured to output the PRN coded modulated quantized orientation signals to a combiner 810. Where the coding noise is outside the audio signal range then the psychoacoustic processor 808 can be configured to control the PRN coder to recode the modulated quantized orientation signals.
The operation of psychoacoustically checking/testing the PRN coded modulated quantized orientation signals is shown in Figure 13 by step 1209. Furthermore the failure of test is shown in Figure 13 by the loop back to the encoding operation step 1207.
In some embodiments the encoder 101 comprises a combiner 810. The combiner 810 is configured to receive the audio signal 803 and the PRN coded modulated quantized orientation values and combine them in a suitable manner to be output as a combined signal.
The operation of combining the audio signal and the PRN coded modulated quantized orientation signals is shown in Figure 13 by step 121 1 . As described herein in some embodiments the combined signal can be further processed for transmission or storage purposes. However it would be understood
that in some embodiments the further processing can be configured to perform processing on the combined audio signal as if the combined signal was a conventional audio signal. The operation of outputting the combined signal is shown in Figure 13 by step 1213.
With respect to Figure 10 an example decoder 151 is shown suitable for decoding the output of the encoder 101 as shown in Figure 9. Furthermore with respect to Figure 14 the operation of the decoder 151 shown in Figure 10 according to some embodiments is further described.
In some embodiments the decoder 151 is configured to receive the combined signal (comprising the orientation signal and audio signal). It would be understood that in some embodiments the combined signal is received from a transceiver or receiver which decodes or demodulates a received signal into the combined signal. Furthermore it would be understood that in some embodiments the combined signal is retrieved from memory or other storage media. The operation of receiving the combined signal is shown in Figure 14 by step 1301 .
In some embodiments the decoder 151 outputs the combined signal as the audio signal 81 1 (in other words as the mono audio output 153).
The operation of outputting the audio signal is shown in Figure 14 by step 1302.
In some embodiments the decoder 151 comprises a pseudo-random number decoder (PRN decoder) 901 . The PRN decoder 901 is configured to receive the combined signal and a key 905. The key 905 is the same key used by the PRN coder 806 within the encoder 101 and configured to permit the decoding of the
combined signal to output a modulated orientation signal. The output of the PRN decoder 901 can then be output to the demodulator 903.
The operation of decoding the combined signal to generate a modulated orientation signal is shown in Figure 14 by step 1303.
In some embodiments the decoder 151 comprises a demodulator 903 configured to perform an inverse of the modulation scheme performed by the modulator 804 in the encoder 101 . It would be understood that the modulator 804 and therefore the demodulator 903 can be configured to perform any suitable modulation/demodulation scheme or method to modulate the orientation signal such as for example frequency modulation, amplitude modulation, phase modulation, static or dynamic modulation or variants thereof. The operation of demodulating the modulated orientation signal is shown in Figure 14 by step 1305.
The demodulator 903 can then output the orientation signal 913 as the orientation angle output 155.
The operation of outputting the orientation signal is shown in Figure 14 by step 1307.
In the examples described herein the metadata channel has been a single data channel which defines an orientation value. However it would be understood that in some embodiments the metadata can comprise more than one data channel and/or data or information other than orientation. For example as described herein, in some embodiments the metadata comprises channels defining orientation or azimuth, and elevation and thus is able to define a spherical co-ordinate. In some embodiments the metadata comprises range information. In such embodiments an
audio or sound signal can be encoded with information defining the location of the audio or sound signal such that it can be represented in the playback apparatus.
In some embodiments the metadata can be used or employed in any suitable manner. In the examples described herein the metadata, the orientation information, has been employed to control the playback or reproduction of a mono audio signal. However it would be understood that in some embodiments the metadata can be employed to produce and suitable output which is based on the metadata. For example in some embodiments the metadata can be used to trigger additional audio effects. In such embodiments the decoder can inject a suitable audio signal into the output audio stream when a determined condition is met. For example a sudden orientation change can cause the decoder to inject a swoosh sound effect, a sudden elevation change can cause the decoder to inject a bouncing or 'boing' sound effect.
In some embodiments the metadata can be used to generate or modify display or video outputs. For example a rapid change in orientation or elevation can cause the decoder to 'shake' the displayed image. Furthermore in some embodiments the metadata can be employed as a control for an apparatus function other than displaying audio and/or video images. For example the control can be gaming function control, and thus in some embodiments can be used a remote controller controlling the motion of a gaming character. However any suitable functional control can employ the metadata information. For example metadata can be used to control the user interface on a remote or decoder apparatus.
The foregoing description has provided by way of exemplary and non-limiting examples a full and informative description of the exemplary embodiment of this invention. However, various modifications and adaptations may become apparent
to those skilled in the relevant arts in view of the foregoing description, when read in conjunction with the accompanying drawings.
Although the above has been described with regards to audio signals, or audio- visual signals it would be appreciated that embodiments may also be applied to audio-video signals where the audio signal components of the recorded data are processed in terms of the determining of the base signal and the determination of the time alignment factors for the remaining signals and the video signal components may be synchronised using the above embodiments of the invention. In other words the video parts may be synchronised using the audio synchronisation information.
It shall be appreciated that the term user equipment is intended to cover any suitable type of wireless user equipment, such as mobile telephones, portable data processing devices or portable web browsers.
Furthermore elements of a public land mobile network (PLMN) may also comprise apparatus as described above. In general, the various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. For example, some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto. While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
The embodiments of this invention may be implemented by computer software executable by a data processor of the mobile device, such as in the processor entity, or by hardware, or by a combination of software and hardware. Further in this regard it should be noted that any blocks of the logic flow as in the Figures may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions. The software may be stored on such physical media as memory chips, or memory blocks implemented within the processor, magnetic media such as hard disk or floppy disks, and optical media such as for example DVD and the data variants thereof, CD.
The memory may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory. The data processors may be of any type suitable to the local technical environment, and may include one or more of general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASIC), gate level circuits and processors based on multi-core processor architecture, as non-limiting examples.
Embodiments of the inventions may be practiced in various components such as integrated circuit modules. The design of integrated circuits is by and large a highly automated process. Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
Programs, such as those provided by Synopsys, Inc. of Mountain View, California and Cadence Design, of San Jose, California automatically route conductors and locate components on a semiconductor chip using well established rules of design
as well as libraries of pre-stored design modules. Once the design for a semiconductor circuit has been completed, the resultant design, in a standardized electronic format (e.g., Opus, GDSII, or the like) may be transmitted to a semiconductor fabrication facility or "fab" for fabrication.
The foregoing description has provided by way of exemplary and non-limiting examples a full and informative description of the exemplary embodiment of this invention. However, various modifications and adaptations may become apparent to those skilled in the relevant arts in view of the foregoing description, when read in conjunction with the accompanying drawings and the appended claims. However, all such and similar modifications of the teachings of this invention will still fall within the scope of this invention as defined in the appended claims.
Claims
1 . Apparatus comprising at least one processor and at least one memory including computer code for one or more programs, the at least one memory and the computer code configured to with the at least one processor cause the apparatus to at least:
generate a single channel audio signal;
generate at least one channel of data associated with the single channel audio signal; and
encode the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
2. The apparatus as claimed in claim 1 , wherein generating the single channel audio signal causes the apparatus to perform at least one of:
receive a single channel audio signal from a single microphone;
generate a single channel audio signal from at least two channel audio signals;
generate a single channel audio signal from at least two channel audio signals received from at least one microphone;
receive the single channel audio signal from a memory; and
receive the single channel audio signal from a further apparatus separate from the apparatus.
3. The apparatus as claimed in any of claims 1 and 2, wherein generating the at least one channel of data associated with the single channel audio signal causes the apparatus to perform at least one of:
generate at least one spatial parameter associated with the single channel audio signal;
generate an azimuth parameter associated with the single channel audio signal;
generate an elevation parameter associated with the single channel audio signal;
generate a range parameter associated with the single channel audio signal;
generate at least one velocity parameter associated with the single channel audio signal;
generate at least one recording configuration parameter associated with the single channel audio signal;
generate an azimuth parameter associated with at least one microphone recording the single channel audio signal;
generate an elevation parameter associated with at least one microphone recording the single channel audio signal; and
generate at least one processing parameter associated with the generation of the single channel audio signal.
4. The apparatus as claimed in any of claims 1 to 3, wherein encoding the at least one channel of data within the single channel audio signal causes the apparatus to perform at least one of:
quantize the at least one channel of data;
modulate the quantized at least one channel of data; and
combine the modulated quantized at least one channel of data with the single channel audio signal.
5. The apparatus as claimed in claim 4, further caused to band-pass filter the single channel audio signal, and wherein modulating the quantized at least one channel of data causes the apparatus to frequency shift the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter.
6. The apparatus as claimed in claim 5, wherein band-pass filtering the single channel audio signal causes the apparatus to perform at least one of:
low pass filter the single channel audio signal, wherein frequency shifting the quantized at least one channel of data causes the apparatus to frequency shift the quantized at least one channel of data to a frequency range above the range of the low pass filter; and
high pass filter the single channel audio signal, wherein frequency shifting the quantized at least one channel of data causes the apparatus to frequency shift the quantized at least one channel of data to a frequency range below the range of the high pass filter.
7. The apparatus as claimed in claim 4, further caused to band-stop filter the single channel audio signal, and wherein modulating the quantized at least one channel of data causes the apparatus to frequency shift the quantized at least one channel of data to at least one frequency of the band-stop filter.
8. The apparatus as claimed in claim 4, wherein modulating the quantized at least one channel of data causes the apparatus to:
apply a defined pseudo-random frequency shifting to the at least one channel of data; and
determine the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
9. Apparatus comprising at least one processor and at least one memory including computer code for one or more programs, the at least one memory and the computer code configured to with the at least one processor cause the apparatus to at least:
receive a single channel audio signal, the single channel audio signal comprising at least one channel of data;
separate the at least one channel of data from the single channel audio signal; and
control at least one operation of the apparatus based on the at least one channel of data.
10. The apparatus as claimed in claim 9, wherein separating the at least one channel of data from the single channel audio signal causes the apparatus to: apply a first band pass filter to extract the at least one channel of data; and apply a second band pass filter to extract an audio signal from the single channel audio signal.
1 1 . The apparatus as claimed in claim 10, wherein the first band pass filter is a high pass filter, and the second band pass filter is a low pass filter.
12. The apparatus as claimed in claim 10, wherein the first band pass filter is a low pass filter, and the second band pass filter is a high pass filter.
13. The apparatus as claimed in claim 10, wherein the first band pass filter is at least one narrow band filter, and the second band pass filter is a notch filter comprising stop bands at the frequencies of the at least one narrow band filter.
14. The apparatus as claimed in claim 9, wherein separating the at least one channel of data from the single channel audio signal causes the apparatus to: apply a pseudo-random decoding of the single channel audio signal comprising the at least one channel of data to extract the at least one channel of data; and
demodulate the decoded at least one channel of data to regenerate the at least one channel of data.
15. The apparatus as claimed in any of claims 9 to 14, wherein controlling at least one operation of the apparatus based on the at least one channel of data causes the apparatus to perform at least one of:
process the single channel audio signal based on the at least one channel of data;
generate at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data;
generate at least user interface input based on the at least one channel of data;
generate at least gaming control input based on the at least one channel of data; and
generate at least one visual effect to be displayed based on the at least one channel of data.
16. The apparatus as claimed in claim 15, wherein processing the single channel audio signal based on the at least one channel of data causes the apparatus to perform at least one of:
generate at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; generate a spatial audio signal based on the single channel audio signal and the at least one channel of data.
17. A method comprising:
generating a single channel audio signal;
generating at least one channel of data associated with the single channel audio signal; and
encoding the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
18. The method as claimed in claim 17, wherein generating the single channel audio signal comprises at least one of:
receiving a single channel audio signal from a single microphone; generating a single channel audio signal from at least two channel audio signals;
generating a single channel audio signal from at least two channel audio signals received from at least one microphone;
receiving the single channel audio signal from a memory; and receiving the single channel audio signal from a further apparatus separate from the apparatus.
19. The method as claimed in claim 17 or claim 18, wherein generating the at least one channel of data associated with the single channel audio signal comprises at least one of:
generating at least one spatial parameter associated with the single channel audio signal;
generating an azimuth parameter associated with the single channel audio signal;
generating an elevation parameter associated with the single channel audio signal;
generating a range parameter associated with the single channel audio signal;
generating at least one velocity parameter associated with the single channel audio signal;
generating at least one recording configuration parameter associated with the single channel audio signal;
generating an azimuth parameter associated with at least one microphone recording the single channel audio signal;
generating an elevation parameter associated with at least one microphone recording the single channel audio signal;
and generating at least one processing parameter associated with the generation of the single channel audio signal.
20. The method as claimed in any of claims 17 to 19, wherein encoding the at least one channel of data within the single channel audio signal comprises at least one of:
quantizing the at least one channel of data;
modulating the quantized at least one channel of data;
and combining the modulated quantized at least one channel of data with the single channel audio signal.
21 . The method as claimed in claim 20, further comprising band-pass filtering the single channel audio signal, and wherein modulating the quantized at least one channel of data comprises frequency shifting the quantized at least one channel of data to a frequency outside of at least one frequency of the band-pass filter.
22. The method as claimed in claim 21 , wherein band-pass filtering the single channel audio signal comprises at least one of:
low pass filtering the single channel audio signal, wherein frequency shifting the quantized at least one channel of data comprises frequency shifting the quantized at least one channel of data to a frequency range above the range of the low pass filter; and
high pass filtering the single channel audio signal, wherein frequency shifting the quantized at least one channel of data comprises frequency shifting the quantized at least one channel of data to a frequency range below the range of the high pass filter.
23. The method as claimed in claim 22, further comprising band-stop filtering the single channel audio signal, and wherein modulating the quantized at least one channel of data comprises frequency shifting the quantized at least one channel of data to at least one frequency of the band-stop filter.
24. The method as claimed in claim 20, wherein modulating the quantized at least one channel of data comprises: applying a defined pseudo-random frequency shifting to the at least one channel of data; and
determining the frequency shifted at least one channel of data is inaudible compared to the single channel audio signal.
25. A method comprising:
receiving a single channel audio signal, the single channel audio signal comprising at least one channel of data;
separating the at least one channel of data from the single channel audio signal; and
controlling at least one operation of the apparatus based on the at least one channel of data.
26. The method as claimed in claim 25, wherein separating the at least one channel of data from the single channel audio signal comprises:
applying a first band pass filter to extract the at least one channel of data; and applying a second band pass filter to extract an audio signal from the single channel audio signal.
27. The method as claimed in claim 26, wherein the first band pass filter comprises a high pass filter, and the second band pass filter comprises a low pass filter.
28. The method as claimed in claim 26, wherein the first band pass filter comprises a low pass filter, and the second band pass filter comprises a high pass filter.
29. The method as claimed in any of claims 26 to 28, wherein the first band pass filter comprises at least one narrow band filter, and the second band pass
filter comprises a notch filter comprising stop bands at the frequencies of the at least one narrow band filter.
30. The method as claimed in any of claims 25 to 29, wherein separating the at least one channel of data from the single channel audio signal comprises:
applying a pseudo-random decoding of the single channel audio signal comprising the at least one channel of data to extract the at least one channel of data;
and demodulating the decoded at least one channel of data to regenerate the at least one channel of data.
31 . The method as claimed in any of claims 25 to 30, wherein controlling at least one operation of the apparatus based on the at least one channel of data comprises at least one of:
processing the single channel audio signal based on the at least one channel of data;
generating at least one further audio signal to be combined with the single channel audio signal based on the at least one channel of data;
generating at least user interface input based on the at least one channel of data;
generating at least gaming control input based on the at least one channel of data; and
generating at least one visual effect to be displayed based on the at least one channel of data.
32. The method as claimed in claim 31 , wherein processing the single channel audio signal based on the at least one channel of data comprises at least one of: generating at least two channels of audio signals from the single channel audio signal being amplitude panned based on the at least one channel of data; generating a spatial audio signal based on the single channel audio signal and the at least one channel of data.
33. An apparatus comprising:
means for generating a single channel audio signal;
means for generating at least one channel of data associated with the single channel audio signal; and
means for encoding the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
34. An apparatus comprising:
means for receiving a single channel audio signal, the single channel audio signal comprising at least one channel of data;
means for separating the at least one channel of data from the single channel audio signal; and
means for controlling at least one operation of the apparatus based on the at least one channel of data.
35. An apparatus comprising:
an audio signal generator configured to generate a single channel audio signal;
a data generator configured to generate at least one channel of data associated with the single channel audio signal; and
an encoder configured to encode the at least one channel of data within the single channel audio signal, such that the at least one channel data can be decoded and employed to control at least one operation of an apparatus performing the decoding.
36. An apparatus comprising:
a receiver configured to receive a single channel audio signal, the single channel audio signal comprising at least one channel of data;
a separator configured to separate the at least one channel of data from the single channel audio signal; and
a controller configured to control at least one operation of the apparatus based on the at least one channel of data.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1315524.7 | 2013-08-30 | ||
GBGB1315524.7A GB201315524D0 (en) | 2013-08-30 | 2013-08-30 | Directional audio apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015028715A1 true WO2015028715A1 (en) | 2015-03-05 |
Family
ID=49397108
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FI2014/050653 WO2015028715A1 (en) | 2013-08-30 | 2014-08-28 | Directional audio apparatus |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB201315524D0 (en) |
WO (1) | WO2015028715A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3301673A1 (en) * | 2016-09-30 | 2018-04-04 | Nxp B.V. | Audio communication method and apparatus |
CN107968981A (en) * | 2016-10-05 | 2018-04-27 | 奥迪康有限公司 | Ears Beam-former filter unit, hearing system and hearing devices |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030035553A1 (en) * | 2001-08-10 | 2003-02-20 | Frank Baumgarte | Backwards-compatible perceptual coding of spatial cues |
JP2005184621A (en) * | 2003-12-22 | 2005-07-07 | Yamaha Corp | Speech device |
US20060287748A1 (en) * | 2000-01-28 | 2006-12-21 | Leonard Layton | Sonic landscape system |
US20080008342A1 (en) * | 2006-07-07 | 2008-01-10 | Harris Corporation | Method and apparatus for creating a multi-dimensional communication space for use in a binaural audio system |
US20090265023A1 (en) * | 2008-04-16 | 2009-10-22 | Oh Hyen O | Method and an apparatus for processing an audio signal |
US20130096931A1 (en) * | 2011-10-12 | 2013-04-18 | Jens Kristian Poulsen | Systems and methods for reducing audio disturbance associated with control messages in a bitstream |
-
2013
- 2013-08-30 GB GBGB1315524.7A patent/GB201315524D0/en not_active Ceased
-
2014
- 2014-08-28 WO PCT/FI2014/050653 patent/WO2015028715A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060287748A1 (en) * | 2000-01-28 | 2006-12-21 | Leonard Layton | Sonic landscape system |
US20030035553A1 (en) * | 2001-08-10 | 2003-02-20 | Frank Baumgarte | Backwards-compatible perceptual coding of spatial cues |
JP2005184621A (en) * | 2003-12-22 | 2005-07-07 | Yamaha Corp | Speech device |
US20080008342A1 (en) * | 2006-07-07 | 2008-01-10 | Harris Corporation | Method and apparatus for creating a multi-dimensional communication space for use in a binaural audio system |
US20090265023A1 (en) * | 2008-04-16 | 2009-10-22 | Oh Hyen O | Method and an apparatus for processing an audio signal |
US20130096931A1 (en) * | 2011-10-12 | 2013-04-18 | Jens Kristian Poulsen | Systems and methods for reducing audio disturbance associated with control messages in a bitstream |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3301673A1 (en) * | 2016-09-30 | 2018-04-04 | Nxp B.V. | Audio communication method and apparatus |
US10964332B2 (en) | 2016-09-30 | 2021-03-30 | Nxp B.V. | Audio communication method and apparatus for watermarking an audio signal with spatial information |
CN107968981A (en) * | 2016-10-05 | 2018-04-27 | 奥迪康有限公司 | Ears Beam-former filter unit, hearing system and hearing devices |
CN107968981B (en) * | 2016-10-05 | 2021-10-29 | 奥迪康有限公司 | Hearing device |
Also Published As
Publication number | Publication date |
---|---|
GB201315524D0 (en) | 2013-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10932075B2 (en) | Spatial audio processing apparatus | |
US9820037B2 (en) | Audio capture apparatus | |
US10924850B2 (en) | Apparatus and method for audio processing based on directional ranges | |
US10674262B2 (en) | Merging audio signals with spatial metadata | |
KR102214205B1 (en) | 2-stage audio focus for spatial audio processing | |
US10097943B2 (en) | Apparatus and method for reproducing recorded audio with correct spatial directionality | |
US10142759B2 (en) | Method and apparatus for processing audio with determined trajectory | |
US20130226324A1 (en) | Audio scene apparatuses and methods | |
WO2014188231A1 (en) | A shared audio scene apparatus | |
US9392363B2 (en) | Audio scene mapping apparatus | |
WO2015028715A1 (en) | Directional audio apparatus | |
WO2012171584A1 (en) | An audio scene mapping apparatus | |
US20130226322A1 (en) | Audio scene apparatus | |
WO2014016645A1 (en) | A shared audio scene apparatus | |
WO2015086894A1 (en) | An audio scene capturing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14841320 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14841320 Country of ref document: EP Kind code of ref document: A1 |