EP3729832A1 - Verarbeitung eines monophonen signals in einem 3d-audiodecodierer, der binauralen inhalt liefert - Google Patents

Verarbeitung eines monophonen signals in einem 3d-audiodecodierer, der binauralen inhalt liefert

Info

Publication number
EP3729832A1
EP3729832A1 EP18833274.6A EP18833274A EP3729832A1 EP 3729832 A1 EP3729832 A1 EP 3729832A1 EP 18833274 A EP18833274 A EP 18833274A EP 3729832 A1 EP3729832 A1 EP 3729832A1
Authority
EP
European Patent Office
Prior art keywords
signal
processing
channel
position information
channels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP18833274.6A
Other languages
English (en)
French (fr)
Inventor
Grégory PALLONE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
Orange SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Orange SA filed Critical Orange SA
Priority to EP22197901.6A priority Critical patent/EP4135350A1/de
Publication of EP3729832A1 publication Critical patent/EP3729832A1/de
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/01Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/11Positioning of individual sound objects, e.g. moving airplane, within a sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]

Definitions

  • the present invention relates to the processing of an audio signal in a coded 3D audio decoding system standard MPEG-H 3D audio.
  • the invention relates more particularly to the processing of a monophonic signal intended to be reproduced on a headset which also receives binaural audio signals.
  • binaural aims at a reproduction on headphones or pair of headphones, a sound signal with nevertheless spatialization effects.
  • a binaural processing of audio signals subsequently called binauralization or binauralization processing, uses HRTF filters (for "Head Related Transfer Function” in English) in the frequency domain or HRIR, BRIR ("Head Related Transfer Function", " Binaural Room Impulse Response "in the time domain that reproduce the acoustic transfer functions between the sound sources and the ears of the listener.
  • HRTF filters for "Head Related Transfer Function” in English
  • BRIR Head Related Transfer Function
  • Binaural Room Impulse Response in the time domain that reproduce the acoustic transfer functions between the sound sources and the ears of the listener.
  • the signal of the right ear is obtained by filtering a monophonic signal by the transfer function (HRTF) of the right ear and the signal of the left ear is obtained by filtering this same monophonic signal by the transfer function of the right ear. left ear.
  • HRTF transfer function
  • NGA Next Generation Audio
  • Next Generation Audio type codecs, such as MPEG-H 3D audio described in the document referenced ISO / IEC 23008-3: "High efficiency coding and media deiivery in heterogeneous environments - Part 3: 3D audio »published on 25/07/2014 or AC4 described in the document referenced ETSI TS 103 190:" Digital Audio Compression Standard "published in April 2014
  • the signals received at the decoder are at first decoded and then undergo a processing of Binauralization as described above before being rendered on a headset.
  • the encoded codes therefore provide the possibility of a restitution on several virtual speakers through listening to a binaural signal on headphones but also provide the possibility of a reproduction on several real speakers, a sound spatialized.
  • Head tracking in English
  • This treatment makes it possible to take into account the movement of the listener's head to modify the sound reproduction on each ear in order to keep the restitution of the sound stage stable.
  • the listener will perceive the sound sources in the same place in the physical space if he moves or if he does not move his head. This can be important for viewing and listening to video content
  • a content producer may wish that a sound signal be reproduced independently of the sound scene, that is to say that it is perceived as a sound apart from the sound scene, for example as in the case of an "OFF" voice.
  • This type of reproduction may allow for example to give explanations on a sound scene otherwise restored.
  • the content producer may wish the sound to be reproduced on one ear to be able to obtain a voluntary effect of "headset" type, that is to say that the sound is heard only in one ear .
  • headset a voluntary effect of "headset" type, that is to say that the sound is heard only in one ear .
  • This sound remains permanently only on this ear even if the listener moves his head, which is the case in the previous example.
  • the content producer may also wish this sound to be rendered at a precise position in the sound space, relative to an ear of the listener (and not only within a single ear), even if he moves his head.
  • Such monophonic signal decoded and put in input of a system of reproduction of a codice of type MPEG-H 3D audio or AC4, will be binauralised.
  • the sound will then be spread over both ears (although it will be less loud in the contra-lateral ear) and if the listener moves his head, he will not perceive the sound in the same way on his ear, since the follow-up treatment of the head, if implemented, will ensure that the position of the sound source remains the same as in the initial sound stage: depending on the position of the head, the sound will appear stronger in the one or the other of the ears.
  • a "Dichotic" identification is associated with the contents that should not be processed by binauralization.
  • an information bit indicates that a signal is already virtualized. This bit allows the deactivation of the post-processing.
  • the contents thus identified are contents already formatted for the audio headphones, that is to say in binaural. They have two channels. These methods do not deal with the case of a monophonic signal for which, the producer of the sound stage does not wish to binauralization.
  • the present invention improves the situation.
  • a method of processing an audio monophonic signal in a 3D audio decoder comprising a binauralization processing step of the decoded signals intended to be spatially reproduced by an audio headset.
  • the method is such that, upon detecting, in a data flow representative of the monophonic signal, an indication of binaural non-processing associated with a restitution spatial position information, the decoded monophonic signal is directed to a transmission engine. stereophonic rendering taking into account the position information to construct two rendering channels processed by a direct mixing step summing these two channels with a binauralized signal resulting from binauralization processing, to be rendered on the headphones.
  • a monophonic content must be rendered at a precise spatial position with respect to an ear of a listener and that it does not undergo binauralization processing so that this restored signal can have an "ear" effect, that is to say, it is heard by the listener at a specific position with respect to an ear, inside the head in the same way as a stereophonic signal and this even if the listener's head moves.
  • the stereophonic and binaural signals are similar in that they consist of 2 left and right channels, and are distinguished by the content of these 2 channels.
  • This monaural signal (for monophonic) restored then superimposes the other restored signals that form a 3D sound scene.
  • the rate required to indicate this type of content is optimized since it is sufficient to code only a position indication in the sound scene in addition to the indication of non-binauralization to inform the decoder of the treatment to be performed, unlike a method that would require encoding, transmit and decode a stereo signal taking into account this spatial position.
  • the restitution spatial position information is a binary data indicating a single channel of the reproduction audio headset.
  • This information requires only one coding bit, which further allows the necessary bit rate to be restricted.
  • only the playback channel corresponding to the channel indicated by the binary data is summed to the corresponding channel of the binauralized signal in the direct mixing step, the other playback channel being of zero value.
  • the monophonic signal is a channel-type signal directed to the stereophonic rendering engine with the spatial position feedback information.
  • the monophonic signal does not undergo a binauralization processing step and is not treated as channel-type signals usually processed by state-of-the-art methods.
  • This signal is processed by a stereophonic rendering engine different from that existing for the channel type signals.
  • This rendering engine consists of duplicating the monophonic signal on the 2 channels, by applying function factors of the spatial position information of restitution, on both channels.
  • This stereophonic rendering engine can also be integrated into the channel rendering engine with a differentiated processing according to the detection made for the signal at the input of this rendering engine or the direct mixing module summing the channels resulting from this rendering engine. stereophonic signal binauralized binaural processing module.
  • the restitution spatial position information is an interaural sound level difference data type ILD or more generally a level report information between the left and right channels.
  • the monophonic signal is an object type signal associated with a set of reproduction parameters including the non-binauralization indication and the restitution position information, the signal being directed to the rendering engine. stereophonic with spatial position feedback information.
  • the restitution spatial position information is for example an azimuth angle datum.
  • This information makes it possible to give a restitution position with respect to an ear of the headset carrier so that this sound is superimposed on a sound stage.
  • the monophonic signal does not undergo a binauralization processing step and is not treated as the object type signals usually processed by the methods of the state of the art.
  • This signal is processed by a stereophonic rendering engine different from that existing for the object type signals.
  • Binaural non-processing indication and rest position information are included in the rendering parameters (Metadata) associated with the object type signal.
  • This rendering engine can also be integrated with the object rendering engine or with the direct mixing module summing the channels resulting from this stereophonic rendering engine with the binauralized signal coming from the binauralization processing module.
  • the present invention also relates to a device for processing an audio monophonic signal comprising a processing module for binauralization of decoded signals intended to be spatially reproduced by an audio headset.
  • This device is such that it comprises:
  • a detection module adapted to detect, in a data flow representative of the monophonic signal, an indication of binaural non-processing associated with a restitution spatial position information
  • a redirection module in the case of a positive detection by the detection module, able to direct the monophonic signal to a stereophonic rendering engine
  • a stereophonic rendering engine adapted to take into account the position information to construct two rendering channels
  • a direct mixing module able to directly process the two rendering channels by summing them with a binauralized signal from the binauralization processing module, to be rendered on the headphones.
  • the stereophonic rendering engine is integrated in the direct mixing module.
  • This signal may be of the channel type or of the object type.
  • the monophonic signal is a channel-type signal and the stereophonic rendering engine is integrated with a channel rendering engine that also builds rendering channels for multi-channel signals.
  • the monophonic signal is an object type signal and the stereophonic rendering engine is integrated with an object rendering engine that also builds rendering channels for monophonic signals associated with sets of rendering parameters.
  • the present invention relates to an audio decoder comprising a processing device as described and a computer program comprising code instructions for implementing the steps of the processing method as described, when these instructions are executed by a processor.
  • the invention relates to a storage medium, readable by a processor, integrated or not to the processing device, possibly removable, storing a computer program comprising instructions for executing the processing method as described above.
  • FIG. 1 illustrates a decoder of the MPEG-H 3D audio type as it exists in the state of the art
  • FIG. 2 illustrates the steps of a processing method according to one embodiment of the invention
  • FIG. 3 illustrates a decoder comprising a processing device according to a first embodiment of the invention
  • FIG. 4 illustrates a decoder comprising a processing device according to a second embodiment of the invention.
  • FIG. 5 illustrates a hardware representation of a processing device according to one embodiment of the invention.
  • FIG. 1 schematically illustrates a decoder as standardized in the MPEG-H 3D audio standard according to the document referenced above.
  • Block 101 is a heart decoding module which decodes both channel-type multichannel audio signals (Ch.) And object-type monophonic audio signals (Obj.) Associated with spatialization (“Metadata") (Obj.MeDa.) and audio signals in Higher Order Ambisonic Audio (HOA) format (HOA).
  • Ch. channel-type multichannel audio signals
  • Obj. object-type monophonic audio signals
  • Methodadata OFbj.MeDa.
  • HOA Higher Order Ambisonic Audio
  • a channel-type signal is decoded and processed by a channel rendering engine 102 ("Channel renderer” in English, also called “Format Converter” in MPEG-H 3D Audio) in order to adapt this channel signal to the audio rendering system.
  • the channel rendering engine knows the characteristics of the rendering system and thus provides a signal by way of reproduction (Rdr.Ch.) to supply either real speakers or virtual speakers (which will then be binauralised for a rendering at helmet).
  • rendering channels are mixed by the mixing module 110 to other rendering channels from the object rendering engines 103 and HOA 105 described later.
  • Object-type signals are monophonic signals associated with data (“Metadata”) such as spatialization parameters (azimuth angles, elevation) which make it possible to position the monophonic signal in the spatialized sound scene, priority parameters or sound volume settings.
  • Metadata such as spatialization parameters (azimuth angles, elevation) which make it possible to position the monophonic signal in the spatialized sound scene, priority parameters or sound volume settings.
  • object signals are decoded, together with the associated parameters, by the decoding module 101 and are processed by an object rendering engine 103 ("Object Renderer" in English) which, knowing the characteristics of the rendering system, adapts these monophonic signals to these characteristics.
  • the various reproduction channels (Rdr.Obj.) Thus created are mixed with the other rendering channels from the channel and HOA rendering engines, by the mixing module 110.
  • HOA Higher Order Ambisonic
  • the reproduction channels (Rdr .HOA) created by this rendering engine HOA are mixed at 110 with the reproduction channels created by the other rendering engines 102 and 103.
  • the signals at the output of the mixing module 110 can be restored by HP real speakers located in a playback room.
  • the signals at the output of the mixing module can directly supply these real speakers, a channel corresponding to a loudspeaker.
  • the signals at the output of the mixing module are to be reproduced on an AC headset, then these signals are processed by a binauralization processing module 120 according to binauralization techniques described for example in the document cited for the MPEG standard. -H 3D audio.
  • FIG. 2 now describes the steps of a method of processing according to one embodiment of the invention.
  • This method relates to the processing of a monophonic signal in a 3D audio decoder.
  • a step E200 detects whether the data flow (SMo) representative of the monophonic signal (for example the bitstream at the input of the audio decoder) includes a binaural non-processing indication associated with a restitution spatial position information.
  • the signal must be binauralized. It is processed by binauralization processing, in step E210, before being restored in E240 on a playback headset.
  • This binauralized signal can be mixed with other stereophonic signals from step E220 described below.
  • the signal monophonic decoded is directed to a stereophonic rendering engine to be processed by a step E220.
  • This non-binauralization indication may be, for example, as in the state of the art, a "Dichotic" identification given to the monophonic signal or another identification understood as an instruction not to process the signal by a binauralization process.
  • the spatial position information of restitution can be for example an azimuth angle indicating the restitution position of the sound with respect to an ear, right or left, or an indication of difference in level between the left and right channels as a piece of information.
  • ILD for distributing the energy of the monophonic signal between the left and right channels, or simply the indication of a single channel of restitution, corresponding to the right or left ear. In the latter case, this information is binary information that requires very little bit rate (1 bit of information).
  • step E220 the position information is taken into account to build two rendering channels for the two earphones of the headphones. These two playback channels thus constructed are processed directly by a direct mixing step E230 summing these two stereo channels with the two channels of the binauralized signal from binauralization processing E210.
  • Each of the stereophonic reproduction channels is then summed with the corresponding channel of the binauralized signal.
  • the restitution spatial position information is a binary data indicating a single channel of the playback headset
  • the two reproduction channels constructed in step E220 by the stereophonic rendering engine consist of a channel comprising the monophonic signal, the other channel being zero, and therefore possibly absent.
  • a single channel is summed with the corresponding channel of the binauralized signal, the other channel being zero. This mixing step is simplified.
  • the listener equipped with the audio headset hears on the one hand, a spatialized sound scene from the binauralized signal, this sound scene is heard by him at the same physical place even if he moves his head in the case of a dynamic rendering and on the other hand, a sound positioned inside the head, between an ear and the center of the head, which is superimposed on the sound stage independently, that is, if the listener move your head, this sound will be heard in the same position relative to an ear.
  • This sound is perceived as a superposition of other binauralized sounds of the sound stage, and will act as an "OFF" voice to this sound scene.
  • FIG. 3 illustrates a first embodiment of a decoder comprising a processing device implementing the processing method described with reference to FIG. 2.
  • the monophonic signal processed by the method used is a channel type signal (Ch.).
  • the object (object) and HOA (HOA) type signals are processed in the same way by the respective blocks 303, 304 and 305 as the blocks 103, 104 and 105 described with reference to FIG. way, the mixing block 310 performs a mixing as described for the block 110 of Figure 1.
  • the block 330 receiving the channel-type signals treats differently a monophonic signal having a non-binauralization indication (Di.) associated with a restitution spatial position information (Pos.) That another signal does not include this information, in particularly a multichannel signal. For these signals not having this information, they are processed by the block 302 in the same way as the block 102 described with reference to FIG.
  • the block 330 acts as a router or switch and directs the decoded monophonic signal (Mo.) to a stereophonic rendering engine 331.
  • stereophonic rendering engine also receives, from the decoding module, the spatial position information of restitution (Pos.). With this information, it builds two playback channels (2 Vo.), Corresponding to the left and right channels of the playback headphones, for these channels to be output to the AC headphones.
  • the restitution spatial position information is interaural sound level difference information between the left and right channels. This information makes it possible to define a factor to be applied to each of the rendering channels in order to respect this restitution spatial position.
  • the definition of these factors can be done as in the document referenced MPEG-2 AAC: ISO / IEC 13818-4: 2004 / DCOR 2, AAC in section 7.2 describing the stereo intensity.
  • these rendering channels are added to the channels of a binauralized signal from binauralization module 320 which performs a binauralization processing in the same way as block 120 of FIG.
  • This channel summing step is performed by the direct mixing module 340 which is the left channel from the stereophonic rendering engine 331 to the left channel of the binauralized signal from binauralization processing module 320 and the right channel from the engine stereophonic rendering 331 to the right channel of the binauralized signal from binauralization processing module 320, before playback on the CA headset.
  • the monophonic signal does not pass through the binauralization processing module 320, it is transmitted directly to the stereophonic rendering engine 331 before being directly mixed with a binauralized signal.
  • This signal will not undergo either head tracking treatment.
  • the restored sound will be in a position of restitution with respect to an ear of the listener and will remain in this position even if the listener moves his head.
  • the stereophonic rendering engine 331 can be integrated with the channel rendering engine 302.
  • this channel rendering engine implements both the adaptation of the conventional channel type signals, as described in FIG. FIG. 1 and the construction of the two renderer rendering channels of the rendering engine 331 as explained above by receiving the restitution spatial position information (Pos). Only the two playback channels are then redirected to the direct mixing module 340 before playback on the AC headphones.
  • the stereophonic rendering engine 331 is integrated with the direct mixing module 340.
  • the routing module 330 directs the decoded monophonic signal (for which the non-binauralization indication has been detected. and the restitution spatial position information) to the direct mixing module 340.
  • the decoded spatial position information (Pos) is also transmitted to the direct mixing module 340.
  • This mixing module direct then comprising the stereophonic rendering engine implements the construction of the two rendering channels taking into account the spatial position information of restitution as well as the mixing of these two rendering channels with the return channels of a binauralized signal from binauralization processing module 320.
  • FIG. 4 illustrates a second embodiment of a decoder comprising a processing device implementing the processing method described with reference to FIG. 2.
  • the monophonic signal processed by the method implemented is an object type signal (Obj.).
  • the channel type (Ch) and HOA type (HOA) signals are treated in the same way by the respective blocks 402 and 405 as the blocks 102 and 105 described with reference to FIG. 1.
  • the block mixer 410 performs a mixing as described for block 110 of FIG.
  • the block 430 receiving the object type signals (Obj.) Treats differently a monophonic signal for which it has detected a non-binauralization indication (Di.) associated with a spatial position information of restitution (Pos.) That a other monophonic signal for which this information has not been detected.
  • a non-binauralization indication Di.
  • a spatial position information of restitution Pos.
  • the block 430 acts as a router or switch and directs the decoded monophonic signal (Mo.) to a stereophonic rendering engine 431.
  • the non-binauralization indication (Di.) as well as the restitution spatial position information (Pos) are decoded by the decoding block 404 of the metadata or parameters associated with the object type signals.
  • the non-binauralization indication (Di.) is transmitted to the routing block 430 and the restitution spatial position information is transmitted to the stereophonic rendering engine 431.
  • This stereophonic rendering engine thus receiving the positional restitution position information (Pos.), Builds two rendering channels corresponding to the left and right channels of the reproduction headphones, so that these channels are reproduced on the AC headphones.
  • the restitution spatial position information is an azimuth angle information defining an angle between the desired restitution position and the center of the listener's head.
  • This information makes it possible to define a factor to be applied to each of the rendering channels in order to respect this restitution spatial position.
  • the gain factors for the left and right channels can be calculated as presented in City Pulkki's Virtual Sound Source Positioning Using Vector Base Amplitude Panning in J. Audio Eng. Soc., Vol.45, No.6, of June 1997.
  • the gain factors of the stereophonic rendering engine can be given by:
  • g2 (cosO.sinH - sin0.cosH) / (2.cosH.sinH)
  • O the angle between the frontal direction and the object (called azimuth)
  • H the angle between the frontal direction and the position of the virtual speaker (corresponding to the half-angle between the speakers), fixed for example at 45 °.
  • these rendering channels are added to the channels of a binauralized signal from the binauralization module 420 which performs a binauralization processing in the same way as the block 120 of FIG.
  • This channel summing step is performed by the direct mixing module 440 which is the left channel from the stereophonic rendering engine 431 to the left channel of the binauralized signal from the binauralization processing module 420 and the right channel from the engine stereophonic rendering 431 to the right channel of the binauralized signal from binauralization processing module 420, before playback on the CA headset.
  • the monophonic signal does not go through the binaural processing module 420, it is transmitted directly to the stereophonic rendering engine 431 before being mixed directly to a binauralized signal.
  • This signal will not undergo either head tracking treatment.
  • the restored sound will be in a position of restitution with respect to an ear of the listener and will remain in this position even if the listener moves his head.
  • the stereophonic rendering engine 431 can be integrated with the object rendering engine 403.
  • this object rendering engine implements both the adaptation of the conventional object type signals, as described in FIG. FIG. 1 and the construction of the two renderer rendering channels 431 as explained above by receiving the restitution spatial position information (Pos) of the decoding module 404 of the parameters. Only the two playback channels (2Vo.) Are then redirected to the direct mixing module 440 before playback on the AC headphones.
  • the stereophonic rendering engine 431 is integrated with the direct mixing module 440.
  • the routing module 430 directs the decoded monophonic signal (Mo.) (for which the indication has been detected. non-binauralization and restitution spatial position information) to the direct mixing module 440.
  • the decoded spatial position information (Pos) is also transmitted to the direct mixing module 440 by the parameter decoding module 404.
  • This direct mixing module then including the stereophonic rendering engine, implements the construction of the two reproduction channels taking into account the spatial position information of restitution as well as the mixing of these two paths. rendering with the return channels of a binauralized signal from binauralization processing module 420.
  • FIG. 5 now illustrates an example of a hardware embodiment of a processing device adapted to implement the treatment method according to the invention.
  • the device DIS comprises a storage space 530, for example a memory MEM, a processing unit 520 comprising a processor PROC, driven by a computer program Pg, stored in the memory 530 and implementing the processing method according to the invention .
  • the computer program Pg comprises code instructions for the implementation of the steps of the processing method in the sense of the invention, when these instructions are executed by the processor PROC, and in particular, on detection, in a representative data stream.
  • code instructions for the implementation of the steps of the processing method in the sense of the invention when these instructions are executed by the processor PROC, and in particular, on detection, in a representative data stream.
  • a step of directing the decoded monophonic signal to a stereophonic rendering engine taking into account the position information to construct two paths restitution treated directly by a direct mixing step summing these two channels with a binauralized signal from binauralization processing, to be rendered on the headphones.
  • FIG. 2 typically repeats the steps of an algorithm of such a computer program.
  • the code instructions of the program Pg are for example loaded into a RAM (not shown) before being executed by the processor PROC of the processing unit 520.
  • the program instructions can be stored on a memory card. storage medium such as flash memory, hard disk, or other non-transient storage media.
  • the device DIS comprises a reception module 510 adapted to receive a representative SMo data stream including a monophonic signal. It comprises a detection module 540 able to detect, in this data stream, an indication of binaural non-processing associated with spatial position information rendition. It comprises a direction module 550, in the case of a positive detection by the detection module 540, of the decoded monophonic signal to a stereophonic rendering engine 560, the stereophonic rendering engine 560 being able to take into account the information position to build two tracks of restitution.
  • the device DIS also comprises a direct mixing module 570 able to directly process the two reproduction channels by summing them with the two channels of a binauralized signal coming from a binauralization processing module.
  • the playback channels thus obtained are transmitted to an AC headset via an output module 560, to be restored.
  • module may correspond to a software component as well as a hardware component or a set of hardware and software components, a software component corresponding to one or more programs or subprograms computer or more generally to any element of a program capable of implementing a function or a set of functions as described for the modules concerned.
  • a hardware component corresponds to any element of a hardware set (or hardware) able to implement a function or a set of functions for the module concerned (integrated circuit, smart card, memory card, etc. .)
  • the device can be integrated into an audio decoder as described in FIG. 3 or 4 and can be integrated, for example, in multimedia equipment of the set-top box type, or audio or video content player. They can also be integrated into communication equipment of the mobile phone or communication gateway type.
EP18833274.6A 2017-12-19 2018-12-07 Verarbeitung eines monophonen signals in einem 3d-audiodecodierer, der binauralen inhalt liefert Pending EP3729832A1 (de)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22197901.6A EP4135350A1 (de) 2017-12-19 2018-12-07 Verarbeitung eines monophonen signals in einem 3d-audiodekoder zur darstellung binauraler inhalte

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR1762478A FR3075443A1 (fr) 2017-12-19 2017-12-19 Traitement d'un signal monophonique dans un decodeur audio 3d restituant un contenu binaural
PCT/FR2018/053161 WO2019122580A1 (fr) 2017-12-19 2018-12-07 Traitement d'un signal monophonique dans un décodeur audio 3d restituant un contenu binaural

Related Child Applications (1)

Application Number Title Priority Date Filing Date
EP22197901.6A Division EP4135350A1 (de) 2017-12-19 2018-12-07 Verarbeitung eines monophonen signals in einem 3d-audiodekoder zur darstellung binauraler inhalte

Publications (1)

Publication Number Publication Date
EP3729832A1 true EP3729832A1 (de) 2020-10-28

Family

ID=62222744

Family Applications (2)

Application Number Title Priority Date Filing Date
EP18833274.6A Pending EP3729832A1 (de) 2017-12-19 2018-12-07 Verarbeitung eines monophonen signals in einem 3d-audiodecodierer, der binauralen inhalt liefert
EP22197901.6A Pending EP4135350A1 (de) 2017-12-19 2018-12-07 Verarbeitung eines monophonen signals in einem 3d-audiodekoder zur darstellung binauraler inhalte

Family Applications After (1)

Application Number Title Priority Date Filing Date
EP22197901.6A Pending EP4135350A1 (de) 2017-12-19 2018-12-07 Verarbeitung eines monophonen signals in einem 3d-audiodekoder zur darstellung binauraler inhalte

Country Status (8)

Country Link
US (1) US11176951B2 (de)
EP (2) EP3729832A1 (de)
JP (2) JP7279049B2 (de)
KR (1) KR102555789B1 (de)
CN (1) CN111492674B (de)
BR (1) BR112020012071A2 (de)
FR (1) FR3075443A1 (de)
WO (1) WO2019122580A1 (de)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022544795A (ja) 2019-08-19 2022-10-21 ドルビー ラボラトリーズ ライセンシング コーポレイション オーディオのバイノーラル化のステアリング
TW202348047A (zh) * 2022-03-31 2023-12-01 瑞典商都比國際公司 用於沉浸式3自由度/6自由度音訊呈現的方法和系統

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09327100A (ja) * 1996-06-06 1997-12-16 Matsushita Electric Ind Co Ltd ヘッドホン再生装置
US20090299756A1 (en) * 2004-03-01 2009-12-03 Dolby Laboratories Licensing Corporation Ratio of speech to non-speech audio such as for elderly or hearing-impaired listeners
US7634092B2 (en) * 2004-10-14 2009-12-15 Dolby Laboratories Licensing Corporation Head related transfer functions for panned stereo audio content
KR100754220B1 (ko) * 2006-03-07 2007-09-03 삼성전자주식회사 Mpeg 서라운드를 위한 바이노럴 디코더 및 그 디코딩방법
CN101690269A (zh) * 2007-06-26 2010-03-31 皇家飞利浦电子股份有限公司 双耳的面向对象的音频解码器
PT2146344T (pt) * 2008-07-17 2016-10-13 Fraunhofer Ges Forschung Esquema de codificação/descodificação de áudio com uma derivação comutável
TWI475896B (zh) * 2008-09-25 2015-03-01 Dolby Lab Licensing Corp 單音相容性及揚聲器相容性之立體聲濾波器
US8620008B2 (en) * 2009-01-20 2013-12-31 Lg Electronics Inc. Method and an apparatus for processing an audio signal
KR20120006060A (ko) * 2009-04-21 2012-01-17 코닌클리케 필립스 일렉트로닉스 엔.브이. 오디오 신호 합성
MY154078A (en) * 2009-06-24 2015-04-30 Fraunhofer Ges Forschung Audio signal decoder, method for decoding an audio signal and computer program using cascaded audio object processing stages
ES2755349T3 (es) * 2013-10-31 2020-04-22 Dolby Laboratories Licensing Corp Renderización binaural para auriculares utilizando procesamiento de metadatos
CN106162500B (zh) * 2015-04-08 2020-06-16 杜比实验室特许公司 音频内容的呈现

Also Published As

Publication number Publication date
FR3075443A1 (fr) 2019-06-21
KR20200100664A (ko) 2020-08-26
WO2019122580A1 (fr) 2019-06-27
US11176951B2 (en) 2021-11-16
BR112020012071A2 (pt) 2020-11-24
JP2021508195A (ja) 2021-02-25
US20210012782A1 (en) 2021-01-14
CN111492674A (zh) 2020-08-04
EP4135350A1 (de) 2023-02-15
KR102555789B1 (ko) 2023-07-13
RU2020121890A (ru) 2022-01-04
JP7279049B2 (ja) 2023-05-22
CN111492674B (zh) 2022-03-15
JP2023099599A (ja) 2023-07-13

Similar Documents

Publication Publication Date Title
CN107533843B (zh) 用于捕获、编码、分布和解码沉浸式音频的系统和方法
US9794686B2 (en) Controllable playback system offering hierarchical playback options
EP2042001B1 (de) Binaurale spatialisierung kompressionsverschlüsselter tondaten
EP2489206A1 (de) Verarbeitung von in einer subbanddomäne codierten schalldaten
US11570569B2 (en) Associated spatial audio playback
JP2023099599A (ja) バイノーラルコンテンツを配信する3d音声デコーダにおけるモノラル信号の処理
US20230232182A1 (en) Spatial Audio Capture, Transmission and Reproduction
EP3603076B1 (de) Verfahren zur auswahl von mindestens einem bildabschnitt, der zum rendern eines audiovisuellen streams vorausschauend heruntergeladen wird
FR3011373A1 (fr) Terminal portable d'ecoute haute-fidelite personnalisee
US11430451B2 (en) Layered coding of audio with discrete objects
EP4055840A1 (de) Signalisierung von audioeffekt-metadaten in einem bitstrom
RU2779295C2 (ru) Обработка монофонического сигнала в декодере 3d-аудио, предоставляющая бинауральный информационный материал
KR100598602B1 (ko) 가상 입체 음향 생성 장치 및 그 방법
WO2006075079A1 (fr) Procede d’encodage de pistes audio d’un contenu multimedia destine a une diffusion sur terminaux mobiles
US20240114310A1 (en) Method and System For Efficiently Encoding Scene Positions
FR3040253B1 (fr) Procede de mesure de filtres phrtf d'un auditeur, cabine pour la mise en oeuvre du procede, et procedes permettant d'aboutir a la restitution d'une bande sonore multicanal personnalisee
CN117768832A (zh) 用于高效编码场景位置的方法和系统
WO2024012805A1 (en) Transporting audio signals inside spatial audio signal

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20200703

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
RAP3 Party data changed (applicant data changed or rights of an application transferred)

Owner name: ORANGE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20220824

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

INTG Intention to grant announced

Effective date: 20240311