WO2022170716A1 - Procédé et appareil de traitement audio, et dispositif, support et produit programme - Google Patents

Procédé et appareil de traitement audio, et dispositif, support et produit programme Download PDF

Info

Publication number
WO2022170716A1
WO2022170716A1 PCT/CN2021/100382 CN2021100382W WO2022170716A1 WO 2022170716 A1 WO2022170716 A1 WO 2022170716A1 CN 2021100382 W CN2021100382 W CN 2021100382W WO 2022170716 A1 WO2022170716 A1 WO 2022170716A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
directional
signal
channel
information
Prior art date
Application number
PCT/CN2021/100382
Other languages
English (en)
Chinese (zh)
Inventor
潘兴德
吴超刚
Original Assignee
北京全景声信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京全景声信息科技有限公司 filed Critical 北京全景声信息科技有限公司
Publication of WO2022170716A1 publication Critical patent/WO2022170716A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/323Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only for loudspeakers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation

Definitions

  • the present application relates to the field of electronic technology, and in particular, to an audio processing method, apparatus, device, medium and program product.
  • audio-visual media has experienced silent video media, monophonic sound media, stereo media, surround sound media (4-channel, 5.1-channel, 7.1-channel) and panoramic sound media.
  • the sound technology applied by traditional media generally enables all users to hear the sound synchronized with the picture. For example, movie audiences in a theater can hear the same movie sound no matter where they sit in the theater.
  • users' demands for audio-visual media have gradually diversified. This existing audio-visual technology not only limits the creative space of audio-visual producers, but also cannot satisfy different audiences of audio-visual media in the same space. Different needs for sound.
  • the existing audio-visual media technologies have the technical problem that they cannot meet the different demands for sound of users in different locations.
  • the present application provides an audio processing method, apparatus, device, medium and program product to solve the technical problem that the existing audio-visual media technology cannot meet the different demands of users on different sound characteristics in different locations.
  • the present application provides an audio processing method, comprising:
  • the media information includes sound orientation information configured for each target location within a preset range, where the sound orientation information is used to determine the transmission direction of the sound signal in the media information;
  • a sound signal is transmitted to each of the target positions, so that the sound signal received by each target position conforms to the corresponding target feature.
  • the sound signals received by each of the target positions are not identical or identical.
  • the sound signals received at each of the target locations have the same attribute characteristics.
  • the transmitting sound signals to each of the target positions according to the sound orientation information includes:
  • the sound signal is sent to each of the target positions according to the sound orientation information.
  • the sound signal includes a directional sound signal, and before loading the sound signal into the ultrasonic signal, it further includes:
  • the directional sound channel is a channel other than a conventional channel
  • the directional sound signal is sent to each of the target positions according to the sound directional information.
  • the conventional channels include: stereo, 5.1 channels, 7.1 channels, 5.1.4 channels, 7.1.2 channels, 7.1.4 channels, and 13.1 channels.
  • the sound signal includes a directional sound signal, and before loading the sound signal into the ultrasonic signal, it further includes:
  • the directional sound signal is sent to each of the target positions according to the sound directional information.
  • the method before the acquiring the media information, the method further includes:
  • the directional sound channel and the regular channel are added to the audio data file corresponding to the media information.
  • the method before the acquiring the media information, the method further includes:
  • the sound object is used to carry the directional sound signal.
  • the acoustic signal into the ultrasonic signal before loading the acoustic signal into the ultrasonic signal, it also includes:
  • a preset separation model is used to analyze the conventional channel information in the media information, so as to divide the conventional channel information into a first multi-channel signal and a second multi-channel signal, where the first multi-channel signal includes a One or more conventional channel signals, and the second multi-channel signal is a conventional channel signal other than the first multi-channel signal;
  • the directional acoustic signal is loaded into the ultrasonic signal
  • the directional sound signal is sent to each target position.
  • the directional sound signal is extracted from the sound object of the media information, including:
  • one or more conventional objects are selected from the sound objects as objects to be processed, and the sound objects include at least one conventional object;
  • the directional sound signal is determined according to the object to be processed.
  • an audio processing device comprising:
  • an acquisition module configured to acquire media information, where the media information includes sound orientation information configured for each target position within a preset range, and the sound orientation information is used to determine the transmission direction of the sound signal in the media information;
  • the directional transmission module is configured to transmit sound signals to each of the target positions according to the sound directional information, so that the sound signals received by each target position conform to the corresponding target characteristics.
  • the sound signals received by each of the target positions are not identical or identical.
  • the sound signals received at each of the target locations have the same attribute characteristics.
  • the directional transmission module is specifically used for:
  • the sound signal is sent to each of the target positions according to the sound orientation information.
  • the directional transmission module is further configured to extract the directional sound signal from a directional sound channel of the media information, where the directional sound channel is a channel other than a conventional channel ;
  • the directional transmission module is used to load the directional sound signal into the ultrasonic signal; using the ultrasonic signal, according to the sound directional information, send the directional sound signal to each of the targets Location.
  • the conventional channels include: stereo, 5.1 channels, 7.1 channels, 5.1.4 channels, 7.1.2 channels, 7.1.4 channels, and 13.1 channels.
  • the sound signal includes a directional sound signal
  • the directional transmission module is further configured to extract the directional sound signal from the sound object of the media information
  • the directional transmission module is used to load the directional sound signal into the ultrasonic signal; using the ultrasonic signal, according to the sound directional information, send the directional sound signal to each of the targets Location.
  • the acquisition module is further configured to acquire the directional sound signal
  • a media information production module configured to create the directional sound channel and add the directional sound signal to the directional sound channel
  • the media information production module is further configured to add the directional sound channel and the normal channel together to the audio data file corresponding to the media information when the media information is produced.
  • the acquisition module is further configured to acquire the directional sound signal
  • the media information production module is configured to use the sound object to carry the directional sound signal when the media information is produced.
  • the media information production module may be included in the audio processing apparatus, or may be an independent media information production apparatus.
  • a directional transmission module for:
  • a preset separation model is used to analyze the conventional channel information in the media information, so as to divide the conventional channel information into a first multi-channel signal and a second multi-channel signal, where the first multi-channel signal includes a One or more conventional channel signals, and the second multi-channel signal is a conventional channel signal other than the first multi-channel signal;
  • the directional acoustic signal is loaded into the ultrasonic signal
  • the directional sound signal is sent to each target position.
  • a directional transmission module for:
  • one or more conventional objects are selected from the sound objects as objects to be processed, and the sound objects include at least one conventional object;
  • the directional sound signal is determined according to the object to be processed.
  • the application provides an electronic device, comprising:
  • a memory for storing a computer program for the processor
  • the processor is configured to implement any one of the possible audio processing methods provided by the first aspect by executing the computer program.
  • the present application provides an audio processing system, including: a directional speaker and the electronic device provided in the third aspect; wherein,
  • the directional speaker is used to implement the step of transmitting the sound signal to the target position in any one of the possible audio processing methods provided in the first aspect.
  • the present application further provides a storage medium, where a computer program is stored in the readable storage medium, and the computer program is used to execute any one of the possible audio processing methods provided in the first aspect.
  • the present application further provides a computer program product, including a computer program, which implements any one of the possible audio processing methods provided in the first aspect when the computer program is executed by a processor.
  • the present application provides an audio processing method, apparatus, device, medium and program product.
  • the media information includes sound orientation information configured for each target position within a preset range, and the sound orientation information is used to determine the media The transmission direction of the sound signal in the information; and then according to the sound orientation information, the sound signal is transmitted to each target position, so that the sound signal received by each target position conforms to the corresponding target characteristics.
  • FIG. 1 is a schematic diagram of an application scenario of an audio processing method according to an exemplary embodiment of the present application
  • FIG. 2 is a schematic flowchart of an audio processing method according to an exemplary embodiment of the present application
  • FIG. 3 is a schematic flowchart of an audio processing method according to another exemplary embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a third audio processing method provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a fourth audio processing method provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of an audio processing apparatus provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an electronic device provided by the present application.
  • the audio processing technology of traditional audio-visual media has experienced from silent media, mono sound media, stereo media, surround sound media (4-channel, 5.1-channel, 7.1-channel) to panoramic sound media (such as Dolby Atmos ) development process.
  • silent to monophonic sound the sound is recorded in the media, and the sound is restored to the audience when it is played and displayed.
  • the conversion from mono to stereo and surround sound is mainly reflected in the increase of channels and the emergence of the so-called sound field concept.
  • the latest panoramic sound is added to the attribute characteristics of the sound expression according to the motion characteristics of the sound objects in the media display content, so that the audience or users can more experience the sense of motion of the sound, and truly restore the sound effect scene in reality.
  • the inventors of the present application found that in the course of the development of audio processing technology, the audience or users of media information, although on the surface, are the direct feelers of sound effects, but the audio processing technology seldom pays attention to the personality of the audience or users.
  • the requirements are all based on media information such as the sound source object in the movie as the core of processing, so that the audience or user can perceive the live sound effect of the sound source object when it is recorded.
  • this kind of inertial thinking discards the most important "user-centered" concept, and it is assumed that all users want to obtain the same look and feel or hear the same audio content.
  • This kind of thinking inertia not only limits the creative ideas of audio-visual media, but also makes it impossible to effectively meet the different needs of users.
  • some users are particularly sensitive to the sound of a specific frequency, such as the friction of metal when the train is braking. Too realistic sound effect reproduction will affect the look and feel of these users, resulting in a decline in the viewing experience, while other users have this kind of detailed sound. For example, users who watch racing videos, the roar of the engine, and the sound of mechanical operation, these are the sounds that users who pursue the ultimate want to focus on. These two contradictory demands are obviously not fulfilled by the existing audio processing technology in the same viewing environment.
  • FIG. 1 is a schematic diagram of an application scenario of an audio processing method according to an exemplary embodiment of the present application.
  • the scene is a movie theater viewing scene.
  • a plurality of target positions 110 are set within a preset range 100, and at least one seat can be arranged on each target position 110.
  • multiple ordinary speakers and multiple directional speakers are arranged.
  • the ordinary speakers are used to play the existing basic audio, such as: stereo, surround sound, and panoramic sound, and the directional speakers are used to transmit directional sound information to the target position.
  • the sound signals received by each target position are made to conform to the corresponding target characteristics, so as to meet the different needs of users at different target positions.
  • the steps of the audio processing method provided by the present application are described in detail below.
  • FIG. 2 is a schematic flowchart of an audio processing method according to an exemplary embodiment of the present application. As shown in FIG. 2 , the audio processing method provided by this embodiment specifically includes:
  • the sound orientation information is used to determine the transmission direction of the sound signal in the media information.
  • the media information includes a DCP (Digital Cinema Package) digital movie package.
  • DCP Digital Cinema Package
  • the player of the theater obtains the DCP digital movie package, and then decodes it through a decoding program to obtain a directional sound signal.
  • the attribute feature includes sound orientation information, and the sound orientation information may be the coordinates of the target position.
  • the geometric center of the preset range 100 is set as the origin of the rectangular coordinate system or the polar coordinate system, and then the coordinates of the target position can be determined by the center of the target position. The coordinates of a point relative to the origin.
  • the sound signals in the media information are divided into two categories, one is the basic sound signal, and the other is the directional sound signal.
  • the player renders the sound field of the basic sound signal, amplifies it, and outputs it to the ordinary speaker array of the theater, so as to realize the sound reproduction of the theater.
  • the basic sound signal is transmitted to the common speaker, and the basic sound signal is played to the entire preset range 100 by the common speaker, and the basic sound signal includes at least one of: mono signal, stereo signal, surround sound signal, and panoramic sound signal.
  • the basic sound signal is used to transmit sound signals common to all users, such as background music, dialogues of main characters, etc., to each position in the preset range 100 .
  • the player sends the directional sound signal to the directional sound speaker, and the directional sound speaker loads the directional sound signal into the ultrasonic signal, generates ultrasonic waves through the ultrasonic transducer, and sends the ultrasonic waves to the target position indicated by the sound directional information to generate Sub-sound field, so that only the audience in the area covered by the directional sound speakers can hear the directional sound, that is, the sound signals received by each target location are different or not identical.
  • the ultrasonic signal due to the nonlinear effect of the air medium, the ultrasonic signal generates a beat signal, so that the directional sound signal loaded in the ultrasonic wave is restored to a visible sound through the beat signal, which is felt by the user.
  • the basic sound field generated within the preset range 100 in the prior art cannot enable each target position to generate the same sound effect characteristics.
  • the audio processing method provided in this embodiment can solve this problem, using the characteristics of low attenuation of ultrasonic directional sound technology (that is, the attenuation amplitude is small, the sound propagation distance is far; In the case of headphones, you can experience the effect that the sound is right next to your ears. In a large theater or a large music scene, the directional speakers can give the audience better sound effects under the same power consumption, or in the case of the same sound effects. lower power consumption. And it can make the audience who go up at each position feel the same sound effect, that is, the sound signal received by each target position has the same attribute characteristics.
  • an audio processing system is used to implement each step of this embodiment, and the audio processing system includes: a server and a directional speaker, wherein the server is used to acquire media information, the media The information includes sound orientation information configured for each target position within a preset range, and the sound orientation information is used to determine the transmission direction of the sound signal in the media information;
  • the server further configured to transmit the media information to the directional sound speaker
  • the directional sound speaker is used for transmitting sound signals to each of the target positions according to the sound directional information, so that the sound signals received by each target position conform to the corresponding target characteristics.
  • the server may be a player of the theater, or may be a central controller integrated in the directional speaker, that is, the server and the directional speaker are integrated into one as a smart directional speaker.
  • the smart directional speaker can be applied to scenarios such as home theaters, amusement parks, and concerts.
  • This embodiment provides an audio processing method, by acquiring media information, the media information includes sound orientation information configured for each target position within a preset range, and the sound orientation information is used to determine the transmission direction of the sound signal in the media information; Then, according to the sound orientation information, the sound signal is transmitted to each target position, so that the sound signal received by each target position conforms to the corresponding target feature. It solves the technical problem that the existing audio-visual media technology cannot meet the different needs of users in different positions for sound characteristics, and realizes the technical effect of directional transmission of different sounds to users in a specific position to meet the different sound requirements of users.
  • FIG. 3 is a schematic flowchart of an audio processing method according to another exemplary embodiment of the present application. As shown in FIG. 3 , the audio processing method provided by this embodiment specifically includes:
  • the directional sound signal is used to provide a preset target user with a specific sound signal to meet the special needs of the user.
  • the audio-visual media is a movie, that is, the audio processing method provided in this Users can also walk into the cinema to watch movies like normal people.
  • narration audio for the plot commentary during film production, and this part of the narration audio can be sent to the special viewing area for visually impaired persons specially set up in the theater, that is, the target location. In this way, visually impaired people can watch movies in theaters like normal people without wearing special headphones.
  • This step is to first obtain directional sound information that needs to be added to the movie, such as narration audio, during movie production.
  • the conventional channels include: 5.1 channels, 7.1 channels, 7.1.2 channels, 13.1 channels, and so on.
  • the conventional channel includes other channel construction types in the prior art, which is not limited in this application, but only to indicate that the directional sound channel in the embodiment of the present application is a new channel specially used to carry directional sound, It is different from regular channels.
  • the directional sound channel in the embodiment of the present application is a new channel specially used to carry directional sound, It is different from regular channels.
  • those skilled in the art can also use one or several channels among the conventional channels as the directional sound channel specially carrying the directional sound.
  • the directional sound when the directional sound is placed in the channel, it is placed in other channels other than the 5.1, 7.1, 7.1.2, 13.1 and other channels, and is generally placed in the 16 channels (Sch1, Sch2, ..., In the last 4 channels in Sch16), that is, Sch1 to Sch12 are basic sound signals, Sch13 to Sch16 are directional sound signals, and each movie program can contain one or more channels from Sch13 to Sch16 as directional sound signals Signal.
  • the metadata of the sound object is used to carry the directional sound signal.
  • the metadata of the sound object may store various attribute information of the directional sound, such as direction, position, frequency band, and the like.
  • the directional sound is placed in the sound object, and the object type in the metadata of the directional sound sound object is marked as directional sound.
  • the coding of the basic sound object and the directional sound object may adopt audio coding schemes such as WANOS and AVS2.
  • the metadata of the directional sound object also includes the information required for directional sound rendering, including but not limited to the position coordinate information (x, y, z) of the sound object, which is based on the cinema viewing area, that is, the preset area.
  • the center is the origin of the coordinate system, and the coordinates of the center point of the target position are used as the position coordinate information.
  • the position where the directional speakers are installed can also be used as the origin of the coordinate system, and those skilled in the art can select the number and positions of the directional speakers according to the actual situation, which is not limited in this application.
  • the basic sound signal and the directional sound signal are jointly output as a movie sound stream (or file) containing the directional sound
  • the movie sound file (stream) containing the directional sound and the movie video file (stream) are jointly synthesized into a movie
  • the DCP package is distributed to the theater through the network, hard disk, etc.
  • the sound orientation information is used to determine the transmission direction of the sound signal in the media information.
  • the player of the theater reads the movie DCP package, that is, the media information, through a network or a hard disk.
  • step S302 there are also two extraction methods.
  • the cinema sound decoder receives the movie sound stream containing the directional sound output from the movie player (server), and decodes two types of sound signals, namely the basic sound signal and the directional sound signal. And send these two kinds of sound signals to the ordinary speaker array of the theater and the directional sound speakers respectively.
  • the directional sound speaker uses the ultrasonic directional sound technology to load the directional sound signal into the ultrasonic signal, and utilizes the characteristics of the ultrasonic high-frequency signal to realize directional transmission.
  • the attenuation range is small, and the sound propagation distance is long;
  • the sound encounters the plane and can produce reflection, which can realize the virtual sound source, and stereo.
  • the sound orientation information includes the coordinates of the target position, and the sound orientation information is included in the attribute information of the directional sound signal.
  • the directional sound speaker will send the directional sound signal to each target position, so that the sound signal received by each target position conforms to the corresponding target characteristics.
  • the sound signals received at each target position conform to the corresponding target characteristics, including:
  • the sound signals received by each of the target positions are not identical or identical;
  • the sound signals received at each of the target locations have the same attribute characteristics.
  • the sound signal received at the corresponding target position includes a specially customized directional sound signal in addition to the basic sound signal.
  • a narration narration signal for the plot of the movie can be included, so that visually impaired users can watch movies together with ordinary people without any additional equipment such as earphones. It greatly improves the experience of visually impaired users.
  • the theater sound decoder receives the movie sound stream containing directional sound output from the movie player (server), and decodes two types of sound signals, namely the basic sound signal and the directional sound Signal.
  • the basic sound signal is rendered by the sound field, amplified and output to the ordinary speaker array of the theater to realize the sound reproduction of the theater.
  • the cinema sound decoder performs an ultrasonic frequency modulator on the directional sound signal according to the metadata of the directional sound sound object, the theater space size (height, width, length) and other information, and outputs it to the directional sound speaker for presentation after ultrasonic modulation.
  • the audio processing method provided in this embodiment is also applicable to similar scenarios such as a home theater, a living room, and the like.
  • the above process can also be implemented by an audio processing system, the audio processing system includes: a media information production server, a playback server and a directional speaker; wherein,
  • a media information production server for acquiring the directional sound signal
  • the media information production server is further configured to create the directional sound channel and add the directional sound signal to the directional sound channel; or,
  • the media information production server is also used to add the directional sound signal to the sound object
  • the media information production server is further configured to add the directional sound channel and the regular sound channel to the audio data file corresponding to the media information when the media information is produced; or,
  • the media information production server is further configured to use the metadata of the sound object to carry the directional sound signal when the media information is produced;
  • the playback server configured to acquire media information, where the media information includes sound orientation information configured for each target location within a preset range, and the sound orientation information is used to determine the transmission direction of the sound signal in the media information;
  • the playback server is further configured to transmit the media information to the directional sound speaker;
  • the directional sound speaker is used for transmitting sound signals to each of the target positions according to the sound directional information, so that the sound signals received by each target position conform to the corresponding target characteristics.
  • the sound signal includes a directional sound signal
  • the directional sound speaker is configured to transmit sound signals to each of the target positions according to the sound directional information, including:
  • the directional sound speaker for loading the sound signal into the ultrasonic signal
  • the directional sound speaker is further configured to use the ultrasonic signal to send the sound signal to each of the target positions according to the sound directional information.
  • the sound signal includes a directional sound signal
  • the playback server is further configured to transmit the media information to the directional sound speaker, including:
  • the playback server is further configured to extract the directional sound signal from a directional sound channel of the media information, where the directional sound channel is a channel other than a conventional channel;
  • the directional sound speaker is used for loading the directional sound signal into the ultrasonic signal; and using the ultrasonic signal, according to the sound directional information, the directional sound signal is sent to each of the target location.
  • the playback server is integrated into the directional speaker to form a smart directional speaker playback terminal.
  • the media production server and the playback server are integrated into one server, forming an intelligent directional media center server integrating production and playback functions.
  • the present embodiment provides an audio processing method, by acquiring media information, the media information includes sound orientation information configured for each target position within a preset range, and the sound orientation information is used to determine the transmission direction of the sound signal in the media information; Then, according to the sound orientation information, the sound signal is transmitted to each target position, so that the sound signal received by each target position conforms to the corresponding target feature.
  • the audio processing method provided by the present application can also perform directional transmission processing on the audio in traditional video media, which will be described below with two specific embodiments.
  • FIG. 4 is a schematic flowchart of a third audio processing method provided by an embodiment of the present application. As shown in FIG. 3 , the audio processing method provided by this embodiment specifically includes:
  • the media information is traditional media information, which does not contain directional sound information.
  • the media information is a DCP digital movie package without directional sound
  • the player in the theater obtains the DCP digital movie package, and then decodes it through a decoding program.
  • S402. Use a preset separation model to parse the regular channel information in the media information to divide the regular channel information into a first multi-channel signal and a second multi-channel signal.
  • the first multi-channel signal includes one or more conventional channel signals for constructing the directional sound signal
  • the second multi-channel signal is a conventional channel signal other than the first multi-channel signal
  • the player in the theater obtains a multi-channel audio signal, that is, conventional channel information.
  • a multi-channel audio signal that is, conventional channel information.
  • audio signals in channels such as 5.1, 7.1, 7.1.2, 13.1, etc.
  • the preset separation model is used to analyze the conventional channel information in the media information in order to extract the directional sound components more effectively and accurately. signal as the first multi-channel signal.
  • the first embodiment analyzes the time-domain data of the multi-channel signal (that is, the conventional channel information) to extract the directional sound component.
  • mapping channel information that is, the multi-channel signal z, is divided into a directional sound component zd and a remaining normal component zc.
  • mapping matrix is a 4*4 matrix with 16 coefficients
  • W can be represented by the following formula:
  • W is a unit orthogonal matrix that satisfies:
  • zi is the directional sound component signal
  • ai is the directional sound directional information
  • zi is the regular component.
  • the directional sound component zd and the remaining conventional components zc are reversely mapped back to the time domain of the conventional channel information, that is, the multi-channel signal domain, to obtain the directional sound multi-channel signal xd and the remaining conventional multi-channel signals.
  • channel signal xc channel signal
  • the second embodiment analyzes the frequency domain data of the multi-channel signal (that is, the conventional channel information) to extract the directional sound component.
  • the conventional channel information that is, the original multi-channel sound signal, is mapped to the conventional channel frequency domain signal .
  • the initial form of the original multi-channel sound signal is the time-domain signal u(m, t).
  • the multi-channel frequency-domain signal x(m, k) that is, the first frequency-domain signal, can be obtained, where , m is the channel sequence number, t is the frame (or subframe) sequence number, and k is the frequency sequence number.
  • the regular channel frequency domain signal is divided into a plurality of different time-frequency subbands.
  • the conventional channel frequency domain signal obtained in the first step is x(m,k), and x(m,k) can be divided into different time-frequency subbands xi(t,k), where, m is the channel sequence number, i is the sequence number of the time-frequency subband, t is the frame (or subframe) sequence number, and k is the frequency sequence number.
  • the first frequency domain feature of the frequency domain signal of the conventional channel is calculated, and a PCA mapping model is constructed according to the first frequency domain feature.
  • the first frequency domain feature includes the statistical properties of the conventional channel frequency domain signal in each time-frequency subband, and the first-order statistic (mean value), the second-order statistic (variance and correlation coefficient) and Higher-order statistics (higher-order moments) or their transformed forms are usually more second-order statistics.
  • a second-order statistic may be used as the first statistical characteristic, for example, a covariance matrix.
  • the first frequency domain feature of the conventional channel frequency domain signal x i (t, k) can be calculated, and the optimized subspace mapping model W i (t, k) can be determined, using this
  • the mapping model maps the regular channel frequency domain signal to a new subspace, and obtains a new frequency domain signal zi (t,k), that is, the mapping domain frequency domain signal.
  • the frequency domain signal obtained by decoding in the audio decoder can also be directly used as the normal channel frequency domain signal, without performing MDCT transformation.
  • S404. Load the directional acoustic signal into the ultrasonic signal.
  • steps S404-S405 For the specific implementation manner of steps S404-S405, reference may be made to S305-S306, which will not be repeated here.
  • FIG. 5 is a schematic flowchart of a fourth audio processing method provided by an embodiment of the present application. As shown in FIG. 5 , the audio processing method provided by this embodiment specifically includes:
  • the media information is traditional media information, which does not contain directional sound information.
  • steps S502-S503 specific implementations include:
  • a part of the sound objects can be selected from multiple sound objects as directional sound signals for processing.
  • the selection method and judgment standard are not limited here, and those skilled in the art can make specific selections as needed.
  • the spatial coordinates of the sound object are defined: the origin of the coordinates is defined as the center of the horizontal section, and the height is at the same level as the sound engineer's ear when monitoring, and the x-axis points to the right (wall), and the y-axis points to the right side (wall).
  • the front usually the screen
  • the z-axis pointing vertically upward (roof).
  • the sound field space is represented by normalized coordinates.
  • the maximum absolute coordinate value of the x-axis, y-axis and z-axis is 1.
  • the shorter side of the z-axis is the ground, and its normalized absolute coordinate value is a (a ⁇ 1).
  • the sound object When the sound object is closer to the center position, the sound object is processed as a directional sound signal: the distance between the position coordinates (x, y, z) in the metadata and the center position (0, 0, 0) is less than the threshold Dthreshold Time. Or adopt another judgment criterion, when the position of the sound object is close to the horizontal plane of the listening position, the sound object is processed as a directional sound signal: in the position coordinates (x, y, z) in the metadata, y is less than Threshold Ythreshold.
  • steps S504-S505 For the specific implementation manner of steps S504-S505, reference may be made to S305-S306, which will not be repeated here.
  • the presentation effect of the existing audio-visual media can be improved, that is, some components are extracted from the basic sound signal (stereo, surround sound, panoramic sound, etc.)
  • Presenting with a directional loudspeaker expands the application range of the directional sound technology provided in this application.
  • FIG. 6 is a schematic structural diagram of an audio processing apparatus according to an embodiment of the present application. As shown in FIG. 6 , the audio processing apparatus 600 provided in this embodiment includes:
  • an acquisition module 601 configured to acquire media information, where the media information includes sound orientation information configured for each target position within a preset range, and the sound orientation information is used to determine the transmission direction of the sound signal in the media information;
  • the directional transmission module 602 is configured to transmit sound signals to each of the target positions according to the sound directional information, so that the sound signals received by each target position conform to the corresponding target characteristics.
  • the sound signals received by each of the target locations are not identical or identical.
  • the sound signals received at each of the target locations have the same attribute characteristics.
  • the directional transmission module 602 is specifically used for:
  • the sound signal is sent to each of the target positions according to the sound orientation information.
  • the sound signal includes a directional sound signal
  • the directional transmission module 602 is further configured to extract the directional sound signal from the directional sound channel of the media information, the directional sound signal
  • the channel is a channel other than the conventional channel
  • the directional transmission module 602 is configured to load the directional sound signal into the ultrasonic signal; using the ultrasonic signal, according to the sound directional information, send the directional sound signal to each of the target location.
  • the conventional channels include: stereo, 5.1 channels, 7.1 channels, 5.1.4 channels, 7.1.2 channels, 7.1.4 channels, and 13.1 channels.
  • the sound signal includes a directional sound signal
  • the directional transmission module 602 is further configured to extract the directional sound signal from the sound object of the media information
  • the directional transmission module 602 is configured to load the directional sound signal into the ultrasonic signal; using the ultrasonic signal, according to the sound directional information, send the directional sound signal to each of the target location.
  • the obtaining module 601 is further configured to obtain the directional sound signal
  • a media information production module configured to create the directional sound channel and add the directional sound signal to the directional sound channel
  • the media information production module is further configured to add the directional sound channel and the normal channel together to the audio data file corresponding to the media information when the media information is produced.
  • the obtaining module 601 before the obtaining module 601 is used to obtain media information, it further includes:
  • the obtaining module 601 is further configured to obtain the directional sound signal
  • the media information production module is configured to use the sound object to carry the directional sound signal when the media information is produced.
  • the audio processing device 600 is one of the constituent devices of a media production system with directional sound, and the media production system with directional sound further includes:
  • a media information production module configured to create the directional sound channel when the media information is produced, and add the directional sound signal to the directional sound channel; and/or be used for creating the directional sound channel in the media information During production, the sound object is used to carry the directional sound signal.
  • a media information production module, an acquisition module, and a directional transmission module are integrated in a media production system including directional sound.
  • the audio effect with directional sound can be monitored at the same time.
  • a media information production module used for creating the directional sound channel when the media information is produced, and adding the directional sound signal to the directional sound channel; or/and for creating the directional sound channel in the media information During production, the sound object is used to carry the directional sound signal.
  • an acquisition module configured to acquire media information, where the media information includes sound orientation information configured for each target position within a preset range, and the sound orientation information is used to determine the transmission direction of the sound signal in the media information;
  • the directional transmission module is configured to transmit sound signals to each of the target positions according to the sound directional information, so that the sound signals received by each target position conform to the corresponding target characteristics.
  • the directional transmission module 602 is used to:
  • a preset separation model is used to analyze the conventional channel information in the media information, so as to divide the conventional channel information into a first multi-channel signal and a second multi-channel signal, where the first multi-channel signal includes a One or more conventional channel signals, and the second multi-channel signal is a conventional channel signal other than the first multi-channel signal;
  • the directional acoustic signal is loaded into the ultrasonic signal
  • the directional sound signal is sent to each target position.
  • the directional transmission module 602 is used to:
  • one or more conventional objects are selected from the sound objects as objects to be processed, and the sound objects include at least one conventional object;
  • the directional sound signal is determined according to the object to be processed.
  • the audio processing apparatus provided by the embodiment shown in FIG. 6 can execute the method provided by any of the above method embodiments, and its specific implementation principles, technical features, technical terms and technical effects are similar, and will not be repeated here. Repeat.
  • FIG. 7 is a schematic structural diagram of an electronic device provided by the present application. As shown in FIG. 7 , the electronic device 700 may include: at least one processor 701 and a memory 702 . FIG. 7 shows an electronic device with a processor as an example.
  • the memory 702 is used to store programs.
  • the program may include program code, and the program code includes computer operation instructions.
  • Memory 702 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.
  • the processor 701 is configured to execute the computer-executed instructions stored in the memory 702 to implement the methods described in the above method embodiments.
  • the processor 701 may be a central processing unit (central processing unit, referred to as CPU), or a specific integrated circuit (application specific integrated circuit, referred to as ASIC), or is configured to implement one or more of the embodiments of the present application. multiple integrated circuits.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • the memory 702 may be independent or integrated with the processor 701 .
  • the electronic device 700 may further include:
  • a bus 703 is used to connect the processor 701 and the memory 702 .
  • the bus may be an industry standard architecture (abbreviated as ISA) bus, a peripheral component (PCI) bus, or an extended industry standard architecture (EISA) bus, or the like. Buses can be divided into address bus, data bus, control bus, etc., but it does not mean that there is only one bus or one type of bus.
  • ISA industry standard architecture
  • PCI peripheral component
  • EISA extended industry standard architecture
  • the memory 702 and the processor 701 can communicate through an internal interface.
  • the electronic device 700 may correspond to the servers in the embodiments shown in FIGS. 2 , 4 and 5 , or the media production server and/or the playback server in the embodiment shown in FIG. 3 .
  • the present application also provides an audio processing system, including: a directional speaker and the electronic device shown in FIG. 7 ; wherein,
  • the directional loudspeaker is used to implement the step of transmitting the sound signal to the target position in any one of the possible audio processing methods provided by the above method embodiments.
  • the audio processing system introduces directional sound in the production and playback of media information, which can effectively improve the immersion and interactivity of media information such as sound in movies, or provide help for visually impaired audiences.
  • the visually impaired auxiliary channel is also defined. If there is a visually impaired auxiliary channel signal in the movie, it can also be provided to the visually impaired through directional sound speakers.
  • the present application also provides a computer-readable storage medium
  • the computer-readable storage medium may include: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM) ), a magnetic disk or an optical disk and other media that can store program codes.
  • the computer-readable storage medium stores program instructions, and the program instructions are used for the methods in the above embodiments.
  • the present application further provides a computer program product, including a computer program, when the computer program is executed by a processor, the methods in the foregoing method embodiments are implemented.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Stereophonic System (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)

Abstract

La présente demande concerne un procédé et un appareil de traitement audio, ainsi qu'un dispositif, un support et un produit programme. Le procédé de traitement audio consiste à : acquérir des informations multimédias, les informations multimédias comprenant des informations d'orientation du son configurées pour chaque position cible dans une plage prédéfinie, et les informations d'orientation du son étant utilisées pour déterminer la direction de transmission d'un signal sonore dans les informations multimédias ; puis transmettre le signal sonore à chaque position cible selon les informations d'orientation du son, de telle sorte que le signal sonore reçu à chaque position cible soit conforme à une caractéristique cible correspondante. Le problème technique d'impossibilité, pour une technologie multimédia audio/vidéo existante, de satisfaire les différentes exigences de caractéristiques sonores d'utilisateurs à différentes positions est résolu ; et différents sons sont transmis, d'une manière orientée, à des utilisateurs à des positions spécifiques, ce qui permet de réaliser l'effet technique de satisfaire différentes exigences sonores d'utilisateurs.
PCT/CN2021/100382 2021-02-10 2021-06-16 Procédé et appareil de traitement audio, et dispositif, support et produit programme WO2022170716A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110182857.XA CN114915874B (zh) 2021-02-10 2021-02-10 音频处理方法、装置、设备及介质
CN202110182857.X 2021-02-10

Publications (1)

Publication Number Publication Date
WO2022170716A1 true WO2022170716A1 (fr) 2022-08-18

Family

ID=82761462

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/100382 WO2022170716A1 (fr) 2021-02-10 2021-06-16 Procédé et appareil de traitement audio, et dispositif, support et produit programme

Country Status (2)

Country Link
CN (1) CN114915874B (fr)
WO (1) WO2022170716A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115695902A (zh) * 2022-11-07 2023-02-03 百视通网络电视技术发展有限责任公司 盲人无障碍电影音频处理方法、装置及存储介质
CN115604647B (zh) * 2022-11-28 2023-03-10 北京天图万境科技有限公司 一种超声波感知全景的方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140254840A1 (en) * 2012-08-16 2014-09-11 Parametric Sound Corporation Hearing enhancement systems and methods
CN104469595A (zh) * 2014-10-30 2015-03-25 苏州上声电子有限公司 一种基于误差模型的多区域声重放方法和装置
CN104902388A (zh) * 2015-05-06 2015-09-09 苏州上声电子有限公司 用于实现多区域音量差异的声重放方法及系统
CN106231503A (zh) * 2016-09-19 2016-12-14 清华大学 一种用于车内分区域控制的音频系统及控制方法
CN107592588A (zh) * 2017-07-18 2018-01-16 科大讯飞股份有限公司 声场调节方法及装置、存储介质、电子设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3157717A1 (fr) * 2011-07-01 2013-01-10 Dolby Laboratories Licensing Corporation Systeme et procede pour generation, codage et rendu de signal audio adaptatif
US20150078595A1 (en) * 2013-09-13 2015-03-19 Sony Corporation Audio accessibility
JP6905824B2 (ja) * 2016-01-04 2021-07-21 ハーマン ベッカー オートモーティブ システムズ ゲーエムベーハー 非常に多数のリスナのための音響再生
CN109218859A (zh) * 2017-06-29 2019-01-15 长城汽车股份有限公司 车载定向音响系统、控制方法及车辆
US10063972B1 (en) * 2017-12-30 2018-08-28 Wipro Limited Method and personalized audio space generation system for generating personalized audio space in a vehicle
CN209487146U (zh) * 2018-01-19 2019-10-11 深圳市编际智能科技有限公司 一种声频定向家庭影院系统
CN109040907A (zh) * 2018-09-11 2018-12-18 戴姆勒股份公司 车内声音分区控制系统
CN110234048B (zh) * 2019-02-02 2021-07-27 上海蔚来汽车有限公司 车内声音分区控制装置和方法、控制器及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140254840A1 (en) * 2012-08-16 2014-09-11 Parametric Sound Corporation Hearing enhancement systems and methods
CN104469595A (zh) * 2014-10-30 2015-03-25 苏州上声电子有限公司 一种基于误差模型的多区域声重放方法和装置
CN104902388A (zh) * 2015-05-06 2015-09-09 苏州上声电子有限公司 用于实现多区域音量差异的声重放方法及系统
CN106231503A (zh) * 2016-09-19 2016-12-14 清华大学 一种用于车内分区域控制的音频系统及控制方法
CN107592588A (zh) * 2017-07-18 2018-01-16 科大讯飞股份有限公司 声场调节方法及装置、存储介质、电子设备

Also Published As

Publication number Publication date
CN114915874A (zh) 2022-08-16
CN114915874B (zh) 2023-07-25

Similar Documents

Publication Publication Date Title
US10952009B2 (en) Audio parallax for virtual reality, augmented reality, and mixed reality
US9124966B2 (en) Image generation for collaborative sound systems
US9622007B2 (en) Method and apparatus for reproducing three-dimensional sound
CN104869335B (zh) 用于局域化感知音频的技术
RU2661775C2 (ru) Передача сигнальной информации рендеринга аудио в битовом потоке
JP5865899B2 (ja) 立体音響の再生方法及び装置
JP6246922B2 (ja) 音響信号処理方法
WO2022170716A1 (fr) Procédé et appareil de traitement audio, et dispositif, support et produit programme
JPWO2019078035A1 (ja) 信号処理装置および方法、並びにプログラム
US11221820B2 (en) System and method for processing audio between multiple audio spaces
US20190007782A1 (en) Speaker arranged position presenting apparatus
US11122386B2 (en) Audio rendering for low frequency effects
CN114128312B (zh) 用于低频效果的音频渲染
Ando Preface to the Special Issue on High-reality Audio: From High-fidelity Audio to High-reality Audio
WO2024014390A1 (fr) Procédé de traitement de signal acoustique, procédé de génération d'informations, programme informatique et dispositif de traitement de signal acoustique
O’Dwyer Sound Source Localization and Virtual Testing of Binaural Audio
Stewart Spatial auditory display for acoustics and music collections
JP2022128177A (ja) 音声生成装置、音声再生装置、音声再生方法、及び音声信号処理プログラム
CN115167803A (zh) 一种音效的调节方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21925370

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21925370

Country of ref document: EP

Kind code of ref document: A1