CN109246580B

CN109246580B - 3D sound effect processing method and related product

Info

Publication number: CN109246580B
Application number: CN201811118269.4A
Authority: CN
Inventors: 严锋贵
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2018-09-25
Filing date: 2018-09-25
Publication date: 2022-02-11
Anticipated expiration: 2038-09-25
Also published as: CN109246580A; WO2020062922A1

Abstract

The embodiment of the application discloses a 3D sound effect processing method and a related product, wherein the method comprises the following steps: determining a three-dimensional coordinate of each sound source in a plurality of sound sources corresponding to the electronic equipment and single sound channel data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of single sound channel data; determining a second three-dimensional coordinate of a target object corresponding to the electronic equipment; and synthesizing the plurality of pieces of single-channel data according to the plurality of first three-dimensional coordinates and the plurality of second three-dimensional coordinates to obtain target two-channel data. By adopting the method and the device, the playing effect of the audio data can be improved.

Description

3D sound effect processing method and related product

Technical Field

The application relates to the technical field of audio playing, in particular to a 3D sound effect processing method and a related product.

Background

With the development of the diversification of the functions of electronic devices and their portability, more and more people enjoy some entertainment activities through electronic devices. Particularly, the music player can listen to songs and watch videos anytime and anywhere according to the needs of users.

In the existing audio playing mode, the volume set by the user is often used as the basis, and the sound producing body adopts the constant power corresponding to the volume set by the user to play the audio, so that the played sound meets the loudness requirement of the user. However, such a playing mode is too single in form, and often causes sensory fatigue to the user.

Disclosure of Invention

The embodiment of the application provides a 3D sound effect processing method and a related product, which can improve the playing effect of audio data.

In a first aspect, an embodiment of the present application provides a 3D sound effect processing method, including:

determining a three-dimensional coordinate of each sound source in a plurality of sound sources corresponding to the electronic equipment and single channel data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of single channel data;

determining a second three-dimensional coordinate of a target object corresponding to the electronic equipment;

and synthesizing the plurality of pieces of single-channel data according to the plurality of first three-dimensional coordinates and the plurality of second three-dimensional coordinates to obtain target two-channel data.

In a second aspect, an embodiment of the present application provides a 3D sound effect processing apparatus, including:

the electronic equipment comprises a determining unit, a processing unit and a processing unit, wherein the determining unit is used for determining three-dimensional coordinates of each sound source in a plurality of sound sources corresponding to the electronic equipment and single sound channel data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of single sound channel data; determining a second three-dimensional coordinate of a target object corresponding to the electronic equipment;

and the synthesizing unit is used for synthesizing the plurality of pieces of single-channel data according to the plurality of first three-dimensional coordinates and the second three-dimensional coordinates to obtain target two-channel data.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for some or all of the steps described in the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, where the computer program makes a computer perform some or all of the steps as described in the first aspect of the present application.

In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product comprises a non-transitory computer-readable storage medium storing a computer program, the computer program being operable to cause a computer to perform some or all of the steps as described in the first aspect of embodiments of the present application. The computer program product may be a software installation package.

The embodiment of the application has the following beneficial effects:

the method comprises the steps of determining a three-dimensional coordinate of each sound source in a plurality of sound sources corresponding to the electronic equipment and mono data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of mono data, determining a second three-dimensional coordinate of a target object corresponding to the electronic equipment, and synthesizing the plurality of mono data according to the plurality of first three-dimensional coordinates and the second three-dimensional coordinate to obtain target double-channel data. Therefore, after the first three-dimensional coordinates of the sound sources and the second three-dimensional coordinates of the target object are determined, the target two-channel data corresponding to the sound sources are generated, the playing effect of the audio data is improved, the immersion feeling can be generated, and the user experience is improved conveniently.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Wherein:

fig. 1A is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;

fig. 1B is a scene schematic diagram of coordinate axes of an electronic device according to an embodiment of the present disclosure;

fig. 2A is a schematic flowchart of a 3D sound effect processing method according to an embodiment of the present disclosure;

fig. 2B is a scene diagram of multi-channel binaural data according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a 3D sound effect processing device according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of another electronic device according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The electronic device according to the embodiment of the present application may include various handheld devices (e.g., smart phones), vehicle-mounted devices, Virtual Reality (VR)/Augmented Reality (AR) devices, wearable devices, computing devices or other processing devices connected to wireless modems, and various forms of User Equipment (UE), Mobile Stations (MSs), terminal devices (terminal devices), development/test platforms, servers, and so on, which have wireless communication functions. For convenience of description, the above-mentioned devices are collectively referred to as electronic devices.

Referring to fig. 1A, fig. 1A is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, where the electronic device includes a control circuit and an input-output circuit, and the input-output circuit is connected to the control circuit.

The control circuitry may include, among other things, storage and processing circuitry. The storage circuit in the storage and processing circuit may be a memory, such as a hard disk drive memory, a non-volatile memory (e.g., a flash memory or other electronically programmable read only memory used to form a solid state drive, etc.), a volatile memory (e.g., a static or dynamic random access memory, etc.), etc., and the embodiments of the present application are not limited thereto. Processing circuitry in the storage and processing circuitry may be used to control the operation of the electronic device. The processing circuitry may be implemented based on one or more microprocessors, microcontrollers, digital signal processors, baseband processors, power management units, audio codec chips, application specific integrated circuits, display driver integrated circuits, and the like.

The storage and processing circuitry may be used to run software in the electronic device, such as play incoming call alert ringing application, play short message alert ringing application, play alarm alert ringing application, play media file application, Voice Over Internet Protocol (VOIP) phone call application, operating system functions, and so forth. The software may be used to perform some control operations, such as playing an incoming alert ring, playing a short message alert ring, playing an alarm alert ring, playing a media file, making a voice phone call, and performing other functions in the electronic device, and the embodiments of the present application are not limited.

The input-output circuit can be used for enabling the electronic device to input and output data, namely allowing the electronic device to receive data from the external device and allowing the electronic device to output data from the electronic device to the external device.

The input-output circuit may further include a sensor. The sensors may include ambient light sensors, optical and capacitive based infrared proximity sensors, ultrasonic sensors, touch sensors (e.g., optical based touch sensors and/or capacitive touch sensors, where the touch sensors may be part of a touch display screen or may be used independently as a touch sensor structure), acceleration sensors, gravity sensors, and other sensors, etc. The input-output circuit may further include audio components that may be used to provide audio input and output functionality for the electronic device. The audio components may also include a tone generator and other components for generating and detecting sound.

In the embodiment of the application, the sensor further comprises a three-axis acceleration sensor, which can measure the space acceleration, is used for measuring the attitude and the inclination angle of the electronic device, can be used for motion offset compensation calculation when a Global Positioning System (GPS) signal is poor besides automatically switching horizontal and vertical display visual angles, and can comprehensively and accurately reflect the motion property of an object.

Referring to fig. 1B, fig. 1B is a schematic view of a scene in which a three-dimensional acceleration sensor determines coordinate axes of an electronic device. As shown in FIG. 1B, the x-axis, the y-axis, and the z-axis are all relative to the position of the electronic device body, generally the y-axis is upward of the body, the x-axis is rightward of the body, and the z-axis is perpendicular to the front of the body and is in the same direction with the gravity. The lateral, longitudinal and vertical components are generally projections of a unit of gravitational force (magnitude 1g (m/s), directed vertically downwards) on the respective axes. The transverse component corresponds to a numerical value on an x axis, the longitudinal component corresponds to a numerical value on a y axis, the vertical component corresponds to a numerical value on a z axis, the transverse inclination angle is an included angle between the x axis and the horizontal plane, and the longitudinal inclination angle is an included angle between the y axis and the horizontal plane.

For example: the method comprises the following steps that the electronic equipment is horizontally placed on a desktop, the x axis is defaulted to 0, the y axis is defaulted to 0, and the z axis is defaulted to 9.81; the electronic equipment is placed on a desktop downwards, and the z axis is-9.81; tilting the electronic equipment to the left, wherein the x axis is a positive value; tilting the electronic equipment to the right, wherein the x axis is a negative value; tilting the electronic equipment upwards, wherein the y axis is a negative value; the electronic equipment is inclined downwards, and the y axis is a positive value; and regarding the condition that the z axis is smaller than-3 as that the touch display screen of the electronic equipment faces downwards.

The input-output circuitry may also include one or more display screens. The display screen can comprise one or a combination of a liquid crystal display screen, an organic light emitting diode display screen, an electronic ink display screen, a plasma display screen and a display screen using other display technologies. The display screen may include an array of touch sensors (i.e., the display screen may be a touch display screen). The touch sensor may be a capacitive touch sensor formed by a transparent touch sensor electrode (e.g., an Indium Tin Oxide (ITO) electrode) array, or may be a touch sensor formed using other touch technologies, such as acoustic wave touch, pressure sensitive touch, resistive touch, optical touch, and the like, and the embodiments of the present application are not limited thereto.

The input-output circuitry may further include communications circuitry that may be used to provide the electronic device with the ability to communicate with external devices. The communication circuitry may include analog and digital input-output interface circuitry, and wireless communication circuitry based on radio frequency signals and/or optical signals. The wireless communication circuitry in the communication circuitry may include radio frequency transceiver circuitry, power amplifier circuitry, low noise amplifiers, switches, filters, and antennas. For example, the wireless communication circuitry in the communication circuitry may include circuitry to support Near Field Communication (NFC) by transmitting and receiving near field coupled electromagnetic signals. For example, the communication circuit may include a near field communication antenna and a near field communication transceiver. The communications circuitry may also include cellular telephone transceiver and antennas, wireless local area network transceiver circuitry and antennas, and so forth.

The input-output circuit may further include an input-output unit. Input-output units may include buttons, joysticks, click wheels, scroll wheels, touch pads, keypads, keyboards, cameras, light emitting diodes and other status indicators, and the like.

The electronic device may further include a battery (not shown) for supplying power to the electronic device.

The following describes embodiments of the present application in detail.

Referring to fig. 2A, an embodiment of the present application provides a flow diagram of a 3D sound effect processing method, which is applied to an electronic device. Specifically, as shown in fig. 2A, a 3D sound effect processing method includes:

s201: the method comprises the steps of determining the three-dimensional coordinates of each sound source in a plurality of sound sources corresponding to the electronic equipment and the single sound channel data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of single sound channel data.

The embodiment of the application can be applied to virtual reality/augmented reality scenes or 3D recording scenes. In the embodiment of the present application, the sound source may be a sounding body in a virtual scene, for example, an airplane in a game scene, and the sound source may be a fixed sound source or a mobile sound source. As shown in the coordinate axis of the electronic device in fig. 1B, the sound source corresponding to the electronic device may use the coordinate axis as a reference, determine a first three-dimensional coordinate corresponding to the sound source, and when the sound source makes a sound, may obtain monaural data generated by the original source.

The electronic device may include a plurality of sound sources, for example, the sound source of the game scene includes an airplane, a gun, river water, etc., and the corresponding monaural data is the glide sound of the airplane, the sound emitted by the gun when the gun is loaded, and the stream sound of the river water; the monophonic data may also include a game player, and the corresponding monophonic data is a footstep sound, a voice sound, etc. of the game player, which is not limited herein.

In one possible example, the determining three-dimensional coordinates of each of a plurality of sound sources corresponding to the electronic device and mono data generated by each of the sound sources to obtain a plurality of first three-dimensional coordinates and a plurality of mono data includes: determining a plurality of reference objects corresponding to the electronic equipment; determining behavior information of each reference object in the plurality of reference objects to obtain a plurality of behavior information; selecting the plurality of sound sources from the plurality of reference objects according to the plurality of behavior information; determining a coordinate position corresponding to each sound source in the plurality of sound sources to obtain a plurality of first three-dimensional coordinates; and determining the single-channel data of each sound source according to the behavior information corresponding to the sound source to obtain the multiple single-channel data.

Wherein, the reference object may be an object presented in a display page of the electronic device, such as: displaying houses in the page, cars in which game players ride, guns held; or may be an object not presented on the display page, such as: nearby players, firearms, vehicles, etc.

The behavior information is dynamic information of the reference object. It will be appreciated that different reference objects correspond to different types of behaviour, for example, a gun may sound a breech, a gun, but not a water flow, a speech. And each reference corresponds to a different sound when it is in a different behavior information, such as: the house can not make sound under normal conditions, but can make a shot sound when being shot; when the automobile is started, starting sound and running sound can be generated; the game player generates a voice sound when transmitting a voice, a footstep sound when walking, and the like.

That is to say, a plurality of reference objects corresponding to the electronic device are determined, then the behavior information of each reference object is obtained, and whether the reference object is a sound source is determined according to each behavior information, so that the accuracy of determining the sound source is improved. And then, further determining the coordinate position of each sound source to obtain a plurality of first three-dimensional coordinates, and determining the monaural data of the sound source according to the behavior information corresponding to each sound source, so that the accuracy of determining the monaural data can be improved according to the behavior information.

The method and the device have the advantages that the coordinate position corresponding to the sound source is not limited, for example, if the method and the device correspond to a game scene, and the game scene corresponds to a three-dimensional map, the coordinate positions corresponding to different sound sources can be determined according to the map, namely, the first three-dimensional coordinate is determined according to the specific position of the role, so that the accuracy of determining the first three-dimensional coordinate is improved, the 3D sound effect of the target two-channel data is convenient to improve, a user can be on the spot when playing games, and the game world is more vivid.

In this application, how to determine the monaural data according to the behavior information is not limited, and if the plurality of behavior information includes target behavior information corresponding to a target sound source in the plurality of sound sources, in a possible example, the determining the monaural data of the sound source according to the behavior information corresponding to each of the plurality of sound sources to obtain the plurality of monaural data includes: determining a sound type and a playing parameter corresponding to the target behavior information; and generating the single-channel data of the target sound source according to the sound type and the playing parameters.

Wherein, the sound type is the sound type of the target behavior information corresponding to the sound, for example: the gun comprises the types of sound of loading, shooting and hitting, and the playing parameters are loudness, frequency, tone and the like.

It can be understood that, taking a target sound source as an example, the sound type and the playing parameter of the target sound source are determined according to the target behavior information corresponding to the target sound source, and then the monaural data of the target sound source is generated according to the sound type and the playing parameter.

S202: and determining a second three-dimensional coordinate of the target object corresponding to the electronic equipment.

In this embodiment of the application, the target object may be a game player corresponding to the electronic device in a game, a virtual reality scene, or an augmented reality scene, or may be a target user corresponding to the electronic device in a 3D recording scene, or the like. The target object may also correspond to a three-dimensional position, i.e. a second three-dimensional position, and of course, the first three-dimensional position and the second three-dimensional position are different positions. As for the method for determining the second three-dimensional coordinate, the method for determining the first three-dimensional coordinate may refer to a method for determining the first three-dimensional coordinate, that is, the second three-dimensional coordinate of the target object is determined with the coordinate axis shown in fig. 1B as a reference, which is not described herein again.

S203: and synthesizing the plurality of pieces of single-channel data according to the plurality of first three-dimensional coordinates and the plurality of second three-dimensional coordinates to obtain target two-channel data.

Specifically, the plurality of first three-dimensional coordinates, the second three-dimensional coordinates, and the plurality of monaural data may be input to the HRTF model to obtain the target binaural data.

In a specific implementation, in the embodiment of the present application, the electronic device may perform filtering on audio data (sound emitted by a sound source) by using a Head Related Transfer Function (HRTF) filter to obtain virtual surround sound, which is also called surround sound or panoramic sound, so as to implement a three-dimensional stereo effect. The name of the HRTF in the time domain is Head Related Impulse Response (HRIR), or the audio data is convolved with Binaural Room Impulse Response (BRIR), which consists of three parts: direct sound, early reflected sound and reverberation.

For example, the electronic device may generate left and right channels according to a spatial three-dimensional coordinate position (X, Y, Z) of the sound source, which may be any coordinate, and mono data generated by the sound source, wherein the left and right channels are generated according to a principle that a binaural sound is generated according to a distance (X, Y, Z) between the sound source and the listener, a time difference between data transmitted from a single point to the left ear and data transmitted to the right ear, and a pressure difference.

In one possible example, the synthesizing the plurality of pieces of mono data according to the plurality of first three-dimensional coordinates and the second three-dimensional coordinates to obtain the target binaural data includes: determining a left ear three-dimensional coordinate and a right ear three-dimensional coordinate corresponding to the second three-dimensional coordinate; determining transmission paths between the sound sources and the target object according to the first three-dimensional coordinates, the left ear three-dimensional coordinates and the right ear three-dimensional coordinates to obtain a plurality of transmission paths; determining the time and pressure of each piece of monaural data in the plurality of pieces of monaural data transmitted to the left ear three-dimensional coordinate and the right ear three-dimensional coordinate according to the plurality of transmission paths to obtain a plurality of times and a plurality of pressures; determining a time difference corresponding to each piece of single-channel data in the single-channel data according to the plurality of times to obtain a plurality of time differences, and determining a pressure difference corresponding to each piece of single-channel data in the single-channel data according to the plurality of pressures to obtain a plurality of pressure differences; determining a delay parameter corresponding to each piece of single-channel data in the plurality of single-channel data according to the plurality of time differences and the plurality of pressure differences to obtain a plurality of delay parameters; processing each single sound channel data in the single sound channel data to obtain a corresponding left sound channel parameter and a corresponding right sound channel parameter according to the delay parameters; and synthesizing the plurality of pieces of single-channel data according to the left channel parameter and the right channel parameter corresponding to each piece of single-channel data in the plurality of pieces of single-channel data to obtain the target two-channel data.

The target object corresponds to a left ear three-dimensional coordinate and a right ear three-dimensional coordinate, the determination of the left ear three-dimensional coordinate and the right ear three-dimensional coordinate is not limited, and the determination can be performed according to a 3D character model of the target object, namely, the determination is performed according to a second three-dimensional coordinate and a right ear three-dimensional coordinate of the target object, which are preset in the 3D character model, and the incidence relation between the second three-dimensional coordinate and the left ear three-dimensional coordinate.

The time difference and the pressure difference are respectively the time difference and the pressure difference transmitted to the three-dimensional coordinates of the left ear and the three-dimensional coordinates of the right ear, that is, the time difference and the pressure difference transmitted to the left ear and the right ear of the target object by the monaural data corresponding to the sound source.

It can be understood that, because a certain distance exists between the left ear and the right ear, and pressure difference exists when sound propagates in the air, the left ear three-dimensional coordinate and the right ear three-dimensional coordinate corresponding to the second three-dimensional coordinate are determined, then transmission paths between a plurality of sound sources and target objects are determined according to the first three-dimensional coordinates, the left ear three-dimensional coordinate and the right ear three-dimensional coordinate to obtain a plurality of transmission paths, and then time and pressure for transmitting each monaural data from the first three-dimensional coordinate to the left ear three-dimensional coordinate and the right ear three-dimensional coordinate are determined by the transmission paths to obtain a plurality of time and a plurality of pressure. Then, the time difference of each single-channel data is determined according to a plurality of times, the pressure of each single-channel data is determined according to the plurality of pressures, the delay parameter of the corresponding single-channel data transmitted to the left ear and the right ear of the target object is determined according to each time difference and each pressure, then, the left channel parameter and the right channel parameter corresponding to each single-channel data are determined according to the delay parameter, therefore, the single-channel data are processed according to the left channel parameter and the right channel parameter of each single-channel data to obtain processed two-channel data, and then the obtained two-channel data are synthesized to obtain the target two-channel data. That is to say, the delay parameters corresponding to each sound source are respectively determined according to the first three-dimensional coordinates and the second three-dimensional coordinates, so that the left channel parameters and the right channel parameters of the single channel data are determined according to the delay parameters, and then the target double channel data are obtained by synthesis, thereby improving the playing effect of the audio data, generating the immersion feeling and being convenient for improving the user experience.

The method for determining the time and the pressure is not limited in the present application, and the present application takes the time and the pressure corresponding to the three-dimensional coordinates of the target sound source transmitted to the left ear as an example, the time and the pressure corresponding to the three-dimensional coordinates of the target sound source transmitted to the left ear, and the time and the pressure corresponding to the three-dimensional coordinates of the left ear and the right ear determined by the monaural data corresponding to the other sound sources except the target sound source in the plurality of sound sources as reference.

Specifically, if the plurality of sound sources include a target sound source, the plurality of first three-dimensional coordinates include a target first three-dimensional coordinate corresponding to the target sound source, and the plurality of monaural data include target monaural data corresponding to the target sound source; in one possible example, the determining, according to the plurality of transmission paths, a time and a pressure at which each of the plurality of monaural data is transmitted to the left ear three-dimensional coordinate and the right ear three-dimensional coordinate to obtain a plurality of times and a plurality of pressures comprises: obtaining a cross section by taking the target first three-dimensional coordinate and the left ear three-dimensional coordinate as axes; determining an occluding object between the target first three-dimensional coordinate and the left ear three-dimensional coordinate; determining a plurality of reference transmission paths for transmitting the target single-channel data to the three-dimensional coordinates of the left ear according to the cross section and the shielding object; and determining the time and pressure for transmitting the target single-channel data to the three-dimensional coordinates of the left ear according to the plurality of reference transmission paths.

Since sound propagates along all directions in real-world environment, and of course, reflection, refraction, interference, diffraction, etc. also occur during propagation, the propagation of target monaural data may include multiple reference transmission paths. As shown in fig. 2B, when the target first three-dimensional coordinate and the left ear three-dimensional coordinate are taken as the axes to form the cross section, the propagation path has a certain symmetry along a certain symmetry axis due to a certain sound propagation direction, and a plurality of transmission paths can be obtained. And the sound transmission is dispersed and transmitted when encountering the shielding object, so that a plurality of corresponding reference transmission paths are determined according to the cross section and the shielding object in the application scene.

It can be understood that a cross section of sound source data transmission is determined by taking the first three-dimensional coordinate of the target and the three-dimensional coordinate of the left ear as axes, a plurality of reference transmission paths corresponding to target single-channel data are determined according to the cross section and the shielding object, and the pressure of the target sound source transmitted to the left ear of the target object is determined by the plurality of reference transmission paths. That is to say, according to the cross section that the first three-dimensional coordinate of target and the three-dimensional coordinate of left ear correspond, confirm a plurality of reference transmission paths that are possible, and then confirm time and pressure according to a plurality of reference transmission paths, improved the accuracy of confirming time and pressure.

In one possible example, the determining the time and pressure at which the target mono data is transmitted to the left ear three-dimensional coordinates according to the plurality of reference transmission paths comprises: determining the sound intensity and the sound pressure corresponding to each transmission path in the plurality of reference transmission paths to obtain a plurality of sound intensities and a plurality of sound pressures; and determining the time and the pressure for transmitting the target single-channel data to the three-dimensional coordinates of the left ear according to the sound intensities and the sound pressures.

The sound intensity refers to the sound energy per unit area of sound wave passing through the direction perpendicular to the propagation direction in unit time, and the unit is W/m 2. Sound pressure is the increase in pressure due to the presence of sound waves, in Pa.

The method for determining the pressure according to the multiple sound pressures is not limited, the preset weight value can be determined according to the corresponding relative distance of the corresponding reference transmission path, and then the multiple sound pressures and the corresponding preset weight value are subjected to weighting calculation to obtain the pressure.

It can be understood that each sound source data has a corresponding sound pressure, so that the sound pressure corresponding to each reference transmission path is determined to obtain a plurality of sound pressures, and then the pressure of the target monaural data transmitted to the three-dimensional coordinate of the left ear is determined according to the plurality of sound pressures, so that the accuracy of determining the pressure can be improved.

In the 3D sound effect processing method as shown in fig. 2A, a three-dimensional coordinate of each of a plurality of sound sources corresponding to an electronic device and mono data generated by each sound source are determined to obtain a plurality of first three-dimensional coordinates and a plurality of mono data, a second three-dimensional coordinate of a target object corresponding to the electronic device is determined, and then the plurality of mono data are synthesized according to the plurality of first three-dimensional coordinates and the second three-dimensional coordinate to obtain target binaural data. Therefore, after the first three-dimensional coordinates of the sound sources and the second three-dimensional coordinates of the target object are determined, the target two-channel data corresponding to the sound sources are generated, the playing effect of the audio data is improved, the immersion feeling can be generated, and the user experience is improved conveniently.

In one possible example, the method further comprises: determining a preference reverberation parameter corresponding to the target object; and processing the target binaural data according to the preference reverberation parameter to obtain reverberation binaural data.

The preference reverberation parameter includes an input volume, a low-frequency cut, a high-frequency cut, an early reflection time, a spatial extent, a diffusion degree, a low mixing ratio, a frequency dividing point, a reverberation time, a high-frequency attenuation point, a dry sound adjustment, a reverberation volume, an early reflection sound volume, a sound field width, an output sound field, a tail sound, and the like, which is not limited herein.

It can be understood that the preference reverberation parameter corresponding to the target object is determined, and then the reverberation parameter is processed on the target binaural data according to the preference reverberation parameter to obtain the reverberation binaural data, so that the target binaural data is processed according to the target object, the playing effect of the audio data can be further improved, and the user experience is convenient to improve.

The application does not limit how to determine the preference reverberation parameter, and in a possible example, the determining the preference reverberation parameter corresponding to the target object includes: acquiring a plurality of pre-stored historical reverberation play records corresponding to the target object; obtaining a plurality of listening parameters corresponding to each historical reverberation playing record in the plurality of historical reverberation playing records; determining the preferred reverberation parameter from the plurality of listening parameters.

The listening parameters comprise audio type, playing time, playing adjustment times, user mood parameters and the like. It can be understood that a plurality of prestored historical reverberation play records corresponding to the target object are obtained, the listening parameters corresponding to the historical reverberation play records are obtained, and then the preference reverberation parameters are determined according to the listening parameters, so that the accuracy of determining the preference reverberation parameters is improved, and the playing effect is improved conveniently.

The present application does not limit how to determine the preferred reverberation parameter according to the plurality of listening parameters, and in one possible example, the determining the preferred reverberation parameter according to the plurality of listening parameters includes: determining an evaluation value corresponding to each historical reverberation play record in the plurality of historical reverberation play records according to the plurality of listening parameters to obtain a plurality of evaluation values; taking the historical reverberation record corresponding to the maximum value in the plurality of evaluation values as a target historical reverberation record; and taking the reverberation parameter corresponding to the target historical reverberation record as the preference reverberation parameter.

If the plurality of historical reverberation play records include a target historical reverberation play record, taking the target historical reverberation play record as an example, determining an application scene of the electronic device; determining a preset mood parameter corresponding to the application scene; and determining an evaluation value corresponding to the target historical reverberation record according to the difference between the preset mood parameter and the user mood parameter.

It can be understood that the preset mood parameters corresponding to the target object are different in different application scenes, so that the preset mood parameters of the target object are determined according to the application scenes corresponding to the electronic device, and then the evaluation value corresponding to the target historical reverberation record is determined according to the difference value between the preset mood parameters and the user mood parameters. In the application, the evaluation value corresponding to each historical reverberation record is determined according to the listening parameter of each historical reverberation record, then the historical reverberation record corresponding to the largest evaluation value is selected as the target historical reverberation record, and the preference reverberation parameter is determined according to the listening parameter of the target historical reverberation record, so that the accuracy of determining the preference reverberation parameter is improved.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a 3D sound effect processing apparatus according to an embodiment of the present application, and as shown in fig. 3, the 3D sound effect processing apparatus 300 includes a determining unit 301 and a synthesizing unit 302, wherein:

the determining unit 301 is configured to determine a three-dimensional coordinate of each of a plurality of sound sources corresponding to the electronic device and monaural data generated by each of the sound sources, to obtain a plurality of first three-dimensional coordinates and a plurality of monaural data; determining a second three-dimensional coordinate of a target object corresponding to the electronic equipment;

the synthesizing unit 302 is configured to synthesize the plurality of pieces of mono data according to the plurality of first three-dimensional coordinates and the plurality of second three-dimensional coordinates to obtain target binaural data.

It is understood that determining unit 301 determines a three-dimensional coordinate of each of a plurality of sound sources corresponding to the electronic device and mono data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of mono data, and determines a second three-dimensional coordinate of a target object corresponding to the electronic device, and then synthesizing unit 302 synthesizes the plurality of mono data according to the plurality of first three-dimensional coordinates and the second three-dimensional coordinates to obtain target binaural data. Therefore, after the first three-dimensional coordinates of the sound sources and the second three-dimensional coordinates of the target object are determined, the target two-channel data corresponding to the sound sources are generated, the playing effect of the audio data is improved, the immersion feeling can be generated, and the user experience is improved conveniently.

In a possible example, in terms of synthesizing the plurality of pieces of mono data according to the plurality of first three-dimensional coordinates and the plurality of second three-dimensional coordinates to obtain target binaural data, the determining unit 301 is further configured to determine left ear three-dimensional coordinates and right ear three-dimensional coordinates corresponding to the plurality of second three-dimensional coordinates; determining transmission paths between the sound sources and the target object according to the first three-dimensional coordinates, the left ear three-dimensional coordinates and the right ear three-dimensional coordinates to obtain a plurality of transmission paths; determining the time and pressure for transmitting each piece of single sound channel data to the left ear three-dimensional coordinate and the right ear three-dimensional coordinate according to the plurality of transmission paths to obtain a plurality of times and a plurality of pressures; determining a time difference corresponding to each piece of single-channel data in the single-channel data according to the plurality of times to obtain a plurality of time differences, and determining a pressure difference corresponding to each piece of single-channel data in the single-channel data according to the plurality of pressures to obtain a plurality of pressure differences; determining a delay parameter corresponding to each piece of single-channel data in the plurality of pieces of single-channel data according to the plurality of time differences and the plurality of pressure differences to obtain a plurality of delay parameters; processing each piece of single-channel data in the plurality of pieces of single-channel data according to the plurality of delay parameters to obtain corresponding left-channel parameters and right-channel parameters; the synthesizing unit 302 is specifically configured to synthesize the multiple pieces of mono data according to the left channel parameter and the right channel parameter corresponding to each piece of mono data in the multiple pieces of mono data, so as to obtain the target binaural data.

In one possible example, the plurality of sound sources includes a target sound source, the plurality of first three-dimensional coordinates includes a target first three-dimensional coordinate corresponding to the target sound source, and the plurality of mono data includes target mono data corresponding to the target sound source;

in the aspect of determining, according to the multiple transmission paths, time and pressure for transmitting each piece of monaural data in the multiple pieces of monaural data to the left ear three-dimensional coordinate and the right ear three-dimensional coordinate, so as to obtain multiple times and multiple pressures, the determining unit 301 is specifically configured to obtain a cross section by taking the target first three-dimensional coordinate and the left ear three-dimensional coordinate as axes; determining an occluding object between the target first three-dimensional coordinate and the left ear three-dimensional coordinate; determining a plurality of reference transmission paths for transmitting the target single-channel data to the three-dimensional coordinates of the left ear according to the cross section and the shielding object; and determining the time and pressure for transmitting the target single-channel data to the three-dimensional coordinates of the left ear according to the plurality of reference transmission paths.

In a possible example, in terms of the determining, according to the plurality of reference transmission paths, the time and the pressure at which the target monaural data is transmitted to the three-dimensional coordinates of the left ear, the determining unit 301 is specifically configured to determine the sound intensity and the sound pressure corresponding to each of the plurality of reference transmission paths, so as to obtain a plurality of sound intensities and a plurality of sound pressures; and determining the time for transmitting the target single sound channel data to the three-dimensional coordinate of the left ear according to the sound intensities, and determining the pressure for transmitting the target single sound channel data to the three-dimensional coordinate of the left ear according to the sound pressures.

In one possible example, in the aspect of determining three-dimensional coordinates of each of a plurality of sound sources corresponding to an electronic device and mono data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of mono data, the determining unit 301 is specifically configured to determine a plurality of reference objects corresponding to the electronic device; determining behavior information of each reference object in the plurality of reference objects to obtain a plurality of behavior information; selecting the plurality of sound sources from the plurality of reference objects according to the plurality of behavior information; determining a coordinate position corresponding to each sound source in the plurality of sound sources to obtain a plurality of first three-dimensional coordinates; and determining the single-channel data of each sound source according to the behavior information corresponding to the sound source in the plurality of sound sources to obtain the plurality of single-channel data.

In one possible example, the plurality of behavior information includes target behavior information corresponding to a target sound source; in the aspect that the monaural data of each sound source is determined according to the behavior information corresponding to the sound source in the plurality of sound sources to obtain the plurality of monaural data, the determining unit 301 is specifically configured to determine the sound type and the playing parameter corresponding to the target behavior information; and generating the single-channel data of the target sound source according to the sound type and the playing parameters.

In one possible example, the determining unit 301 is further configured to determine a target reverberation parameter corresponding to the target object; the synthesizing unit 302 is further configured to process the target binaural data according to the target reverberation parameter to obtain reverberation binaural data.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 4, the electronic device 400 includes a processor 410, a memory 420, a communication interface 430, and one or more programs 440, wherein the one or more programs 440 are stored in the memory 420 and configured to be executed by the processor 410, and wherein the programs 440 include instructions for:

determining a three-dimensional coordinate of each sound source in a plurality of sound sources corresponding to the electronic device 400 and mono data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of mono data;

determining a second three-dimensional coordinate of a target object corresponding to the electronic device 400;

It can be understood that the three-dimensional coordinates of each of the plurality of sound sources corresponding to the electronic device 400 and the mono data generated by each of the sound sources are determined to obtain a plurality of first three-dimensional coordinates and a plurality of mono data, the second three-dimensional coordinates of the target object corresponding to the electronic device 400 are determined, and then the plurality of mono data are synthesized according to the plurality of first three-dimensional coordinates and the second three-dimensional coordinates to obtain the target binaural data. Therefore, after the first three-dimensional coordinates of the sound sources and the second three-dimensional coordinates of the target object are determined, the target two-channel data corresponding to the sound sources are generated, the playing effect of the audio data is improved, the immersion feeling can be generated, and the user experience is improved conveniently.

In one possible example, in the aspect that the target binaural data is obtained by synthesizing the plurality of pieces of mono data according to the plurality of first three-dimensional coordinates and the plurality of second three-dimensional coordinates, the instructions in the program 440 are specifically configured to:

determining a left ear three-dimensional coordinate and a right ear three-dimensional coordinate corresponding to the second three-dimensional coordinate;

determining transmission paths between the sound sources and the target object according to the first three-dimensional coordinates, the left ear three-dimensional coordinates and the right ear three-dimensional coordinates to obtain a plurality of transmission paths;

determining the time and pressure for transmitting each piece of single sound channel data to the left ear three-dimensional coordinate and the right ear three-dimensional coordinate according to the plurality of transmission paths to obtain a plurality of times and a plurality of pressures;

determining a time difference corresponding to each piece of single-channel data in the single-channel data according to the plurality of times to obtain a plurality of time differences, and determining a pressure difference corresponding to each piece of single-channel data in the single-channel data according to the plurality of pressures to obtain a plurality of pressure differences;

determining a delay parameter corresponding to each piece of single-channel data in the plurality of pieces of single-channel data according to the plurality of time differences and the plurality of pressure differences to obtain a plurality of delay parameters;

processing each piece of single-channel data in the plurality of pieces of single-channel data according to the plurality of delay parameters to obtain corresponding left-channel parameters and right-channel parameters;

and synthesizing the plurality of pieces of single channel data according to the left channel parameter and the right channel parameter corresponding to each piece of single channel data in the plurality of pieces of single channel data to obtain the target two-channel data.

in the aspect of determining, according to the plurality of transmission paths, a time and a pressure at which each piece of mono data in the plurality of pieces of mono data is transmitted to the left ear three-dimensional coordinates and the right ear three-dimensional coordinates, so as to obtain a plurality of times and a plurality of pressures, the instructions in the program 440 are specifically configured to perform the following operations:

obtaining a cross section by taking the target first three-dimensional coordinate and the left ear three-dimensional coordinate as axes;

determining an occluding object between the target first three-dimensional coordinate and the left ear three-dimensional coordinate;

determining a plurality of reference transmission paths for transmitting the target single-channel data to the three-dimensional coordinates of the left ear according to the cross section and the shielding object;

and determining the time and pressure for transmitting the target single-channel data to the three-dimensional coordinates of the left ear according to the plurality of reference transmission paths.

In one possible example, the instructions in the program 440 are specifically configured to perform the following operations in terms of the determining of the time and pressure at which the target mono data is transmitted to the left ear three-dimensional coordinates from the plurality of reference transmission paths:

determining the sound intensity and the sound pressure corresponding to each transmission path in the plurality of reference transmission paths to obtain a plurality of sound intensities and a plurality of sound pressures;

and determining the time for transmitting the target single sound channel data to the three-dimensional coordinate of the left ear according to the sound intensities, and determining the pressure for transmitting the target single sound channel data to the three-dimensional coordinate of the left ear according to the sound pressures.

In one possible example, in terms of determining three-dimensional coordinates of each of a plurality of sound sources corresponding to the electronic device 400 and mono data generated by each sound source, obtaining a plurality of first three-dimensional coordinates and a plurality of mono data, the instructions in the program 440 are specifically configured to:

determining a plurality of reference objects corresponding to the electronic device 400;

determining behavior information of each reference object in the plurality of reference objects to obtain a plurality of behavior information;

selecting the plurality of sound sources from the plurality of reference objects according to the plurality of behavior information;

determining a coordinate position corresponding to each sound source in the plurality of sound sources to obtain a plurality of first three-dimensional coordinates;

and determining the single-channel data of each sound source according to the behavior information corresponding to the sound source in the plurality of sound sources to obtain the plurality of single-channel data.

In one possible example, the plurality of behavior information includes target behavior information corresponding to a target sound source; in the aspect that the monaural data of each of the plurality of sound sources is determined according to the behavior information corresponding to the sound source, so as to obtain the plurality of monaural data, the instructions in the program 440 are specifically configured to perform the following operations:

determining a sound type and a playing parameter corresponding to the target behavior information;

and generating the single-channel data of the target sound source according to the sound type and the playing parameters.

In one possible example, the instructions in the program 440 are further configured to:

determining a target reverberation parameter corresponding to the target object;

and processing the target binaural data according to the target reverberation parameters to obtain reverberation binaural data.

Embodiments of the present application also provide a computer storage medium, where the computer storage medium stores a computer program for causing a computer to execute a part or all of the steps of any one of the methods as described in the method embodiments, and the computer includes an electronic device.

Embodiments of the application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as recited in the method embodiments. The computer program product may be a software installation package and the computer comprises the electronic device.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art will also appreciate that the embodiments described in this specification are presently preferred and that no particular act or mode of operation is required in the present application.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a hardware mode or a software program mode.

The integrated unit, if implemented in the form of a software program module and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a read-only memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and the like.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash disk, ROM, RAM, magnetic or optical disk, and the like.

The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A3D sound effect processing method is used for electronic equipment and is characterized by comprising the following steps:

determining a three-dimensional coordinate of each sound source in a plurality of sound sources corresponding to the electronic equipment and single channel data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of single channel data; the method specifically comprises the following steps: determining a plurality of reference objects corresponding to the electronic equipment; determining behavior information of each reference object in the plurality of reference objects to obtain a plurality of behavior information; selecting the plurality of sound sources from the plurality of reference objects according to the plurality of behavior information; determining a coordinate position corresponding to each sound source in the plurality of sound sources to obtain a plurality of first three-dimensional coordinates; determining single-channel data of each sound source according to the behavior information corresponding to the sound source in the plurality of sound sources to obtain the plurality of single-channel data; the reference object is an object presented in a display page of the electronic equipment; the behavior information is dynamic information of the reference objects, and each reference object corresponds to different sounds when being in different behavior information; the sound source is a sounding body in a virtual scene;

synthesizing the single sound channel data according to the first three-dimensional coordinates and the second three-dimensional coordinates to obtain target double-sound-channel data;

wherein the synthesizing the plurality of pieces of mono data according to the plurality of first three-dimensional coordinates and the second three-dimensional coordinates to obtain target binaural data includes:

synthesizing the plurality of single channel data according to the left channel parameter and the right channel parameter corresponding to each single channel data in the plurality of single channel data to obtain the target double-channel data;

determining a preference reverberation parameter corresponding to the target object;

processing the target binaural data according to the preference reverberation parameter to obtain reverberation binaural data;

wherein the determining of the preference reverberation parameter corresponding to the target object comprises: acquiring a plurality of pre-stored historical reverberation play records corresponding to the target object; obtaining a plurality of listening parameters corresponding to each historical reverberation playing record in the plurality of historical reverberation playing records; determining the preferred reverberation parameter from the plurality of listening parameters;

said determining the preferred reverberation parameter from the plurality of listening parameters comprises: determining an evaluation value corresponding to each historical reverberation play record in the plurality of historical reverberation play records according to the plurality of listening parameters to obtain a plurality of evaluation values; taking the historical reverberation play record corresponding to the maximum value in the plurality of evaluation values as a target historical reverberation play record; and taking the reverberation parameter corresponding to the target historical reverberation playing record as the preference reverberation parameter.

2. The method of claim 1, wherein the plurality of sound sources comprises a target sound source, wherein the plurality of first three-dimensional coordinates comprises a target first three-dimensional coordinate corresponding to the target sound source, and wherein the plurality of mono data comprises target mono data corresponding to the target sound source;

determining, according to the plurality of transmission paths, time and pressure at which each piece of monaural data of the plurality of monaural data is transmitted to the left ear three-dimensional coordinate and the right ear three-dimensional coordinate, to obtain a plurality of times and a plurality of pressures, including:

3. The method of claim 2, wherein said determining a time and a pressure at which the target mono data is transmitted to the left ear three-dimensional coordinates based on the plurality of reference transmission paths comprises:

4. The method of claim 1, wherein the plurality of behavior information includes target behavior information corresponding to a target sound source of the plurality of sound sources; the determining the monaural data of each sound source according to the behavior information corresponding to the sound source in the plurality of sound sources to obtain the plurality of monaural data includes:

5. A 3D sound effect processing apparatus, comprising:

the electronic equipment comprises a determining unit, a processing unit and a processing unit, wherein the determining unit is used for determining three-dimensional coordinates of each sound source in a plurality of sound sources corresponding to the electronic equipment and single sound channel data generated by each sound source to obtain a plurality of first three-dimensional coordinates and a plurality of single sound channel data; determining a second three-dimensional coordinate of a target object corresponding to the electronic equipment; the method specifically comprises the following steps: determining a plurality of reference objects corresponding to the electronic equipment; determining behavior information of each reference object in the plurality of reference objects to obtain a plurality of behavior information; selecting the plurality of sound sources from the plurality of reference objects according to the plurality of behavior information; determining a coordinate position corresponding to each sound source in the plurality of sound sources to obtain a plurality of first three-dimensional coordinates; determining single-channel data of each sound source according to the behavior information corresponding to the sound source in the plurality of sound sources to obtain the plurality of single-channel data; the reference object is an object presented in a display page of the electronic equipment; the behavior information is dynamic information of the reference objects, and each reference object corresponds to different sounds when being in different behavior information; the sound source is a sounding body in a virtual scene;

a synthesizing unit, configured to synthesize the plurality of pieces of mono data according to the plurality of first three-dimensional coordinates and the second three-dimensional coordinates, so as to obtain target binaural data;

the determining unit is further configured to determine a preference reverberation parameter corresponding to the target object;

the synthesis unit is further configured to process the target binaural data according to the preference reverberation parameter to obtain reverberation binaural data;

6. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps of the method of any of claims 1-4.

7. A computer-readable storage medium for storing a computer program, wherein the computer program causes a computer to perform the method according to any one of claims 1-4.