CN116156390B - Audio processing method and electronic equipment - Google Patents

Audio processing method and electronic equipment Download PDF

Info

Publication number
CN116156390B
CN116156390B CN202310415787.7A CN202310415787A CN116156390B CN 116156390 B CN116156390 B CN 116156390B CN 202310415787 A CN202310415787 A CN 202310415787A CN 116156390 B CN116156390 B CN 116156390B
Authority
CN
China
Prior art keywords
audio
electronic device
transfer function
target
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310415787.7A
Other languages
Chinese (zh)
Other versions
CN116156390A (en
Inventor
韩欣宇
韩荣
杨昭
夏日升
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Priority to CN202311136573.2A priority Critical patent/CN117177135A/en
Priority to CN202310415787.7A priority patent/CN116156390B/en
Publication of CN116156390A publication Critical patent/CN116156390A/en
Application granted granted Critical
Publication of CN116156390B publication Critical patent/CN116156390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2420/00Details of connection covered by H04R, not provided for in its groups
    • H04R2420/03Connection circuits to selectively connect loudspeakers or headphones to amplifiers

Abstract

The application discloses an audio processing method and electronic equipment, relates to the field of audio processing, and is used for improving the listening experience of a user when the electronic equipment plays audio by improving the transient response of the electronic equipment in an acoustic system formed when the user wears the electronic equipment on the basis of original hardware of the electronic equipment. The method comprises the following steps: firstly, a response for indicating audio played by a loudspeaker of the audio playing device to be received by the audio playing device after being transmitted by an acoustic system and a target sound effect selected by a user are obtained, wherein the acoustic system is formed after the user wears the audio playing device. Then, according to the first transfer function, a response is obtained for indicating that the audio played by the speaker of the audio playing device is propagated to the position of the eardrum of the user via the acoustic system. And finally, obtaining the target audio to be played according to the second transfer function, the original audio and the target sound effect.

Description

Audio processing method and electronic equipment
Technical Field
The embodiment of the application relates to the field of audio processing, in particular to an audio processing method and electronic equipment.
Background
Transient response of an electronic device refers to the ability of the electronic device to accurately reproduce transient audio changes. The transient response may affect the quality of the audio played by the electronic device and thus the listening experience of the user. In the prior art, transient response of an electronic device is improved by changing the hardware structure of the electronic device. However, improving the transient response of an electronic device by changing the hardware structure of the electronic device is not only costly, but also less effective in improving the listening experience of the user.
Disclosure of Invention
The application provides an audio processing method and electronic equipment, which are used for improving the listening experience of a user when the electronic equipment plays audio by improving the transient response of the electronic equipment in an acoustic system formed when the user wears the electronic equipment on the basis of the original hardware of the electronic equipment.
In order to achieve the above purpose, the application adopts the following technical scheme:
in a first aspect, there is provided an audio processing method, the method comprising: firstly, acquiring a response for indicating audio played by a loudspeaker of audio playing equipment to be received by the audio playing equipment after being transmitted by an acoustic system and a target sound effect selected by a user, wherein the target sound effect is an effect achieved when the user expects the audio to be played, and the acoustic system is formed after the user wears the audio playing equipment; then, according to the first transfer function, obtaining a second transfer function, wherein the second transfer function can be used for indicating the response of the audio played by the loudspeaker of the audio playing device to be transmitted to the position of the eardrum of the user through the acoustic system; and finally, obtaining the target audio to be played according to the second transfer function, the original audio and the target sound effect.
When the user wears the audio playing device to listen to the audio, the auditory canal of the user and the audio playing device form an acoustic system. By adopting the audio processing method provided by the application, firstly, the response, namely the first transfer function, which can indicate the audio played by the loudspeaker of the audio playing device to be received by the audio playing device after being transmitted by the acoustic system is obtained, and the first transfer function can be obtained through measurement. Then, a second transfer function, which cannot be obtained by measurement but can be used to represent the real listening experience of the user (the second transfer function is also a transient response function of the audio playback device in the acoustic system), is determined from the first transfer function and the mapping relation between the first transfer function and the second transfer function. And according to the second transfer function and the target sound effect selected by the user, the target audio obtained after the original audio is processed can meet the listening requirement of the user, so that the user has better listening experience in the process of listening to the audio by wearing the audio playing equipment. Therefore, the application does not change the hardware structure of the audio playing device, but combines the characteristics of the auditory canal of the user, the wearing state of the user when wearing the audio playing device and the target sound effect selected by the user in a software mode, processes the original audio and obtains the target audio meeting the listening requirement of the user.
In a possible implementation manner of the first aspect, the first transfer function is a preset transfer function, or the first transfer function is determined by an audio playing device.
The first transfer function refers to a corresponding relation of audio output by a speaker of the audio playing device and received by the audio playing device after being transmitted by the acoustic system, and the audio playing device receives the audio output by the speaker through the built-in feedback microphone, so that a hardware structure of the audio playing device can influence a source for acquiring the first transfer function. Specifically, if the audio playback device includes a feedback microphone, the first transfer function may be determined by the audio playback device itself; if the audio playback device does not include a feedback microphone, the preset transfer function may be determined as the first transfer function.
In another possible implementation manner of the first aspect, the determining the first transfer function may be determined by an electronic device other than the audio playing device. At this time, a first transfer function is acquired, including: firstly, the electronic equipment acquires audio played by a loudspeaker of the audio playing equipment and audio received by a feedback microphone in the audio playing equipment; then, the electronic device obtains a first transfer function according to the audio played by the loudspeaker of the audio playing device and the audio received by the feedback microphone of the audio playing device.
If the audio playing device does not have the data processing capability, but includes a feedback microphone, the audio playing device may send the audio played by the speaker and the audio received by the feedback microphone to an electronic device having the data processing capability, and obtain the first transfer function after processing by the electronic device.
In another possible implementation manner of the first aspect, the obtaining the target audio to be played according to the second transfer function, the original audio, and the target audio includes: firstly, according to a second transfer function and a target sound effect, a target filter for processing original audio or a target neural network is obtained; then, the original audio is processed through a target filter or a target neural network to obtain target audio.
The second transfer function may be used to indicate a real listening experience when the user is listening to audio with the audio playback device, the target sound effect is an effect that the user would like to achieve when the audio is played, and one sound effect may be represented by a transient response function. Thus, the target filter, or target neural network, capable of processing the second transfer function into a transient response function corresponding to the target sound effect, is a target audio that can process the original audio into a listening demand of the user.
In a second aspect, the present application provides an electronic device comprising a processor and a memory in which instructions are stored which, when executed by the processor, perform a method as described in the first aspect and any of its embodiments.
In a third aspect, there is provided a computer storage medium comprising instructions which, when run on an electronic device, cause the electronic device to perform the method of the first aspect and any implementation thereof.
In a fourth aspect, there is provided a computer program product comprising instructions which, when run on an electronic device as described above, cause the electronic device to perform the method of the first aspect and any of its embodiments.
In a fifth aspect, a chip system is provided, the chip system comprising a processor for supporting an electronic device to implement the functions as referred to in the first aspect and any implementation thereof. In one possible design, the electronic device may further include interface circuitry that may be used to receive signals from other devices (e.g., memory) or to send signals to other devices (e.g., a communication interface). The system-on-chip may include a chip, and may also include other discrete devices.
The technical effects of the second to fifth aspects are referred to the technical effects of the first aspect and any of its embodiments and are not repeated here.
Drawings
Fig. 1 is a schematic structural diagram of an audio playing system according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a first electronic device according to an embodiment of the present application;
fig. 3 is a schematic diagram of a software architecture of a first electronic device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a second electronic device according to an embodiment of the present application;
fig. 5 is a waveform diagram of output audio of a speaker of a second electronic device according to an embodiment of the present application;
fig. 6 is a schematic diagram of a waveform diagram of a first transient response function of a speaker of a second electronic device, a waveform diagram of input audio, and a waveform diagram of output audio according to an embodiment of the present application;
fig. 7 is a schematic diagram of a waveform diagram of a second transient response function of a speaker of a second electronic device and a waveform diagram of output audio according to an embodiment of the present application;
FIG. 8 is a schematic diagram of an acoustic system according to an embodiment of the present application;
FIG. 9 is a schematic diagram of an interface of a first electronic device according to an embodiment of the present application;
Fig. 10 is a schematic flow chart of an audio processing method according to an embodiment of the present application;
FIG. 11 is a second schematic diagram of an interface of a first electronic device according to an embodiment of the present application;
fig. 12 is a schematic diagram of a frame structure of an adaptive filter according to an embodiment of the present application;
FIG. 13 is a third diagram illustrating an interface of a first electronic device according to an embodiment of the present application;
FIG. 14 is a fourth schematic diagram of an interface of a first electronic device according to an embodiment of the present application;
FIG. 15 is a fifth exemplary interface diagram of a first electronic device according to an embodiment of the present application;
fig. 16 is a schematic structural diagram of a first neural network according to an embodiment of the present application;
FIG. 17 is a schematic diagram of a transient response function for obtaining a target sound effect according to a second transfer function according to an embodiment of the present application;
fig. 18 is a schematic structural diagram of a chip system according to an embodiment of the present application.
Detailed Description
The terms "first," "second," and the like, in accordance with embodiments of the present application, are used solely for the purpose of distinguishing between similar features and not necessarily for the purpose of indicating a relative importance, number, sequence, or the like.
The terms "exemplary" or "such as" and the like, as used in relation to embodiments of the present application, are used to denote examples, illustrations, or descriptions. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
The terms "coupled" and "connected" in accordance with embodiments of the application are to be construed broadly, and may refer, for example, to a physical direct connection, or to an indirect connection via electronic devices, such as, for example, electrical resistance, inductance, capacitance, or other electrical devices.
Some concepts to which the present application relates will be described first.
The transfer function is a relationship between input and output of an object having a linear characteristic, and is generally expressed as a function (ratio of laplace transform of an output waveform to laplace transform of an input waveform).
Transient response refers to the change in system (or fixture) output from an initial state to a steady state of a system (or fixture) under some typical signal input. Transient response is also known as dynamic response or transient response. The system (or equipment) with good transient response should respond immediately as soon as the signal is coming to a stop, and the system is never drags the mud. That is, the better the transient response of the system (or fixture), the more fidelity the audio output by the system (or fixture). Thus, by improving the transient response of the system (or fixture), the quality of the audio output by the system (or fixture) can be improved.
The audio processing method provided by the embodiment of the application can be applied to the audio playing system shown in fig. 1. As shown in fig. 1, the audio playing system may include: a first electronic device 101 and a second electronic device 102, the first electronic device 101 being communicatively connected to the second electronic device 102. The user listens to audio by wearing the second electronic device 102.
In one embodiment, the first electronic device 101 is configured to process the pre-stored original audio to obtain a target audio to be played, which meets the listening requirement of the user, and then send the target audio to the second electronic device 102. The second electronic device 102 is used to play the target audio. That is, the first electronic device 101 may be an audio processing device and the second electronic device 102 may be an audio playing device.
In another embodiment, the first electronic device 101 is configured to send the original audio to the second electronic device. The second electronic device 102 is configured to process the original audio to obtain a target audio to be played, which meets the listening requirement of the user, and play the target audio. That is, the second electronic device 102 may be both an audio processing device and an audio playback device.
Optionally, the audio playing system according to the embodiment of the present application may also include the second electronic device 102. At this time, the second electronic device pre-stores the original audio. The second electronic device 102 is configured to process the original audio to obtain a target audio to be played, which meets the listening requirement of the user, and play the target audio. That is, the second electronic device 102 is both an audio processing device and an audio playback device.
The first electronic device according to the embodiment of the present application may be a device having a communication function and a data processing function, and the first electronic device may be mobile or fixed. The first electronic device may be deployed on land (e.g., indoor or outdoor, hand-held or vehicle-mounted, etc.), on water (e.g., ship, etc.), or in the air (e.g., aircraft, balloon, etc.). The first electronic device may be referred to as a User Equipment (UE), an access terminal, a terminal unit, a subscriber unit (subscriber unit), a terminal station, a Mobile Station (MS), a mobile station, a terminal agent, a terminal device, or the like. For example, the first electronic device may be a mobile phone, a tablet computer, a notebook computer, or the like. The embodiment of the application is not limited to the specific type, structure and the like of the first electronic equipment. One possible configuration of the first electronic device is described below.
By way of example, using a first electronic device as a mobile phone, fig. 2 shows one possible configuration of the first electronic device 101. As shown in fig. 2, the first electronic device 101 may include a processor 210, an external memory interface 220, an internal memory 221, a universal serial bus (universal serial bus, USB) interface 230, a power management module 240, a battery 241, a wireless charging coil 242, an antenna 1, an antenna 2, a mobile communication module 250, a wireless communication module 260, an audio module 270, a speaker 270A, a receiver 270B, a microphone 270C, an earphone interface 270D, a sensor module 280, keys 290, a motor 291, an indicator 292, a camera 293, a display 294, a subscriber identity module (subscriber identification module, SIM) card interface 295, and the like.
The sensor module 280 may include, among other things, a pressure sensor, a gyroscope sensor, a barometric pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, and the like.
It should be understood that the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the first electronic device 101. In other embodiments of the application, the first electronic device 101 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Processor 210 may include one or more processing units such as, for example: the processor 210 may be a field programmable gate array (field programmable gate array, FPGA), an application specific integrated circuit (application specific integrated circuit, ASIC), a system on chip (SoC), a central processing unit (central processing unit, CPU), an application processor (application processor, AP), a network processor (network processor, NP), a digital signal processor (digital signal processor, DSP), a micro control unit (micro controller unit, MCU), a programmable logic device (programmable logic device, PLD), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a video codec, a baseband processor, and a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors. For example, the processor 210 may be an application processor AP. Alternatively, the processor 210 may be integrated in a system on chip (SoC). Alternatively, the processor 210 may be integrated in an integrated circuit (integrated circuit, IC) chip. The processor 210 may include an Analog Front End (AFE) and a micro-controller unit (MCU) in an IC chip.
The controller may be a neural hub and a command center of the first electronic device 101. The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.
A memory may also be provided in the processor 210 for storing instructions and data. In some embodiments, the memory in the processor 210 is a cache memory. The memory may hold instructions or data that the processor 210 has just used or recycled. If the processor 210 needs to reuse the instruction or data, it may be called directly from the memory. Repeated accesses are avoided and the latency of the processor 210 is reduced, thereby improving the efficiency of the system.
In some embodiments, processor 210 may include one or more interfaces. The interfaces may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous receiver transmitter (universal asynchronous receiver/transmitter, UART) interface, a mobile industry processor interface (mobile industry processor interface, MIPI), a general-purpose input/output (GPIO) interface, a subscriber identity module (subscriber identity module, SIM) interface, and/or a USB interface, among others.
It should be understood that the interfacing relationship between the modules illustrated in the embodiment of the present application is only illustrative, and is not limited to the structure of the first electronic device 101. In other embodiments of the present application, the first electronic device 101 may also use different interfacing manners, or a combination of multiple interfacing manners in the foregoing embodiments.
The power management module 240 is configured to receive a charging input from a charger. The charger may be a wireless charger (such as a wireless charging base of the first electronic device 101 or other devices that may wirelessly charge the first electronic device 101), or may be a wired charger. For example, the power management module 240 may receive a charging input of a wired charger through the USB interface 230. The power management module 240 may receive wireless charging input through a wireless charging coil 242 of the electronic device.
The power management module 240 may also supply power to the electronic device while charging the battery 241. The power management module 240 receives input from the battery 241 to power the processor 210, the internal memory 221, the external memory interface 220, the display 294, the camera 293, the wireless communication module 260, and the like. The power management module 240 may also be configured to monitor parameters of the battery 241 such as battery capacity, battery cycle times, battery health (leakage, impedance), etc. In other embodiments, the power management module 240 may also be disposed in the processor 210.
The wireless communication function of the first electronic device 101 may be implemented by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, a modem processor, a baseband processor, and the like.
The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the first electronic device 101 may be used to cover a single or multiple communication bands. Different antennas may also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed into a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 250 may provide a solution for wireless communication including 2G/3G/4G/5G, etc. applied on the first electronic device 101. The wireless communication module 260 may provide solutions for wireless communication including wireless local area network (wireless local area networks, WLAN) (e.g., wireless fidelity (wireless fidelity, wi-Fi) network), bluetooth (BT), frequency modulation (frequency modulation, FM), near field wireless communication technology (near field communication, NFC), infrared technology (IR), etc., applied on the first electronic device 101. In some embodiments, antenna 1 and mobile communication module 250 of first electronic device 101 are coupled, and antenna 2 and wireless communication module 260 are coupled, such that first electronic device 101 may communicate with a network and other devices through wireless communication techniques.
The first electronic device 101 implements display functions by a GPU, a display screen 294, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display screen 294 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 210 may include one or more GPUs that execute program instructions to generate or change display information.
The display 294 is used to display images, videos, and the like. The display 294 includes a display panel. In some embodiments, the first electronic device 101 may include 1 or N display screens 294, N being a positive integer greater than 1.
The first electronic device 101 may implement a photographing function through an ISP, a camera 293, a video codec, a GPU, a display 294, an application processor, and the like. The ISP is used to process the data fed back by the camera 293. In some embodiments, the ISP may be provided in the camera 293. The camera 293 is used to capture still images or video. In some embodiments, the electronic device may include 1 or N cameras 293, N being a positive integer greater than 1. Exemplary cameras of embodiments of the present application include a wide angle camera and a main camera.
The external memory interface 220 may be used to connect an external memory card, such as a Micro SanDisk (Micro SD) card, to enable expansion of the memory capabilities of the first electronic device 101. The external memory card communicates with the processor 210 through an external memory interface 220 to implement data storage functions. For example, files such as music, video, etc. are stored in an external memory card.
Internal memory 221 may be used to store computer executable program code that includes instructions. The processor 210 executes various functional applications of the first electronic device 101 and data processing by executing instructions stored in the internal memory 221. In addition, the internal memory 221 may include a high-speed random access memory, and may further include a nonvolatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and the like.
The memory to which embodiments of the present application relate may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous DRAM (SLDRAM), and direct memory bus RAM (DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
The first electronic device 101 may implement audio functionality through an audio module 270, a speaker 270A, a receiver 270B, a microphone 270C, an ear-headphone interface 270D, an application processor, and so forth. Such as music playing, recording, etc.
Audio module 270 is used to convert digital audio information to analog audio signal output and also to convert analog audio input to digital audio input. In some embodiments, the audio module 270 may be disposed in the processor 210, or some functional modules of the audio module 270 may be disposed in the processor 210. Speaker 270A, also referred to as a "horn," is used to convert audio electrical signals into sound signals. A receiver 270B, also referred to as a "earpiece", is used to convert the audio electrical signal into a sound signal. Microphone 270C, also referred to as a "microphone" or "microphone," is used to convert sound signals into electrical signals. The first electronic device 101 may be provided with at least one microphone 270C. The earphone interface 270D is for connecting a wired earphone. Earphone interface 270D may be USB interface 230 or a 3.5mm open mobile terminal platform (open mobile terminal platform, OMTP) standard interface, american cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.
Keys 290 include a power on key, a volume key, etc. The keys 290 may be mechanical keys. Or may be a touch key. The first electronic device 101 may receive key inputs, generating key signal inputs related to user settings and function controls of the first electronic device 101. The motor 291 may generate a vibration alert. The motor 291 may be used for incoming call vibration alerting or for touch vibration feedback. The indicator 292 may be an indicator light, which may be used to indicate a state of charge, a change in power, or an indication message, missed call, notification, etc. The SIM card interface 295 is for interfacing with a SIM card. The SIM card may be inserted into the SIM card interface 295 or removed from the SIM card interface 295 to enable contact and separation from the first electronic device 101. The first electronic device 101 may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 295 may support a Nano SIN (Nano SIM) card, micro SIM (Micro SIM) card, SIM card, etc. In some embodiments, the first electronic device 101 employs an embedded SIM (eSIM) card, which may be embedded in the first electronic device 101 and not separable from the first electronic device 101.
The processor 210 performs the audio processing method provided by the embodiment of the present application by executing programs, instructions stored in the internal memory 221. The program run by the processor 210 may be based on an operating system, such as an Android operating system, an apple (iOS) operating system, a Windows operating system, etc. As shown in fig. 3, taking the android operating system as an example, the program running by the processor 210 is layered according to functions, and may include an application layer, a framework layer and a kernel layer.
The application layer may include an audio processing program. The audio processing program is used for processing the original audio to obtain target audio to be played. The target audio is audio that meets the listening needs of the user.
The framework layer is used to provide system resource services, application programming interfaces (application programming interface, APIs), to applications in the application layer. For example, the framework layer provides a communication API to the application layer.
The kernel layer includes an Operating System (OS) kernel. The operating system kernel is used for managing the processes, the memory, the driving program, the file system and the network system of the system.
The second electronic device 102 according to the embodiment of the present application may be a terminal device with relatively simple functions and configuration compared to the first electronic device 101. For example, the second electronic device 102 may be a device with audio playback functionality such as a headset, a Virtual Reality (VR) device, an augmented reality (augmented reality, AR) device, or the like.
In one embodiment, when the second electronic device 102 is a headset, the target audio to be played may be determined by the first electronic device 101 and then sent to the second electronic device 102, because the data processing capability of the headset is limited or the headset does not have the data processing capability. The second electronic device 102 is used to play the target audio. The headset may be a bluetooth headset (e.g., a real wireless stereo (true wireless stereo, TWS) headset), a wired headset, or the like.
In another embodiment, when the second electronic device 102 is a virtual reality device or an augmented reality device with greater data processing capabilities, the second electronic device 102 is configured to determine a target audio and play the target audio.
By way of example, using a second electronic device as a Bluetooth headset, FIG. 4 illustrates one possible configuration of the second electronic device 102. As shown in fig. 4, the second electronic device 102 may include: at least one processor 401, at least one memory 402, a wireless communication module 403, an audio module 404, a power module 405, an input/output interface 406, and a sensor 407, etc. The processor 401 may include one or more interfaces for interfacing with other components of the bluetooth headset. The Bluetooth headset can be stored through a headset storage box. The following describes the components of the bluetooth headset in detail with reference to fig. 4.
The sensor 407 may include a distance sensor, a proximity sensor, a bone conduction sensor, a touch sensor, or the like, among others. The Bluetooth headset may be used to determine whether the Bluetooth headset is worn by a user via a distance sensor, a proximity light sensor. The Bluetooth headset can acquire vibration signals of human body vocal parts to vibrate bone blocks through the bone conduction sensor, analyze out voice signals, realize voice functions, and accordingly receive voice instructions of users and the like. The Bluetooth headset can detect touch operation of a user through the touch sensor. The touch operation may include a single click, double click, multiple clicks, long press, heavy press, etc. operation by the user.
The wireless communication module 403 may be used to support data exchange between the bluetooth headset and other electronic devices or headset boxes via wireless communication technology. The wireless communication module 403 may further include an antenna, and the wireless communication module 403 receives electromagnetic waves via the antenna, modulates the electromagnetic wave signals, filters the electromagnetic wave signals, and sends the processed signals to the processor 401. The wireless communication module 403 may also receive a signal to be sent from the processor 401, frequency modulate and amplify the signal, and convert the signal to electromagnetic waves through an antenna to radiate the electromagnetic waves.
The audio module 404 may be used to manage audio data, and to implement the bluetooth headset to input and output audio data. For example, the audio module 404 may obtain audio data from the wireless communication module 403 or transfer audio data to the wireless communication module 403, to implement a function of playing music through a bluetooth headset, and the like. The audio module 404 may include a speaker (or earpiece, receiver) assembly for outputting audio data. Speakers may be used to convert audio electrical signals into sound signals and play them.
In one embodiment, the audio module 404 may also include a feedback microphone (or microphone, microphone), a microphone pickup circuit that mates with the feedback microphone, and the like. The feedback microphone may be used to convert sound signals into audio electrical signals.
The power module 405 may be used to provide a system power for the bluetooth headset, power the various modules of the bluetooth headset, support the bluetooth headset to receive charging inputs, and so on. The power module 405 may include a power management unit (power management unit, PMU) and a battery. Wherein the power management unit may receive an external charging input; the electric signals input by the charging circuit are transformed and then provided for the battery to charge, and the electric signals provided by the battery can be transformed and then provided for other modules such as an audio module 404, a wireless communication module 403 and the like; and to prevent overcharging, overdischarging, shorting, or overcurrent of the battery, etc. In some embodiments, the power module 405 may also include a wireless charging coil for wirelessly charging the bluetooth headset. In addition, the power management unit can also be used for monitoring parameters such as battery capacity, battery cycle times, battery health status (leakage, impedance) and the like.
A plurality of input/output interfaces 406 may be used to provide a wired connection for charging or communication between the bluetooth headset and the headset case. In some embodiments, the input/output interface may be a USB interface. In other embodiments, the input/output interface 406 may be an electrical connector of the headset, through which the bluetooth headset may establish an electrical connection with an electrical connector in the headset case when the bluetooth headset is placed in the headset case, thereby charging a battery in the bluetooth headset. In other embodiments, the bluetooth headset may also be in data communication with the headset box after the electrical connection is established, e.g., may receive pairing instructions from the headset box.
The memory 402 may be used for storing program codes, such as program codes for physically connecting a bluetooth headset to a plurality of electronic devices, for interfacing with service specifications of the electronic devices, for handling audio services of the electronic devices (e.g., music playing, receiving/making a call, etc.), for charging a bluetooth headset, for wireless pairing of a bluetooth headset with other electronic devices, etc. As another example, program code for determining a first transfer function between a speaker of the bluetooth headset and the feedback microphone when the user is wearing the bluetooth headset, determining a transient response function (i.e., a second transfer function) of the bluetooth headset in the acoustic system described above, determining target audio, and so forth. The memory 402 may also be used to store other information, such as the priority of the electronic device.
Processor 401 may be used to execute program code stored in memory 402 in lieu of the associated modules to implement the functionality of a bluetooth headset. For example, when the user wears the bluetooth headset, the processor determines a first transfer function between the speaker and the microphone according to audio data output by the speaker of the bluetooth headset and audio data collected by a feedback microphone of the bluetooth headset. For another example, determining a transient response function (i.e., a second transfer function) of the bluetooth headset in the acoustic system described above, determining a target audio, and so forth.
It will be appreciated that the structure illustrated in the embodiments of the present application does not constitute a specific limitation on the bluetooth headset. It may have more or fewer components than shown in fig. 4, may combine two or more components, or may have a different configuration of components. For example, the outer surface of the bluetooth headset may further include a button 408, an indicator light (which may indicate a state of power, incoming/outgoing call, pairing mode, etc.), a display screen (which may prompt a user about information), a dust screen (which may be used with the earpiece), etc. The key 408 may be a physical key or a touch key (used with a touch sensor) for triggering operations such as powering on, powering off, pausing, playing, recording, starting pairing, resetting, etc.
During the process of playing the input audio by the second electronic device: as shown in fig. 5, the amplitude of the output audio of the speaker of the second electronic device has a set-up procedure from 0 to a maximum value; when the second electronic device stops playing the input audio, the vibration of the diaphragm of the speaker of the second electronic device is not stopped immediately, so that the amplitude of the output audio of the speaker of the second electronic device is not attenuated immediately. Thus, the output audio has a tail in its amplitude from a maximum to 0. If the time of the setup process and the time of the tail of the output audio are long, the transient response of the speaker of the second electronic device is considered to be poor. If the transient response of the speaker of the second electronic device is poor, it may be difficult for the speaker of the second electronic device to accurately reproduce the input audio. That is, if the transient response of the speaker of the second electronic device is poor, this may result in a low overlap of the waveform of the output audio with the waveform of the input audio. Otherwise, the coincidence of the waveform of the output audio and the waveform of the input audio is higher.
The relationship between the waveform of the transient response function of the speaker of the second electronic device, the waveform of the input audio, and the waveform of the output audio will be briefly described below with reference to fig. 6 and 7.
Illustratively, the input audio is assumed to be a "clean" audio that may be passed through a time-domain impulse signalTo represent; the first transient response function of the loudspeaker of the second electronic device is +.>. A in fig. 6 shows the input audio +.>B in FIG. 6 shows a first transient response function of a loudspeaker of an audio playback device +.>Is shown in FIG. 6, C shows the output audio +>Is a waveform diagram of (a). As can be seen by comparing the waveforms in a to C in fig. 6: the waveform of the output audio of the speaker of the second electronic device is identical to the waveform of the first transient response function of the speaker of the second electronic device, i.e. +.>However, the waveform of the output audio of the speaker of the second electronic device is different from the waveform of the input audio. Thus, the output audio of the speaker of the second electronic device may be described by the transient response function of the speaker of the second electronic device.
When the input audio is continuous audio, the transient response of the speaker of the second electronic device is more responsive to the output audio of the speaker of the second electronic device. Specifically, the method can be expressed as follows: the output audio of the speaker of the second electronic device is related not only to the input audio of the speaker of the second electronic device at the current moment but also to the input audio of the speaker of the second electronic device for a period of time before.
Illustratively, it is assumed that the input audio played by the audio playback device is a time-domain impulse signalThe second transient response function of the loudspeaker of the audio playback device is +.>. A in fig. 7 shows the second transient response function of the loudspeaker of the second electronic device in an ideal situation +.>Is used for the waveform diagram of (a),b in FIG. 7 shows the output audioIs a waveform diagram of (a). As can be seen by comparing the waveform diagram in a in fig. 6, the waveform diagram in a in fig. 7, and the waveform diagram in B in fig. 7: the waveform of the output audio of the speaker of the second electronic device is identical to the waveform of the second transient response function of the speaker of the second electronic device, i.e +.>The method comprises the steps of carrying out a first treatment on the surface of the And, the waveform diagram of the output audio of the speaker of the second electronic device is the same as the waveform diagram of the input audio of the speaker of the second electronic device. It can be seen that the second transient response function +. >Is an ideal transient response function of the loudspeaker.
In summary, the output audio of the speaker of the second electronic device is related to the transient response function of the speaker of the second electronic device. Therefore, by improving the transient response function of the speaker of the second electronic device, the performance of the speaker of the second electronic device may be improved, thereby improving the quality of the output audio of the second electronic device and thus improving the listening experience of the user. For example, if it is desired to make the output audio of the second electronic device coincide with the input audio of the second electronic device, i.e., to achieve an audio output of high fidelity sound effect, the transient response function of the speaker of the second electronic device may be adjusted to
When the user wears the second electronic device to listen to audio, the second electronic device forms an acoustic system (or acoustic environment) with the user's ear canal, as shown in fig. 8. The output audio (i.e., the first audio) output by the speaker of the second electronic device is the input signal to the acoustic system and the audio received by the user's eardrum location is the audio heard by the user. The audio heard by the user is an output signal of the acoustic system. Thus, the transient response function of the second electronic device in the acoustic system may affect the listening experience of the user. The transient response function of the acoustic system is related to the ear canal characteristics of the user and the wearing state of the second electronic device when the user is wearing the second electronic device. Therefore, in order to improve the listening experience of the user, based on the original hardware of the second electronic device, the characteristics of the ear canal of the user and the wearing state of the user when wearing the second electronic device need to be considered.
It should be noted that the characteristics of the auditory canal may include: the dimensions of the ear canal, the shape of the ear canal, the acoustic resistance of the eardrum, etc. can reflect the physiological acoustic characteristics of the human ear. The wearing state refers to the tightness of the user when wearing the second electronic device.
Based on the above, the embodiment of the application provides an audio processing method, when a user wears a second electronic device to listen to audio, the first electronic device (or the second electronic device) processes the original audio by combining the characteristics of the auditory canal of the user, the wearing state of the user and the target sound effect selected by the user on the basis of the original hardware of the second electronic device, so that the target audio capable of meeting the listening requirement of the user is obtained, and the user has better listening experience in the process of wearing the second electronic device to listen to audio.
Illustratively, it is assumed that the audio processing method is performed by a first electronic device, and that the first electronic device is a cell phone. As shown in a of fig. 9, the setup interface of the handset includes controls 901 for sound and vibration. In response to the user clicking on the sound and vibration control 901, the handset jumps from the setup interface to the sound and vibration interface 902, as shown at B in fig. 9. The sound and vibration interface 902 includes an audio processing control 903. In response to the user clicking the audio processing control 903, the mobile phone starts an audio processing function, starts to run an audio processing program, and executes the audio processing method provided by the embodiment of the application.
Specifically, as shown in fig. 10, the following description will take a case where the first electronic device is used to provide the original audio, and the target audio to be played is obtained according to the original audio, and the second electronic device is used to play the target audio. The audio processing method may include:
s1001, the first electronic device acquires a first transfer function and a target sound effect.
The first transfer function is used for indicating a response that the audio played by the loudspeaker of the second electronic device is received by the second electronic device after being transmitted through the acoustic system. The acoustic system is related to the characteristics of the ear canal of the user and the wearing state of the second electronic device when the user wears the second electronic device. The target sound effect is the sound effect selected by the user, and is the effect achieved when the user expects the audio to be played.
The manner in which the first transfer function and the target sound effect are obtained is described below, respectively.
As shown in fig. 8, when the user wears the second electronic device, the second electronic device forms an acoustic system (or acoustic environment) with the user's ear canal. The first transfer function may be a correspondence between an input signal and a first output signal of the acoustic system. The input signal may be a first audio played by a speaker of the second electronic device, and the first output signal may be a second audio received by the second electronic device after the first audio propagates through the acoustic system.
The second electronic devices may receive the second audio through a built-in feedback microphone, but not every second electronic device may include a feedback microphone. Of course, not every second electronic device has data processing functionality. Thus, how the first transfer function is obtained is dependent on whether a feedback microphone is included in the second electronic device and whether the second electronic device has data processing functionality. The following are examples.
In one embodiment, a built-in feedback microphone is included in the second electronic device, and the second electronic device has a data processing function. At this time, the first audio played by the speaker of the second electronic device may be propagated to the feedback microphone of the second electronic device through the acoustic system, so that the feedback microphone of the second electronic device receives the second audio. And the second electronic equipment processes the first audio and the second audio to obtain a first transfer function. The second electronic device then transmits the first transfer function to the first electronic device. That is, when the feedback microphone is included in the second electronic device and the second electronic device has a data processing function, the first transfer function may be determined by the second electronic device, and the first electronic device may receive the first transfer function transmitted by the second electronic device.
Of course, the second electronic device may also send the first audio and the second audio to the first electronic device. Then, the first electronic device processes the first audio and the second audio, and directly acquires a first transfer function.
Wherein, the first audio may be: the method comprises the steps of pre-storing in-ear prompt sound in the second electronic device, and downlink audio or test audio sent to the second electronic device by the first electronic device.
For example, assume that the first electronic device is a cell phone and the first audio is test audio. As shown in a in fig. 11, the main interface 1101 of the cellular phone displays a set application icon 1102. In response to the user clicking on the set application icon 1102, the handset jumps from the main interface 1101 to the set interface 1103 as shown in B in fig. 11. The setup interface 1103 displays the ear canal adapted switch control 1104. In response to the user clicking on the ear canal adaptation opening Guan Kongjian 1104, as shown at C in fig. 11, the ear canal adaptation function switches to an open state, while the handset transmits test audio to the second electronic device.
Optionally, an adaptive filter may be pre-stored in the second electronic device. The adaptive filter may include: an adjustable filter with adjustable filter coefficients and an adaptive filter algorithm for adjusting the filter coefficients of the adjustable filter. The second electronic device obtains a target filter coefficient of the adjustable filter through the adjustable filter, the self-adaptive filter algorithm, the first audio and the second audio, so that a first transfer function is obtained.
Illustratively, a schematic diagram of the frame structure of an adaptive filter is shown in fig. 12. As shown in fig. 12, the signals involved in the adaptive filter include: input signal (first audio), desired signal, output signal (second audio). In determining the first transfer function: first, the second electronic device determines the desired signal and a difference value of the second audio corresponding to the first audio as an error signal. And the second electronic device adjusts the filter coefficient of the adjustable filter according to the first audio frequency, the error signal, the adjustable filter and the adaptive filter algorithm until the error signal is minimum. When the error signal is minimum, the target filter coefficient of the tunable filter can be obtained, thereby obtaining the target filter. The target filter is the first transfer function.
In the practical application process, once the input signal changes, the adjustable filter can automatically track the change of the input signal, so that the self-filtering coefficient is automatically adjusted until the target filtering coefficient is obtained, and the self-adaptation process is realized.
The adaptive filtering algorithm may include: least mean square algorithm (least mean square, LMS), normalized least mean square algorithm (normalized least mean square, NLMS), recursive least square method (recursive least square, RLS), but is not limited thereto.
Illustratively, if the adaptive filtering algorithm is a least square algorithm, the filter coefficients of the tunable filter may be determined by the following equation (1):
formula (1)
Wherein, the liquid crystal display device comprises a liquid crystal display device,is the step length; />The filter coefficient of the adjustable filter in the current iteration times; />The filter coefficient of the last iteration number is the adjustable filter; />A frequency domain sequence for a first audio; />For the second audio frequencyIs a frequency domain sequence of (a);is a conjugate function of the frequency domain sequence of the first audio; />Is the number of iterations.
Of course, the adaptive filter may be pre-stored in the first electronic device. After the first electronic device receives the first audio and the second audio sent by the second electronic device, the first electronic device may directly acquire the first transfer function by adopting the above process.
In summary, the acoustic systems of different users are different due to the different ear canal characteristics and wearing states of different users. I.e. the first transfer functions of different users are different. When the second electronic device includes a feedback microphone, the second electronic device (or the first electronic device) may obtain the first transfer function of the different users, i.e. may be able to automatically identify the characteristics of the ear canal of the user, and determine the wearing state of the user when wearing the second electronic device. Thus, for different users, target audio meeting the listening requirements of different users can be obtained in a customized mode, and therefore listening experience of the users is improved. In addition, in the practical application process, the first electronic device can acquire a first transfer function as long as the speaker of the second electronic device plays the first audio. In this way, the second electronic device (or the first electronic device) can adjust in time in the process of obtaining the target audio. Based on the method, any target audio played can meet the listening requirement of the user, and listening experience of the user is further improved.
In another embodiment, the feedback microphone is not included in the second electronic device. If the second electronic device does not include the feedback microphone, the second electronic device cannot acquire the second audio, and the second electronic device cannot acquire the first transfer function. In this case, the first transfer function is a preset transfer function.
The preset transfer function may be a transfer function obtained by enabling the artificial ear or the artificial head to wear the second electronic device including the feedback microphone in a standard wearing state under the test environment and adopting the above manner of determining the first transfer function. Specific processes may refer to the above processes, and are not described herein.
It should be noted that the artificial head is an ergonomically designed device, which includes an artificial ear. The artificial ear is a simulation device for simulating the physical characteristics of the human ear, and has the same acoustic characteristics as the real human ear. In an embodiment of the application, the artificial ear has a standard ear canal (i.e., a target ear canal). The standard ear canal may be designed after a tester has analyzed a large number of ear canal characteristic sample data. The standard wearing state refers to a state in which the second electronic device is properly worn.
Optionally, the first electronic device may determine, according to a model and a type of the second electronic device, a configuration parameter of the second electronic device, so as to determine whether the second electronic device includes a feedback microphone, and further determine whether to determine the preset function as the first transfer function. If the first electronic device determines that the second electronic device does not comprise the feedback microphone, the first electronic device directly determines the preset transfer function as the first transfer function. How the first electronic device determines whether the second electronic device includes a feedback microphone is briefly described below.
In one embodiment, the first electronic device may have pre-stored therein configuration parameters for a plurality of models and types of second electronic devices. In the actual application process, the first electronic device can respond to the operation of selecting the second electronic device worn by the user from the prestored multiple second electronic devices to acquire the model and the type of the second electronic device. The first electronic device determines whether the second electronic device of the model and the type comprises a feedback microphone according to the configuration parameters of the second electronic device of the model and the type.
For example, assume that the first electronic device is a cell phone and the second electronic device is a headset. As shown in fig. 13, the handset displays an adapted headset model interface 1301. The adapted headset model interface 1301 includes two sections of the type of headset, the model of the headset. Wherein, in the type section of the earphone, a plurality of earphone types are displayed; in the model section of the earphone, a plurality of earphone models are displayed. First, in response to a user's selection of the type of headset, and the headset model, the handset device obtains the model and type of headset. And then, the mobile phone searches configuration parameters of the earphone according to the model and the type, so as to determine whether the earphone of the model and the type comprises a feedback microphone or not.
In another embodiment, the second electronic device worn by the user is not included in the second electronic devices pre-stored in the first electronic device. First, the first electronic device may acquire the model and the type of the second electronic device worn by the user in response to an operation in which the user can input the model and the type of the second electronic device in the first electronic device. The first electronic device may then look up configuration parameters of the model and type of second electronic device over the internet to determine whether the second electronic device includes a feedback microphone.
Illustratively, the first electronic device is assumed to be a cell phone. Referring to fig. 13, as shown in a in fig. 14, the adapting headset model interface 1301 of the mobile phone further includes: a first add control 1401 and a second add control 1402. The first add control 1401 is located in a type section of the headset and the second add control 1402 is located in a model section of the headset. In response to the user clicking on the first add control 1401, as shown at B in fig. 14, the handset's adapted headset model interface 1301 may display a first drop-down list control 1403. In response to the user clicking on the first drop-down list control 1403, as shown in C in fig. 14, the model make of the headset of the adapted headset model interface 1301 displays a headset type list 1405. The headset type list includes controls for multiple types of headset (e.g., universal neck hanging). In response to the user clicking on the universal neck-in control, as shown at D in fig. 14, the model make of the headset adapting the headset model interface 1301 displays the universal neck-in selection control 1407, and the universal neck-in selection control 1407 displays the selected state. In response to the user clicking on the second add control 1402, as shown at B in fig. 14, the handset's adapted headset model interface 1301 may display a second drop-down list control 1404. In response to the user clicking on the second drop-down list control 1404, as shown in C in fig. 14, the model make of the headset adapting to the headset model interface 1301 displays a model list 1406 of the headset. The model list of the earphone comprises controls of various models (such as A type puberty) of the earphone. In response to the user clicking on the control of type a adolescents, as shown at D in fig. 14, the model block of the headset adapting the headset model interface 1301 displays the selection control 1408 of type a adolescents, and the selection control 1408 of type a adolescents displays the selected status. Finally, the mobile phone obtains configuration parameters of the earphone from the Internet according to the model and the type of the earphone added by the user, so as to determine whether the model and the type of the earphone comprise a feedback microphone.
In another embodiment, when the first electronic device is connected to the second electronic device, the second electronic device actively transmits its own model number and type to the first electronic device. The first electronic device firstly searches whether the second electronic device is pre-stored in the first electronic device according to the received model and type. If the first electronic device finds the second electronic device of the model and the type, determining whether the second electronic device comprises a feedback microphone directly according to the configuration parameters of the second electronic device. If the first electronic device does not find the second electronic device of the model and the type, the first electronic device can acquire configuration parameters of the second electronic device through the internet, so that whether the second electronic device comprises a feedback microphone is determined.
In another embodiment, if the first electronic device cannot acquire the model and the type of the second electronic device, and the first electronic device does not receive the first audio and the second audio sent by the second electronic device or does not receive the first transfer function sent by the second electronic device within a preset period of time (for example, 3 seconds after the first electronic device and the second electronic device are connected), the first electronic device determines that the second electronic device does not include the feedback microphone.
The first electronic device according to the embodiment of the application can be pre-stored with various sound effects. The first electronic device may obtain the target sound effect in response to a selection operation by the user on the first electronic device. The pre-stored sound effects in the first electronic device may include: high fidelity sound effects, stage sound effects (or spot sound effects), electrical sound effects, but are not limited thereto.
Illustratively, assume that the first electronic device is a cell phone. As shown in fig. 15, the handset displays an audio selection interface 1501. The sound effect selection interface comprises: a high fidelity sound selection control 1502, a stage sound selection control 1503, and an electrical sound selection control 1504. In response to a user clicking on the select control 1502 of the hi-fi effect, the handset determines the hi-fi effect as the target effect.
It should be noted that there may be an association between the sound effect selection interface and the sound and vibration interface. For example, after the user audio processing control, the first electronic device may jump from the sound and vibration interface to the sound effect selection interface.
S1002, the first electronic device obtains a second transfer function according to the first transfer function.
The second transfer function may be a correspondence between the input signal and the second output signal of the acoustic system described above, i.e. the second transfer function is a transient response function of the second electronic device in the acoustic system. The second output signal may be a third audio received by the eardrum position of the user, that is, the audio actually heard by the user. Thus, the second transfer function may reflect the actual listening experience when the user hears audio played by the speaker of the second electronic device.
Wherein, since the third audio cannot be measured, the second transfer function cannot be determined by determining the first transfer function. However, since the first transfer function and the second transfer function are transfer functions of different measurement locations in the same acoustic system, the two are interrelated, and thus the second transfer function can be determined by the first transfer function.
The first neural network model is prestored in the first electronic equipment. The first neural network model is used for indicating a mapping relation between the first transfer function and the second transfer function. The first electronic equipment inputs the first transfer function into the first neural network model, and after the first neural network model is processed, the first electronic equipment obtains the second transfer function.
Optionally, as shown in fig. 16, the first neural network model according to the embodiment of the present application may include: an input layer, at least one hidden layer, and an output layer. The output layer may be a full connection layer (fully connected layer, FC), the hidden layer may be a convolutional layer of a convolutional neural network (convolutional neural network, CNN), a long and short term memory (long short term memory, LSTM) layer, or a combination of these layers, so as to implement different functions. The first neural network model may perform a linear operation on the data and the network parameters, and implement a nonlinear operation by activating the function. In addition, in order to simplify the neural network model, operations such as maximum value taking, minimum value taking, average value taking, pooling and the like are performed on the operation result of the linear operation output.
Wherein the linear operation in the first neural network model can be usedTo indicate (I)>Representing the output (i.e. the second transfer function),>representing the input (i.e. the first transfer function),>representing weights +.>Representing the bias. Wherein (1)>May be referred to as generic matrix multiplication (general matrix multiplication, GEMM), the network parameters of the first neural network include the weights +.>And bias->
One common activation function is to modify the linear units (rectified linear units, reLU). The function of the activation function is to add nonlinear elements to the first neural network model, so that the first neural network model can better solve the complex problem. If the activation function is not adopted but only the linear transformation is performed, the combination of a plurality of linear equations is equivalent, even if the number of layers is increased, the whole first neural network model is equivalent to a linear regression model, and the capability of solving the complex problem is limited. And by activating the function, nonlinear transformation can be realized, so that the first neural network model can learn and execute more complex tasks.
Pooling, i.e., spatial pooling, is a method for extracting features in CNN, and by performing aggregate statistical processing on different features, relatively lower latitude is obtained, while overfitting is avoided. Pooling can preserve most of the important information while reducing the individual feature dimensions.
Optionally, the training process of the first neural network model is as follows:
first, a plurality of first sample data sets are acquired. Specifically, a large number of volunteers are summoned, first, each volunteer is repeatedly wearing an earphone including a feedback microphone such as a real wireless stereo (true wireless stereo, TWS) earphone, and a probe microphone is fixedly installed at the eardrum position of each volunteer. The first test audio is then played through the speaker of the headset. The first test audio is received by a feedback microphone of the earphone via the ear canal of the volunteer, resulting in a second test audio. The first test audio is received by the probe microphone via the volunteer's ear canal, resulting in a third test audio. Finally, according to the first test audio and the second test audio, a first sample transfer function is obtainedThe method comprises the steps of carrying out a first treatment on the surface of the Obtaining a second sample transfer function according to the first test audio and the third test audio>. Wherein (1)>Number volunteers, < >>,/>Is a positive integer greater than 1; />Number wearing times, ->,/>Is a positive integer greater than 1. Based on this, a plurality of sets of first sample data sets for training the first neural network model can be obtained +.>
The auditory meatus of volunteers comprises auditory meatus with different sizes, auditory meatus with different shapes, auditory meatus with different eardrum acoustic resistances, normal auditory meatus, pathological auditory meatus and the like. That is, the plurality of sets of first sample data sets collected contain a wide variety of ear canal characteristics. Meanwhile, the wearing state of the volunteers when wearing the earphone each time can be different, so that the trained first neural network model is more comprehensive, and the prediction accuracy is higher.
Second, first, a set of first sample data setsInputting the second transfer function into the neural network model to be trained to obtain the second transfer function of the actual output +.>. Then, a second sample transfer function is determined>And a second transfer function->Is a function of the error of (a). This error is used to adjust the weights between the various hidden layers of the neural network model to be trained. This process is repeated for each first sample data set until the error converges within the control range for all first sample data sets. In this way, a trained first neural network model may be obtained.
S1003, the first electronic device obtains target audio to be played according to the second transfer function, the original audio and the target sound effect.
Optionally, the process of obtaining the target audio by the first electronic device may include: first, the first electronic device obtains a target filter for processing the original audio or a target neural network according to the second transfer function and the target sound effect. Then, the first electronic device processes the original audio through the target filter or the target neural network to obtain target audio to be played.
The second transfer function may be used to indicate a real listening experience when the user wears the second electronic device to listen to audio, the target sound effect is an effect achieved when the user desires audio to play, and one sound effect may be represented by a transient response function. Thus, the target filter, or target neural network, capable of processing the second transfer function into target sound effects, is one that can process the original audio into target audio that meets the listening needs of the user.
In combination with B in FIG. 6, as shown in FIG. 17, when the target sound effect is a high fidelity sound effect, the second transfer functionTime domain sequence +.>After being processed by a target filter or a target neural network, the transient response function of the target sound effect is obtained>. It is noted that the transient response function of the target sound effect +.>Parameter->According to->Is set so that the acoustic system described above meets causality.
Optionally, the process of determining the target filter by the first electronics is as follows:
the first electronic device may have a plurality of filters pre-stored therein. A filter may process a second transfer function into a transient response function corresponding to an audio effect. After the first electronic device acquires the target sound effect, the first electronic device can determine the target filter according to the corresponding relation between the sound effect and the filter.
The filters can be classified into: finite length unit impulse response (finite impulse response, FIR) filters (or non-recursive filters), infinite length unit impulse response (infinite impulse response, IIR) filters (or recursive filters).
The method of determining each filter in each type is briefly described as follows:
Illustratively, the filter is assumed to be a FIR filter. Firstly, setting the order N of an FIR filter; then, the frequency response function of the FIR filter can be determined by the following formula (2).
Formula (2)
Wherein, the liquid crystal display device comprises a liquid crystal display device,a time domain sequence that is a frequency response function of the FIR filter; />Time domain sequence for the second transfer function (+.>For the frequency domain sequence of the second transfer function, +.>And->Is a fourier transform pair); />Representing a convolution operation;representing the square of the 2-norm; />Time domain sequence of transient response functions for the target sound effect (e.g. when the target sound effect is a high fidelity sound effect +.>)。
For example, the FIR filter can be expressed by the following formula (3):
formula (3)
Wherein, the liquid crystal display device comprises a liquid crystal display device,a frequency domain sequence that is a frequency response function of the FIR filter; />A frequency domain sequence that is a second transfer function; />A conjugate function of the frequency domain sequence that is the second transfer function; />A frequency domain sequence of transient response functions for the target sound effect.
Illustratively, the filter is assumed to be an IIR filter. The frequency response function of the IIR filter may be determined by the frequency response function of the FIR filter. Specifically, the frequency response function of the IIR filter can be determined by the following equation (4).
Formula (4)
Wherein, the liquid crystal display device comprises a liquid crystal display device,a time domain sequence that is a frequency response function of the FIR filter; />A time domain sequence that is a frequency response function of the IIR filter; />And->Is a coefficient; />Representing the square of the 2-norm. />
Optionally, the process of determining the target neural network by the first electronics is as follows:
the first electronic device may have a plurality of second neural network models pre-stored therein. The second neural network model is used for indicating the mapping relation between the second transfer function and the transient response function of the sound effect, namely, one second neural network model can process the second transfer function into the transient response function corresponding to the sound effect. After the first electronic device acquires the target sound effect, the first electronic device can determine the target neural network model according to the matching relation between the sound effect and the second neural network model.
The structure of the second neural network model may be the same as that of the first neural network model, which is not limited in the embodiment of the present application.
S1004, the first electronic device sends the target audio to the second electronic device.
The first electronic device sends the processed target audio to the second electronic device, and the user can hear the audio meeting the self-hearing requirement by wearing the second electronic device.
It should be noted that, if the above-mentioned process of obtaining the target audio to be played is performed by the second electronic device, S1004 is no longer required to be performed.
S1005, the second electronic device plays the target audio.
The target audio played by the second electronic device is obtained according to the listening requirement of the user, so that the listening experience of the user when listening to the audio by wearing the second electronic device can be improved.
In summary, in the audio processing method provided by the embodiment of the present application, first, through measurement, a first transfer function, which is determined and used to instruct a user to wear a second electronic device, of a response that is received by the second electronic device after audio played by a speaker of the second electronic device propagates through an acoustic system, is determined and cannot be determined through measurement, but when the user can instruct the user to wear the second electronic device, a transient response function (i.e., a second transfer function) of the second electronic device in the acoustic system may be determined, and then, a target filter or a target neural network model for processing original audio is obtained by combining a target sound effect selected by the user and the second transfer function. And then, processing the original audio through a target filter or a target neural network model to obtain target audio meeting the listening requirements of the user. Because the second transfer function can represent the real listening experience of the user wearing the second electronic device, and meanwhile, the target sound effect is the effect achieved by the user when the audio desired by the user is played, the second transfer function can be processed into the target filter of the target sound effect, or the target neural network can process the original audio into the target audio meeting the listening requirement of the user.
In addition, by adopting the audio processing method provided by the embodiment of the application, the amplitude-frequency response of the original audio is optimized, and the amplitude and the phase of the original audio are also optimized. In this way, the quality of the resulting target audio is better.
As shown in fig. 18, the embodiment of the application further provides a chip system. The system on a chip 1800 includes at least one processor 1801 and at least one interface circuit 1802. The at least one processor 1801 and the at least one interface circuit 1802 may be interconnected by wires. The processor 1801 is configured to support the electronic device to implement the steps of the method embodiments described above, e.g., the method illustrated in fig. 10, and the at least one interface circuit 1802 may be configured to receive signals from other devices (e.g., memory) or to transmit signals to other devices (e.g., a communication interface). The system-on-chip may include a chip, and may also include other discrete devices.
Embodiments of the present application also provide a computer storage medium including instructions that, when executed on an electronic device described above, cause the electronic device to perform the steps of the method embodiments described above, for example, performing the method shown in fig. 10.
Embodiments of the present application also provide a computer program product comprising instructions which, when run on an electronic device as described above, cause the electronic device to perform the steps of the method embodiments described above, for example to perform the method shown in fig. 10.
Technical effects concerning the chip system, the computer storage medium, the computer program product refer to the technical effects of the previous method embodiments.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system, apparatus and module may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described device embodiments are merely illustrative, e.g., the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple modules or components may be combined or integrated into another device, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, indirect coupling or communication connection of devices or modules, electrical, mechanical, or other form.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physically separate, i.e., may be located in one device, or may be distributed over multiple devices. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present application may be integrated in one device, or each module may exist alone physically, or two or more modules may be integrated in one device.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using a software program, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer storage medium or transmitted from one computer storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer storage media may be any available media that can be accessed by a computer or data storage devices including one or more servers, data centers, etc. that can be integrated with the media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (6)

1. A method of audio processing, the method comprising:
acquiring a first transfer function and a target sound effect; the first transfer function is used for indicating a response that audio played by a loudspeaker of the audio playing device is received by the audio playing device after being transmitted by an acoustic system, and the acoustic system is formed after a user wears the audio playing device; the target sound effect is achieved when a user expects audio playing;
obtaining a second transfer function according to the first transfer function; the second transfer function is used for indicating a response that the audio played by the loudspeaker of the audio playing device propagates to the position of the eardrum of the user through the acoustic system;
obtaining target audio to be played according to the second transfer function, the original audio and the target sound effect;
The acquiring a first transfer function includes:
acquiring audio played by a loudspeaker of the audio playing device and audio received by a feedback microphone in the audio playing device;
and acquiring the first transfer function according to the audio played by the loudspeaker and the audio received by the feedback microphone.
2. The method of claim 1, wherein the first transfer function is determined by the audio playback device.
3. The method according to claim 1 or 2, wherein the obtaining the target audio to be played according to the second transfer function, the original audio and the target sound effect comprises:
obtaining a target filter for processing the original audio or a target neural network according to the second transfer function and the target sound effect;
and processing the original audio through the target filter or the target neural network model to obtain the target audio.
4. The method according to claim 1 or 2, characterized in that the method further comprises:
and sending the target audio to the audio playing device to play the target audio.
5. An electronic device comprising a processor and a memory, the memory storing instructions that, when executed by the processor, perform the method of any of claims 1-4.
6. A computer readable storage medium comprising instructions which, when executed on an electronic device, cause the electronic device to perform the method of any of claims 1-4.
CN202310415787.7A 2023-04-18 2023-04-18 Audio processing method and electronic equipment Active CN116156390B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202311136573.2A CN117177135A (en) 2023-04-18 2023-04-18 Audio processing method and electronic equipment
CN202310415787.7A CN116156390B (en) 2023-04-18 2023-04-18 Audio processing method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310415787.7A CN116156390B (en) 2023-04-18 2023-04-18 Audio processing method and electronic equipment

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202311136573.2A Division CN117177135A (en) 2023-04-18 2023-04-18 Audio processing method and electronic equipment

Publications (2)

Publication Number Publication Date
CN116156390A CN116156390A (en) 2023-05-23
CN116156390B true CN116156390B (en) 2023-09-12

Family

ID=86350972

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202311136573.2A Pending CN117177135A (en) 2023-04-18 2023-04-18 Audio processing method and electronic equipment
CN202310415787.7A Active CN116156390B (en) 2023-04-18 2023-04-18 Audio processing method and electronic equipment

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202311136573.2A Pending CN117177135A (en) 2023-04-18 2023-04-18 Audio processing method and electronic equipment

Country Status (1)

Country Link
CN (2) CN117177135A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102905206A (en) * 2011-07-26 2013-01-30 哈曼贝克自动系统股份有限公司 Noise reducing sound reproduction
CN108430003A (en) * 2018-03-30 2018-08-21 广东欧珀移动通信有限公司 Audio compensation method and device, readable storage medium storing program for executing, terminal
CN108702578A (en) * 2016-02-09 2018-10-23 索诺瓦公司 By executing the method for real ear measurement at the desired location for the eardrum that probe member is placed on to the duct away from individual and being configured as executing the measuring system of this method
CN114257910A (en) * 2021-08-17 2022-03-29 北京安声浩朗科技有限公司 Audio processing method and device, computer readable storage medium and electronic equipment
CN114420158A (en) * 2021-08-17 2022-04-29 北京安声浩朗科技有限公司 Model training method and device, and target frequency response information determining method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9392366B1 (en) * 2013-11-25 2016-07-12 Meyer Sound Laboratories, Incorporated Magnitude and phase correction of a hearing device
CN106664499B (en) * 2014-08-13 2019-04-23 华为技术有限公司 Audio signal processor
CN105895112A (en) * 2014-10-17 2016-08-24 杜比实验室特许公司 Audio signal processing oriented to user experience
US10187740B2 (en) * 2016-09-23 2019-01-22 Apple Inc. Producing headphone driver signals in a digital audio signal processing binaural rendering environment
EP3687188B1 (en) * 2019-01-25 2022-04-27 ams AG A noise cancellation enabled audio system and method for adjusting a target transfer function of a noise cancellation enabled audio system
CN116368819A (en) * 2021-07-16 2023-06-30 深圳市韶音科技有限公司 Earphone and earphone sound effect adjusting method
CN115714944A (en) * 2022-11-17 2023-02-24 北京小米移动软件有限公司 Audio processing method and device, earphone and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102905206A (en) * 2011-07-26 2013-01-30 哈曼贝克自动系统股份有限公司 Noise reducing sound reproduction
CN108702578A (en) * 2016-02-09 2018-10-23 索诺瓦公司 By executing the method for real ear measurement at the desired location for the eardrum that probe member is placed on to the duct away from individual and being configured as executing the measuring system of this method
CN108430003A (en) * 2018-03-30 2018-08-21 广东欧珀移动通信有限公司 Audio compensation method and device, readable storage medium storing program for executing, terminal
CN114257910A (en) * 2021-08-17 2022-03-29 北京安声浩朗科技有限公司 Audio processing method and device, computer readable storage medium and electronic equipment
CN114420158A (en) * 2021-08-17 2022-04-29 北京安声浩朗科技有限公司 Model training method and device, and target frequency response information determining method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
个性化头相关传递函数研究;袁康;《中国优秀硕士论文电子期刊》;全文 *

Also Published As

Publication number Publication date
CN116156390A (en) 2023-05-23
CN117177135A (en) 2023-12-05

Similar Documents

Publication Publication Date Title
US10609483B2 (en) Method for sound effect compensation, non-transitory computer-readable storage medium, and terminal device
US9344793B2 (en) Audio apparatus and methods
US20190124456A1 (en) Processor-readable medium, apparatus and method for updating hearing aid
CN108391205B (en) Left and right channel switching method and device, readable storage medium and terminal
CN108668009B (en) Input operation control method, device, terminal, earphone and readable storage medium
CN113676804A (en) Active noise reduction method and device
CN108922537B (en) Audio recognition method, device, terminal, earphone and readable storage medium
CN108243481B (en) File transmission method and device
CN108540900B (en) Volume adjusting method and related product
CN109547848B (en) Loudness adjustment method and device, electronic equipment and storage medium
WO2022242528A1 (en) Volume adjustment method and terminal device
JP2023525138A (en) Active noise canceling method and apparatus
EP2560413A1 (en) Audio device and audio producing method
CN114157945A (en) Data processing method and related device
CN107040655A (en) Mobile terminal and audio adaptive equilibrium method
CN109754796A (en) The method and electronic device of function are executed using multiple microphones
CN108391208B (en) Signal switching method, device, terminal, earphone and computer readable storage medium
CN116156390B (en) Audio processing method and electronic equipment
CN108810787B (en) Foreign matter detection method and device based on audio equipment and terminal
CN114390406B (en) Method and device for controlling displacement of loudspeaker diaphragm
CN114979923A (en) Method, device and storage medium for determining hearing loss curve
CN111263017B (en) Call mode control method and device, storage medium and electronic equipment
CN107493376A (en) A kind of ringing volume adjusting method and device
CN109618062B (en) Voice interaction method, device, equipment and computer readable storage medium
CN115567831A (en) Method and device for improving tone quality of loudspeaker

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant