WO2022267468A1 - 一种声音处理方法及其装置 - Google Patents

一种声音处理方法及其装置 Download PDF

Info

Publication number
WO2022267468A1
WO2022267468A1 PCT/CN2022/073338 CN2022073338W WO2022267468A1 WO 2022267468 A1 WO2022267468 A1 WO 2022267468A1 CN 2022073338 W CN2022073338 W CN 2022073338W WO 2022267468 A1 WO2022267468 A1 WO 2022267468A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
user
action
electronic device
music
Prior art date
Application number
PCT/CN2022/073338
Other languages
English (en)
French (fr)
Inventor
胡贝贝
许剑峰
Original Assignee
北京荣耀终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京荣耀终端有限公司 filed Critical 北京荣耀终端有限公司
Priority to EP22826977.5A priority Critical patent/EP4203447A4/en
Priority to US18/030,446 priority patent/US20240031766A1/en
Publication of WO2022267468A1 publication Critical patent/WO2022267468A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/307Frequency adjustment, e.g. tone control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/72442User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for playing music files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • H04S1/007Two-channel systems in which the audio signals are in digital form
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6033Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
    • H04M1/6041Portable telephones adapted for handsfree use
    • H04M1/6058Portable telephones adapted for handsfree use involving the use of a headset accessory device connected to the portable telephone
    • H04M1/6066Portable telephones adapted for handsfree use involving the use of a headset accessory device connected to the portable telephone including a wireless connection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]

Definitions

  • the present application relates to the field of terminals, in particular to a sound processing method and device thereof.
  • the terminal device when a user uses a smart terminal to play audio, the terminal device generally simply performs audio playback. Users cannot process and process the audio being played, so they cannot obtain an audio-based interactive experience.
  • the application provides a sound processing method.
  • the electronic device can recognize the action of one or more electronic devices when the user is playing audio through motion detection, and determine the music material that matches the action according to the preset association relationship, so as to add entertainment to the audio being played
  • the unique interactive effect increases the fun of the audio playback process and meets the user's interactive needs with the audio being played.
  • the present application provides a sound processing method, which is applied to a first electronic device, and the method includes: playing the first audio; detecting the first action of the user; and acquiring the second audio in response to the first action , the second audio has a corresponding relationship with the first action, wherein the corresponding relationship is preset by the user; according to the second audio, the first audio is processed to obtain the third audio, wherein the third audio is different from the first audio, and the third audio The third audio is associated with the first audio; the third audio is played.
  • the first electronic device can recognize the action of the detected electronic device when the user plays music.
  • the first electronic device may determine the audio that matches the above action, and add the above audio to the music being played, and play it together with the music being played.
  • the second audio is preset audio for adding background sound effects to the first audio.
  • the first electronic device can add audio with entertaining interactive effects to the music being played, so as to satisfy the user's need for interaction with the music being played.
  • the method further includes: processing the second audio to make it a second audio with a variable stereo playback effect, variable The stereo playback effect means that the stereo playback effect can change with the change of the relative position between the user and the first electronic device; according to the second audio, the first audio is processed to obtain the third audio, which specifically includes: will have a variable stereo playback The second audio of the effect is superimposed with the first audio to obtain the third audio.
  • the first electronic device can perform spatial rendering processing on the added audio with entertaining interactive effects, so that the general interactive audio has a variable spatial three-dimensional surround effect.
  • processing the second audio to make it a second audio with a variable stereo playback effect specifically includes: acquiring the position of the first electronic device relative to the user; The position determines the first parameter, which is obtained from the head-related transformation function database, and adjusts the parameters of the playback effect of the left and right channels of the second audio; multiply the second audio and the first parameter by frequency to obtain Second audio with variable stereo playback effect.
  • the first electronic device can determine the parameters for performing spatial rendering processing on the second audio through the position of the first electronic device relative to the user, so as to confirm the audio data of the left and right channels of the second audio. In this way, the left sound channel and the right sound channel heard by the left and right ears of the user are different, thereby forming a stereo playback effect.
  • the parameters of the spatial rendering process will also change continuously, so that the audio that the user hears is also three-dimensional and can follow the relative position of the first electronic device and the user. It changes from time to time, so as to enhance the user's immersive experience.
  • the first audio is processed to obtain the third audio, which specifically includes: superimposing the second audio of the first duration on the first audio of the first audio One interval, the third audio frequency is obtained, and the duration of the first interval is equal to the first duration.
  • Playing the third audio specifically includes: playing audio in the first interval of the third audio.
  • the first electronic device may play the second audio while playing the first audio after detecting a preset device action. This way, the user can immediately hear the audio being played with interactive audio added for added entertainment.
  • the first action includes multiple second actions, the multiple second actions are a combination of actions performed by multiple second electronic devices at the same time, and the second audio A plurality of fourth audios are included, and the plurality of fourth audios respectively correspond to the plurality of second actions.
  • the first electronic device may detect an action obtained by combining actions performed by multiple electronic devices. In this way, the diversity of the detected actions can be increased, and more options can be provided for the user.
  • the action formed by the combination of actions of multiple second electronic devices can also more accurately describe the user's body action.
  • the method before playing the first audio, the method further includes: displaying a first user interface, where one or more icons and controls are displayed, and the icons include The first icon, the control includes the first control; a first operation performed by the user on the first control is detected; and in response to the first operation, it is confirmed that the second audio is associated with the first action.
  • the user can pre-set the matching relationship between device actions and audio with entertaining interactive effects in the first electronic device.
  • obtaining the second audio specifically includes: querying the storage table, determining the second audio, one or more audios are recorded in the storage table, and an action corresponding to the audio; a The or multiple audios include a second audio, and the second audio corresponds to the first action in the storage table; the second audio is obtained from a local database or a server.
  • the local storage of the first electronic device can store the preset music material in the storage table, so that when the above music material needs to be used, the first electronic device can directly obtain it from the local storage space the music material.
  • the first electronic device can also directly obtain the music material preset in the storage table from the server through the Internet, which can save the storage space of the first electronic device.
  • the second audio includes: any one of musical instrument sounds, animal sounds, environmental sounds, or recordings.
  • the first electronic device can add such different sounds to the music being played, such as musical instrument sounds, animal sounds, environmental sounds or recordings.
  • the musical instrument sound includes: any one of a snare drum, a bass drum, a maraca, a piano, an accordion, a trumpet, a tuba, a flute, a cello, or a violin; an animal sound includes: a bird Any one of crowing, frogs, insects, cats, dogs, sheep, cows, pigs, horses or chickens; ambient sounds include: wind, rain, thunder, running water, sea waves or Any of the waterfall sounds.
  • the second electronic device includes an earphone connected to the first electronic device, and the first motion includes a user's head motion detected by the earphone.
  • the first electronic device may determine the head movement of the user by detecting the device movement of the earphone.
  • the first electronic device can determine that the user has made the above-mentioned movement of shaking his head through the movement of the earphone.
  • the head movement includes any one of head displacement or head rotation; the head displacement includes any of left, right, up or down One, head turning includes any one of turning left, turning right, looking up or down.
  • the second electronic device includes a watch connected to the first electronic device, and the first motion includes a user's hand motion detected by the watch.
  • the first electronic device can determine the user's hand movement by detecting the device movement of the watch.
  • the first electronic device can judge that the user has made the above-mentioned action of shaking the hand through the movement of the watch.
  • the hand motion includes any one of hand displacement or hand rotation;
  • the hand displacement includes: any of left, right, up or down One, hand turning includes any one of turning left, turning right, raising or lowering the hand.
  • the second electronic device includes an earphone and a watch connected to the first electronic device, and the first action includes a movement of the user's head and hand detected by the earphone and the watch. combination.
  • the first electronic device can detect the combination of the head movement and the hand movement of the user through the earphone and the watch, thereby increasing the diversity of movement types and providing the user with more choices.
  • the motion formed by the combination of the user's head motion and hand motion can also more accurately describe the user's body motion.
  • the present application provides an electronic device, which includes one or more processors and one or more memories; wherein, one or more memories are coupled with one or more processors, and one or more
  • the memory is used to store computer program codes.
  • the computer program codes include computer instructions.
  • the electronic device executes the method described in the first aspect and any possible implementation manner of the first aspect.
  • the present application provides a computer-readable storage medium, including instructions.
  • the above-mentioned instructions When the above-mentioned instructions are run on an electronic device, the above-mentioned electronic device executes the method described in the first aspect and any possible implementation manner of the first aspect. method.
  • the present application provides a computer program product containing instructions.
  • the above-mentioned computer program product is run on an electronic device, the above-mentioned electronic device is executed as described in the first aspect and any possible implementation manner of the first aspect. method.
  • the electronic device provided in the second aspect, the computer storage medium provided in the third aspect, and the computer program product provided in the fourth aspect are all used to execute the method provided in this application. Therefore, the beneficial effects that it can achieve can refer to the beneficial effects in the corresponding method, and will not be repeated here.
  • FIG. 1 is a scene diagram of a sound processing method provided by an embodiment of the present application
  • Fig. 2 is a software structural diagram of a sound processing method provided by the embodiment of the present application.
  • Fig. 3 is a flow chart of a sound processing method provided by an embodiment of the present application.
  • Fig. 4A is a schematic diagram of a master device identifying device actions provided by an embodiment of the present application.
  • Fig. 4B is a schematic diagram of another master device identifying device actions provided by the embodiment of the present application.
  • Fig. 4C is a schematic diagram of the master device identifying the azimuth provided by the embodiment of the present application.
  • FIG. 5A is a flow chart of 3D spatial rendering of audio by the master device provided in the embodiment of the present application.
  • FIG. 5B is a schematic diagram of 3D spatial rendering of a set of frequency-domain audio provided by an embodiment of the present application.
  • FIG. 5C is a schematic diagram of 3D spatial rendering of a set of time-domain audio provided by the embodiment of the present application.
  • 6A-6J are a set of user interfaces provided by the embodiment of the present application.
  • FIG. 7 is a hardware structural diagram of an electronic device provided by an embodiment of the present application.
  • the wireless headset can determine the distance between the user's left and right ears and the mobile phone by tracking the user's head movement, thereby adjusting the volume of the audio output from the left and right ears to meet User immersive surround sound experience.
  • the above-mentioned processing is only limited to adjusting the intensity of the original audio output in the left ear and the right ear to obtain the effect of stereo surround sound, and cannot satisfy the effect of the user interacting with the above-mentioned audio during the process of playing the above-mentioned audio.
  • an embodiment of the present application provides a sound processing method.
  • the method can be applied to electronic devices such as mobile phones.
  • electronic devices such as mobile phones can establish a connection between device actions and music materials. After recognizing the preset device action, the electronic device can confirm the music material associated with the device action, and then fuse the music material after three-dimensional space rendering processing with the audio being played by the user, and then output it.
  • the aforementioned device actions refer to changes in the position and shape of the electronic device caused by user movements, including displacement actions and/or rotation actions.
  • the displacement action refers to an action generated due to the change of the current position of the electronic device relative to the position at the previous moment, including moving left, moving right, moving up, and moving down.
  • the electronic device can determine whether the electronic device performs any of the above-mentioned displacement actions through the data collected by the acceleration sensor.
  • the turning action refers to an action caused by the change of the direction of the electronic device at the current moment relative to the direction at the previous moment, including turning left, turning right, turning up, and turning down.
  • the electronic device can determine whether the electronic device performs any of the above-mentioned rotation actions through the data collected by the gyroscope sensor. It can be understood that if more detailed classification criteria are adopted, the above-mentioned displacement actions and rotation actions may also include more types.
  • the above device actions also include combination actions.
  • the aforementioned combined action refers to a combination of actions performed by multiple electronic devices at the same time. For example, at the same moment, the first detected electronic device moves left, and the second detected electronic device makes a left turn. At this time, the above-mentioned left move and left turn are combined The action is a combined action.
  • the aforementioned music material refers to preset audio data with specific content, including instrument sounds, animal sounds, environmental sounds, user-defined recording files, and the like.
  • musical instrument sounds include a snare drum, a bass drum, a maraca, a piano, an accordion, a trumpet, a tuba, a flute, a cello, and a violin.
  • animal sounds mentioned above are, for example, birdsong, frogs, insects, cats, dogs, sheep, cows, pigs, horses, chickens, etc.
  • the environmental sounds mentioned above are, for example, wind, rain, thunder, running water, ocean waves, waterfalls, and the like.
  • Three-dimensional space rendering refers to the use of Head Related Transfer Function (Head Related Transfer Function, HRTF) to process audio data, so that the processed audio data can have a stereo surround effect on the user's left and right ears .
  • HRTF Head Related Transfer Function
  • the head-related transformation function will be referred to as a head function for short.
  • a module that processes audio data using head functions is called a head function filter.
  • the user can drive the electronic device to move through his own movement (such as shaking his head, shaking his hands, etc.) when playing audio, thereby adding an entertaining interactive effect to the above-mentioned audio that is being played, increasing the fun of the audio playback process, Meet the user's interaction needs with the audio being played.
  • his own movement such as shaking his head, shaking his hands, etc.
  • FIG. 1 exemplarily shows a system 10 for implementing the above sound processing method.
  • the scenarios involved in implementing the above methods will be introduced below in combination with the system 10 .
  • the system 10 may include a master device 100 and a slave device 200 .
  • the master device 100 can be used to acquire and process audio files.
  • the master device 100 can be connected to the slave device 200 , and play an audio signal on the slave device 200 by using the playback capability of the sound unit provided by the slave device 200 . That is, the audio file parsing task is performed on the master device 100 , and the audio signal playback task is performed on the slave device 200 .
  • the above scenario in which the system 10 includes the master device 100 and the slave device 200 may be referred to as a first scenario.
  • the master device 100 shown in FIG. 1 is an electronic device of mobile phone type as an example, and the slave device 200 is an electronic device of earphone type as an example.
  • the main device 100 may also include a tablet computer, a personal computer (PC), a personal digital assistant (PDA), a smart wearable electronic device, an augmented reality (augmented reality, AR) device, a virtual Reality (virtual reality, VR) equipment, etc.
  • the aforementioned electronic equipment may also be other portable electronic equipment, such as a laptop computer (Laptop).
  • the above-mentioned electronic device may not be a portable electronic device, but a desktop computer or the like.
  • Exemplary embodiments of electronic devices include, but are not limited to Portable electronic devices with Linux or other operating systems.
  • the connection between the master device 100 and the slave device 200 may be a wired connection or a wireless connection.
  • the wireless connection includes, but is not limited to, a high-fidelity wireless communication (wireless fidelity, Wi-Fi) connection, a Bluetooth connection, an NFC connection, and a ZigBee connection. If there is a wired connection between the master device 100 and the slave device 200, then the device type of the slave device 200 can be a wired earphone; if there is a wireless connection between the master device 100 and the slave device 200, then the device type of the slave device 200 can be wireless Headphones, including head-mounted wireless headphones, neck-mounted wireless headphones and true wireless headphones (True wireless headset, TWS). This embodiment of the present application does not limit this.
  • the objects detected by the master device 100 include: the master device 100 and/or, the slave device 200 . That is, in the first scenario, the detection object of the master device 100 may only include itself; it may also only include the slave device 200 ; it may also include both the master device 100 and the slave device 200 . A specific object detected by the main device 100 may be set by a user.
  • Records in the master device 100 include the association between device actions and music materials.
  • the master device 100 and the slave device 200 are playing audio, the master device 100 can detect device actions of the electronic device in real time.
  • the main device 100 may determine the music material matching the action according to the above association relationship. Referring to Table 1, Table 1 exemplarily shows the association relationship between the above-mentioned device actions and music materials.
  • the main device 100 may determine that the music material associated with the above-mentioned upward movement of the main device 100 is a flute. Then, the main device 100 can add the music material (flute) corresponding to the device action (move up) to the audio being played, so that the audio file being played is also accompanied by the effect of the music material (flute), thereby increasing the audio frequency.
  • the fun of the playback process meets the user's interactive needs with the audio being played.
  • no effect may indicate that the music material does not match.
  • the master device 100 may not add any interactive music material to the audio being played.
  • the device actions and music materials recorded in Table 1 will correspondingly increase, which will not be listed one by one in this embodiment of the present application.
  • the device actions and music materials recorded in Table 1 are not necessarily all of the currently detected electronic devices.
  • the association relationship between device actions and music materials recorded in Table 1 includes the master device 100 and the slave device 200, but the actual detected object may only include the master device 100 (or slave device 200).
  • Table 1 may also record device actions formed by combining individual actions of multiple detected electronic devices.
  • the master device 100 moves left + the slave device 200 moves right and so on.
  • the embodiment of the present application does not limit the type of actions.
  • the device actions listed in Table 1 are also optional.
  • the device action detected by the main device 100 only includes a displacement action, only the relationship between the displacement action and the music material can be recorded in Table 1; when the device action detected by the main device 100 only includes a rotation action, only the rotation action and The association relationship of music material.
  • the association relationship between device actions and music materials shown in Table 1 is preset.
  • the user can set the music material matching the device action through the user interface provided by the master device 100 .
  • Subsequent embodiments will introduce the above user interface in detail, which will not be expanded here.
  • the system 10 may further include a slave device 300 (second scenario).
  • the slave device 300 includes: a smart wearable device (such as a smart watch, a smart bracelet, etc.), a game handheld device (such as a game controller, etc.).
  • the master device 100 may record music material matching the device motion of the slave device 300 . After detecting a certain device action of the slave device 300, the master device 100 may determine the music material that matches the action, and then attach the music material to the audio being played by the master device 100.
  • the action of the user waving his hand along with the music can be captured by the slave device 300. Further, the master device 100 can add more information to the audio file being played according to the device action of the slave device 300. Lots of interactive music material.
  • the combined actions described above may also include actions of the slave device 300, such as switching up from the device 200+moving down from the device 300, and so on.
  • smart wearable devices such as smart watches and smart bracelets can be used as the slave device 300 .
  • smart wearable devices such as smart watches and smart bracelets can also serve as the main device 100 .
  • the above scenarios are, for example: playing music on a smart watch, playing music on a smart watch connected to a wireless headset, and so on.
  • the embodiment of the present application does not limit this.
  • FIG. 2 exemplarily shows a software structure 20 for implementing the sound processing method provided by the embodiment of the present application.
  • the software structure for implementing the above method will be specifically introduced below in conjunction with FIG. 2 .
  • the software structure 20 includes two parts: an audio playing module 201 and an interactive sound effect processing module 202 .
  • the audio playing module 201 includes: an original audio 211 , a basic sound effect 212 , an output audio 213 and a superposition module 214 .
  • the interactive sound effect processing module 202 may include: a music material library 221 , a personalized setting module 222 , a motion detection module 223 , a header function database 224 and a 3D space rendering module 225 .
  • the original audio 211 may be used to indicate the audio being played by the master device 100 .
  • the main device 100 plays a certain song (song A).
  • the audio data of song A may be referred to as audio being played by the main device 100 .
  • the basic sound effect 212 can be used to add some basic playback effects to the original audio 211 .
  • the basic sound effect 212 can modify the original audio 211 so that the user finally hears a higher quality audio.
  • the above-mentioned additional basic playback effects are, for example: equalization (adjusting the timbre of music), dynamic range control (adjusting the loudness of music), limiting (preventing algorithm from clipping), and low-frequency enhancement (enhancing the effect of low-frequency), etc.
  • Output audio 213 may be used to indicate the audio that is actually played from device 200 .
  • the content and effects included in the output audio 213 are what the user can directly hear or feel. For example, when the output audio 213 is rendered in 3D space, the sound heard by the user may have the effect of spatial stereo surround.
  • the audio playing module 201 further includes a superposition module 214 .
  • the overlay module 214 can be used to add entertaining interactive effects to the original audio 211 .
  • the superposition module 214 can receive the music material sent by the interactive sound effect processing module 202, and fuse the above music material with the original audio 211, so that the played fused audio includes both the content of the original audio 211 and the content of the above music material. content, that is, the original audio 211 is accompanied by entertaining interactive effects.
  • the interactive sound effect processing module 202 Before the superposition module 214 receives the music material sent by the interactive sound effect processing module 202, the interactive sound effect processing module 202 needs to determine the specific content of the above-mentioned interactive effect, that is, determine which music materials to add to the original audio 211. At the same time, the interactive sound effect processing module 202 also needs to render the selected music material in 3D space, so that the music material has the effect of spatial surround, thereby improving user experience.
  • a variety of music materials are stored in the music material library 221, including musical instrument sounds, animal sounds, environmental sounds and user-defined recording files introduced in the foregoing embodiments.
  • the music material attached to the original audio 211 comes from the music material library 221 .
  • All the music materials included in the music material library 221 can be stored on the main device 100, or can be stored in the server.
  • the main device 100 can directly obtain the music material from the local storage when using the above music material.
  • the master device 100 can download the required music material from the server to the local storage, and then read the music material from the local storage.
  • the above-mentioned server refers to a device that stores a large amount of music materials and provides services for terminal devices to obtain the above-mentioned music materials
  • the above required music material refers to the music material associated with the device action of the detected electronic device.
  • the detected objects only include the master device 100, and the music materials to be stored in the memory of the master device 100 include: bass drum, left turn, cat meowing, ocean waves, flute, dog barking, ocean waves, and cello. Except for the above music materials, the main device 100 does not need to download from the cloud to the local in advance, thereby saving the storage space of the main device 100 .
  • the personalized setting module 222 can be used to set the relationship between device actions and music materials.
  • the user can match any device action with any music material through the personalized setting module 222 , for example, the user can match the movement of the main device 100 to the left with the bass drum through the personalized setting module 222 .
  • the main device 100 After being preset by the personalized setting module 222, the main device 100 can obtain the storage table recording the above association relationship, refer to Table 1. Based on the above storage table, the main device 100 can determine the music material corresponding to any device action at any time.
  • the motion detection module 223 can be used to detect whether electronic devices such as the master device 100 , the slave device 200 , and the slave device 300 perform actions recorded in the above storage table.
  • an acceleration sensor and a gyroscope sensor may be installed in the above-mentioned electronic device.
  • the acceleration sensor can be used to detect whether the above-mentioned electronic device has a displacement action
  • the gyro sensor can be used to detect whether the above-mentioned electronic device has a rotation action.
  • the master device 100 When the master device 100 (or the slave device 200 ) is displaced, the data of the three axes of the acceleration sensor will change.
  • the above three axes refer to the X axis, the Y axis and the Z axis in the spatial rectangular coordinate system.
  • the master device 100 can determine whether displacement has occurred to the master device 100 (or the slave device 200 ).
  • the master device 100 can determine whether the master device 100 (or the slave device 200 ) has rotated.
  • the acceleration sensor and gyroscope sensor please refer to the subsequent introduction, which will not be expanded here.
  • the motion detection module 223 can also detect the change of the azimuth angle of the main device 100 .
  • the above azimuth angle refers to the azimuth angle of the main device 100 relative to the user's head.
  • the motion detection module 223 can set the position of the main device 100 when starting to play the audio as a default value, for example, the azimuth is 0° (that is, the default main device 100 is directly in front of the user). Then, the master device 100 may calculate a new azimuth angle according to the change between the moved position and the position at the previous moment.
  • the specific calculation method refer to the introduction of the subsequent embodiments, which will not be expanded here.
  • the main device 100 can query the storage table in the personalized setting module 222 to determine the music material matching the above-mentioned device action. After the music material is determined, the master device 100 may acquire the audio data of the music material from the music material library 221 . At the same time, according to the new azimuth angle calculated by the motion detection module 223 , the main device 100 may determine the filter coefficient corresponding to the azimuth angle by querying the head function database 224 .
  • the above-mentioned filter coefficients refer to the parameters for the main device 100 to determine the output audio of the left ear and the right ear by using the head function filter.
  • the main device 100 can determine that the music material matching the leftward movement is a bass drum.
  • the azimuth angle of the main device 100 relative to the user will change from the previous azimuth angle (here, it is assumed that the previous azimuth angle is the initial default value of 0°) to 280° (that is, the frontal yaw). left 80°).
  • the 3D space rendering module 225 can use the above-mentioned head function filter with specific filter coefficients to perform space rendering on the selected music material, so that it has a three-dimensional surround effect. In this way, the music material added to the original audio 211 also has a three-dimensional surround effect.
  • the detection object of the motion detection module 223 in the software structure 20 can be changed accordingly.
  • the detection object of the motion detection module 223 does not include the slave device 300 . If the system 10 includes the master device 100 and the slave device 200 , but the detected object only includes the slave device 200 , at this time, the detection object of the motion detection module 223 only includes the slave device 200 .
  • the master device 100 records an association relationship between an action and a music material.
  • the main device 100 needs to determine the relationship between the device action and the music material, that is, determine what kind of device action corresponds to what kind of music material. Based on the above association relationship, after detecting a device action, the main device 100 can determine the music material corresponding to the action.
  • the main device 100 may display the first user interface.
  • the interface displays the detected electronic device, the type of action (device action) of the detected electronic device, and preset buttons for the user to select music material.
  • the main device 100 may display music materials recorded in the preset music material library 221 in response to user operations acting on the above buttons.
  • the detected electronic devices include: the master device 100 , and/or the slave device 200 , and/or the slave device 300 .
  • the user can also delete the detected electronic devices supported by the main device 100 .
  • the master device 100 may display the slave device 300 on the above-mentioned first user interface.
  • the user confirms that there is no need to detect the user operation of the slave device 300 , the user can delete the slave device 300 .
  • the master device 100 may not display the slave device 300 .
  • the above detected motion types of the electronic device are preset device motions, including displacement motions and rotation motions.
  • the displacement action may include moving left, moving right, moving up, and moving down.
  • the turning action may include turning left, turning right, turning up, and turning down. It can be understood that, without being limited to the above-mentioned displacement motion and rotation motion, the above-mentioned preset device motion may also be other motions, which are not limited in this embodiment of the present application.
  • the aforementioned multiple music materials that can be selected by the user refer to preset audio with specific content, including musical instrument sounds, animal sounds, environmental sounds, and user-defined recording files, etc., which will not be repeated here.
  • the user After displaying the first user interface, the user can set which action of which electronic device matches which music material.
  • the master device 100 may record the association between actions and music materials.
  • the main device 100 may include a music material library 221 and a personalized setting module 222 .
  • the music material library 221 stores a plurality of audio data of different types that can be selected, that is, music materials.
  • the personalized setting module 222 can record preset device actions. First, the personalized setting module 222 can match the above-mentioned device actions to a default music material.
  • the above default music material includes "no effect", random music material.
  • the personalized setting module 222 may modify the originally recorded music material matching a certain device action to a new user-specified music material.
  • the music material originally recorded by the personalized setting module 222 that matches the movement of the main device 100 to the left is the sound of rain.
  • the music material recorded by the personalized setting module 222 that matches the movement of the main device 100 to the left can be changed to a bass drum.
  • the record in the personalized setting module 222 is queried, and the main device 100 can confirm the music material matching the action.
  • S102 The main device 100 downloads the music material associated with the device action.
  • the main device 100 may first determine whether the above music material has been stored in the local storage.
  • the aforementioned local storage refers to the storage of the master device 100 .
  • the main device 100 can directly obtain it from the storage. If the above-mentioned music material is not stored in the local storage, the main device 100 needs to obtain the above-mentioned music material from the server providing the above-mentioned music material, and store the above-mentioned music material in the local storage, so as to be called at any time.
  • the music material library 221 can include a large number of music materials, and the main device 100 can obtain some music materials according to actual needs, thereby reducing the demand on the storage capacity of the main device 100 . Further, the main device 100 may also download the required music material each time when implementing the sound processing method provided by the embodiment of the present application, and delete the downloaded music material when it is not needed.
  • S102 is optional. If the music materials recorded in the music material library 221 only include the music materials stored in the master device 100, then the master device 100 does not need any music materials from the server. Conversely, if the music materials recorded in the music material library 221 are provided by the server, the local storage of the master device 100 may only include some of the music materials recorded in the music material library 221 . At this time, the main device 100 needs to determine whether the music material specified by the user and matched with the device action can be obtained from the local storage. If not, then the main device 100 needs to download the music materials that have not been downloaded to the local storage to the local storage in advance.
  • the master device 100 determines that the audio data of the above-mentioned bass drum has not been stored in the local memory, the master device 100 needs to obtain the audio data from the server that provides the bass drum. Download the audio data of the bass drum. In this way, when the main device 100 detects that the main device 100 makes an action of moving to the left, the main device 100 can directly obtain the audio data of the bass drum from the local storage.
  • the main device 100 may detect the user's operation of playing audio, and in response to the above playing operation, the main device 100 may start to play the original audio.
  • the above-mentioned operation of playing audio may be an operation acting on third-party software audio software, or may be an operation acting on the audio software included in the system of the master device 100 .
  • the audio software of the main device 100 system or third-party software audio software can use the system application to add entertainment to the audio being played.
  • interactive sound effects can also be a functional plug-in provided by third-party audio software.
  • the main device 100 can add entertaining interactive sound effects to the audio being played.
  • the main device 100 can divide the audio data being played according to the preset length. In this way, the audio data being played can be divided into several data segments. Wherein, the data segment being played may be referred to as the first data segment. After the first data segment, the data segment to be played may be referred to as a second data segment.
  • the master device 100 When the master device 100 plays the first data segment, the master device 100 can detect a certain device action. After determining the music material corresponding to the above-mentioned device action and processing the material, the master device 100 can fuse the audio data (additional audio data) of the processed music material with the second data segment, so that the second data segment Not only the content of the original audio, but also the content of the attached music material is included. It can be understood that the data length of the above additional audio data is consistent with the data length of the second data segment.
  • the master device 100 acquires motion data, and determines a device action, an audio material associated with the action, and an azimuth angle according to the above motion data.
  • the master device 100 may start to acquire motion data of the detected electronic device.
  • the above motion data includes data collected by the acceleration sensor (acceleration data) and data collected by the gyroscope sensor (gyroscope data).
  • the motion data may indicate whether the detected electronic device has performed an action that matches a preset action.
  • the master device 100 can receive its own acceleration data and gyroscope data.
  • the master device 100 may also receive acceleration data and gyroscope data from the slave device 200 .
  • the acceleration data and gyroscope data of the slave device 200 can be sent to the master device 100 through a wired or wireless connection between the master device 100 and the slave device 200 . It can be understood that when the detected electronic devices increase or decrease, the movement data that the master device 100 needs to acquire increases or decreases accordingly.
  • the master device 100 may calculate the device action indicated by the motion data.
  • FIG. 4A shows a schematic diagram of the main device 100 determining device actions according to acceleration data.
  • the acceleration sensor may establish a space rectangular coordinate system with the center point of the main device 100 as the origin.
  • the positive direction of the X-axis of the coordinate system is horizontal to the right; the positive direction of the Y-axis of the coordinate system is vertically upward; and the positive direction of the Z-axis of the coordinate system faces the user forward. Therefore, the above acceleration data specifically includes: X-axis acceleration, Y-axis acceleration and Z-axis acceleration.
  • the value of the X-axis acceleration is close to the value of the acceleration of gravity g (9.81), which means that the left side of the main device 100 is facing downward. Conversely, a value of the X-axis acceleration close to a negative g value may indicate that the right side of the main device 100 is facing down.
  • the value of the Y-axis acceleration close to the g value can indicate that the lower side of the main device 100 is facing downward; the value of the Y-axis acceleration close to the negative g value can indicate that the upper side of the main device 100 is facing downward (inverted);
  • the value of the Z-axis acceleration is close to the g value, which means that the screen of the main device 100 is facing upward, that is, the positive direction of the Z-axis at this time is consistent with the positive direction of the Y-axis in the figure;
  • the value of the Z-axis acceleration is close to the negative g value, which means that the main device 100
  • the screen of the device 100 faces downward, that is, the positive direction of the Z axis at this time is consistent with the negative direction of the Y axis in the figure.
  • the master device 100 may further determine device actions. Specifically, taking the orientation of the device shown in FIG. 4A as an example (the Y axis is facing up and the X axis is facing right), if the value of the acceleration on the X axis is positive, the master device 100 can confirm that it has moved to the right ; If the value of the X-axis acceleration is negative, the master device 100 can confirm that it has made an action to move to the left ; The acceleration moves upward; if the value of the Y-axis acceleration is equal to -A+g, then the master device 100 is moving downward with an acceleration of -A m/s 2 .
  • the host device 100 may determine that the host device 100 has made a device action (displacement action) corresponding to the preset condition. Further, the master device 100 may determine the music material matching the displacement action.
  • FIG. 4B shows a schematic diagram of the main device 100 determining device actions according to gyroscope data.
  • the gyroscope sensor can also establish a spatial rectangular coordinate system with the center point of the main device 100 as the origin. Refer to the introduction in FIG. 4A , and details will not be repeated here.
  • the above gyroscope data specifically includes: X-axis angular velocity, Y-axis angular velocity and Z-axis angular velocity.
  • the main device 100 may also rotate simultaneously.
  • the rectangular coordinate system established by the gyro sensor with the center point of the main device 100 as the origin will also change. According to the above changes, the master device 100 may determine that it has made a turning motion.
  • the main device 100 may rotate from right to left with the Y axis as the rotation center.
  • the above actions can correspond to the left turn in Table 1.
  • the positive direction of the X-axis and Z-axis of the space Cartesian coordinate system will change. Specifically, referring to FIG. 4C , before turning left, the positive direction of the X-axis can be expressed as the direction indicated by X1; the positive direction of the Z-axis can be expressed as the direction indicated by Z1. After turning left, the positive direction of the X axis can be expressed as the direction pointed by X2; the positive direction of the Z axis can be expressed as the direction indicated by Z2.
  • the rotation angle between X1 and X2 is recorded as ⁇ (angular velocity: ⁇ /s); the rotation angle between Z1 and Z2 is also ⁇ (angular velocity: ⁇ /s); the rotation angle of the Y axis is 0 (angular velocity : 0/s).
  • the host device 100 can determine that the host device 100 has made a device action (rotation action) corresponding to the preset condition. Further, the master device 100 may determine the music material matching the rotation action.
  • the main device 100 While detecting the device action of the electronic device, the main device 100 also needs to determine the azimuth of the main device 100 relative to the user. Specifically, the master device 100 may determine the azimuth angle of the master device 100 after making a specific device movement according to the two changes in position.
  • FIG. 4C shows a schematic diagram of the master device 100 determining the azimuth after the master device 100 moves left.
  • the icon 41 shows the position of the master device 100 before moving to the left.
  • Icon 42 shows the position of the main device 100 after moving to the left.
  • the main device 100 can set the initial orientation ( ⁇ 0 ) to 0° and the distance to d1, that is, by default, the main device 100 is directly in front of the user (the position indicated by the icon 41 ).
  • the distance here refers to the distance from the center point of the device to the midpoint of the listener's ear connection. This is because when the user completes the operation of playing audio, he usually places the mobile phone directly in front of the user, and the distance is usually within 50cm (the length of the arms), so that he can face the screen and complete the playback operation on the mobile phone screen .
  • the main device 100 can move from the position shown by the icon 41 to the position shown by the icon 42 by moving left. At this time, the master device 100 can determine the distance by which it has moved to the left, which is denoted as d2. At this time, the new azimuth angle ⁇ 1 of the main device 100 relative to the user can be determined by the aforementioned d1 and d2. At the same time, the master device 100 can also determine the distance d3 from the user at this time.
  • the main device 100 can determine the position after the movement according to the distance and direction of the movement and the position at the previous moment, so as to determine the azimuth to the user. Based on the above azimuth angle, the main device 100 may determine the filter coefficient used by the head function filter.
  • the main device 100 can also directly detect the distance between the main device 100 and the user through the depth-sensing camera.
  • S105 Perform 3D space rendering on the input head function filter of the music material matched with the device action, so that the audio data of the above music material has a spatial three-dimensional surround effect.
  • the head function filter refers to a device for processing audio data using a head-related transform function (HRTF).
  • HRTF head-related transform function
  • the head function filter can simulate the propagation of the sound signal in the three-dimensional space, so that the sound heard by the user's ears is different, and it has a spatial three-dimensional surround effect.
  • the main device 100 can determine the music material matching the device action through the corresponding relationship recorded in the personalization setting module 222 . After acquiring the music material, the main device 100 can first use the head function filter to render the audio data of the music material in 3D space, and then superimpose the processed audio data on the original audio, so that the audio that the user hears Not only does it come with interactive sound effects, but its interactive sound effects also have spatial stereo surround effects.
  • FIG. 5A the process of performing 3D space rendering on the audio data of the music material by the head function filter can be shown in FIG. 5A:
  • S201 Perform time-frequency domain conversion on the audio data of the music material.
  • the master device 100 may perform time-domain conversion or frequency-domain conversion on the audio data of the above music material to obtain time-domain audio data or frequency-domain audio data.
  • S202 Determine the filter coefficient of the head function filter according to the azimuth angle.
  • the main device 100 Before using the head function filter to render the audio data of the selected music material in 3D space, the main device 100 also needs to determine the filter coefficient of the head function filter. Filter coefficients can affect the rendering effect of 3D space rendering. If the filter coefficient is inappropriate or even wrong, there will be a significant difference between the sound processed by the head function filter and the sound actually transmitted to the user's ears, thereby affecting the user's listening experience.
  • Filter coefficients can be determined by azimuth.
  • the head-related transform function (HRTF) database records the mapping relationship between the azimuth and the filter data.
  • the master device 100 may determine the filter coefficient of the head function filter by querying the HRTF database.
  • the filter coefficients corresponding to the same azimuth are also correspondingly divided into time domain filter coefficients and frequency domain filter coefficients.
  • the main device 100 may determine the frequency domain filter coefficient as the filter coefficient of the head function filter. On the contrary, if it is determined to perform 3D space rendering on the audio data of the music material from the time domain, the main device 100 may determine the time domain filter coefficient as the filter coefficient of the head function filter.
  • S203 Input the converted audio data of the music material into a head function filter for filtering.
  • the master device 100 may input the above audio data into a head function filter corresponding to the above filter coefficients. Then, the head function filter can multiply the input frequency-domain (or time-domain) audio data by corresponding filter coefficients to obtain rendered frequency-domain (or time-domain) audio data. At this time, the frequency-domain (or time-domain) audio data after rendering can have a spatial stereo surround effect.
  • S204 Perform inverse time-frequency transformation to obtain a 3D space rendering signal processed by a head function filter.
  • the master device 100 before the audio data is input into the head function filter for filtering (S203), the master device 100 performs time-frequency domain conversion on the audio data. Therefore, after the filtering is completed, the main device 100 also needs to perform inverse time-frequency domain transformation on the audio data that has undergone time-frequency domain conversion, so that the audio data that has undergone time-frequency domain conversion can be restored to a data format that can be processed by an audio player.
  • the main device 100 uses inverse time domain transformation to convert the rendered audio data; otherwise, if the frequency domain transformation is performed in S201, the main device 100 uses inverse frequency domain transformation to The rendered audio data is transformed.
  • FIG. 5B schematically shows a schematic diagram of performing 3D space rendering on a frequency-domain audio signal by a head function filter using frequency-domain filter coefficients.
  • the graph 511 is a frequency domain signal of a certain audio material.
  • the vertical axis is the sample point amplitude (dB)
  • the horizontal axis is the frequency (Hz).
  • the frequency domain signal in the chart 511 can be used as the audio data of the music material after the frequency domain conversion introduced in S201.
  • Chart 512 and chart 513 are the frequency-domain filter coefficients corresponding to a certain azimuth angle in the head function database.
  • the chart 512 is the frequency-domain filter coefficient of the left channel corresponding to the azimuth; the chart 513 is the frequency-domain filter coefficient of the right channel corresponding to the azimuth.
  • the vertical axis is head function amplitude (dB), and the horizontal axis is frequency (Hz).
  • the main device 100 can respectively obtain the rendered left channel frequency domain audio signal and right channel frequency domain audio signal .
  • Graph 514 and graph 515 show a left channel frequency domain audio signal and a right channel frequency domain audio signal, respectively.
  • the main device 100 can obtain the rendered left channel audio signal and right channel audio signal. Further, the left ear device of the slave device 200 may play the above left channel audio signal; the right ear device of the slave device 200 may play the above right channel audio signal. In this way, the additional music materials heard by the left and right ears of the user are different and have a spatial three-dimensional surround effect.
  • the head function filter can also use the time domain filter coefficients to render the time domain audio signal in 3D space.
  • graph 521 shows a time-domain signal of a certain audio material.
  • the vertical axis is the sample point amplitude
  • the horizontal axis is the sample point serial number according to time.
  • Chart 522 and chart 523 are time-domain filter coefficients corresponding to a certain azimuth angle in the head function database.
  • chart 522 is the time-domain filter coefficient of the left channel corresponding to the azimuth
  • chart 523 is the time-domain filter coefficient of the right channel corresponding to the azimuth.
  • the vertical axis is the sample point amplitude
  • the horizontal axis is the sample point serial number according to time.
  • the method based on the time domain has a higher computational complexity than the method based on the frequency domain when the filter length is long. Therefore, in the case of a long filter length, the main device 100 may preferentially adopt a method based on the frequency domain to render the audio signal in the frequency domain, so as to reduce time complexity and save computing resources.
  • S106 Add the spatially rendered music material to the audio being played by the main device 100.
  • the master device 100 may attach the music material to the audio being played by the master device 100 . This allows the user to hear both the audio being played and the attached music material at the same time.
  • the master device 100 can directly attach the music material to the audio being played. If the number of audios superimposed at the same time is too large, it is easy to cause the signal to be too large after superimposition, resulting in clipping. Therefore, in the process of adding music material, the main device 100 can also use a weighting method to avoid the situation that the superimposed signal is too large.
  • the weight of each audio material can be:
  • S output is the superimposed output signal
  • S input is the original played music signal
  • r i is the i-th music material
  • w i is the weight of the i-th music material
  • the master device 100 may also set different weights for different electronic devices, but the sum of the weights is 1.
  • the weight W 1 of the slave device 200 can be 0.3
  • the weight W 2 of the slave device 300 can be 0.3
  • the weight W 2 of the master device 200 can be 0.3
  • a weight W 3 of 100 may be 0.4.
  • S107 Perform basic sound effect processing on the original audio of the additional audio material, and then play it.
  • the master device 100 may also perform basic sound effect processing on the audio after the music material is added.
  • the above-mentioned basic sound effects specifically include: equalization, dynamic range control, limiting, and low-frequency enhancement, etc.
  • the audio after basic sound processing has better quality. Therefore, users can obtain a better listening experience.
  • the main device 100 can play the above-mentioned audio.
  • the process of converting the electric signal into the sound signal is completed by the slave device 200 .
  • the sound that the user hears from the device 200 includes not only the audio originally specified by the user, but also the interactive music material generated according to the movement of the device.
  • the main device 100 can detect the motion state of the electronic device when playing audio such as music. When it is detected that the electronic device makes an action that matches the preset action, the main device 100 may add the music material that matches the above action to the music being played. In this way, the user can add interactive effects to the music while listening to the music, thereby improving the fun of the music playing process and satisfying the user's need for interaction with the audio being played.
  • the main device 100 also renders the additional music material in 3D space according to the position change between the electronic device and the user, so that the additional music material heard by the user also has a three-dimensional space. Surround effect.
  • 6A-6J are a set of user interfaces provided by the embodiment of the present application.
  • a schematic diagram of a user interface for implementing a sound processing method provided by an embodiment of the present application will be introduced below with reference to FIGS. 6A-6J .
  • FIG. 6A shows a schematic diagram of the main device 100 displaying a first user interface.
  • the first user interface includes a status bar 601 , an area 602 , and an area 603 .
  • the status column 601 specifically includes: one or more signal strength indicators of mobile communication signals (also referred to as cellular signals), one or more signal strength indicators of high-fidelity wireless communication (wireless fidelity, Wi-Fi) signals indicator, battery status indicator, time indicator, etc.
  • Area 602 can be used to display some global setting buttons.
  • Area 603 can be used to display specific music materials that match each device action.
  • the master device 100 can detect a user operation acting on a certain electronic device, and in response to the operation, the master device 100 can set not to detect the device action of the electronic device.
  • the foregoing user operation is, for example, a left-swipe delete operation, etc., which is not limited in this embodiment of the present application.
  • a button 611 and a button 612 may be displayed in the area 602 .
  • the main device 100 may randomly match device actions and music materials in response to the above operation. In this way, the user does not need to set the music material matching the actions of each device one by one. At this time, the music material associated with each device action displayed in the area 603 is "random".
  • the host device 100 may display the user interface shown in FIG. 6B in response to the above operation.
  • the user can set music materials matching the actions of each device one by one. For example, the left turning action of "headphone A" shown in the area 603 in FIG. 6B may match snare drum type music material.
  • the first user interface shown in FIG. 6A may further include a button 613 and a button 614 .
  • Button 613 can be used to set the user's mood. According to the aforementioned mood, the master device 100 may filter the music materials provided in the music material library 221 . The main device 100 may not display music materials that obviously do not meet the user's current mood. In this way, the user can filter out some unnecessary music materials through the button 613, thereby reducing the operation complexity of the user's designation of music materials.
  • the main device 100 may detect a user operation acting on the button 613, and in response to the above operation, the main device 100 may display the user interface shown in FIG. 6C. At this time, the main device 100 may display a series of mood types that can be selected by the user, including joy, sadness, anger, fear and so on.
  • the master device 100 can filter all types of music materials provided in the music material library 221 according to the mood type. For example, after the main device 100 detects a user operation on the sad button 631 , the main device 100 may filter out music materials matching the sad mood provided in the music material library 221 according to the sad mood type.
  • the above-mentioned music materials suitable for sad mood are, for example, erhu, sound of rain and so on.
  • the main device 100 may not display the aforementioned music materials that obviously do not fit the sad mood, such as suona, birdsong, etc.
  • the user interface shown in FIG. 6C also includes a random button 632 and a no effect button 633 .
  • the main device 100 may randomly set the user's mood type, and then filter music materials matching the mood type according to the randomly set mood type.
  • the host device 100 may not perform an operation of filtering music materials provided in the music material library 221 from the perspective of mood type in response to the above operation.
  • the above mood may also be automatically sensed by the master device 100, that is, the master device 100 may judge the user's current mood by acquiring the user's physiological data.
  • the user interface shown in FIG. 6C may include a self-aware button 634 .
  • Button 614 may be used to set the musical style of the overall additional musical material. Similarly, according to the selected music style, the main device 100 can filter the music materials provided in the music material library 221 . For music materials that obviously do not conform to the user's current music style, the main device 100 may not display them. In this way, the user can filter out some unnecessary music materials through the button 614, thereby reducing the operation complexity of the user's designation of music materials.
  • the host device 100 may display the user interface shown in FIG. 6D.
  • the main device 100 can display a series of music styles that can be selected by the user, including pop, rock, electronic, folk, classical and so on.
  • the master device 100 may filter out music materials matching the rock genre provided in the music material library 221 .
  • the aforementioned music materials conforming to the rock type include guitar, bass, drum kit and the like.
  • the main device 100 may not display the aforementioned music materials that obviously do not conform to the rock type, such as guzheng, pipa, etc.
  • the main device 100 may display a user interface including multiple types of music materials.
  • the host device 100 may display the user interface shown in FIG. 6E.
  • the interface may display a plurality of different types of option buttons, such as button 651 , button 652 , button 653 , and button 654 .
  • the button 651 can be used to display the music material of musical instrument type.
  • Button 652 can be used to display music material of animal sound type;
  • button 653 can be used to display music material of environment type;
  • button 654 can be used to display user's recording.
  • the host device 100 may display the user interface shown in FIG. 6F.
  • the user interface may be displayed with a plurality of buttons indicating different instrument types, such as snare drum, bass drum, maracas, piano, accordion, and the like.
  • the master device 100 may detect a user operation acting on any button, and in response to the operation, the master device 100 may match the music material corresponding to the button to the device action (turn left) corresponding to the button 621 . In this way, when the above-mentioned device action is detected, the main device 100 can add the above-mentioned music material to the audio being played.
  • the main device 100 may display the user interface shown in FIG. 6G.
  • the user interface may display a plurality of buttons indicating different types of animal sounds, such as birdsong, frogs, insects, cats, dogs and so on.
  • the host device 100 may display the user interface shown in FIG. 6H.
  • the user interface may display a plurality of buttons indicating different types of animal sounds, such as wind, rain, thunder, running water and so on.
  • the host device 100 may display the user interface shown in FIG. 6I.
  • the user interface may display a plurality of buttons indicating user-defined recording, such as hello, Hi, refueling and so on.
  • the main device 100 can set the latter music material as the music material selected by the user, that is, a device action matches A type of musical material. For example, after the user selects the snare drum among the musical instrument sounds, if the user selects the rain sound among the environmental sounds, at this time, the master device 100 may determine that the above rain sound is the music material selected by the user.
  • the user interfaces shown in FIGS. 6F-6G also include a random button and a no effect button.
  • a random button and no-effect button refer to the introduction in FIG. 6C , which will not be repeated here.
  • the master device 100 may also set a random button on the right side of the button 651 , the button 652 , the button 653 , and the button 654 .
  • the user can directly set random music materials on the user interface shown in FIG. 6E , thereby reducing user operations, reducing operation complexity, and improving user experience.
  • the user interface shown in FIG. 6E may also include a button 655 . Refer to the random button above.
  • the button 655 can provide the user with the function of setting no effect on the user interface shown in FIG. 6E , thereby reducing user operations, reducing operation complexity, and improving user experience.
  • the user interface shown in FIG. 6E may also include a button 655 .
  • the host device may display the user interface shown in FIG. 6J.
  • the interface may include a button to start recording, a button to listen to a recording, a button to save a recording, and the like.
  • the interface may include a button indicating the user's newly recorded recording file.
  • the interface may include a button named "Welcome”. The user can click the button to select the music material.
  • the user interface shown in FIG. 6I may also include a button for adding a recording. Refer to the introduction of the button 656 shown in FIG. 6E , which will not be repeated here.
  • the user can freely select and set music materials that match the actions of the device.
  • the main device 100 can determine the music material associated with the above-mentioned device action by querying the association relationship preset by the user.
  • the original audio 211 shown in Fig. 2 can be referred to as the first audio; Music materials such as wind sound, drums and the like can be referred to as the second audio for the original audio 211; Second audio for variable stereo playback.
  • the action of moving the user's head to the left may be referred to as a first action.
  • the device action of moving the slave device 200 to the left may reflect the above-mentioned action of the user moving the head to the left.
  • the first action may also be a combined action, for example, an action in which the user moves the head and the arm to the left at the same time may be referred to as the first action.
  • moving the head to the left may be called a second action; moving the arm to the left may be called another second action.
  • the music material corresponding to moving the head to the left can be called a fourth audio; the music material corresponding to moving the arm to the left can be called another fourth audio.
  • the second audio includes the above two fourth audio.
  • the output audio 213 shown in FIG. 2 may be referred to as a third audio.
  • the filter coefficient of the head function filter determined according to the azimuth angle in FIG. 5A may be referred to as a first parameter.
  • the master device 100 can obtain several pieces of audio data by dividing the audio being played, and the second data section to be played can be called a first section.
  • the duration of the first interval is equal to the duration of the additional music material, that is, equal to the first duration.
  • the user interface shown in Figure 6A or Figure 6B can be called the first user interface; in Figure 6A or Figure 6B, the icon representing the "turn left” action in "earphone A" can be called the first icon, followed by the first icon
  • the control 621 for selecting music material (the name of the music material displayed on the control 621 in FIG. 6A is random, and the name of the music material displayed in FIG. 6B is snare drum) can be called the first control.
  • Table 1 may be referred to as storage tables.
  • FIG. 7 exemplarily shows a hardware structural diagram of the master device 100 , the slave device 200 , and the slave device 300 .
  • the following describes the hardware structure of the electronic device involved in the embodiment of the present application with reference to FIG. 7 .
  • the hardware modules of the master device 100 include: a processor 701 , a memory 702 , a sensor 703 , a touch screen 704 , and an audio unit 705 .
  • the hardware modules of the slave device 200 include: a processor 711 , a sensor 712 , and a sounding unit 713 .
  • the hardware modules of the slave device 300 include: a processor 721 and a sensor 722 .
  • the structure shown in the embodiment of the present invention does not constitute a specific limitation on the foregoing electronic device.
  • the above-mentioned electronic device may include more or fewer components than those shown in the illustrations, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • the hardware module structure and the cooperative relationship among the modules of the slave device 200 and the slave device 300 are simpler. Therefore, the hardware structure of the main device 100 is introduced here by taking the main device 100 as an example.
  • the processor 701 may include one or more processing units, for example: the processor 701 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 701 for storing instructions and data.
  • the memory in processor 701 is a cache memory.
  • the memory may hold instructions or data that the processor 701 has just used or recycled. If the processor 701 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 701 is reduced, thus improving the efficiency of the system.
  • processor 701 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input and output
  • subscriber identity module subscriber identity module
  • SIM subscriber identity module
  • USB universal serial bus
  • the I2C interface is a bidirectional synchronous serial bus, including a serial data line (serial data line, SDA) and a serial clock line (derail clock line, SCL).
  • processor 701 may include multiple sets of I2C buses.
  • the processor 701 can be respectively coupled to a touch sensor, a charger, a flashlight, a camera, etc. through different I2C bus interfaces.
  • the processor 701 can be coupled to the touch sensor through the I2C interface, so that the processor 701 and the touch sensor can communicate through the I2C bus interface to realize the touch function of the master device 100 .
  • the I2S interface can be used for audio communication.
  • processor 701 may include multiple sets of I2S buses.
  • the processor 701 may be coupled to the audio unit 705 through an I2S bus to implement communication between the processor 701 and the audio unit 705 .
  • the audio unit 705 can transmit audio signals to the wireless communication module through the I2S interface, so as to realize the function of answering calls through the Bluetooth headset.
  • the PCM interface can also be used for audio communication, sampling, quantizing and encoding the analog signal.
  • the audio unit 705 and the wireless communication module may be coupled through a PCM bus interface.
  • the audio unit 705 can also transmit audio signals to the wireless communication module through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
  • the UART interface is a universal serial data bus used for asynchronous communication.
  • the bus can be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication.
  • a UART interface is generally used to connect the processor 701 with the wireless communication module.
  • the processor 701 communicates with the Bluetooth module in the wireless communication module through the UART interface to realize the Bluetooth function.
  • the audio unit 705 can transmit audio signals to the wireless communication module through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
  • the MIPI interface can be used to connect the processor 701 with the touch screen 704, camera and other peripheral devices.
  • MIPI interface includes camera serial interface (camera serial interface, CSI), touch screen 704 serial interface (display serial interface, DSI), etc.
  • the processor 701 communicates with the camera through a CSI interface to realize the shooting function of the main device 100 .
  • the processor 701 communicates with the touch screen 704 through the DSI interface to realize the display function of the main device 100 .
  • the GPIO interface can be configured by software.
  • the GPIO interface can be configured as a control signal or as a data signal.
  • the GPIO interface can be used to connect the processor 701 with the camera, touch screen 704, wireless communication module, audio unit 705, sensor module 180 and so on.
  • the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
  • the USB interface is an interface that conforms to the USB standard specification, specifically, it can be a Mini USB interface, a Micro USB interface, a USB Type C interface, etc.
  • the USB interface can be used to connect a charger to charge the main device 100, and can also be used to transmit data between the main device 100 and peripheral devices. It can also be used to connect headphones and play audio through them. This interface can also be used to connect other electronic devices, such as AR devices.
  • the interface connection relationship between the modules shown in the embodiment of the present invention is only a schematic illustration, and does not constitute a structural limitation of the main device 100 .
  • the main device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
  • the memory 702 may include one or more random access memories (random access memory, RAM) and one or more non-volatile memories (non-volatile memory, NVM).
  • RAM random access memory
  • NVM non-volatile memory
  • Random access memory can include static random-access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (synchronous dynamic random access memory, SDRAM), double data rate synchronous Dynamic random access memory (double data rate synchronous dynamic random access memory, DDR SDRAM, such as the fifth generation DDR SDRAM is generally called DDR5 SDRAM), etc.;
  • Non-volatile memory may include magnetic disk storage devices, flash memory (flash memory).
  • flash memory can include NOR FLASH, NAND FLASH, 3D NAND FLASH, etc.
  • it can include single-level storage cells (single-level cell, SLC), multi-level storage cells (multi-level cell, MLC), triple-level cell (TLC), quad-level cell (QLC), etc.
  • SLC single-level storage cells
  • MLC multi-level storage cells
  • TLC triple-level cell
  • QLC quad-level cell
  • UFS universal flash storage
  • embedded multimedia memory card embedded multi media Card
  • the random access memory can be directly read and written by the processor 701, and can be used to store executable programs (such as machine instructions) of an operating system or other running programs, and can also be used to store data of users and application programs.
  • the non-volatile memory can also store executable programs and data of users and application programs, etc., and can be loaded into the random access memory in advance for the processor 701 to directly read and write.
  • the main device 100 can also include an external memory interface that can be used to connect an external non-volatile memory, so as to expand the storage capacity of the main device 100.
  • the external non-volatile memory communicates with the processor 701 through the external memory interface to realize the data storage function. For example, files such as music and video are stored in an external non-volatile memory.
  • the computer program implementing the sound processing method may be stored in the memory 702 .
  • the sensor 703 includes a plurality of sensors.
  • implementing the method provided in the embodiment of the present application mainly involves an acceleration sensor and a gyroscope sensor.
  • the acceleration sensor can detect the acceleration of the host device 100 in various directions (generally three axes).
  • the magnitude and direction of gravity can be detected when the main device 100 is stationary. It can also be used to identify the posture of electronic devices, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
  • the gyro sensor can be used to determine the motion posture of the main device 100 .
  • the angular velocity of the main device 100 around three axes may be determined by a gyro sensor.
  • the gyro sensor can be used for image stabilization. Exemplarily, when the shutter is pressed, the gyro sensor detects the shaking angle of the main device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shaking of the main device 100 through reverse movement to achieve anti-shake.
  • Gyroscope sensors can also be used for navigation and somatosensory game scenes.
  • the master device 100 detects the device actions of the master device 100, the slave device 200 (and the slave device 300) depending on the acceleration sensor and the gyroscope sensor.
  • the master device 100 also relies on the above sensors to determine the azimuth angle between itself and the user.
  • the sensor 703 may also include other sensors, such as a pressure sensor, an air pressure sensor, a magnetic sensor, a distance sensor, a proximity light sensor, an ambient light sensor, a fingerprint sensor, a temperature sensor, a bone conduction sensor, and the like.
  • sensors such as a pressure sensor, an air pressure sensor, a magnetic sensor, a distance sensor, a proximity light sensor, an ambient light sensor, a fingerprint sensor, a temperature sensor, a bone conduction sensor, and the like.
  • the pressure sensor is used to sense the pressure signal and convert the pressure signal into an electrical signal.
  • a pressure sensor may be provided on the touch screen 704 .
  • pressure sensors such as resistive pressure sensors, inductive pressure sensors, and capacitive pressure sensors.
  • a capacitive pressure sensor may be comprised of at least two parallel plates with conductive material. When a force is applied to the pressure sensor, the capacitance between the electrodes changes.
  • the main device 100 determines the intensity of pressure according to the change in capacitance.
  • the main device 100 detects the intensity of the touch operation according to the pressure sensor.
  • the host device 100 may also calculate the touched position according to the detection signal of the pressure sensor.
  • touch operations acting on the same touch position but with different touch operation intensities may correspond to different operation instructions. For example: when a touch operation with a touch operation intensity less than the first pressure threshold acts on the short message application icon, an instruction to view short messages is executed. When a touch operation whose intensity is greater than or equal to the first pressure threshold acts on the icon of the short message application, the instruction of creating a new short message is executed.
  • the barometric pressure sensor is used to measure air pressure.
  • the main device 100 calculates the altitude by using the air pressure value measured by the air pressure sensor to assist positioning and navigation.
  • Magnetic sensors include Hall sensors.
  • the main device 100 may detect opening and closing of the flip holster using a magnetic sensor.
  • the main device 100 can detect the opening and closing of the flip according to the magnetic sensor.
  • features such as automatic unlocking of the flip cover are set.
  • the main device 100 may measure the distance by infrared or laser. In some embodiments, when shooting a scene, the main device 100 may use a distance sensor to measure a distance to achieve fast focusing.
  • Proximity light sensors may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes.
  • the light emitting diodes may be infrared light emitting diodes.
  • the main device 100 emits infrared light through the light emitting diode.
  • the main device 100 detects infrared reflected light from nearby objects using a photodiode. When sufficient reflected light is detected, it can be determined that there is an object near the main device 100. When insufficient reflected light is detected, the host device 100 may determine that there is no object near the host device 100 .
  • the main device 100 can use the proximity light sensor to detect that the user holds the main device 100 close to the ear to make a call, so as to automatically turn off the screen to save power.
  • Proximity light sensor can also be used for leather case mode, pocket mode auto unlock and lock screen.
  • the ambient light sensor is used to sense the ambient light brightness.
  • the master device 100 can adaptively adjust the brightness of the touch screen 704 according to the perceived ambient light brightness.
  • the ambient light sensor can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor can also cooperate with the proximity light sensor to detect whether the main device 100 is in the pocket, so as to prevent accidental touch.
  • the fingerprint sensor is used to collect fingerprints.
  • the main device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, access to the application lock, take pictures with fingerprints, answer incoming calls with fingerprints, and so on.
  • a temperature sensor is used to detect temperature.
  • the master device 100 uses the temperature detected by the temperature sensor to implement a temperature processing strategy. For example, when the temperature reported by the temperature sensor exceeds the threshold, the master device 100 may reduce the performance of a processor located near the temperature sensor, so as to reduce power consumption and implement thermal protection.
  • the master device 100 when the temperature is lower than another threshold, the main device 100 heats the battery, so as to avoid abnormal shutdown of the main device 100 caused by the low temperature.
  • the master device 100 boosts the output voltage of the battery to avoid abnormal shutdown caused by low temperature.
  • Bone conduction sensors can pick up vibration signals.
  • the bone conduction sensor can acquire the vibration signal of the vibrating bone mass of the human voice.
  • Bone conduction sensors can also contact the human pulse and receive blood pressure beating signals.
  • the bone conduction sensor can also be disposed in the earphone, combined into a bone conduction earphone.
  • the audio unit 705 can analyze the voice signal based on the vibration signal of the vibrating bone mass of the vocal part acquired by the bone conduction sensor, so as to realize the voice function.
  • the application processor can analyze the heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor, so as to realize the heart rate detection function.
  • the touch screen 704 includes a display screen and a touch sensor (also called “touch device”).
  • the display screen is used to display the user interface.
  • the touch sensor can be arranged on the display screen, and the touch sensor and the display screen form a "touch screen".
  • the touch sensor is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation may be provided through the display screen.
  • the touch sensor may also be disposed on the surface of the main device 100, which is different from the position of the display screen.
  • the user interface shown in FIGS. 6A-6J relies on the touch screen 704 .
  • the audio unit 705 includes audio modules such as a speaker, a receiver, a microphone, an earphone jack, and an application processor to implement audio functions such as music playing and recording.
  • audio modules such as a speaker, a receiver, a microphone, an earphone jack, and an application processor to implement audio functions such as music playing and recording.
  • the audio unit 705 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal.
  • the audio unit 705 may also be used to encode and decode audio signals.
  • the audio unit 705 may be set in the processor 701 , or some functional modules of the audio unit 705 may be set in the processor 701 .
  • Loudspeakers also called “horns" are used to convert audio electrical signals into sound signals.
  • the main device 100 can listen to music through a speaker, or listen to a hands-free call.
  • the main device 100 can play audio, such as music, etc. through a speaker.
  • the sound generating unit 713 of the slave device 200 can realize the function of converting audio electrical signals into sound signals.
  • the receiver also known as the "earpiece" is used to convert audio electrical signals into sound signals.
  • the master device 100 answers a phone call or a voice message, it can listen to the voice by putting the receiver close to the human ear.
  • the headphone jack is used to connect wired headphones.
  • Microphones also known as “microphones” and “microphones" are used to convert sound signals into electrical signals. When making a phone call or sending a voice message, the user can pass the human mouth close to the microphone to speak, and input the sound signal into the microphone.
  • the main device 100 may be provided with at least one microphone. In some other embodiments, the main device 100 may be provided with two microphones, which may also implement a noise reduction function in addition to collecting sound signals. In some other embodiments, the main device 100 can also be provided with three, four or more microphones to realize sound signal collection, noise reduction, identify sound sources, realize directional recording functions, and the like.
  • the earphone jack can be a USB interface, or a 3.5mm open mobile terminal platform (open mobile terminal platform, OMTP) standard interface, or a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the main device 100 may also include other hardware modules.
  • the main device 100 may further include a communication module.
  • the communication module includes: antenna, mobile communication module, wireless communication module, modem processor and baseband processor, etc.
  • the master device 100 can establish a wireless connection with the slave device 200 through the above communication module. Based on the above wireless connection, the master device 100 can convert audio electrical signals into sound signals through the sound generating unit 713 of the slave device 200 . At the same time, based on the aforementioned wireless connection, the master device 100 can acquire motion data (acceleration data, gyroscope data) collected by the sensor 712 of the slave device 200 .
  • Antennas are used to transmit and receive electromagnetic wave signals.
  • Each antenna in the master device 100 can be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: Antennas can be multiplexed as diversity antennas for wireless LANs. In other embodiments, the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module can provide wireless communication solutions including 2G/3G/4G/5G applied on the main device 100 .
  • the mobile communication module may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module can receive electromagnetic waves through the antenna, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module can also amplify the signal modulated by the modem processor, and convert it into electromagnetic wave and radiate it through the antenna.
  • at least part of the functional modules of the mobile communication module may be set in the processor 701 .
  • at least part of the functional modules of the mobile communication module and at least part of the modules of the processor 701 may be set in the same device.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is passed to the application processor after being processed by the baseband processor.
  • the application processor outputs sound signals through audio equipment (not limited to speakers, receivers, etc.), or displays images or videos through the touch screen 704 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent of the processor 701, and be set in the same device as the mobile communication module or other functional modules.
  • the wireless communication module can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite system, etc. (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • the wireless communication module may be one or more devices integrating at least one communication processing module.
  • the wireless communication module receives electromagnetic waves through the antenna, frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 701 .
  • the wireless communication module can also receive the signal to be sent from the processor 701, frequency-modulate it, amplify it, and convert it into electromagnetic wave to radiate through the antenna.
  • the antenna of the main device 100 is coupled to the mobile communication module, and the antenna is coupled to the wireless communication module, so that the main device 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR techniques, etc.
  • GSM global system for mobile communications
  • GPRS general packet radio service
  • code division multiple access code division multiple access
  • CDMA broadband Code division multiple access
  • WCDMA wideband code division multiple access
  • time division code division multiple access time-division code division multiple access
  • TD-SCDMA time-division code division multiple access
  • LTE long term evolution
  • the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a Beidou navigation satellite system (beidou navigation satellite system, BDS), a quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • Beidou navigation satellite system beidou navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the main device 100 also includes a GPU, a touch screen 704, and an application processor.
  • the above hardware modules support the realization of the display function.
  • the GPU is a microprocessor for image processing, connected to the touch screen 704 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 701 may include one or more GPUs that execute program instructions to generate or change display information.
  • the touch screen 704 is used to display images, videos and the like.
  • the touch screen 704 includes a display panel.
  • the display panel can adopt liquid crystal touch screen 704 (liquid crystal display, LCD), organic light-emitting diode (organic light-emitting diode, OLED), active-matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the master device 100 may include 1 or N touch screens 704, where N is a positive integer greater than 1.
  • the main device 100 can realize the shooting function through ISP, camera, video codec, GPU, touch screen 704 and application processor.
  • the ISP is used to process the data fed back by the camera. For example, when taking a picture, open the shutter, the light is transmitted to the photosensitive element of the camera through the lens, and the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be located in the camera.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the light signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other image signals.
  • the master device 100 may include 1 or N cameras, where N is a positive integer greater than 1.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the master device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
  • Video codecs are used to compress or decompress digital video.
  • the main device 100 may support one or more video codecs.
  • the main device 100 can play or record videos in various encoding formats, for example: moving picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4 and so on.
  • MPEG moving picture experts group
  • the charging management module is used to receive charging input from the charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module can receive charging input from the wired charger through a USB interface.
  • the charging management module can receive wireless charging input through the wireless charging coil of the main device 100 . While charging the battery, the charging management module can also supply power to the electronic device through the power management module.
  • the power management module is used to connect the battery, the charging management module and the processor 701 .
  • the power management module receives the input of the battery and/or the charging management module, and supplies power to the processor 701, the memory 702, the touch screen 704, the camera, and the wireless communication module.
  • the power management module can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module can also be set in the processor 701 .
  • the power management module and the charging management module can also be set in the same device.
  • the NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • the NPU can quickly process input information and continuously learn by itself.
  • Applications such as intelligent cognition of the main device 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
  • Buttons include power button, volume button, etc.
  • the keys may be mechanical keys. It can also be a touch button.
  • the main device 100 can receive key input and generate key signal input related to user setting and function control of the main device 100 .
  • the motor can generate a vibrating prompt.
  • the motor can be used for incoming call vibration prompts, and can also be used for touch vibration feedback.
  • touch operations applied to different applications may correspond to different vibration feedback effects.
  • the motors may also correspond to different vibration feedback effects for touch operations on different areas of the touch screen 704 .
  • Different application scenarios for example: time reminder, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator can be an indicator light, which can be used to indicate the charging status, the change of the power level, and can also be used to indicate messages, missed calls, notifications, etc.
  • the SIM card interface is used to connect a SIM card.
  • the SIM card can be connected to and separated from the main device 100 by inserting it into the SIM card interface or pulling it out from the SIM card interface.
  • the master device 100 may support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
  • the SIM card interface can support Nano SIM card, Micro SIM card, SIM card, etc.
  • the same SIM card interface can insert multiple cards at the same time. The types of the multiple cards may be the same or different.
  • the SIM card interface is also compatible with different types of SIM cards.
  • the SIM card interface is also compatible with external memory cards.
  • the master device 100 interacts with the network through the SIM card to implement functions such as call and data communication.
  • the main device 100 adopts an eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the main device 100 and cannot be separated from the main device 100 .
  • the processor 711, sensor 712, and sound unit 713 of the slave device 200 refer to the introduction of the above-mentioned processor 701, sensor 703, and audio unit 705; Introduction, no more details here.
  • the slave device 200 and the slave device 300 may also include other hardware modules, which is not limited in this embodiment of the present application.
  • the user can drive the electronic device to move through his own actions (such as shaking his head, shaking his hands, etc.) when playing audio.
  • the electronic device can recognize the above-mentioned actions through motion detection, and determine the music material matching the above-mentioned actions according to the preset association relationship, so as to add entertaining interactive effects to the audio being played, increase the fun of the audio playback process, and satisfy the needs of users and users. Interaction requirements for the audio being played.
  • UI user interface
  • the term "user interface (UI)" in the specification, claims and drawings of this application is a medium interface for interaction and information exchange between an application program or an operating system and a user, and it realizes the internal form of information Conversion to and from a form acceptable to the user.
  • the user interface of the application program is the source code written in specific computer languages such as java and extensible markup language (XML). Such as pictures, text, buttons and other controls.
  • Control also known as widget (widget)
  • Typical controls include toolbar (toolbar), menu bar (menu bar), text box (text box), button (button), scroll bar (scrollbar), images and text.
  • the properties and contents of the controls in the interface are defined through labels or nodes.
  • XML specifies the controls contained in the interface through nodes such as ⁇ Textview>, ⁇ ImgView>, and ⁇ VideoView>.
  • a node corresponds to a control or property in the interface, and after the node is parsed and rendered, it is presented as the content visible to the user.
  • the interfaces of many applications, such as hybrid applications usually include web pages.
  • a web page, also called a page, can be understood as a special control embedded in the application program interface.
  • a web page is a source code written in a specific computer language, such as hyper text markup language (GTML), cascading style Tables (cascading style sheets, CSS), java scripts (JavaScript, JS), etc.
  • GTML hyper text markup language
  • cascading style Tables cascading style sheets, CSS
  • java scripts JavaScript, JS
  • the specific content contained in the web page is also defined by the tags or nodes in the source code of the web page.
  • GTML defines the elements and attributes of the web page through ⁇ p>, ⁇ img>, ⁇ video>, and ⁇ canvas>.
  • GUI graphical user interface
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, solid state hard disk), etc.
  • the processes can be completed by computer programs to instruct related hardware.
  • the programs can be stored in computer-readable storage media.
  • When the programs are executed may include the processes of the foregoing method embodiments.
  • the aforementioned storage medium includes: ROM or random access memory RAM, magnetic disk or optical disk, and other various media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

本申请提供了一种声音处理方法及其装置。实施该方法,手机等电子设备可以学习设备动作,和与该动作匹配的音乐素材。这样,用户可以在播放音频时,可以通过自身动作(例如摇头、晃手等)带动电子设备动作。电子设备可以通过运动检测识别上述动作,并根据预设的关联关系确定与上述动作匹配的音乐素材,从而为正在播放的音频附加娱乐性的互动效果,增加音频播放过程的趣味性,满足用户与正在播放的音频的互动需求。

Description

一种声音处理方法及其装置
本申请要求于2021年6月24日提交中国专利局、申请号为202110705314.1、申请名称为“一种声音处理方法及其装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端领域,尤其涉及一种声音处理方法及其装置。
背景技术
目前,在用户使用智能终端播放音频的过程中,终端设备一般都是单纯地进行音频回放。用户无法对正在播放的音频进行处理加工,从而无法获得基于音频的互动体验。
发明内容
本申请提供了一种声音处理方法。实施该方法,电子设备可以通过运动检测识别一个或多个电子设备在用户在播放音频时的动作,并根据预设的关联关系确定与该动作匹配的音乐素材,从而为正在播放的音频附加娱乐性的互动效果,增加音频播放过程的趣味性,满足用户与正在播放的音频的互动需求。
第一方面,本申请提供了一种声音处理方法,该方法应用于第一电子设备,该方法包括:播放第一音频;检测到用户的第一动作;响应于第一动作,获取第二音频,第二音频与第一动作有对应关系,其中对应关系为用户预置的;根据第二音频,对第一音频进行处理,得到第三音频,其中第三音频和第一音频不同,且第三音频和第一音频关联;播放第三音频。
实施第一方面提供的方法,第一电子设备可以在用户播放音乐时,识别被检测电子设备的动作。当被检测电子设备做出预设的动作时,第一电子设备可以确定与上述动作匹配的音频,并将上述音频添加到正在播放的音乐上,与该正在播放的音乐一同播放。
结合第一方面的一些实施例,在一些实施例中,第二音频为预设的、用于为第一音频增加背景音效果的音频。
实施上述实施例提供的方法,第一电子设备可以为正在播放的音乐添加具有娱乐性互动效果的音频,满足用户与正在播放的音乐进行互动需求。
结合第一方面的一些实施例,在一些实施例中,在获取第二音频之后,该方法还包括:对第二音频进行处理,使之成为具有可变立体声播放效果的第二音频,可变立体声播放效果是指立体声播放效果能够随着用户与第一电子设备相对位置的变化而变化;根据第二音频,对第一音频进行处理,得到第三音频,具体包括:将具有可变立体声播放效果的第二音频和第一音频进行叠加得到第三音频。
实施上述实施例提供的方法,第一电子设备可以对添加的具有娱乐性互动效果的音频进行空间渲染处理,使一般的互动音频具备变化的空间立体环绕效果。
结合第一方面的一些实施例,在一些实施例中,对第二音频进行处理,使之成为具有可变立体声播放效果的第二音频,具体包括:获取第一电子设备相对用户的位置;根据位 置确定第一参数,第一参数是从头相关变换函数数据库中获取的,调整第二音频左声道、右声道播放效果的参数;将第二音频与第一参数按频点相乘,得到具有可变立体声播放效果第二音频。
实施上述实施例提供的方法,第一电子设备可以通过第一电子设备相对用户的位置确定对第二音频进行空间渲染处理的参数,从而确认第二音频左声道、右声道的音频数据。这样,用户左右耳听到的左声道、右声道不同,从而形成立体声播放效果。随着相对位置的变化,空间渲染处理的参数也会不断变化,这样,用户听到的添加的具有娱乐性互动效果的音频也是立体的,且是可以随着第一电子设备与用户的相对位置变化而变化的,从而提升用户的沉浸式体验。
结合第一方面的一些实施例,在一些实施例中,根据第二音频,对第一音频进行处理,得到第三音频,具体包括:将第一时长的第二音频叠加到第一音频的第一区间,得到第三音频,第一区间的时长与第一时长相等。播放第三音频,具体包括:播放第三音频的第一区间内的音频。
实施上述实施例提供的方法,第一电子设备可以在检测到预设的设备动作后,在播放的第一音频的同时,播放第二音频。这样,用户可以立即听到正在播放的音频中添加了增加娱乐效果的互动音频。
结合第一方面的一些实施例,在一些实施例中,第一动作包括多个第二动作,多个第二动作为多个第二电子设备在同一时刻做出的动作的组合,第二音频包括多个第四音频,多个第四音频分别与多个第二动作对应。
实施上述实施例提供的方法,第一电子设备可以在检测到由多个电子设备做出的动作组合得到的动作。这样可以增加被检测动作的多样性,为用户提供更多的选择性。多个第二电子设备的动作组合形成的动作也更能准确的描述用户的肢体动作。
结合第一方面的一些实施例,在一些实施例中,在播放第一音频之前,该方法还包括:显示第一用户界面,第一用户界面中显示有一个或多个图标和控件,图标包括第一图标,控件包括第一控件;检测到用户作用于第一控件的第一操作;响应于第一操作,确认第二音频与第一动作关联。
实施上述实施例提供的方法,用户可以预先在第一电子设备中设置设备动作与具有娱乐性互动效果的音频的匹配关系。
结合第一方面的一些实施例,在一些实施例中,获取第二音频,具体包括:查询存储表,确定第二音频,存储表中记录了一个或多个音频,以及音频对应的动作;一个或多个音频包括第二音频,第二音频在存储表中对应第一动作;从本地数据库,或服务器获取第二音频。
实施上述实施例提供的方法,第一电子设备的本地存储中可以存储有存储表中预设的音乐素材,这样,在需要使用上述音乐素材时,第一电子设备可以直接从本地存储空间中获取该音乐素材。第一电子设备也可直接通过互联网从服务器获取存储表中预设的音乐素材,这样可以节省第一电子设备的存储空间。
结合第一方面的一些实施例,在一些实施例中,第二音频包括:乐器声、动物声、环境声或录音中任意一个。
实施上述实施例提供的方法,第一电子设备可以为正在播放的音乐添加这种不同的声音,例如乐器声、动物声、环境声或录音。
结合第一方面的一些实施例,在一些实施例中,乐器声包括:小鼓、大鼓、砂槌、钢琴、手风琴、小号、大号、长笛、大提琴或小提琴中任意一个;动物声包括:鸟鸣、蛙鸣、虫鸣、猫叫、狗叫、羊叫、牛叫、猪叫、马叫或鸡鸣中任意一个;环境声包括:风声、雨声、雷声、流水声、海浪声或瀑布声中任意一个。
结合第一方面的一些实施例,在一些实施例中,第二电子设备包括第一电子设备连接的耳机,第一动作包括耳机检测到的用户的头部动作。
实施上述实施例提供的方法,第一电子设备可以通过检测耳机的设备运动,确定用户的头部运动。当用户跟随正在播放的音乐晃动头部时,第一电子设备可以通过耳机的运动,判断用户做出了上述晃动头部的动作。
结合第一方面的一些实施例,在一些实施例中,头部动作包括头部位移或头部转动中的任意一个;头部位移包括:左移、右移、上移或下移中的任意一个,头部转动包括左转、右转、仰头或低头中的任意一个。
结合第一方面的一些实施例,在一些实施例中,第二电子设备包括第一电子设备连接的手表,第一动作包括手表检测到的用户的手部动作。
实施上述实施例提供的方法,第一电子设备可以通过检测手表的设备运动,确定用户的手部运动。当用户跟随正在播放的音乐晃动手部时,第一电子设备可以通过手表的运动,判断用户做出了上述晃动手部的动作。
结合第一方面的一些实施例,在一些实施例中,手部动作包括手部位移或手部转动中的任意一个;手部位移包括:左移、右移、上移或下移中的任意一个,手部转动包括左转、右转、抬手或垂手中的任意一个。
结合第一方面的一些实施例,在一些实施例中,第二电子设备包括第一电子设备连接的耳机和手表,第一动作包括耳机和手表检测到的用户的头部动作和手部动作的组合。
实施上述实施例提供的方法,第一电子设备可以通过耳机和手表检测与由用户头部动作和手部动作组合形成的动作,从而增加动作类型的多样性,为用户提供更多的选择性。由用户头部动作和手部动作组合形成的动作也更能准确的描述用户的肢体动作。
第二方面,本申请提供了一种电子设备,该电子设备包括一个或多个处理器和一个或多个存储器;其中,一个或多个存储器与一个或多个处理器耦合,一个或多个存储器用于存储计算机程序代码,计算机程序代码包括计算机指令,当一个或多个处理器执行计算机指令时,使得电子设备执行如第一方面以及第一方面中任一可能的实现方式描述的方法。
第三方面,本申请提供一种计算机可读存储介质,包括指令,当上述指令在电子设备上运行时,使得上述电子设备执行如第一方面以及第一方面中任一可能的实现方式描述的方法。
第四方面,本申请提供一种包含指令的计算机程序产品,当上述计算机程序产品在电子设备上运行时,使得上述电子设备执行如第一方面以及第一方面中任一可能的实现方式描述的方法。
可以理解地,上述第二方面提供的电子设备、第三方面提供的计算机存储介质、第四 方面提供的计算机程序产品均用于执行本申请所提供的方法。因此,其所能达到的有益效果可参考对应方法中的有益效果,此处不再赘述。
附图说明
图1是本申请实施例提供的一种声音处理方法的场景图;
图2是本申请实施例提供的一种声音处理方法的软件结构图;
图3是本申请实施例提供的一种声音处理方法的流程图;
图4A是本申请实施例提供的一种主设备识别设备动作的示意图;
图4B是本申请实施例提供的另一种主设备识别设备动作的示意图;
图4C是本申请实施例提供的主设备识别方位角的示意图;
图5A是本申请实施例提供的主设备对音频进行3D空间渲染的流程图;
图5B是本申请实施例提供的一组频域音频进行3D空间渲染的示意图;
图5C是本申请实施例提供的一组时域音频进行3D空间渲染的示意图;
图6A-图6J是本申请实施例提供的一组用户界面;
图7是本申请实施例提供的电子设备的硬件结构图。
具体实施方式
本申请以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。
在手机连接无线耳机播放音频的过程中,无线耳机可通过跟踪用户的头部动作确定用户的左耳、右耳与手机的距离,从而调整左耳、右耳中输出的音频的音量,以满足用户沉浸式环绕声体验。但是,上述处理仅限于调整原始音频在左耳、右耳中输出的强度,以得到立体环绕声的效果,而不能满足用户在播放上述音频的过程中,与上述音频进行互动的效果。
为了满足用户与正在播放的音频的互动需求,增加音频播放过程的娱乐效果,提升用户使用体验,本申请实施例提供了一种声音处理方法。
该方法可应用于手机等电子设备。实施该方法,手机等电子设备可以建立设备动作与音乐素材的联系。在识别到预设的设备动作后,电子设备可确认与该设备动作关联的音乐素材,然后,将经过三维空间渲染处理后的音乐素材与用户正在播放的音频进行融合,然后输出。
上述设备动作是指由用户运动引起的电子设备的位置、形态的变化,包括位移动作,和/或,转动动作。其中,位移动作是指由于电子设备当前位置相对于前一时刻的位置的变化产生的动作,包括左移、右移、上移、下移。电子设备可通过加速度传感器采集的数据判断该电子设备是否执行上述任一位移动作。转动动作是指由于电子设备当前时刻的方向相对于前一时刻的方向变化产生的动作,包括左转、右转、上转、下转。电子设备可通过陀螺仪传感器采集的数据判断该电子设备是否执行上述任一转动动作。可以理解的,若采取更细致的划分标准,上述位移动作、转动动作还可包括更多的类型。
可选的,上述设备动作还包括组合动作。上述组合动作是指多个电子设备在同一时刻做出的动作的组合。例如,在同一时刻,第一个被检测的电子设备做出了左移的动作,第二个被检测的电子设备做出了左转的动作,此时,上述左移和左转组合而成的动作为一个组合动作。
上述音乐素材是指预设的具备特定内容的音频数据,包括乐器声、动物声、环境声以及用户自定义的录音文件等等。上述乐器声例如小鼓、大鼓、砂槌、钢琴、手风琴、小号、大号、长笛、大提琴、小提琴等。上述动物声例如鸟鸣、蛙鸣、虫鸣、猫叫、狗叫、羊叫、牛叫、猪叫、马叫、鸡鸣等。上述环境声例如风声、雨声、雷声、流水声、海浪声、瀑布等。
三维空间渲染(three-dimensional space rendering)是指利用头相关变换函数(Head Related Transfer Function,HRTF)对音频数据进行处理,使得处理后的音频数据可以在用户的左耳、右耳具备立体环绕效果。后续实施例将简称头相关变换函数为头函数。利用头函数处理音频数据的模块称为头函数滤波器。
实施上述方法,用户可以在播放音频时,可以通过自身运动(例如摇头、晃手等)带动电子设备运动,从而为上述正在播放的音频附加娱乐性的互动效果,增加音频播放过程的趣味性,满足用户与正在播放的音频的互动需求。
图1示例性示出了实施上述声音处理方法的系统10。下面将结合系统10介绍实施上述方法涉及的场景。
如图1所示,系统10可包括主设备100、从设备200。
其中,主设备100可用于获取和处理音频文件。主设备100可以连接从设备200,利用从设备200提供的发声单元的播放能力,在从设备200端播放音频信号。即在主设备100端执行音频文件解析任务,在从设备200端执行播放音频信号的任务。上述系统10包括主设备100和从设备200的场景可称为第一场景。
图1所示的主设备100以手机类型的电子设备为例,从设备200以耳机类型的电子设备为例。不限手机,主设备100还可包括平板电脑、个人电脑(personal computer,PC)、个人数字助理(personal digital assistant,PDA)、智能可穿戴电子设备、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备等。上述电子设备也可为其它便携式电子设备,诸如膝上型计算机(Laptop)等。还应当理解的是,在其他一些实施例中,上述电子设备也可以不是便携式电子设备,而是台式计算机等等。电子设备的示例性实施例包括但不限于搭载
Figure PCTCN2022073338-appb-000001
Linux或者其它操作系统的便携式电子设备。
主设备100与从设备200的之间的连接可以为有线连接,也可以为无线连接。其中无线连接包括但不限于高保真无线通信(wireless fidelity,Wi-Fi)连接、蓝牙连接、NFC连接、ZigBee连接。若主设备100与从设备200之间为有线连接,则从设备200的设备类型可以是有线耳机;若主设备100与从设备200之间为无线连接,则从设备200的设备类型可以是无线耳机,包括头戴式无线耳机、颈挂式无线耳机和真无线耳机(True wireless headset,TWS)。本申请实施例对此不做限制。
在第一场景中,主设备100检测的对象包括:主设备100,和/或,从设备200。即在第一场景中,主设备100的检测对象可以只包括自身;也可只包括从设备200;还可以既包括主设备100又包括从设备200。具体主设备100检测的对象可由用户设置。
主设备100中记录中有设备动作与音乐素材的关联关系。主设备100和从设备200在播放音频的过程中,主设备100可实时地检测电子设备的设备动作。当检测到某一设备动作后,根据上述关联关系,主设备100可确定与该动作匹配的音乐素材。参考表1,表1示例性示出了上述设备动作与音乐素材的关联关系。
表1
电子设备 设备动作 音乐素材 设备动作 音乐素材
主设备100 左移 大鼓 左转 雨声
主设备100 右移 猫叫 右转 海浪声
主设备100 上移 长笛 上转 狗叫
主设备100 下移 海浪声 下转 大提琴
从设备200 左移 砂槌 左转 瀑布
从设备200 右移 雨声 右转 大提琴
从设备200 上移 无效果 上转 蛙鸣
…… …… …… …… ……
例如,当主设备100检测到主设备100发生上移时,响应于上述上移动作,主设备100可确定与上述主设备100上移动作关联的音乐素材为长笛。然后,主设备100可在正在播放的音频上添加与该设备动作(上移)对应的音乐素材(长笛),使得正在播放的音频文件还附带有该音乐素材(长笛)的效果,从而增加音频播放过程的趣味性,满足用户与正在播放的音频的互动需求。
其中,“无效果”可指示不匹配音乐素材。例如,当主设备100检测到从设备200发生上移后,主设备100可不为在正在播放的音频上添加任何互动的音乐素材。
可以理解的,若系统10还包括其他电子设备,则表1中记录的设备动作与音乐素材也会相应地增多,本申请实施例对此不再一一例举。当然,表1中记录的设备动作与音乐素材也不一定都是当前被检测电子设备的。例如,表1中记录的设备动作与音乐素材的关联关系包括主设备100的和从设备200的,而在实际的被检测对象可能仅包括主设备100(或从设备200)。
可选的,表1也可记录有多个被检测电子设备的单个动作组合而成的设备动作。例如,主设备100左移+从设备200右移等等。本申请实施例对于动作的类型不做限制。
此外,表1中所例举的设备动作也是可选的。当主设备100检测的设备动作仅包括位移动作时,表1中可仅记录位移动作和音乐素材的关联关系;当主设备100检测的设备动作仅包括转动动作时,表1中可仅记录转动动作和音乐素材的关联关系。
表1示例性示出的设备动作与音乐素材的关联关系是预设的。用户可通过的主设备100提供的用户界面设置与设备动作匹配的音乐素材。后续实施例将会详细介绍上述用户界面,这里先不展开。
在其他实施例中,系统10还可包括从设备300(第二场景)。其中从设备300包括:智能可穿戴设备(例如智能手表、智能手环等)、游戏手持设备(例如游戏手柄等)。
主设备100可记录与从设备300的设备动作匹配的音乐素材。当检测从设备300的做出某一设备动作后,主设备100可为确定与该动作匹配的音乐素材,然后将该音乐素材附加到主设备100正在播放的音频上。
这样,在播放音频(特别是音乐)的过程中,用户跟随音乐挥手的动作可被从设备300捕获到,进一步的,主设备100可以依据从设备300的设备动作为正在播放的音频文件添加更多的互动的音乐素材。
在第二场景中,前述介绍的组合动作还可包括从设备300的动作,例如从设备200上转+从设备300下移等等。
在上述实施例中,智能手表、智能手环等智能可穿戴设备可以作为从设备300。在其他实施例中,智能手表、智能手环等智能可穿戴设备也可以作为主设备100。上述场景例如:智能手表播放音乐、智能手表连接无线耳机播放音乐等等。本申请实施例对此不作限制。
图2示例性示出了实施本申请实施例提供的声音处理方法的软件结构20。下面结合图2具体介绍实施上述方法的软件结构。
如图2所示,软件结构20包括两部分:音频播放模块201和互动音效处理模块202。
音频播放模块201包括:原始音频211、基础音效212、输出音频213以及叠加模块214。互动音效处理模块202可包括:音乐素材库221、个性化设置模块222、运动检测模块223、头函数数据库224以及3D空间渲染模块225。
原始音频211可用于指示主设备100正在播放的音频。例如响应于播放音乐的用户操作,主设备100播放某一歌曲(歌曲A),此时,歌曲A的音频数据可称为主设备100正在播放的音频。
基础音效212可用于为原始音频211附加一些基本的播放效果。基础音效212可以修饰原始音频211,使得用户最后听到的更高品质的音频。上述附加基本的播放效果例如:均衡(调节音乐的音色)、动态范围控制(调节音乐的响度)、限幅(防止算法产生削波)以及低频增强(增强低频的效果)等。
输出音频213可用于指示从设备200实际播放的音频。输出音频213所包含的内容和效果是用户可直接听到或感受到的。例如,当输出音频213经过3D空间渲染后,用户听到的声音可具备空间立体环绕的效果。
在本申请实施例中,音频播放模块201还包括叠加模块214。叠加模块214可用于为原始音频211附加娱乐性的互动效果。具体的,叠加模块214可接收互动音效处理模块202发送的音乐素材,并将上述音乐素材与原始音频211进行融合,使得播放融合后的音频即包括原始音频211的内容,又包括上述音乐素材的内容,即使得原始音频211附带有娱乐性的互动效果。
在叠加模块214接收到互动音效处理模块202发送的音乐素材之前,互动音效处理模 块202需要确定上述互动效果的具体内容,即确定为原始音频211附加哪一些音乐素材。同时,互动音效处理模块202还需对选定的音乐素材进行3D空间渲染,使得该音乐素材具备空间立体环绕的效果,从而提升用户体验。
音乐素材库221中存储了多种音乐素材,包括前述实施例介绍的乐器声、动物声、环境声以及用户自定义的录音文件等等。原始音频211上附加的音乐素材来自于音乐素材库221。
音乐素材库221中包括的音乐素材可以全部存储在主设备100上,也可以存储在服务器中。当存储在主设备100上时,主设备100在使用上述音乐素材时可直接从本地存储器中获取。当存储在服务器上时,主设备100可从服务器上下载需要的音乐素材到本地存储器,然后从本地存储器中读取上述音乐素材。上述服务器是指存储有大量音乐素材的且为终端设备提供获取上述音乐素材服务的设备
上述需要的音乐素材是指:与被检测电子设备的设备动作关联的音乐素材。参考表1,所被检测对象仅包括主设备100,则主设备100的存储器中需要存储的音乐素材包括:大鼓、左转、猫叫、海浪声、长笛、狗叫、海浪声、大提琴。上述音乐素材之外的,主设备100可无需提前从云上下载到本地,从而节省主设备100的存储空间。
个性化设置模块222可用于设置设备动作与音乐素材的关联关系。用户可通过个性化设置模块222将任一设备动作与任一音乐素材匹配,例如,用户可通过个性化设置模块222将主设备100向左移动的动作与大鼓匹配。
经过个性化设置模块222的预设,主设备100可得到记录有上述关联关系的存储表,参考表1。基于上述存储表,主设备100可随时确定与任一设备动作对应的音乐素材。
运动检测模块223可用于检测主设备100、从设备200、从设备300等电子设备是否做出了上述存储表中记录的动作。具体的,上述电子设备中可安装有加速度传感器和陀螺仪传感器。其中,加速度传感器可用于检测上述电子设备是否发生位移动作;陀螺仪传感器可用于检测上述电子设备是否发生转动动作。
当主设备100(或从设备200)发生位移动作时,加速度传感器的3轴的数据会发生变化。上述3轴是指空间直角坐标系中的X轴、Y轴和Z轴。根据上述3轴的数据的变化,主设备100可确定到主设备100(或从设备200)是否发生了位移。同理,根据陀螺仪传感器采集到的数据的变化,主设备100可确定到主设备100(或从设备200)是否发生了转动。加速度传感器和陀螺仪传感器的具体工作原理可参考后续介绍,这里先不展开。
在检测电子设备是否做出特定的设备动作的同时,运动检测模块223还可检测主设备100的方位角的变化。上述方位角是指主设备100相对于用户头部的方位角度。其中,运动检测模块223可将开始播放音频时的主设备100的位置设置为默认值,例如方位角为0°(即默认主设备100在用户的正前方)。然后,主设备100可根据移动后的位置与前一时刻的位置的变化计算新的方位角。具体计算方式可参考后续实施例的介绍,这里先不展开。
在运动检测模块223检测到特定的设备动作后,主设备100可查询个性化设置模块222中的存储表,确定与上述设备动作匹配的音乐素材。在确定音乐素材后,主设备100可从音乐素材库221获取上述音乐素材的音频数据。同时,根据运动检测模块223计算的新的方位角,主设备100可通过查询头函数数据库224确定与该方位角对应的滤波系数。上述 滤波系数是指主设备100利用头函数滤波器确定左耳、右耳输出音频的参数。
例如,在运动检测模块223检测到主设备100做出向左移动的动作后,通过表1所示的存储表,主设备100可确定与该向左移动的动作匹配的音乐素材为大鼓。同时,由于主设备100向左移动的动作,主设备100相对于用户的方位角会由前一方位角(这里假设前一方位角就是初始默认值0°)变化为280°(即正前方偏左80°)。
然后,3D空间渲染模块225可利用上述特定滤波系数的头函数滤波器,对选定的音乐素材进行空间渲染,使之具备立体环绕效果。这样,附加在原始音频211上的音乐素材也是具备立体环绕效果的。
根据系统10中被检测对象的变更,软件结构20中运动检测模块223的检测对象可相应地变更。例如,当系统10不包括从设备300时,运动检测模块223的检测对象不包括从设备300。若系统10包括主设备100、从设备200,但被检测对象仅包括从设备200,此时,运动检测模块223的检测对象仅包括从设备200。
下面结合图3具体介绍本申请实施例提供的一种声音处理方法的流程图。
S101:主设备100记录动作与音乐素材的关联关系。
如图3所示,首先,主设备100需要先确定设备动作与音乐素材的关联关系,即确定什么样的设备动作对应什么样的音乐素材。基于上述关联关系,当检测到某一设备动作后,主设备100才可确定与该动作对应的音乐素材。
具体的,在主设备100确定动作与音乐素材的关联关系时,主设备100可显示第一用户界面。该界面显示有被检测的电子设备、被检测的电子设备的动作类型(设备动作)以及预设的供用户选择音乐素材的按钮。响应于作用在上述按钮的用户操作,主设备100可显示预设的音乐素材库221中记录的音乐素材。
上述被检测的电子设备包括:主设备100,和/或,从设备200,和/或,从设备300。当然,用户也可删除主设备100显示的支持被检测的电子设备。例如,当主设备100检测到从设备300后,主设备100可在上述第一用户界面显示从设备300。此时,若用户确认不需要检测从设备300的用户操作,则用户可以删除从设备300。响应于上述删除操作,主设备100可不显示从设备300。
上述被检测的电子设备的动作类型为预设的设备动作,包括位移动作和转动动作。位移动作可包括左移、右移、上移、下移。同样的,转动动作可包括左转、右转、上转、下转。可以理解的,不限于上述例举的位移动作和转动动作,上述预设的设备动作还可以是其他动作,本申请实施例对此不做限制。
上述可供用户选择的多个音乐素材是指预设的具备特定内容的音频,包括乐器声、动物声、环境声以及用户自定义的录音文件等等,这里不再赘述。
在显示第一用户界面后,用户可设置哪一电子设备的哪一动作匹配哪一音乐素材。响应于上述用户操作,主设备100可记录动作与音乐素材的关联关系。
结合图2,具体的,主设备100可包括音乐素材库221、个性化设置模块222。音乐素材库221中存储有多个可供选择的不同类型的音频数据,即音乐素材。个性化设置模块222 可记录有预设的设备动作。首先,个性化设置模块222可将上述设备动作匹配一个默认音乐素材。上述默认音乐素材包括“无效果”、随机音乐素材。
响应于作用在第一用户界面上的设置关联关系的用户操作,个性化设置模块222可将原来记录的与某一设备动作匹配的音乐素材修改为新的用户指定的音乐素材。参考表1,个性化设置模块222原来记录的与主设备100向左移动匹配的音乐素材为雨声。在用户将雨声修改为大鼓后,个性化设置模块222记录的与主设备100向左移动匹配的音乐素材可变更为大鼓。
这样,当检测到某一设备动作后,查询个性化设置模块222中的记录,主设备100可确认与该动作匹配的音乐素材。
S102:主设备100下载与设备动作关联的音乐素材。
在用户设定与设备动作匹配的音乐素材后,主设备100可首先确定上述音乐素材是否已经存储在本地存储器中。上述本地存储器是指主设备100的存储器。
若本地存储器已经存储有上述音乐素材,则在需要调用上述音乐素材的时候,主设备100可直接从该存储器获取。若本地存储器还未存储有上述音乐素材,则主设备100需要向提供上述音乐素材的服务器获取上述音乐素材,并将上述音乐素材存储在本地存储器中,以便于随时调用。
这样,音乐素材库221可包括大量音乐素材,而主设备100中可根据实际需求获取部分音乐素材,从而降低对主设备100的存储能力的需求。进一步的,主设备100也可每次在实施本申请实施例提供的声音处理方法时下载需要的音乐素材,在不需要的时候,删除已下载的音乐素材。
S102是可选的。若音乐素材库221中记录的音乐素材仅包括主设备100中存储的音乐素材,那么,主设备100也就无需从服务器音乐素材。反之,若音乐素材库221中记录的音乐素材为服务器提供的,则主设备100的本地存储器中可能仅包括部分音乐素材库221中记录的音乐素材。这时,主设备100需要确定用户指定的与设备动作匹配的音乐素材是否均可从本地存储器获取。若不能,则主设备100需要提前将未下载到本地存储器的音乐素材存下载到本地存储器。
例如,在表1中记录了与主设备100向左移动匹配的音乐素材为大鼓后,若主设备100确定本地存储器中尚未存储有上述大鼓的音频数据,则主设备100需要从提供大鼓的服务器下载大鼓的音频数据。这样,当主设备100检测到主设备100做出向左移动的动作时,主设备100可直接从本地存储器中获取大鼓的音频数据。
S103:响应于用户操作,主设备100播放音频。
主设备100可检测到用户播放音频的操作,响应于上述播放操作,主设备100可开始播放原始音频。上述播放音频的操作可以是作用于第三方软件音频软件的操作,也可以是作用于主设备100的系统自带的音频软件的操作。
具体的,当本申请实施例提供的声音处理方法作为一个系统应用时,主设备100的系统自带的音频软件或第三方软件音频软件均可利用该系统应用,为正在播放的音频附加娱 乐性的互动音效。当然,上述方法也可是第三方音频软件提供的一个功能插件。这样,当使用上述第三方音频软件,且启用上述插件时,主设备100可以为正在播放的音频附加娱乐性的互动音效。
主设备100可以根据预设的长度,对正在播放的音频数据进行分割。这样,正在播放的音频数据可被分割成若干数据段。其中,正在播放的数据段可被称为第一数据段。第一数据段之后,即将播放的数据段可被称为第二数据段。
在主设备100播放第一数据段时,主设备100可检测到某一设备动作。在确定与上述设备动作对应的音乐素材,并对该素材进行处理后,主设备100可将处理后的音乐素材的音频数据(附加音频数据)与第二数据段进行融合,使第二数据段不仅包括原始音频的内容,还包括附加的音乐素材的内容。可以理解的,上述附加音频数据的数据长度与第二数据段的数据长度一致。
S104:主设备100获取运动数据,并根据上述运动数据确定设备动作和与该动作关联的音频素材以及方位角。
在开始播放原始音频后,主设备100可开始获取被检测的电子设备的运动数据。上述运动数据包括加速度传感器采集的数据(加速度数据)和陀螺仪传感器采集的数据(陀螺仪数据)。上述运动数据可指示上述被检测的电子设备是否做出了与预设动作匹配的动作。
以被检测设备包括:主设备100、从设备200为例,主设备100可收到自身的加速度数据和陀螺仪数据。同时,主设备100还可收到从设备200的加速度数据和陀螺仪数据。其中,从设备200的加速度数据和陀螺仪数据可通过主设备100与从设备200之间的有线或无线连接发送到主设备100。可以理解的,当被检测的电子设备增多或减少,则主设备100需要获取的运动数据相应增多或减少。
在获取到电子设备的加速度数据和陀螺仪数据之后,主设备100可计算上述运动数据指示的设备动作。
图4A示出了主设备100根据加速度数据确定设备动作的示意图。如图4A所示,加速度传感器可以以主设备100的中心点为原点建立空间直角坐标系。其中,坐标系X轴的正方向水平向右;坐标系Y轴的正方向垂直向上;坐标系Z轴的正方面对用户向前。因此,上述加速度数据具体包括:X轴加速度、Y轴加速度和Z轴加速度。
X轴加速度的取值接近重力加速度g值(9.81)可说明主设备100的左侧边朝下。反之,X轴加速度的取值接近负的g值可说明主设备100的右侧边朝下。同理,Y轴加速度的取值接近g值可说明主设备100的下侧边朝下;Y轴加速度的取值接近负的g值可说明主设备100的上侧边朝下(倒置);Z轴加速度的取值接近g值可说明主设备100的屏幕朝上,即此时的Z轴正方向与图中Y轴正方向一致;Z轴加速度的取值接近负的g值可说明主设备100的屏幕朝下,即此时的Z轴正方向与图中Y轴负方向一致。
基于已确定的设备朝向,主设备100可进一步确定设备动作。具体的,以图4A所示的设备朝向为例(Y轴朝上,X轴朝右),如果X轴加速度的取值为正,则主设备100可确认自身做出了向右移动的动作;如果X轴加速度的取值为负,则主设备100可确认自身做出了向左移动的动作;如果Y轴加速度取值等于A+g,则主设备100正在以A m/s 2的加速 度向上移动;如果Y轴加速度取值等于-A+g,则主设备100正在以-A m/s 2的加速度向下移动。
这样,当加速度传感器采集的数据符合上述预设条件时,主设备100可确定主设备100做出了与该预设条件对应的设备动作(位移动作)。进一步的,主设备100可确定与该位移动作匹配的音乐素材。
图4B示出了主设备100根据陀螺仪数据确定设备动作的示意图。如图4B所示,陀螺仪传感器也可以以主设备100的中心点为原点建立空间直角坐标系,参考图4A的介绍,这里不再赘述。上述陀螺仪数据具体包括:X轴角速度、Y轴角速度和Z轴角速度。
在主设备100发生移动时,主设备100还可能同时发生转动。在转动时,陀螺仪传感器建立的主设备100的中心点为原点空间直角坐标系也会发生变化。根据上述变化,主设备100可确定自身做出了转动动作。
例如,主设备100可以以Y轴为旋转中心,从右向左转动。上述动作可对应表1中的左转。在单位时间内,在左转的过程中,空间直角坐标系的X轴、Z轴的正方向会发生变化。具体的,参考图4C,左转前,X轴的正方向可表示为X1所指的方向;Z轴的正方向可表示为Z1所示所指的方向。左转后,X轴的正方向可表示为X2所指的方向;Z轴的正方向可表示为Z2所示所指的方向。此时,X1与X2之间的转动角记为θ(角速度:θ/s);Z1与Z2之间的转动角也为θ(角速度:θ/s);Y轴的转动角为0(角速度:0/s)。
这样,当陀螺仪传感器采集的数据符合上述预设条件时,主设备100可确定主设备100做出了与该预设条件对应的设备动作(转动动作)。进一步的,主设备100可确定与该转动动作匹配的音乐素材。
主设备100确定从设备200是否做出预设的设备动作的方法可参考前述介绍,这里不再赘述。
在检测电子设备的设备动作的同时,主设备100还要确定主设备100相对于用户的方位角。具体的,主设备100可根据两次位置的变化确定主设备100在做出某一特定设备运动后的方位角。
以主设备100做出左移动作为例,图4C示出了主设备100确定主设备100左移后的方位角的示意图。如图4C所示,图标41示出了左移前主设备100的位置。图标42示出了左移后主设备100的位置。
首先,主设备100可将初始的方位(θ 0)设置为0°、距离为d1,即默认主设备100在用户的正前方(图标41所示的位置)。这里的距离是指设备中心点距离听音者双耳连线中点的距离。这是因为用户在完成播放音频的操作时,通常会将手机放置在正对用户的前方,且距离通常在50cm以内(双臂长度),以便于可以正视屏幕并完成作用于手机屏幕的播放操作。
主设备100可通过左移从图标41所示的位置移动到图标42所示的位置。此时,主设备100可确定自身左移的距离,记为d2。这时,主设备100相对于用户的新的方位角θ1可通过上述d1、d2确定。同时,主设备100还可确定此时与用户的距离d3。
依次类推,主设备100可通过运动的距离、方向与前一时刻的位置确定运动后的位置,从而确定与用户的方位角。基于上述方位角,主设备100可确定头函数滤波器使用的滤波 系数。
在其他实施例中,主设备100还可通过深感摄像头,直接检测主设备100与用户的距离。
S105:将与设备动作匹配的音乐素材输入头函数滤波器进行3D空间渲染,使上述音乐素材的音频数据具备具有空间立体环绕效果。
头函数滤波器是指利用头相关变换函数(HRTF)对音频数据进行处理的装置。头函数滤波器可以模拟声音信号在三维空间中的传播,使得用户双耳听到的声音不同,且具有空间立体环绕效果。
参考S104,在根据电子设备的运动数据确定电子设备做出某一特定的设备动作后,主设备100可通过个性化设置模块222中记录的对应关系,确定与该设备动作匹配的音乐素材。在获取该音乐素材后,主设备100可首先使用头函数滤波器对该音乐素材的音频数据进行3D空间渲染,然后再将处理后的音频数据叠加到原始音频上,从而使用户听到的音频不仅附带互动音效,而且其互动音效还具备空间立体环绕效果。
具体的,头函数滤波器对音乐素材的音频数据进行3D空间渲染的过程可如图5A所示:
S201:将音乐素材的音频数据进行时频域转换。
首先,主设备100可将上述音乐素材的音频数据进行时域转换或频域转换,得到时域音频数据或频域音频数据。
S202:根据方位角度确定头函数滤波器的滤波系数。
在使用头函数滤波器对选定的音乐素材的音频数据进行3D空间渲染之前,主设备100还需要确定头函数滤波器的滤波系数。滤波系数可影响3D空间渲染的渲染效果。如果滤波系数不合适甚至错误,则经过头函数滤波器处理后的声音与实际传播到用户双耳的声音会具有明显差别,从而影响用户的收听体验。
滤波系数可通过方位角确定。具体的,头相关变换函数(HRTF)数据库中记录有方位角与滤波器数据的映射关系。在确定方位角后,主设备100可通过查询HRTF数据库确定头函数滤波器的滤波系数。根据时域和频域的区分,同一方位角对应的滤波系数也相应地分为时域滤波系数和频域滤波系数。
参考S201,若确定从频域上对音乐素材的音频数据进行3D空间渲染,则主设备100可确定频域滤波系数作为头函数滤波器的滤波系数。反之,若确定从时域上对音乐素材的音频数据进行3D空间渲染,则主设备100可确定时域滤波系数作为头函数滤波器的滤波系数。
S203:将转换后的音乐素材的音频数据输入头函数滤波器进行过滤。
在获得频域(或时域)音频数据、且确定滤波系数后,主设备100可将上述音频数据输入上述滤波系数对应的头函数滤波器。然后,头函数滤波器可将输入的频域(或时域)音频数据与对应地滤波系数相乘,得到渲染后的频域(或时域)音频数据。这时,渲染后的频域(或时域)音频数据可具备空间立体环绕效果。
S204:进行反时频变换,获得经过头函数滤波器处理后的3D空间渲染信号。
参考S201,在将音频数据输入头函数滤波器进行过滤之前(S203),主设备100将该 音频数据进行了时频域转换。因此,在完成过滤后,主设备100还需要将经过了时频域转换的音频数据进行反时频域变换,从而使得经过时频域转换的音频数据恢复成音频播放器可处理的数据格式。
若S201中进行的是时域变换,则主设备100采用反时域变换对渲染后的音频数据进行转换;反之,若S201中进行的是频域变换,则主设备100采用反频域变换对渲染后的音频数据进行转换。
以频域3D空间渲染为例,图5B示例性示出了头函数滤波器使用频域滤波系数对频域音频信号进行3D空间渲染的示意图。
如图5B所示,图表511为某个音频素材的频域信号。其中,纵轴为样点幅度(dB),横轴为频率(Hz)。图表511中的频域信号可作为S201中介绍的经过频域转换后的音乐素材的音频数据。图表512和图表513分别为头函数数据库中某一方位角对应的频域滤波系数。其中图表512为该方位角对应的左声道频域滤波系数;图表513为该方位角对应的右声道频域滤波系数。纵轴为头函数幅度(dB),横轴为频率(Hz)。
将图表511中的音频数据与图表512和图表513所示的频域滤波系数按频点相乘,主设备100可分别得到渲染后的左声道频域音频信号和右声道频域音频信号。图表514和图表515分别示出了左声道频域音频信号和右声道频域音频信号。
然后,进行反频域转换,主设备100可得到渲染后的左声道音频信号和右声道音频信号。进一步的,从设备200的左耳设备可播放上述左声道音频信号;从设备200的右耳设备可播放上述右声道音频信号。这样,用户左右耳听到的附加音乐素材是不同的且具有空间立体环绕效果的。
头函数滤波器也可使用时域滤波系数对时域音频信号进行3D空间渲染。参考图5C,图表521示出了某个音频素材的时域信号。其中,纵轴为样点幅度,横轴为按时间的样点序号。图表522和图表523分别为头函数数据库中某一方位角对应的时域滤波系数。其中,图表522为该方位角对应的左声道时域滤波系数;图表523为该方位角对应的右声道时域滤波系数。纵轴为样点幅度,横轴为按时间的样点序号。
将上述时域信号(图表521)分别经过上述时域滤波系数(图表522、图表523)进行卷积(Convolution)后,可以得到3D空间渲染后的左声道时域信号(图表524)和右声道时域信号(图表525)。
基于时域的方法,在滤波器长度较长的情况,计算复杂度比基于频域的方法高。因此,在滤波器长度较长的情况,主设备100可优先采取基于频域的方法,对频域音频信号进行渲染,以降低时间复杂度,节省计算资源。
S106:将经过空间渲染的音乐素材附加到主设备100正在播放的音频上。
在得到经过3D空间渲染后的音乐素材后,主设备100可将上述音乐素材附加到主设备100正在播放的音频上。这样,用户便可以同时听到正在播放的音频和附加的音乐素材。
一般的,主设备100可直接将音乐素材附加的到正在播放的音频上。若同时叠加的音频的数量过多,则容易出现叠加后信号过大,从而造成削波。因此,在附加音乐素材的过程中,主设备100还可加权的方法避免叠加后信号过大的情况。
例如,需要附加的音乐素材包括n个,则每个音频素材的权重可以为:
W i=1/n
因此,叠加音乐素材后的音频为:
Figure PCTCN2022073338-appb-000002
其中S output是叠加后的输出信号,S input是原始播放的音乐信号,r i为第i个音乐素材,w i为第i个音乐素材的权重。
另外,主设备100还可以给不同的电子设备设定不同的权重,但权重之和为1。例如,当被检测的电子设备数量为3时,包括主设备100、从设备200、从设备300,从设备200的权重W 1可以为0.3、从设备300的权重W 2可以为0.3、主设备100的权重W 3可以为0.4。
S107:对附加音频素材的原始音频进行基础音效处理,然后播放。
在附加音乐素材后,主设备100还可对附加音乐素材后的音频进行基础音效处理。上述基础音效具体包括:均衡、动态范围控制、限幅以及低频增强等等。具体可参考图2的介绍,这里不再赘述。经过基础音效处理后的音频具备更好的品质。因此,用户可以获得更好的收听体验。
然后,主设备100可播放上述音频。其中,将电信号转换为声音信号的过程由从设备200完成。此时,用户通过从设备200听到的声音既包括原本用户指定播放的音频,还包括根据设备运动产生的互动性音乐素材。
实施图2所示的声音处理方法,主设备100可以在播放音乐等音频时,检测电子设备的运动状态。当检测到该电子设备做出与预设动作相符的动作时,主设备100可该与上述动作匹配的音乐素材附加到上述正在播放的音乐上。这样,用户可以在听音乐的同时,为该音乐添加互动效果,从而提高音乐播放过程的趣味性,满足用户与正在播放的音频的互动需求。
进一步的,在附加音乐素材的过程中,主设备100还根据电子设备与用户之间的位置变化,对附加的音乐素材进行3D空间渲染,从而使用户听到的附加的音乐素材还具有空间立体环绕的效果。
图6A-图6J是本申请实施例提供的一组用户界面。下面结合图6A-图6J介绍实施本申请实施例提供的一种声音处理方法的用户界面示意图。
图6A示出了主设备100显示第一用户界面示意图。如图6A所示,该第一用户界面包括状态栏601、区域602、区域603。其中,状态栏601具体包括:移动通信信号(又可称为蜂窝信号)的一个或多个信号强度指示符、高保真无线通信(wireless fidelity,Wi-Fi)信号的一个或多个信号强度指示符,电池状态指示符、时间指示符等。区域602可用于显示一些全局性的设置按钮。区域603可用于显示具体的与各个设备动作匹配的音乐素材。
区域603中显示的“耳机A”、“手机B”、“手表C”是可选的。主设备100可检测到作 用于某一电子设备的用户操作,响应于该操作,主设备100可设置不检测该电子设备的设备动作。上述用户操作例如是左滑删除操作等等,本申请实施例对此不作限制。
区域602中可显示按钮611、按钮612。当检测到作用于按钮611的用户操作时,响应于上述操作,主设备100可以随机地匹配设备动作和音乐素材。这样,用户无需再一一地设置与各个设备动作匹配的音乐素材。此时,区域603中显示的与各个设备动作关联的音乐素材为“随机”。
当检测到作用于按钮612的用户操作时,响应于上述操作,主设备100可显示图6B所示的用户界面。此时,用户可一一地设置与各个设备动作匹配的音乐素材。例如图6B中的区域603示出的“耳机A”的左转动作可匹配小鼓类型的音乐素材。
图6A(或图6B)所示的第一用户界面还可包括按钮613、按钮614。按钮613可用于设置用户的心情。根据上述心情,主设备100可对音乐素材库221中提供的音乐素材进行筛选。对于明显不符合用户当前心情的音乐素材,主设备100可不显示出来。这样,用户可以通过按钮613过滤掉一部分自己不需要的音乐素材,从而降低用户指定音乐素材的操作复杂度。
例如,主设备100可检测到作用于按钮613的用户操作,响应于上述操作,主设备100可显示图6C所示的用户界面。此时,主设备100可显示一系列可供用户选择的心情类型,包括喜悦、悲伤、愤怒、恐惧等等。当主设备100检测到作用于任意心情选项的用户操作后,主设备100可根据该心情类型筛选音乐素材库221中提供的所有类型的音乐素材。例如,当主设备100检测到作用于悲伤按钮631的用户操作后,主设备100可根据该悲伤的心情类型将音乐素材库221中提供的符合悲伤心情的音乐素材筛选出来。上述符合悲伤心情的音乐素材例如二胡、雨声等等。上述明显不符合悲伤心情的音乐素材例如唢呐、鸟鸣等等,主设备100可不显示。
图6C所示的用户界面还包括随机按钮632、无效果按钮633。当检测到作用于随机按钮632的用户操作时,响应于上述操作,主设备100可随机地设置用户的心情类型,然后根据随机设置的心情类型筛选与该心情类型匹配的音乐素材。当检测到作用于无效果按钮633的用户操作时,响应于上述操作,主设备100可不执行从心情类型的角度对音乐素材库221中提供的音乐素材进行筛选的操作。
在其他实施例中,上述心情也可以是主设备100自动感知的,即主设备100可通过获取用户的生理数据判断用户当前的心情。例如,图6C所示的用户界面可包括自感知按钮634。
按钮614可用于设置整体的附加音乐素材的音乐风格。同理,根据选定的音乐风格,主设备100可对音乐素材库221中提供的音乐素材进行筛选。对于明显不符合用户当前音乐风格的音乐素材,主设备100可不显示出来。这样,用户可以通过按钮614过滤掉一部分自己不需要的音乐素材,从而降低用户指定音乐素材的操作复杂度。
响应于作用在按钮614的用户操作,主设备100可显示图6D所示的用户界面。此时,主设备100可显示一系列可供用户选择的音乐风格,包括流行、摇滚、电子、民谣、古典 等等。例如,当主设备100检测到作用于摇滚按钮的用户操作后,主设备100可将音乐素材库221中提供的符合摇滚类型的音乐素材筛选出来。上述符合摇滚类型的音乐素材例如吉他、贝斯、架子鼓等等。上述明显不符合摇滚类型的音乐素材例如古筝、琵琶等等,主设备100可不显示。
上述介绍的主设备100显示音乐素材库221中提供的音乐素材的界面可参考图6E-图6J。在主设备100显示第一用户界面的过程中,当主设备100检测到作用于任一音乐素材按钮的用户操作时,主设备100可显示包括多种类型音乐素材的用户界面。
例如,在图6B所示的用户界面中,当检测到作用于按钮621的用户操作时,主设备100可显示图6E所示的用户界面。该界面可显示有多个不同类型的选项按钮,例如按钮651、按钮652、按钮653、按钮654。其中,按钮651可用于展示乐器类型的音乐素材。按钮652可用于展示动物声类型的音乐素材;按钮653可用于展示环境类型的音乐素材;按钮654可用于展示用户的录音。
响应于作用在按钮651的用户操作,主设备100可显示图6F所示的用户界面。该用户界面可显示有多个指示不同乐器类型按钮,例如小鼓、大鼓、砂槌、钢琴、手风琴等等。主设备100可检测到作用于任一按钮的用户操作,响应于该操作,主设备100可将该按钮对应的音乐素材匹配到按钮621对应的设备动作(左转)。这样,当检测到上述设备动作时,主设备100可将上述音乐素材添加到正在播放的音频上。
响应于作用在按钮652的用户操作,主设备100可显示图6G所示的用户界面。该用户界面可显示有多个指示不同动物声类型按钮,例如鸟鸣、蛙鸣、虫鸣、猫叫、狗叫等等。响应于作用在按钮653的用户操作,主设备100可显示图6H所示的用户界面。该用户界面可显示有多个指示不同动物声类型按钮,例如风声、雨声、雷声、流水声等等。响应于作用在按钮654的用户操作,主设备100可显示图6I所示的用户界面。该用户界面可显示有多个指示用户自定义录音的按钮,例如hello、Hi、加油等等。
可以理解的,在选择某一类型的音乐素材后,若检测到作用于另一音乐素材的用户操作时,主设备100可将后一个音乐素材设置为用户选择的音乐素材,即一个设备动作匹配一个类型的音乐素材。例如,当用户在乐器声中选择了小鼓后,若用户又在环境声中选择雨声,此时,主设备100可确定上述雨声为用户选择的音乐素材。
图6F-图6G示出的用户界面还包括随机按钮和无效果按钮。上述随机按钮和无效果按钮可参考图6C的介绍,这里不再赘述。
如图6E所示,主设备100还可在按钮651、按钮652、按钮653、按钮654的右边设置随机按钮。这样,用户可以直接在图6E所示用户界面设置随机音乐素材,从而减少用户操作,降低操作复杂度,提升用户体验。图6E所示的用户界面还可包括按钮655。参考上述随机按钮。按钮655可以在图6E所示的用户界面为用户提供设置无效果的功能,从而减少用户操作,降低操作复杂度,提升用户体验。
此外,图6E所示的用户界面还可包括按钮655。当检测到作用于按钮656的用户操作后,响应于该操作,主设备可显示图6J所示的用户界面。如图6J所示,该界面可包括开始录音按钮、录音试听按钮、保存录音按钮等等。
在保存录音后,在主设备100再次显示图6I所示的用户界面时,该界面可包括指示用户新录制的录音文件的按钮。例如,在用户录制了一个文件名称为“欢迎光临”的录音后,在主设备100再次显示图6I所示的用户界面时,该界面可包括名称为“欢迎光临”的按钮。用户可点击该按钮,选择该音乐素材。
图6I所示的用户界面也可包括新增录音的按钮,参考图6E所示的按钮656的介绍,这里不再赘述。
实施图6A-图6J所示的方法,用户可以自由地选择设置与设备动作匹配的音乐素材。这样,当检测到预设的设备动作时,通过查询用户预设的关联关系,主设备100可确定与上述设备动作关联的音乐素材。
在本申请实施例中:
图2所示的原始音频211可称为第一音频;为原始音频211附加风声、鼓声等音乐素材可称为第二音频;经过3D空间渲染模块225处理过的第二音频可称为具有可变立体声播放效果的第二音频。
用户头部左移的动作可称为第一动作。从设备200左移的设备动作可反映上述用户头部左移的动作。第一动作还可以是组合动作,例如用户同时左移头部和左移手臂的动作可称为第一动作。其中,第一动作中,左移头部可称为一个第二动作;左移手臂可称为另一个第二动作。当第一动作为上述组合动作时,左移头部对应的音乐素材可称为一个第四音频;左移手臂对应的音乐素材可可称为另一个第四音频。此时,第二音频包括上述两个第四音频。
图2所示的输出音频213可称为第三音频。
图5A中根据方位角度确定的头函数滤波器的滤波系数可称为第一参数。
参考S103的介绍,主设备100对正在播放的音频进行切分可得到若干段音频数据,其中即将播放的第二数据段可称为第一区间。第一区间的时长与附加的音乐素材的时长相等,即与第一时长的相等。
图6A或图6B所示的用户界面可称为第一用户界面;图6A或图6B中,“耳机A”中表示“左转”动作的图标可称为第一图标、第一图标后用于选择音乐素材的控件621(图6A中控件621上显示的音乐素材的名称为随机,图6B中显示的音乐素材的名称为小鼓)可称为第一控件。
表1所示的表可称为存储表。
图7示例性示出了主设备100、从设备200、从设备300的硬件结构图。下面结合图7介绍本申请实施例涉及的电子设备的硬件结构。
其中,主设备100的硬件模块包括:处理器701、存储器702、传感器703、触摸屏704、音频单元705。从设备200的硬件模块包括:处理器711、传感器712、发声单元713。从设备300的硬件模块包括:处理器721、传感器722。
可以理解的是,本发明实施例示意的结构并不构成对上述电子设备的具体限定。在本 申请另一些实施例中,上述电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
相对于主设备100,从设备200和从设备300的硬件模块结构以及模块间的协作关系更加简单。因此,这里以主设备100为例,介绍主设备100的硬件结构。
处理器701可以包括一个或多个处理单元,例如:处理器701可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器701中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器701中的存储器为高速缓冲存储器。该存储器可以保存处理器701刚用过或循环使用的指令或数据。如果处理器701需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器701的等待时间,因而提高了系统的效率。
在一些实施例中,处理器701可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器701可以包含多组I2C总线。处理器701可以通过不同的I2C总线接口分别耦合触摸传感器,充电器,闪光灯,摄像头等。例如:处理器701可以通过I2C接口耦合触摸传感器,使处理器701与触摸传感器通过I2C总线接口通信,实现主设备100的触摸功能。
I2S接口可以用于音频通信。在一些实施例中,处理器701可以包含多组I2S总线。处理器701可以通过I2S总线与音频单元705耦合,实现处理器701与音频单元705之间的通信。在一些实施例中,音频单元705可以通过I2S接口向无线通信模块传递音频信号,实现通过蓝牙耳机接听电话的功能。
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频单元705与无线通信模块可以通过PCM总线接口耦合。在一些实施例中,音频单元705也可以通过PCM接口向无线通信模块传递音频信号,实现通过蓝牙耳机接听电话的功能。所述I2S接口和所述PCM接口都可以用于音频通信。
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它 将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接口通常被用于连接处理器701与无线通信模块。例如:处理器701通过UART接口与无线通信模块中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频单元705可以通过UART接口向无线通信模块传递音频信号,实现通过蓝牙耳机播放音乐的功能。
MIPI接口可以被用于连接处理器701与触摸屏704,摄像头等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),触摸屏704串行接口(display serial interface,DSI)等。在一些实施例中,处理器701和摄像头通过CSI接口通信,实现主设备100的拍摄功能。处理器701和触摸屏704通过DSI接口通信,实现主设备100的显示功能。
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器701与摄像头,触摸屏704,无线通信模块,音频单元705,传感器模块180等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。
USB接口是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口可以用于连接充电器为主设备100充电,也可以用于主设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。
可以理解的是,本发明实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对主设备100的结构限定。在本申请另一些实施例中,主设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
存储器702可以包括一个或多个随机存取存储器(random access memory,RAM)和一个或多个非易失性存储器(non-volatile memory,NVM)。
随机存取存储器可以包括静态随机存储器(static random-access memory,SRAM)、动态随机存储器(dynamic random access memory,DRAM)、同步动态随机存储器(synchronous dynamic random access memory,SDRAM)、双倍资料率同步动态随机存取存储器(double data rate synchronous dynamic random access memory,DDR SDRAM,例如第五代DDR SDRAM一般称为DDR5 SDRAM)等;
非易失性存储器可以包括磁盘存储器件、快闪存储器(flash memory)。
快闪存储器按照运作原理划分可以包括NOR FLASH、NAND FLASH、3D NAND FLASH等,按照存储单元电位阶数划分可以包括单阶存储单元(single-level cell,SLC)、多阶存储单元(multi-level cell,MLC)、三阶储存单元(triple-level cell,TLC)、四阶储存单元(quad-level cell,QLC)等,按照存储规范划分可以包括通用闪存存储(英文:universal flash storage,UFS)、嵌入式多媒体存储卡(embedded multi media Card,eMMC)等。
随机存取存储器可以由处理器701直接进行读写,可以用于存储操作系统或其他正在运行中的程序的可执行程序(例如机器指令),还可以用于存储用户及应用程序的数据等。
非易失性存储器也可以存储可执行程序和存储用户及应用程序的数据等,可以提前加载到随机存取存储器中,用于处理器701直接进行读写。
主设备100还可包括外部存储器接口可以用于连接外部的非易失性存储器,实现扩展 主设备100的存储能力。外部的非易失性存储器通过外部存储器接口与处理器701通信,实现数据存储功能。例如将音乐,视频等文件保存在外部的非易失性存储器中。
在本申请实施例中,实现该声音处理方法的计算机程序可存储在存储器702中。
传感器703包括多个传感器。在本申请实施例中,实施本申请实施例提供的方法主要涉及加速度传感器和陀螺仪传感器。
加速度传感器可检测主设备100在各个方向上(一般为三轴)加速度的大小。当主设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
陀螺仪传感器可以用于确定主设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器确定主设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器检测主设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消主设备100的抖动,实现防抖。陀螺仪传感器还可以用于导航,体感游戏场景。
在本申请实施例中,主设备100检测主设备100、从设备200(以及从设备300)的设备动作依赖于加速度传感器和陀螺仪传感器。主设备100确定自身与用户的方位角也依赖于上述传感器。
传感器703还可包括其他传感器,例如压力传感器、气压传感器、磁传感器、距离传感器、接近光传感器、环境光传感器、指纹传感器、温度传感器、骨传导传感器等等。
压力传感器用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器可以设置于触摸屏704。压力传感器的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器,电极之间的电容改变。主设备100根据电容的变化确定压力的强度。当有触摸操作作用于触摸屏704,主设备100根据压力传感器检测所述触摸操作强度。主设备100也可以根据压力传感器的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
气压传感器用于测量气压。在一些实施例中,主设备100通过气压传感器测得的气压值计算海拔高度,辅助定位和导航。
磁传感器包括霍尔传感器。主设备100可以利用磁传感器检测翻盖皮套的开合。在一些实施例中,当主设备100是翻盖机时,主设备100可以根据磁传感器检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
距离传感器,用于测量距离。主设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,主设备100可以利用距离传感器测距以实现快速对焦。
接近光传感器可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。主设备100通过发光二极管向外发射红外光。主设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定主设备 100附近有物体。当检测到不充分的反射光时,主设备100可以确定主设备100附近没有物体。主设备100可以利用接近光传感器检测用户手持主设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器用于感知环境光亮度。主设备100可以根据感知的环境光亮度自适应调节触摸屏704亮度。环境光传感器也可用于拍照时自动调节白平衡。环境光传感器还可以与接近光传感器配合,检测主设备100是否在口袋里,以防误触。
指纹传感器用于采集指纹。主设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器用于检测温度。在一些实施例中,主设备100利用温度传感器检测的温度,执行温度处理策略。例如,当温度传感器上报的温度超过阈值,主设备100执行降低位于温度传感器附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,主设备100对电池加热,以避免低温导致主设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,主设备100对电池的输出电压执行升压,以避免低温导致的异常关机。
骨传导传感器可以获取振动信号。在一些实施例中,骨传导传感器可以获取人体声部振动骨块的振动信号。骨传导传感器也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器也可以设置于耳机中,结合成骨传导耳机。音频单元705可以基于所述骨传导传感器获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于所述骨传导传感器获取的血压跳动信号解析心率信息,实现心率检测功能。
触摸屏704包括显示屏和触摸传感器(也称“触控器件”)。显示屏用于显示用户界面。触摸传感器可以设置于显示屏,由触摸传感器与显示屏组成“触控屏”。触摸传感器用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器也可以设置于主设备100的表面,与显示屏所处的位置不同。
在本申请实施例中,图6A-图6J所示用户界面依赖于触摸屏704。
音频单元705包括扬声器,受话器,麦克风,耳机接口,以及应用处理器等音频模块,以实现音频功能,例如音乐播放,录音等。
音频单元705用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频单元705还可以用于对音频信号编码和解码。在一些实施例中,音频单元705可以设置于处理器701中,或将音频单元705的部分功能模块设置于处理器701中。
扬声器,也称“喇叭”,用于将音频电信号转换为声音信号。主设备100可以通过扬声器收听音乐,或收听免提通话。在本申请实施例中,主设备100可通过扬声器播放音频,例如音乐等等。在主设备100和从设备200协作播放音频的过程中,从设备200的发声单元713可实现音频电信号转换为声音信号功能。
受话器,也称“听筒”,用于将音频电信号转换成声音信号。当主设备100接听电话或语音信息时,可以通过将受话器靠近人耳接听语音。耳机接口用于连接有线耳机。
麦克风,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风发声,将声音信号输入到麦克风。主设备100可以设置至少一个麦克风。在另一些实施例中,主设备100可以设置两个麦克风,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,主设备100还可以设置三个,四个或更多麦克风,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口可以是USB接口,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
除上述介绍的硬件模块外,主设备100还可包括其他硬件模块。
主设备100还可包括通信模块。通信模块包括:天线,移动通信模块,无线通信模块,调制解调处理器以及基带处理器等。在本申请实施例中,主设备100可通过上述通信模块与从设备200之间建立无线连接。基于上述无线连接,主设备100可以通过从设备200的发声单元713将音频电信号转换为声音信号。同时,基于上述无线连接,主设备100可获取从设备200的传感器712采集的运动数据(加速度数据、陀螺仪数据)。
天线用于发射和接收电磁波信号。主设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块可以提供应用在主设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块可以由天线接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块还可以对经调制解调处理器调制后的信号放大,经天线转为电磁波辐射出去。在一些实施例中,移动通信模块的至少部分功能模块可以被设置于处理器701中。在一些实施例中,移动通信模块的至少部分功能模块可以与处理器701的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器,受话器等)输出声音信号,或通过触摸屏704显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器701,与移动通信模块或其他功能模块设置在同一个器件中。
无线通信模块可以提供应用在主设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术 (infrared,IR)等无线通信的解决方案。无线通信模块可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块经由天线接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器701。无线通信模块还可以从处理器701接收待发送的信号,对其进行调频,放大,经天线转为电磁波辐射出去。
在一些实施例中,主设备100的天线和移动通信模块耦合,天线和无线通信模块耦合,使得主设备100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
主设备100还包括GPU,触摸屏704,以及应用处理器等。上述硬件模块支持实现显示功能。GPU为图像处理的微处理器,连接触摸屏704和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器701可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
触摸屏704用于显示图像,视频等。触摸屏704包括显示面板。显示面板可以采用液晶触摸屏704(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,主设备100可以包括1个或N个触摸屏704,N为大于1的正整数。
主设备100可以通过ISP,摄像头,视频编解码器,GPU,触摸屏704以及应用处理器等实现拍摄功能。
ISP用于处理摄像头反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头中。
摄像头用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例 中,主设备100可以包括1个或N个摄像头,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当主设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。主设备100可以支持一种或多种视频编解码器。这样,主设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
充电管理模块用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块可以通过USB接口接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块可以通过主设备100的无线充电线圈接收无线充电输入。充电管理模块为电池充电的同时,还可以通过电源管理模块为电子设备供电。
电源管理模块用于连接电池,充电管理模块与处理器701。电源管理模块接收电池和/或充电管理模块的输入,为处理器701,存储器702,触摸屏704,摄像头,和无线通信模块等供电。电源管理模块还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块也可以设置于处理器701中。在另一些实施例中,电源管理模块和充电管理模块也可以设置于同一个器件中。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现主设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
按键包括开机键,音量键等。按键可以是机械按键。也可以是触摸式按键。主设备100可以接收按键输入,产生与主设备100的用户设置以及功能控制有关的键信号输入。
马达可以产生振动提示。马达可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于触摸屏704不同区域的触摸操作,马达也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口用于连接SIM卡。SIM卡可以通过插入SIM卡接口,或从SIM卡接口拔出,实现和主设备100的接触和分离。主设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口可以同时插入多张卡。所述多张卡的类型可以相同,也可以不同。SIM卡接口也可以兼容不同类型的SIM卡。SIM卡接口也可以兼容外部存储卡。主设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,主设备100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在主设备100中,不能和主设备100分离。
从设备200的处理器711、传感器712、发声单元713可参考上述处理器701、传感器703、音频单元705的介绍;从设备300的处理器721、传感器722可参考上述处理器701、传感器703的介绍,这里不再赘述。此外,从设备200、从设备300还可包括其他硬件模块,本申请实施例对此不做限制。
实施本申请实施例提供的声音处理方法,用户可以在播放音频时,可以通过自身动作(例如摇头、晃手等)带动电子设备动作。电子设备可以通过运动检测识别上述动作,并根据预设的关联关系确定与上述动作匹配的音乐素材,从而为正在播放的音频附加娱乐性的互动效果,增加音频播放过程的趣味性,满足用户与正在播放的音频的互动需求。
本申请的说明书和权利要求书及附图中的术语“用户界面(user interface,UI)”,是应用程序或操作系统与用户之间进行交互和信息交换的介质接口,它实现信息的内部形式与用户可以接受形式之间的转换。应用程序的用户界面是通过java、可扩展标记语言(extensible markup language,XML)等特定计算机语言编写的源代码,界面源代码在终端设备上经过解析,渲染,最终呈现为用户可以识别的内容,比如图片、文字、按钮等控件。控件(control)也称为部件(widget),是用户界面的基本元素,典型的控件有工具栏(toolbar)、菜单栏(menu bar)、文本框(text box)、按钮(button)、滚动条(scrollbar)、图片和文本。界面中的控件的属性和内容是通过标签或者节点来定义的,比如XML通过<Textview>、<ImgView>、<VideoView>等节点来规定界面所包含的控件。一个节点对应界面中一个控件或属性,节点经过解析和渲染之后呈现为用户可视的内容。此外,很多应用程序,比如混合应用(hybrid application)的界面中通常还包含有网页。网页,也称为页面,可以理解为内嵌在应用程序界面中的一个特殊的控件,网页是通过特定计算机语言编写的源代码,例如超文本标记语言(hyper text markup language,GTML),层叠样式表(cascading style sheets,CSS),java脚本(JavaScript,JS)等,网页源代码可以由浏览器或与浏览器功能类似的网页显示组件加载和显示为用户可识别的内容。网页所包含的具体内容也是通过网页源代码中的标签或者节点来定义的,比如GTML通过<p>、<img>、<video>、<canvas>来定义网页的元素和属性。
用户界面常用的表现形式是图形用户界面(graphic user interface,GUI),是指采用图形方式显示的与计算机操作相关的用户界面。它可以是在电子设备的显示屏中显示的一个图标、窗口、控件等界面元素,其中控件可以包括图标、按钮、菜单、选项卡、文本框、对话框、状态栏、导航栏、Widget等可视的界面元素。
在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括复数表达形式,除非其上下文中明确地有相反指示。还应当理解,本申请中使用的术语“和/或”是指并包含一个或多个所列出项目的任何或所有可能组合。上述实施例中所用,根据上下文,术语“当…时”可以被解释为意思是“如果…”或“在…后”或“响应于确定…”或“响应于检测到…”。类似地,根据上下文,短语“在确定…时”或“如果检测到(所陈述的条件或事件)”可以被解释为意思 是“如果确定…”或“响应于确定…”或“在检测到(所陈述的条件或事件)时”或“响应于检测到(所陈述的条件或事件)”。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如DVD)、或者半导体介质(例如固态硬盘)等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。

Claims (19)

  1. 一种声音处理方法,应用于第一电子设备,其特征在于,所述方法包括:
    播放第一音频;
    检测到用户的第一动作;
    响应于所述第一动作,获取第二音频,所述第二音频与所述第一动作有对应关系,其中所述对应关系为用户预置的;
    根据所述第二音频,对所述第一音频进行处理,得到第三音频,其中所述第三音频和所述第一音频不同,且所述第三音频和所述第一音频关联;
    播放所述第三音频。
  2. 根据权利要求1所述的方法,其特征在于,所述第二音频为预设的、用于为所述第一音频增加背景音效果的音频。
  3. 根据权利要求1或2所述的方法,其特征在于,在获取第二音频之后,所述方法还包括:对所述第二音频进行处理,使之成为具有可变立体声播放效果的第二音频,所述可变立体声播放效果是指所述立体声播放效果能够随着用户与所述第一电子设备相对位置的变化而变化;
    所述根据所述第二音频,对所述第一音频进行处理,得到第三音频,具体包括:
    将具有可变立体声播放效果的第二音频和所述第一音频进行叠加得到第三音频。
  4. 根据权利要求3所述的方法,其特征在于,所述对所述第二音频进行处理,使之成为具有可变立体声播放效果的第二音频,具体包括:
    获取所述第一电子设备相对用户的位置;
    根据所述位置确定第一参数,所述第一参数是从头相关变换函数数据库中获取的,用于调整所述第二音频左声道、右声道播放效果的参数;
    将所述第二音频与所述第一参数按频点相乘,得到具有可变立体声播放效果第二音频。
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述根据所述第二音频,对所述第一音频进行处理,得到第三音频,具体包括:
    将第一时长的所述第二音频叠加到所述第一音频的第一区间,得到第三音频,所述第一区间的时长与所述第一时长相等;
    所述播放所述第三音频,具体包括:播放所述第三音频的所述第一区间内的音频。
  6. 根据权利要求5所述的方法,其特征在于,所述第一动作包括多个第二动作,所述多个第二动作为多个所述第二电子设备在同一时刻做出的动作的组合,所述第二音频包括多个第四音频,所述多个第四音频分别与所述多个第二动作对应。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,在播放第一音频之前,所述方法还包括:
    显示第一用户界面,所述第一用户界面中显示有一个或多个图标和控件,所述图标包括第一图标,所述控件包括第一控件;
    检测到用户作用于所述第一控件的第一操作;
    响应于所述第一操作,确认所述第二音频与所述第一动作关联。
  8. 根据权利要求1-7中任一项所述的方法,其特征在于,所述获取第二音频,具体包括:
    查询存储表,确定所述第二音频,所述存储表中记录了一个或多个音频,以及所述音频对应的动作;所述一个或多个音频包括所述第二音频,所述第二音频在所述存储表中对应所述第一动作;
    从本地数据库,或服务器获取所述第二音频。
  9. 根据权利要求1-8中任一项所述的方法,其特征在于,所述第二音频包括:乐器声、动物声、环境声或录音中任意一个。
  10. 根据权利要求9所述的方法,其特征在于,所述乐器声包括:小鼓、大鼓、砂槌、钢琴、手风琴、小号、大号、长笛、大提琴或小提琴中任意一个;所述动物声包括:鸟鸣、蛙鸣、虫鸣、猫叫、狗叫、羊叫、牛叫、猪叫、马叫或鸡鸣中任意一个;所述环境声包括:风声、雨声、雷声、流水声、海浪声或瀑布声中任意一个。
  11. 根据权利要求1-10中任一项所述的方法,其特征在于,所述第二电子设备包括所述第一电子设备连接的耳机,所述第一动作包括所述耳机检测到的用户的头部动作。
  12. 根据权利要求11所述的方法,其特征在于,所述头部动作包括头部位移或头部转动中的任意一个;所述头部位移包括:左移、右移、上移或下移中的任意一个,所述头部转动包括左转、右转、仰头或低头中的任意一个。
  13. 根据权利要求1-10中任一项所述的方法,其特征在于,所述第二电子设备包括所述第一电子设备连接的手表,所述第一动作包括所述手表检测到的用户的手部动作。
  14. 根据权利要求13所述的方法,其特征在于,所述手部动作包括手部位移或手部转动中的任意一个;所述手部位移包括:左移、右移、上移或下移中的任意一个,所述手部转动包括左转、右转、抬手或垂手中的任意一个。
  15. 根据权利要求1-10中任一项所述的方法,其特征在于,所述第二电子设备包括所述第一电子设备连接的耳机和手表,所述第一动作包括所述耳机和所述手表检测到的用户 的头部动作和手部动作的组合。
  16. 一种电子设备,其特征在于,包括一个或多个处理器和一个或多个存储器;其中,所述一个或多个存储器与所述一个或多个处理器耦合,所述一个或多个存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,当所述一个或多个处理器执行所述计算机指令时,使得执行如权利要求1-15任一项所述的方法。
  17. 一种芯片系统,所述芯片系统应用于电子设备,所述芯片系统包括一个或多个处理器,所述处理器用于调用计算机指令以使得执行如权利要求1-15中任一项所述的方法。
  18. 一种包含指令的计算机程序产品,当计算机程序产品在电子设备上运行时,使得电子设备执行如权利要求1-15任一项所述的方法。
  19. 一种计算机可读存储介质,包括指令,其特征在于,当所述指令在电子设备上运行时,使得执行如权利要求1-15任一项所述的方法。
PCT/CN2022/073338 2021-06-24 2022-01-22 一种声音处理方法及其装置 WO2022267468A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22826977.5A EP4203447A4 (en) 2021-06-24 2022-01-22 SOUND PROCESSING METHOD AND ASSOCIATED APPARATUS
US18/030,446 US20240031766A1 (en) 2021-06-24 2022-01-22 Sound processing method and apparatus thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110705314.1 2021-06-24
CN202110705314.1A CN113596241B (zh) 2021-06-24 2021-06-24 一种声音处理方法及其装置

Publications (1)

Publication Number Publication Date
WO2022267468A1 true WO2022267468A1 (zh) 2022-12-29

Family

ID=78244496

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/073338 WO2022267468A1 (zh) 2021-06-24 2022-01-22 一种声音处理方法及其装置

Country Status (4)

Country Link
US (1) US20240031766A1 (zh)
EP (1) EP4203447A4 (zh)
CN (2) CN116208704A (zh)
WO (1) WO2022267468A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116208704A (zh) * 2021-06-24 2023-06-02 北京荣耀终端有限公司 一种声音处理方法及其装置
CN114501297B (zh) * 2022-04-02 2022-09-02 北京荣耀终端有限公司 一种音频处理方法以及电子设备
CN118264858A (zh) * 2024-05-29 2024-06-28 深圳爱图仕创新科技股份有限公司 数据处理方法、装置、计算机设备及计算机可读存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103885663A (zh) * 2014-03-14 2014-06-25 深圳市东方拓宇科技有限公司 一种生成和播放音乐的方法及其对应终端
US20150170665A1 (en) * 2013-12-16 2015-06-18 Rawles Llc Attribute-based audio channel arbitration
CN106572425A (zh) * 2016-05-05 2017-04-19 王杰 音频处理装置及方法
US20170161381A1 (en) * 2015-12-04 2017-06-08 Chiun Mai Communication Systems, Inc. Electronic device and music play system and method
CN108242238A (zh) * 2018-01-11 2018-07-03 广东小天才科技有限公司 一种音频文件生成方法及装置、终端设备
CN112221137A (zh) * 2020-10-26 2021-01-15 腾讯科技(深圳)有限公司 音频处理方法、装置、电子设备及存储介质
CN113596241A (zh) * 2021-06-24 2021-11-02 荣耀终端有限公司 一种声音处理方法及其装置

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070010589A (ko) * 2005-07-19 2007-01-24 엘지전자 주식회사 턴테이블이 구비되는 이동통신 단말기 및 그 동작방법
JP2007226935A (ja) * 2006-01-24 2007-09-06 Sony Corp 音響再生装置、音響再生方法および音響再生プログラム
CN104640029A (zh) * 2013-11-06 2015-05-20 索尼公司 音频输出方法、装置和电子设备
KR20170019651A (ko) * 2015-08-12 2017-02-22 삼성전자주식회사 음향 제공 방법 및 이를 수행하는 전자 장치
JP6668636B2 (ja) * 2015-08-19 2020-03-18 ヤマハ株式会社 オーディオシステムおよびオーディオ機器
CN105913863A (zh) * 2016-03-31 2016-08-31 乐视控股(北京)有限公司 一种音频播放方法、装置和终端设备
GB201709199D0 (en) * 2017-06-09 2017-07-26 Delamont Dean Lindsay IR mixed reality and augmented reality gaming system
CN111050269B (zh) * 2018-10-15 2021-11-19 华为技术有限公司 音频处理方法和电子设备
CN111405416B (zh) * 2020-03-20 2022-06-24 北京达佳互联信息技术有限公司 立体声录制方法、电子设备及存储介质
CN111930335A (zh) * 2020-07-28 2020-11-13 Oppo广东移动通信有限公司 声音调节方法及装置、计算机可读介质及终端设备
CN112507161A (zh) * 2020-12-14 2021-03-16 华为技术有限公司 一种音乐播放方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150170665A1 (en) * 2013-12-16 2015-06-18 Rawles Llc Attribute-based audio channel arbitration
CN103885663A (zh) * 2014-03-14 2014-06-25 深圳市东方拓宇科技有限公司 一种生成和播放音乐的方法及其对应终端
US20170161381A1 (en) * 2015-12-04 2017-06-08 Chiun Mai Communication Systems, Inc. Electronic device and music play system and method
CN106572425A (zh) * 2016-05-05 2017-04-19 王杰 音频处理装置及方法
CN108242238A (zh) * 2018-01-11 2018-07-03 广东小天才科技有限公司 一种音频文件生成方法及装置、终端设备
CN112221137A (zh) * 2020-10-26 2021-01-15 腾讯科技(深圳)有限公司 音频处理方法、装置、电子设备及存储介质
CN113596241A (zh) * 2021-06-24 2021-11-02 荣耀终端有限公司 一种声音处理方法及其装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4203447A4 *

Also Published As

Publication number Publication date
US20240031766A1 (en) 2024-01-25
CN116208704A (zh) 2023-06-02
CN113596241B (zh) 2022-09-20
CN113596241A (zh) 2021-11-02
EP4203447A1 (en) 2023-06-28
EP4203447A4 (en) 2024-03-27

Similar Documents

Publication Publication Date Title
RU2766255C1 (ru) Способ голосового управления и электронное устройство
WO2020244495A1 (zh) 一种投屏显示方法及电子设备
JP7222112B2 (ja) 歌の録音方法、音声補正方法、および電子デバイス
WO2020211701A1 (zh) 模型训练方法、情绪识别方法及相关装置和设备
WO2022267468A1 (zh) 一种声音处理方法及其装置
WO2022002166A1 (zh) 一种耳机噪声处理方法、装置及耳机
WO2021147415A1 (zh) 实现立体声输出的方法及终端
CN111345010A (zh) 一种多媒体内容同步方法及电子设备
CN110989961A (zh) 一种声音处理方法及其装置
CN109003621B (zh) 一种音频处理方法、装置及存储介质
CN109819306B (zh) 一种媒体文件裁剪的方法、电子设备和服务器
CN111276122A (zh) 音频生成方法及装置、存储介质
CN115033313A (zh) 终端应用控制方法、终端设备及芯片系统
CN114222187B (zh) 视频编辑方法和电子设备
CN113747047A (zh) 一种视频播放的方法及设备
WO2023246563A1 (zh) 一种声音处理方法及电子设备
CN114694646A (zh) 一种语音交互处理方法及相关装置
CN115641867B (zh) 语音处理方法和终端设备
CN115359156B (zh) 音频播放方法、装置、设备和存储介质
CN114173184A (zh) 投屏方法和电子设备
WO2022089563A1 (zh) 一种声音增强方法、耳机控制方法、装置及耳机
CN115641870A (zh) 一种音频信号的处理方法及相关电子设备
WO2024051638A1 (zh) 声场校准方法、电子设备及系统
CN117689776B (zh) 一种音频播放方法、电子设备及存储介质
CN117764853B (zh) 人脸图像增强方法和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22826977

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022826977

Country of ref document: EP

Effective date: 20230322

NENP Non-entry into the national phase

Ref country code: DE