WO2022022536A1 - Audio playback method, audio playback apparatus, and electronic device - Google Patents

Audio playback method, audio playback apparatus, and electronic device Download PDF

Info

Publication number
WO2022022536A1
WO2022022536A1 PCT/CN2021/108757 CN2021108757W WO2022022536A1 WO 2022022536 A1 WO2022022536 A1 WO 2022022536A1 CN 2021108757 W CN2021108757 W CN 2021108757W WO 2022022536 A1 WO2022022536 A1 WO 2022022536A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
target
file
audio file
noise reduction
Prior art date
Application number
PCT/CN2021/108757
Other languages
French (fr)
Chinese (zh)
Inventor
史建兴
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2022022536A1 publication Critical patent/WO2022022536A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/57Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for processing of video signals

Definitions

  • the embodiments of the present application relate to the field of communication technologies, and in particular, to an audio playback method, an audio playback device, and an electronic device.
  • the shooting functions of electronic devices are becoming more and more powerful, and users can shoot videos to record life through the shooting functions of the electronic devices.
  • a user may shoot a video of a concert, or a video of other users' speeches, etc. through the electronic device.
  • the purpose of the embodiments of the present application is to provide an audio playback method, an audio playback device and an electronic device, which can solve the problem of poor playback effect in the audio playback process.
  • an embodiment of the present application provides an audio playback method, the method includes: in the case of playing an audio file in a multimedia file, determining a target noise feature; according to the target noise feature, determining a first audio frequency in the audio file According to this first audio, extract the target audio in the audio file, play the target audio; Wherein, this target audio is the first audio or is the second audio, and the second audio is the audio other than the first audio in the audio file .
  • an embodiment of the present application provides an audio playback device, the audio playback device includes: a determination module, a noise reduction module, and a playback module; the determination module is configured to determine a target when an audio file in a multimedia file is played noise feature; a noise reduction module for determining a first audio in the audio file according to the target noise feature determined by the determining module; and extracting the target audio in the audio file according to the first audio; a playing module for playing the noise reduction The target audio extracted by the module; wherein, the target audio is the first audio or the second audio, and the second audio is the audio other than the first audio in the audio file.
  • embodiments of the present application provide an electronic device, the electronic device includes a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction being The processor implements the steps of the method according to the first aspect when executed.
  • an embodiment of the present application provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the method according to the first aspect are implemented .
  • an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, and implement the first aspect the method described.
  • the audio playback device may determine the target noise feature when playing the audio file in the multimedia file. Then, the audio playback device may determine the first audio in the audio file according to the target noise feature. Then, the audio playback device can extract the target audio in the audio file according to the first audio, and play the target audio; wherein, the target audio is the first audio or the second audio, and the second audio is the first audio in the audio file except the first audio. Audio other than audio.
  • the audio playback device can determine the target noise feature corresponding to the audio file. When the audio playback device determines the target noise feature, it can accurately determine the first audio in the audio file according to the target noise feature.
  • the audio playback device can extract the target audio in the audio file according to the first audio. Due to the improvement of the accuracy rate of determining the first audio, the accuracy of extracting the target audio in the audio file is also improved, so that the audio playback can be made.
  • the device accurately suppresses the noise in the audio file, thereby obtaining the playback effect of the audio file required by the user. In this way, the purpose of improving the playback effect of the audio file is achieved.
  • FIG. 1 is a schematic flowchart of an audio playback method provided by an embodiment of the present application.
  • FIG. 2 is one of schematic diagrams of interfaces to which an audio playback method provided by an embodiment of the present application is applied;
  • FIG. 3 is a second schematic diagram of an interface to which an audio playback method provided by an embodiment of the present application is applied;
  • FIG. 5 is a fourth schematic diagram of an interface to which an audio playback method provided by an embodiment of the present application is applied;
  • FIG. 6 is a schematic structural diagram of an audio playback device provided by an embodiment of the present application.
  • FIG. 7 is one of the schematic structural diagrams of an electronic device provided by an embodiment of the present application.
  • FIG. 8 is a second schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the audio playback method in the embodiments of the present application may be applied in various scenarios, for example, in the scenario of playing a concert video, or in the scenario of playing a child's performance video, or in the scenario of playing a lecture video, or It can be used in the scene of playing ocean audio, or in the scene of playing home video, or in the scene of playing animal video, or in the scene of playing songs, etc.
  • FIG. 1 is a schematic flowchart of an audio playback method provided by an embodiment of the present application, including steps 201 to 203:
  • Step 201 When the audio playback device plays the audio file in the multimedia file, the audio playback device determines the target noise feature.
  • the multimedia file may be a multimedia file collected by an audio playback device, a multimedia file downloaded by an audio playback device, or a multimedia file played online by an audio playback device, which is not limited in this embodiment of the present application .
  • the above-mentioned multimedia file may be a video file or an audio file, which is not limited in the embodiment of the present application.
  • the audio file in the above-mentioned multimedia file is an audio file (for example, background music or vocals, etc.) in the video file
  • the multimedia file is an audio file
  • the above-mentioned audio file the above-mentioned audio file
  • the audio file in the multimedia file is the audio file.
  • the target noise feature may include: white noise, Gaussian noise, impulse noise, human voice, or other noise, which is not limited in this embodiment of the present application.
  • noise features in the embodiments of the present application may be understood as noise types.
  • the target noise feature may be a noise feature corresponding to a shooting scene or a noise feature corresponding to a noise reduction degree, which is not limited in this embodiment of the present application.
  • the above shooting scene may include any one of the following: outdoor, seaside, bus, concert or home. It should be noted that the shooting scenes in the embodiments of the present application include but are not limited to the aforementioned five scenes, which may be specifically limited according to actual needs, which are not limited in the embodiments of the present application.
  • the target noise feature may be the sound of ocean waves or other sounds other than the sound of ocean waves; if the above-mentioned shooting scene is a concert, the target noise feature may be background music, Or other sounds other than the background music and the singer's voice, which may be specifically set according to actual needs, which are not limited in this embodiment of the present application.
  • the above-mentioned noise reduction degree refers to the degree of noise suppression, that is, suppression of all or part of the noise.
  • the noise reduction degree can be classified as high, medium or low, the audio playback device determines that the noise reduction degree is high, and all noise can be suppressed; the audio playback device determines that the noise reduction degree is medium, and 50% of the noise can be suppressed, and the audio playback device determines The noise reduction degree is low, and 10% of the noise can be suppressed.
  • the specific value can be set according to actual requirements, which is not limited in this embodiment of the present application.
  • Step 202 The audio playback device determines the first audio in the audio file according to the above target noise feature.
  • the above-mentioned first audio may be: a target human voice or an ambient sound.
  • the above-mentioned ambient sounds may include: background music, ocean waves, whistling sounds, or other human voices other than the target human voice in the audio file, and the like.
  • Example 1 the audio file includes ocean wave sound and human voice, and the audio playback device determines that the target noise feature is ocean wave sound, then the audio playback device can determine the first audio is human voice according to the ocean wave sound.
  • Example 2 the audio file includes ocean wave sound and human voice, and the audio playback device determines that the target noise feature is ocean wave sound, then the audio playback device can determine the first audio as ocean wave sound according to the ocean wave sound.
  • Example 3 the audio file includes ocean waves and human voices, and the audio playback device determines that the target noise feature is human voices, then the audio playback device can determine the first audio is ocean waves according to the human voice.
  • Example 4 the audio file includes the sound of ocean waves and human voice, and the audio playback device determines that the target noise feature is human voice, then the audio playback device can determine the first audio is human voice according to the human voice.
  • the above-mentioned target audio is the first audio or the second audio
  • the second audio is the audio other than the first audio in the audio file.
  • the second audio may be: target human voice or ambient sound.
  • the first audio can be the ambient sound
  • the second audio is the ambient sound
  • the first audio can be the target human voice, which can be set according to actual needs , which is not limited in the embodiments of the present application.
  • Example 1 the audio playback device extracts the first audio in the audio file as the target audio.
  • the audio file includes the sound of ocean waves and human voices
  • the user needs to obtain audio that only contains human voices (that is, the above-mentioned target audio)
  • the audio playback device can be triggered to determine that the target noise feature is the sound of ocean waves, then the audio playback device can be based on ocean waves. voice, determine the first audio as human voice, and extract the human voice.
  • Example 2 the audio playback apparatus extracts the second audio in the audio file as the target audio.
  • the audio playback device may extract the target audio in the audio file according to the AI noise reduction model.
  • the playback device extracts the target audio in the audio file according to the above-mentioned first audio, which may be extracted in at least two possible ways.
  • the audio playback device determines the second audio according to the above-mentioned first audio, and directly extracts the second audio from the audio file. For example, the audio playback device may extract the second audio through the first AI noise reduction model.
  • the audio playback device extracts the first audio in the audio file according to the above-mentioned first audio, and then filters the first audio from the audio file to extract the second audio.
  • the audio playback device may extract the first audio through the second AI noise reduction model, and filter the first audio from the audio file to extract the second audio.
  • the audio playback device can determine the target noise feature in the case of playing the audio file in the multimedia file. Then, the audio playback device may determine the first audio in the audio file according to the target noise feature. Then, the audio playback device can extract the target audio in the audio file according to the first audio, and play the target audio; wherein, the target audio is the first audio or the second audio, and the second audio is the first audio in the audio file except the first audio. Audio other than audio.
  • the audio playback device can determine the target noise feature corresponding to the audio file.
  • the audio playback device determines the target noise feature, it can accurately determine the first audio in the audio file according to the target noise feature. Secondly, the audio playback device can extract the target audio in the audio file according to the first audio. Due to the improvement of the accuracy rate of determining the first audio, the accuracy of extracting the target audio in the audio file is also improved, so that the audio playback can be made. The device accurately suppresses the noise in the audio file, thereby obtaining the playback effect of the audio file required by the user. In this way, the purpose of improving the playback effect of the audio file is achieved.
  • the audio playback device may correspond to a variety of noise features, and the user selects the noise feature to be determined by the audio playback device.
  • step 201 may specifically include the following steps 201a to 202d:
  • the above-mentioned first input may be a click input by the user on the screen of the audio playback device, or a voice command input by the user, or a specific gesture input by the user, which can be specifically determined according to actual use requirements. This is not limited.
  • the user's click input on the screen of the audio playback device may specifically be: the user's click input on a target control on the screen.
  • the first input may be that the user presses the power button, the volume button, the newly added artificial intelligence (Artificial Intelligence, AI) button, and the like.
  • AI artificial Intelligence
  • the first input may be that the user presses the power button and the AI button for 3 seconds, or the first input may be that the user presses the volume key "+" and the volume key "-" once respectively.
  • the target noise feature may be determined through the correspondence.
  • the audio playback device takes the audio playback device as a mobile phone and the video 1 as an example of the singing and ocean waves of user A shot at the seaside, as shown in FIG.
  • the mobile phone user wants to obtain audio that only includes the singing voice of user A
  • the mobile phone user can click the "AI noise reduction" control 32 (ie, the above-mentioned first input).
  • the mobile phone user can click the "AI noise reduction" control 32 (ie, the above-mentioned first input).
  • a window 33 ie, the above-mentioned first interface
  • the window 33 displays 5 options corresponding to the 5 shooting scenes, namely the “outdoor” option, the “concert” option, the “seaside” option 34 , the “transit” option and the “home” option. Then, the mobile phone user can click on the "seaside” option 34 (ie, the above-mentioned second input). Next, the mobile phone determines the AI noise reduction model A, where the target noise feature corresponding to the noise reduction model A is the sound of waves, then the mobile phone can determine the first audio as the singing voice of user A according to the sound of the waves, and extract the sound according to the AI noise reduction model A. User A's singing voice.
  • the specific application may include a "camera” application, or a “recording” application, etc., or a chat application with a video shooting function, or a shopping application with a video playback function, and the like.
  • the above-mentioned user sliding input on the screen may be: the user sliding left on the screen, or the user sliding right on the screen, or the user sliding up on the screen, or the user sliding down on the screen, This embodiment of the present application does not limit this.
  • the audio playback device can also display at least two function options, and the user first selects the desired implementation function, and then select the corresponding noise reduction model.
  • Step 201b1 In response to the above-mentioned first input, the audio playback device displays N function options.
  • each function option corresponds to a different function
  • N is a positive integer.
  • Step 201b2 The audio playback device receives a third input from the user on the target function option among the above N function options.
  • the above-mentioned second input may be a user's click input on the target function option, or a voice command input by the user, or a specific gesture input by the user, which can be specifically determined according to actual use requirements. This is not limited.
  • Step 201b3 In response to the third input, the audio playback device displays M options.
  • Example 7 in conjunction with FIG. 2 , after the mobile phone user clicks the “AI noise reduction” control 31 (ie, the above-mentioned first input).
  • two function options are displayed on the screen of the mobile phone, namely, a “voice enhancement” function option 41 and a “voice suppression” function option 42 .
  • the mobile phone user wants to obtain the audio that only contains the singing voice of user A, the user can click the “voice enhancement” function option 41.
  • the mobile phone displays 5 corresponding 5 shooting scenes in the window 33.
  • the options are "outdoor” option, "concert” option, "seaside” option34, "transit” option and "home” option.
  • Example 8 in conjunction with FIG. 2 , after the mobile phone user clicks the “AI noise reduction” control 31 (ie, the above-mentioned first input).
  • the “AI noise reduction” control 31 ie, the above-mentioned first input.
  • two function options are displayed on the screen of the mobile phone, namely, a “voice enhancement” function option 41 and a “voice suppression” function option 42 .
  • the mobile phone user wants to obtain the audio that only contains the sound of the ocean waves, the mobile phone user can click the "Voice Suppression” function option 42.
  • the mobile phone displays 3 corresponding noise reduction levels in the window 51
  • the options are "High” option 52, "Medium” option and “Low” option 53.
  • the mobile phone user wants to suppress the singing voice of user A slightly, the mobile phone user can click the "low” option 53; if the mobile phone user wants to completely suppress the singing voice of user A, the mobile phone user can click the "high” option 52, and the mobile phone only keeps the video The sound of waves in 1.
  • the audio playback method provided by the embodiment of the present application can be applied to a scenario where a function is determined by selecting a function option.
  • the audio playback device can display N function options, and the user can select a corresponding function according to requirements, and then select the corresponding function according to the M function corresponding to the function.
  • One option is to select the corresponding noise reduction model. In this way, not only the accuracy of the user selecting the noise reduction model can be improved, but also the flexibility of the audio playback device for audio noise reduction can be improved.
  • the audio playback apparatus may extract the target audio by using the determined noise reduction degree.
  • Step 203a The audio playback device filters out part or all of the first audio from the audio file to obtain the second audio in the audio file.
  • Example 1 when the first audio is the target human voice, the audio playback device filters out part or all of the target human voice from the audio file to obtain the second audio in the audio file, that is, the audio file is suppressed.
  • Example 2 when the first audio is ambient sound, the audio playback device filters out part or all of the ambient sound from the audio file to obtain the second audio in the audio file, that is, performing vocal enhancement on the audio file.
  • the audio playback apparatus extracts all or part of the first audio from the audio file.
  • the audio playback method in the embodiment of the present application can be applied to the scenario of acquiring target audio with different noise reduction degrees, and the audio playback device can flexibly obtain target audio with different noise reduction degrees according to the user's needs, which improves the audio playback performance. flexibility.
  • the execution body may be an audio playback device, or a control module in the audio playback device for executing the audio playback method.
  • an audio playing method performed by an audio playing device is used as an example to describe the audio playing device provided by the embodiments of the present application.
  • FIG. 6 is a schematic diagram of a possible structure for implementing an audio playback device provided by an embodiment of the present application.
  • the audio playback device 600 includes: a determination module 601, a noise reduction module 602, and a playback module 603, wherein: a determination module 601, for determining the target noise feature under the situation of playing the audio file in the multimedia file; the noise reduction module 602, for determining the first audio in the audio file according to the target noise feature determined by the determining module 601; an audio, extracting the target audio in the audio file; the playing module 603, for playing the target audio extracted by the noise reduction module 602; wherein, the target audio is the first audio or the second audio, and the second audio is in the audio file Audio other than the first audio.
  • the audio playback device 600 further includes: a receiving module 604 and a display module 605; the receiving module 604 is configured to receive the first input of the user in the case of playing the audio file in the multimedia file; displaying The module 605 is configured to display M options in response to the first input received by the receiving module 604, the M options correspond to M noise reduction models, the noise characteristics of each noise reduction model are different, and M is a positive integer; the receiving module 604, also used to receive the second input of the user to the target option in the M options; the determining module 601 is specifically configured to respond to the second input received by the receiving module 604, and determine the target noise reduction model corresponding to the target option, And determine the target noise feature according to the target noise reduction model.
  • the noise reduction module 602 is specifically configured to filter out part or all of the first audio from the audio file to obtain the second audio in the audio file.
  • the target noise feature is a noise feature corresponding to a shooting scene, or a noise feature corresponding to a noise reduction degree.
  • the noise reduction module 602 is specifically configured to: in the case that the first audio is the target human voice, filter out part or all of the target human voice from the audio file to obtain the second audio in the audio file, The file is subjected to vocal suppression; or, when the first audio is ambient sound, part or all of the ambient sound is filtered from the audio file to obtain the second audio in the audio file, and vocal enhancement is performed on the audio file.
  • the modules that must be included in the electronic device 600 are indicated by solid line frames, such as the determination module 601; the modules that may or may not be included in the electronic device 600 are indicated by dotted line frames, such as display modules 605.
  • the audio playback device can determine the target noise feature in the case of playing the audio file in the multimedia file. Then, the audio playback device may determine the first audio in the audio file according to the target noise feature. Then, the audio playback device can extract the target audio in the audio file according to the first audio, and play the target audio; wherein, the target audio is the first audio or the second audio, and the second audio is the first audio in the audio file except the first audio. Audio other than audio.
  • the audio playback device can determine the target noise feature corresponding to the audio file.
  • the audio playback device determines the target noise feature, it can accurately determine the first audio in the audio file according to the target noise feature. Secondly, the audio playback device can extract the target audio in the audio file according to the first audio. Due to the improvement of the accuracy rate of determining the first audio, the accuracy of extracting the target audio in the audio file is also improved, so that the audio playback can be made. The device accurately suppresses the noise in the audio file, thereby obtaining the playback effect of the audio file required by the user. In this way, the purpose of improving the playback effect of the audio file is achieved.
  • the audio playback device in this embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal.
  • the apparatus may be a mobile electronic device or a non-mobile electronic device.
  • the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant).
  • UMPC ultra-mobile personal computer
  • netbook or a personal digital assistant
  • non-mobile electronic devices can be servers, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (television, TV), teller machine or self-service machine, etc., this application Examples are not specifically limited.
  • Network Attached Storage NAS
  • personal computer personal computer, PC
  • television television
  • teller machine or self-service machine etc.
  • the audio playback device in the embodiment of the present application may be a device with an operating system.
  • the operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
  • the audio playback device provided in the embodiment of the present application can implement each process implemented by the method embodiments in FIG. 1 to FIG. 5 , and to avoid repetition, details are not described here.
  • an embodiment of the present application further provides an electronic device 700, including a processor 701, a memory 702, a program or instruction stored in the memory 702 and executable on the processor 701,
  • an electronic device 700 including a processor 701, a memory 702, a program or instruction stored in the memory 702 and executable on the processor 701,
  • the program or instruction is executed by the processor 701
  • each process of the above-mentioned audio playback method embodiments can be implemented, and the same technical effect can be achieved. To avoid repetition, details are not described here.
  • the electronic devices in the embodiments of the present application include the aforementioned mobile electronic devices and non-mobile electronic devices.
  • FIG. 8 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
  • the electronic device 100 may also include a power source (such as a battery) for supplying power to various components, and the power source may be logically connected to the processor 110 through a power management system, so as to manage charging, discharging, and power management through the power management system. consumption management and other functions.
  • a power source such as a battery
  • the structure of the electronic device shown in FIG. 8 does not constitute a limitation on the electronic device.
  • the electronic device may include more or less components than those shown in the figure, or combine some components, or arrange different components, which will not be repeated here. .
  • the processor 110 is used to determine the target noise feature under the condition of playing the audio file in the multimedia file; and according to the target noise feature, determine the first audio in the audio file; and according to the first audio, extract the audio in the audio file
  • the audio output unit 103 is used for playing the target audio extracted by the processor 110; wherein, the target audio is the first audio or the second audio, and the second audio is the audio other than the first audio in the audio file audio.
  • the user input unit 107 is used for receiving the first input of the user in the case of playing the audio file in the multimedia file; the display unit 106 is used for displaying M in response to the first input received by the user input unit 107 options, the M options correspond to M noise reduction models, each noise reduction model has different noise characteristics, and M is a positive integer; the user input unit 107 is further configured to receive the user's second selection of the target option in the M options. Input; the processor 110 is specifically configured to, in response to the second input received by the user input unit 107, determine a target noise reduction model corresponding to the target option, where the target noise reduction model corresponds to the target noise feature.
  • the processor 110 is specifically configured to filter out part or all of the first audio from the audio file to obtain the second audio in the audio file.
  • the target noise feature is a noise feature corresponding to a shooting scene, or a noise feature corresponding to a noise reduction degree.
  • the processor 110 is specifically configured to: in the case that the first audio is a target human voice, filter out part or all of the target human voice from the audio file to obtain the second audio in the audio file, Perform vocal suppression; or, when the first audio is ambient sound, filter out part or all of the ambient sound from the audio file to obtain the second audio in the audio file, and perform vocal enhancement on the audio file.
  • the electronic device can determine the target noise feature in the case of playing the audio file in the multimedia file. Then, the electronic device may determine the first audio in the audio file according to the target noise feature. Then, the device in Diaiyou can extract the target audio in the audio file according to the first audio, and play the target audio; wherein, the target audio is the first audio or the second audio, and the second audio is the audio file except the audio file. Audio other than the first audio.
  • the electronic device can determine the target noise feature corresponding to the audio file. When the electronic device determines the target noise feature, it can accurately determine the first audio in the audio file according to the target noise feature.
  • the electronic device can extract the target audio in the audio file according to the first audio. Due to the improvement in the accuracy of determining the first audio, the accuracy of extracting the target audio in the audio file is also improved, so that the electronic device can be accurately The noise in the audio file is effectively suppressed, so as to obtain the playback effect of the audio file required by the user. In this way, the purpose of improving the playback effect of the audio file is achieved.
  • the input unit 104 may include a graphics processor (Graphics Processing Unit, GPU) 1041 and a microphone 1042. Such as camera) to obtain still pictures or video image data for processing.
  • the display unit 106 may include a display panel 1061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 107 includes a touch panel 1071 and other input devices 1072 .
  • the touch panel 1071 is also called a touch screen.
  • the touch panel 1071 may include two parts, a touch detection device and a touch controller.
  • Other input devices 1072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here.
  • Memory 109 may be used to store software programs as well as various data including, but not limited to, application programs and operating systems.
  • the processor 110 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface, and application programs, and the like, and the modem processor mainly processes wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 110 .
  • the embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium.
  • a program or an instruction is stored on the readable storage medium.
  • the processor is the processor in the electronic device described in the foregoing embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the above audio playback method embodiments.
  • the chip includes a processor and a communication interface
  • the communication interface is coupled to the processor
  • the processor is configured to run a program or an instruction to implement the above audio playback method embodiments.
  • the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.
  • the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.
  • a storage medium such as ROM/RAM, magnetic disk, CD-ROM

Abstract

Disclosed are an audio playback method and apparatus, and an electronic device. The method comprises: when an audio file in a multimedia file is played back, determining the target noise characteristic; determining a first audio in the audio file according to the target noise characteristic; and extracting a target audio in the audio file according to the first audio, and playing back the target audio, the target audio being the first audio or a second audio, and the second audio being an audio other than the first audio in the audio file.

Description

音频播放方法、音频播放装置和电子设备Audio playback method, audio playback device and electronic device
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请主张在2020年7月30日提交国家知识产权局、申请号为202010749736.4、申请名称为“音频播放方法、音频播放装置和电子设备”的中国专利申请的优先权,其全部内容通过引用包含于此。This application claims the priority of the Chinese patent application filed with the State Intellectual Property Office on July 30, 2020, the application number is 202010749736.4, and the application name is "audio playback method, audio playback device and electronic equipment", the entire contents of which are included by reference here.
技术领域technical field
本申请实施例涉及通信技术领域,尤其涉及一种音频播放方法、音频播放装置和电子设备。The embodiments of the present application relate to the field of communication technologies, and in particular, to an audio playback method, an audio playback device, and an electronic device.
背景技术Background technique
目前,电子设备的拍摄功能越来越强大,用户可以通过电子设备的拍摄功能来拍摄视频以记录生活。例如,用户可以通过电子设备拍摄演唱会的视频,或者拍摄其他用户讲话的视频等。At present, the shooting functions of electronic devices are becoming more and more powerful, and users can shoot videos to record life through the shooting functions of the electronic devices. For example, a user may shoot a video of a concert, or a video of other users' speeches, etc. through the electronic device.
在电子设备拍摄视频的过程中,以电子设备拍摄演唱会为例,电子设备会对演唱会现场采集范围内的音频均进行采集。然而,通常演唱会现场的环境均比较嘈杂,电子设备不仅会对歌手的声音和背景音乐进行采集,也会对环境中的噪声进行采集。在后续电子设备播放视频的过程中,电子设备会将噪声和歌手的声音以及背景音乐一起播放,如此导致音频播放的效果差。In the process of video shooting by electronic equipment, taking the electronic equipment for shooting a concert as an example, the electronic equipment will collect all the audio within the collection range of the concert site. However, the environment at the concert site is usually noisy, and the electronic device not only collects the singer's voice and background music, but also collects the noise in the environment. During subsequent video playback by the electronic device, the electronic device will play the noise together with the singer's voice and background music, which results in poor audio playback effect.
发明内容SUMMARY OF THE INVENTION
本申请实施例的目的是提供一种音频播放方法、音频播放装置和电子设备,能够解决音频播放过程中所存在的播放效果较差的问题。The purpose of the embodiments of the present application is to provide an audio playback method, an audio playback device and an electronic device, which can solve the problem of poor playback effect in the audio playback process.
为了解决上述技术问题,本申请是这样实现的:In order to solve the above technical problems, this application is implemented as follows:
第一方面,本申请实施例提供了一种音频播放方法,该方法包括:在播放多媒体文件中的音频文件情况下,确定目标噪声特征;根据该目标噪声特征,确定音频文件中的第一音频;根据该第一音频,提取音频文件中的目标音频,播放目标音频;其中,该目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。In a first aspect, an embodiment of the present application provides an audio playback method, the method includes: in the case of playing an audio file in a multimedia file, determining a target noise feature; according to the target noise feature, determining a first audio frequency in the audio file According to this first audio, extract the target audio in the audio file, play the target audio; Wherein, this target audio is the first audio or is the second audio, and the second audio is the audio other than the first audio in the audio file .
第二方面,本申请实施例提供了一种音频播放装置,该音频播放装置包括:确定模块、降噪模块和播放模块;确定模块,用于在播放多媒体文件中的音频文件情况下,确定目标噪声特征;降噪模块,用于根据确定模块确定的目标噪声特征,确定音频文件中的第一音频;以及根据该第一音频,提取音频文件中的目标音频;播放模块,用于播放降噪模块提取的目标音频;其中,该目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。In a second aspect, an embodiment of the present application provides an audio playback device, the audio playback device includes: a determination module, a noise reduction module, and a playback module; the determination module is configured to determine a target when an audio file in a multimedia file is played noise feature; a noise reduction module for determining a first audio in the audio file according to the target noise feature determined by the determining module; and extracting the target audio in the audio file according to the first audio; a playing module for playing the noise reduction The target audio extracted by the module; wherein, the target audio is the first audio or the second audio, and the second audio is the audio other than the first audio in the audio file.
第三方面,本申请实施例提供了一种电子设备,该电子设备包括处理器、存储器及存 储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。In a third aspect, embodiments of the present application provide an electronic device, the electronic device includes a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction being The processor implements the steps of the method according to the first aspect when executed.
第四方面,本申请实施例提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤。In a fourth aspect, an embodiment of the present application provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the method according to the first aspect are implemented .
第五方面,本申请实施例提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法。In a fifth aspect, an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, and implement the first aspect the method described.
在本申请实施例中,音频播放装置在播放多媒体文件中的音频文件情况下,可以确定出目标噪声特征。然后,音频播放装置可以根据该目标噪声特征,确定音频文件中的第一音频。接着,音频播放装置可以根据该第一音频,提取音频文件中的目标音频,播放目标音频;其中,该目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。通过上述方案,首先,在音频播放装置播放该音频文件情况下,音频播放装置可以确定音频文件对应的目标噪声特征。当音频播放装置确定了目标噪声特征,可以根据该目标噪声特征,准确地确定出音频文件中的第一音频。其次,音频播放装置可以根据第一音频,提取音频文件中的目标音频,由于确定第一音频的准确率的提高,使得提取音频文件中的目标音频的准确性也得到提高,从而可以使得音频播放装置准确地抑制音频文件中的噪声,进而得到用户需求的音频文件的播放效果。如此,达到了提高音频文件的播放效果的目的。In the embodiment of the present application, the audio playback device may determine the target noise feature when playing the audio file in the multimedia file. Then, the audio playback device may determine the first audio in the audio file according to the target noise feature. Then, the audio playback device can extract the target audio in the audio file according to the first audio, and play the target audio; wherein, the target audio is the first audio or the second audio, and the second audio is the first audio in the audio file except the first audio. Audio other than audio. Through the above solution, first, in the case that the audio playback device plays the audio file, the audio playback device can determine the target noise feature corresponding to the audio file. When the audio playback device determines the target noise feature, it can accurately determine the first audio in the audio file according to the target noise feature. Secondly, the audio playback device can extract the target audio in the audio file according to the first audio. Due to the improvement of the accuracy rate of determining the first audio, the accuracy of extracting the target audio in the audio file is also improved, so that the audio playback can be made. The device accurately suppresses the noise in the audio file, thereby obtaining the playback effect of the audio file required by the user. In this way, the purpose of improving the playback effect of the audio file is achieved.
附图说明Description of drawings
图1为本申请实施例提供的一种音频播放方法流程示意图;1 is a schematic flowchart of an audio playback method provided by an embodiment of the present application;
图2为本申请实施例提供的一种音频播放方法所应用的界面示意图之一;2 is one of schematic diagrams of interfaces to which an audio playback method provided by an embodiment of the present application is applied;
图3为本申请实施例提供的一种音频播放方法所应用的界面示意图之二;FIG. 3 is a second schematic diagram of an interface to which an audio playback method provided by an embodiment of the present application is applied;
图4为本申请实施例提供的一种音频播放方法所应用的界面示意图之三;FIG. 4 is a third schematic diagram of an interface to which an audio playback method provided by an embodiment of the present application is applied;
图5为本申请实施例提供的一种音频播放方法所应用的界面示意图之四;FIG. 5 is a fourth schematic diagram of an interface to which an audio playback method provided by an embodiment of the present application is applied;
图6为本申请实施例提供的一种音频播放装置的结构示意图;6 is a schematic structural diagram of an audio playback device provided by an embodiment of the present application;
图7为本申请实施例提供的一种电子设备的结构示意图之一;FIG. 7 is one of the schematic structural diagrams of an electronic device provided by an embodiment of the present application;
图8为本申请实施例提供的一种电子设备的结构示意图之二。FIG. 8 is a second schematic structural diagram of an electronic device according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。The terms "first", "second" and the like in the description and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It is to be understood that data so used may be interchanged under appropriate circumstances so that embodiments of the application can be practiced in sequences other than those illustrated or described herein. In addition, the objects distinguished by "first", "second", etc. are usually one type, and the number of objects is not limited. For example, the first object may be one or more than one. In addition, "and/or" in the description and claims indicates at least one of the connected objects, and the character "/" generally indicates that the associated objects are in an "or" relationship.
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的音频播放方法进行详细地说明。The audio playback method provided by the embodiments of the present application will be described in detail below through specific embodiments and application scenarios with reference to the accompanying drawings.
本申请实施例中的音频播放方法可以应用于多种场景中,例如,应用于播放演唱会视频场景中,或者应用于播放小孩的演出视频场景中,或者应用于播放讲座视频场景中,或者应用于播放大海音频场景中,或者应用于播放家庭录像场景中,或者应用于播放动物视频场景中,或者应用于播放歌曲场景中等。The audio playback method in the embodiments of the present application may be applied in various scenarios, for example, in the scenario of playing a concert video, or in the scenario of playing a child's performance video, or in the scenario of playing a lecture video, or It can be used in the scene of playing ocean audio, or in the scene of playing home video, or in the scene of playing animal video, or in the scene of playing songs, etc.
以应用于播放演唱会视频场景为例,当用户通过电子设备播放演唱会视频时,发现音频中包括许多噪声(除了歌手声音和背景音乐以外的声音),则用户可以点击电子设备的屏幕,此时,电子设备可以确定演唱会视频中的音频文件的噪声特征,并确定出用户想要收听的是演唱会视频中的歌手声音和背景音乐。接着,电子设备可以提取演唱会视频中的歌手声音和背景音乐,并进行播放。从而用户可以听到降低了噪声的歌手声音和背景音乐,进而提高了演唱会视频中音频文件的播放效果。Taking the scenario of playing a concert video as an example, when a user plays a concert video through an electronic device and finds that the audio contains a lot of noise (except for the singer's voice and background music), the user can click on the screen of the electronic device, this At this time, the electronic device can determine the noise characteristics of the audio files in the concert video, and determine that what the user wants to listen to is the singer's voice and background music in the concert video. Then, the electronic device can extract the singer's voice and background music in the concert video, and play it. Therefore, the user can hear the singer's voice and background music with reduced noise, thereby improving the playback effect of the audio file in the concert video.
图1为本申请实施例提供的一种音频播放方法流程示意图,包括步骤201至步骤203:1 is a schematic flowchart of an audio playback method provided by an embodiment of the present application, including steps 201 to 203:
步骤201:音频播放装置在播放多媒体文件中的音频文件情况下,音频播放装置确定目标噪声特征。Step 201: When the audio playback device plays the audio file in the multimedia file, the audio playback device determines the target noise feature.
在本申请实施例中,多媒体文件可以为音频播放装置采集的多媒体文件,也可以为音频播放装置下载的多媒体文件,还可以为音频播放装置在线播放的多媒体文件,本申请实施例对此不作限定。In this embodiment of the present application, the multimedia file may be a multimedia file collected by an audio playback device, a multimedia file downloaded by an audio playback device, or a multimedia file played online by an audio playback device, which is not limited in this embodiment of the present application .
在本申请实施例中,上述的多媒体文件可以为视频文件,也可以为音频文件,本申请实施例对此不作限定。In the embodiment of the present application, the above-mentioned multimedia file may be a video file or an audio file, which is not limited in the embodiment of the present application.
可以理解,在多媒体文件为视频文件的情况下,上述的多媒体文件中的音频文件为视频文件中的音频文件(例如,背景音乐或人声等),在多媒体文件为音频文件的情况下,上述的多媒体文件中的音频文件即为该音频文件。It can be understood that when the multimedia file is a video file, the audio file in the above-mentioned multimedia file is an audio file (for example, background music or vocals, etc.) in the video file, and when the multimedia file is an audio file, the above-mentioned audio file The audio file in the multimedia file is the audio file.
在本申请实施例中,目标噪声特征可以包括:白噪声、高斯噪声、脉冲噪声、人声或其他噪声,本申请实施例对此不作限定。In this embodiment of the present application, the target noise feature may include: white noise, Gaussian noise, impulse noise, human voice, or other noise, which is not limited in this embodiment of the present application.
需要说明的是,本申请实施例中的噪声特征可以理解为噪声类型。It should be noted that, the noise features in the embodiments of the present application may be understood as noise types.
可选地,在本申请实施例中,目标噪声特征可以为与拍摄场景对应的噪声特征,也可以为与降噪程度对应的噪声特征,本申请实施例对此不作限定。Optionally, in this embodiment of the present application, the target noise feature may be a noise feature corresponding to a shooting scene or a noise feature corresponding to a noise reduction degree, which is not limited in this embodiment of the present application.
其中,上述的拍摄场景可以包括以下任意一项:户外、海边、公交、演唱会或家里。需要说明的是,本申请实施例中的拍摄场景包括但不限于前述的5种场景,具体的可以根据实际需求限定,本申请实施例对此不作限定。Wherein, the above shooting scene may include any one of the following: outdoor, seaside, bus, concert or home. It should be noted that the shooting scenes in the embodiments of the present application include but are not limited to the aforementioned five scenes, which may be specifically limited according to actual needs, which are not limited in the embodiments of the present application.
示例性的,在上述的拍摄场景为海边的情况下,目标噪声特征可以为海浪声或海浪声以外的其他声音;在上述的拍摄场景为演唱会的情况下,目标噪声特征可以为背景音乐,或背景音乐和歌手声音以外的其他声音,具体可以根据实际需求设定,本申请实施例对此不作限定。Exemplarily, in the case that the above-mentioned shooting scene is a seaside, the target noise feature may be the sound of ocean waves or other sounds other than the sound of ocean waves; if the above-mentioned shooting scene is a concert, the target noise feature may be background music, Or other sounds other than the background music and the singer's voice, which may be specifically set according to actual needs, which are not limited in this embodiment of the present application.
在本申请实施例中,上述的降噪程度是指抑制噪声的程度,即抑制全部或部分噪声。例如,降噪程度可以分为高、中或低,音频播放装置确定降噪程度为高,可以抑制全部噪声,音频播放装置确定降噪程度为中,可以抑制50%的噪声,音频播放装置确定降噪程度为低,可以抑制10%的噪声,具体的可以根据实际需求设定,本申请实施例对此不作限定。In the embodiments of the present application, the above-mentioned noise reduction degree refers to the degree of noise suppression, that is, suppression of all or part of the noise. For example, the noise reduction degree can be classified as high, medium or low, the audio playback device determines that the noise reduction degree is high, and all noise can be suppressed; the audio playback device determines that the noise reduction degree is medium, and 50% of the noise can be suppressed, and the audio playback device determines The noise reduction degree is low, and 10% of the noise can be suppressed. The specific value can be set according to actual requirements, which is not limited in this embodiment of the present application.
在一种示例中,音频播放装置在播放多媒体文件中的音频文件情况下,除了可以自动确定目标噪声特征,还可以由用户触发音频播放装置确定目标噪声特征。In an example, when the audio playback device plays an audio file in a multimedia file, in addition to automatically determining the target noise feature, the user can also trigger the audio playback device to determine the target noise feature.
步骤202:音频播放装置根据上述目标噪声特征,确定音频文件中的第一音频。Step 202: The audio playback device determines the first audio in the audio file according to the above target noise feature.
可选地,在本申请实施例中,上述的第一音频可以为:目标人声或环境音。Optionally, in this embodiment of the present application, the above-mentioned first audio may be: a target human voice or an ambient sound.
示例性的,上述的环境音可以包括:背景音乐、海浪声、鸣笛声或音频文件中除目标人声以外的其他人声等。Exemplarily, the above-mentioned ambient sounds may include: background music, ocean waves, whistling sounds, or other human voices other than the target human voice in the audio file, and the like.
例1,音频文件包括海浪声和人声,音频播放装置确定目标噪声特征为海浪声,则音频播放装置可以根据海浪声,确定第一音频为人声。Example 1, the audio file includes ocean wave sound and human voice, and the audio playback device determines that the target noise feature is ocean wave sound, then the audio playback device can determine the first audio is human voice according to the ocean wave sound.
例2,音频文件包括海浪声和人声,音频播放装置确定目标噪声特征为海浪声,则音频播放装置可以根据海浪声,确定第一音频为海浪声。Example 2, the audio file includes ocean wave sound and human voice, and the audio playback device determines that the target noise feature is ocean wave sound, then the audio playback device can determine the first audio as ocean wave sound according to the ocean wave sound.
例3,音频文件包括海浪声和人声,音频播放装置确定目标噪声特征为人声,则音频播放装置可以根据人声,确定第一音频为海浪声。Example 3, the audio file includes ocean waves and human voices, and the audio playback device determines that the target noise feature is human voices, then the audio playback device can determine the first audio is ocean waves according to the human voice.
例4,音频文件包括海浪声和人声,音频播放装置确定目标噪声特征为人声,则音频播放装置可以根据人声,确定第一音频为人声。Example 4, the audio file includes the sound of ocean waves and human voice, and the audio playback device determines that the target noise feature is human voice, then the audio playback device can determine the first audio is human voice according to the human voice.
步骤203:音频播放装置根据上述第一音频,提取音频文件中的目标音频,播放该目标音频。Step 203: The audio playing device extracts the target audio in the audio file according to the above-mentioned first audio, and plays the target audio.
其中,上述的目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。Wherein, the above-mentioned target audio is the first audio or the second audio, and the second audio is the audio other than the first audio in the audio file.
需要说明的是,音频播放装置提取音频文件中的目标音频和播放目标音频没有明显的先后顺序,例如,音频播放装置可以在提取音频文件中的目标音频之后播放目标音频,也可以在提取音频文件中的目标音频的同时播放目标音频,本申请实施例对此不作限定。It should be noted that there is no obvious sequence for the audio playback device to extract the target audio in the audio file and play the target audio. While playing the target audio in the target audio, the embodiment of the present application does not limit this.
示例性的,第二音频可以为:目标人声或环境音。具体的,在第二音频为目标人声的情况下,第一音频可以为环境音,在第二音频为环境音的情况下,第一音频可以为目标人声,具体可以根据实际需求设定,本申请实施例对此不作限定。Exemplarily, the second audio may be: target human voice or ambient sound. Specifically, in the case where the second audio is the target human voice, the first audio can be the ambient sound, and in the case where the second audio is the ambient sound, the first audio can be the target human voice, which can be set according to actual needs , which is not limited in the embodiments of the present application.
示例1,音频播放装置提取音频文件中的第一音频为目标音频。Example 1, the audio playback device extracts the first audio in the audio file as the target audio.
举例说明,音频文件包括海浪声和人声,用户需求获取到仅包含人声的音频(即上述的目标音频),可以触发音频播放装置确定目标噪声特征为海浪声,则音频播放装置可以根据海浪声,确定第一音频为人声,并提取出人声。For example, the audio file includes the sound of ocean waves and human voices, and the user needs to obtain audio that only contains human voices (that is, the above-mentioned target audio), and the audio playback device can be triggered to determine that the target noise feature is the sound of ocean waves, then the audio playback device can be based on ocean waves. voice, determine the first audio as human voice, and extract the human voice.
示例2,音频播放装置提取音频文件中的第二音频为目标音频。Example 2, the audio playback apparatus extracts the second audio in the audio file as the target audio.
举例说明,音频文件包括海浪声和人声,用户需求获取到仅包含人声的音频(即上述的目标音频),可以触发音频播放装置确定目标噪声特征为海浪声,则音频播放装置可以根据海浪声,确定第一音频为海浪声,并根据海浪声从包括海浪声和人声的音频文件中提取出人声(即上述的第二音频)。For example, the audio file includes the sound of ocean waves and human voices, and the user needs to obtain audio that only contains human voices (that is, the above-mentioned target audio), and the audio playback device can be triggered to determine that the target noise feature is the sound of ocean waves, then the audio playback device can be based on ocean waves. sound, determine that the first audio is the sound of ocean waves, and extract the human voice (that is, the above-mentioned second audio) from the audio file including the sound of ocean waves and the human voice according to the sound of the ocean waves.
在一种示例中,音频播放装置可以根据AI降噪模型,提取音频文件中的目标音频。In an example, the audio playback device may extract the target audio in the audio file according to the AI noise reduction model.
其中,音频播放装置可以根据训练样本库中的训练样本训练AI降噪模型,,其中,训练样本库包括至少一个训练样本,若提取音频文件中的目标音频为人声,该至少一个训练样本中的每个训练样本均包含人声;若提取音频文件中的目标音频为环境音,该至少一个训练样本中的每个训练样本均包含环境音。例如,若提取音频文件中的目标音频为海浪声,至少一个训练样本中的每个训练样本均包含海浪声;若提取音频文件中的目标音频为鸣笛 声,至少一个训练样本中的每个训练样本均包含鸣笛声。The audio playback device can train the AI noise reduction model according to the training samples in the training sample library, wherein the training sample library includes at least one training sample, if the target audio in the extracted audio file is human voice, the at least one training sample Each training sample includes human voice; if the target audio in the extracted audio file is ambient sound, each training sample in the at least one training sample includes ambient sound. For example, if the target audio in the extracted audio file is the sound of ocean waves, each training sample in the at least one training sample contains the sound of ocean waves; if the target audio in the extracted audio file is the sound of whistle, each training sample in the at least one training sample contains the sound of ocean waves. The training samples all contain whistle sounds.
示例性的,在目标音频为第二音频的情况下,播放装置根据上述第一音频,提取音频文件中的目标音频,可以通过至少两种可能的方式提取。Exemplarily, in the case where the target audio is the second audio, the playback device extracts the target audio in the audio file according to the above-mentioned first audio, which may be extracted in at least two possible ways.
例5,音频播放装置根据上述第一音频,确定出第二音频,从音频文件中直接提取第二音频。例如,音频播放装置可以通过第一AI降噪模型提取第二音频。Example 5, the audio playback device determines the second audio according to the above-mentioned first audio, and directly extracts the second audio from the audio file. For example, the audio playback device may extract the second audio through the first AI noise reduction model.
例6,音频播放装置根据上述第一音频,提取音频文件中的第一音频,再从音频文件中滤除第一音频,以提取出第二音频。例如,音频播放装置可以通过第二AI降噪模型提取第一音频,并从音频文件中滤除第一音频,以提取第二音频。Example 6, the audio playback device extracts the first audio in the audio file according to the above-mentioned first audio, and then filters the first audio from the audio file to extract the second audio. For example, the audio playback device may extract the first audio through the second AI noise reduction model, and filter the first audio from the audio file to extract the second audio.
需要说明的是,在音频播放装置提取音频文件中的目标音频之后,可以保存目标音频,还可以在后续的拍摄中利用目标音频进行音频合成,已达到更好的拍摄效果。It should be noted that after the audio playback device extracts the target audio in the audio file, the target audio can be saved, and the target audio can be used for audio synthesis in subsequent shooting, which has achieved a better shooting effect.
本申请实施例提供的音频播放方法,音频播放装置在播放多媒体文件中的音频文件情况下,可以确定出目标噪声特征。然后,音频播放装置可以根据该目标噪声特征,确定音频文件中的第一音频。接着,音频播放装置可以根据该第一音频,提取音频文件中的目标音频,播放目标音频;其中,该目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。通过上述方案,首先,在音频播放装置播放该音频文件情况下,音频播放装置可以确定音频文件对应的目标噪声特征。当音频播放装置确定了目标噪声特征,可以根据该目标噪声特征,准确地确定出音频文件中的第一音频。其次,音频播放装置可以根据第一音频,提取音频文件中的目标音频,由于确定第一音频的准确率的提高,使得提取音频文件中的目标音频的准确性也得到提高,从而可以使得音频播放装置准确地抑制音频文件中的噪声,进而得到用户需求的音频文件的播放效果。如此,达到了提高音频文件的播放效果的目的。In the audio playback method provided by the embodiment of the present application, the audio playback device can determine the target noise feature in the case of playing the audio file in the multimedia file. Then, the audio playback device may determine the first audio in the audio file according to the target noise feature. Then, the audio playback device can extract the target audio in the audio file according to the first audio, and play the target audio; wherein, the target audio is the first audio or the second audio, and the second audio is the first audio in the audio file except the first audio. Audio other than audio. Through the above solution, first, in the case that the audio playback device plays the audio file, the audio playback device can determine the target noise feature corresponding to the audio file. When the audio playback device determines the target noise feature, it can accurately determine the first audio in the audio file according to the target noise feature. Secondly, the audio playback device can extract the target audio in the audio file according to the first audio. Due to the improvement of the accuracy rate of determining the first audio, the accuracy of extracting the target audio in the audio file is also improved, so that the audio playback can be made. The device accurately suppresses the noise in the audio file, thereby obtaining the playback effect of the audio file required by the user. In this way, the purpose of improving the playback effect of the audio file is achieved.
可选地,在本申请实施例中,音频播放装置可以对应多种噪声特征,由用户选择想要触发音频播放装置确定的噪声特征。Optionally, in this embodiment of the present application, the audio playback device may correspond to a variety of noise features, and the user selects the noise feature to be determined by the audio playback device.
示例性的,上述的步骤201具体可以包括如下步骤201a至步骤202d:Exemplarily, the above-mentioned step 201 may specifically include the following steps 201a to 202d:
步骤201a:音频播放装置在播放多媒体文件中的音频文件情况下,接收用户的第一输入。Step 201a: In the case of playing the audio file in the multimedia file, the audio playback device receives the first input from the user.
其中,上述的第一输入可以为用户在音频播放装置的屏幕上的点击输入,或者为用户输入的语音指令,或者为用户输入的特定手势,具体的可以根据实际使用需求确定,本申请实施例对此不作限定。Wherein, the above-mentioned first input may be a click input by the user on the screen of the audio playback device, or a voice command input by the user, or a specific gesture input by the user, which can be specifically determined according to actual use requirements. This is not limited.
本申请实施例中的特定手势可以为单击手势、滑动手势、拖动手势、压力识别手势、长按手势、面积变化手势、双按手势、双击手势中的任意一种;本申请实施例中的点击输入可以为单击输入、双击输入或任意次数的点击输入等,还可以为长按输入或短按输入。The specific gesture in the embodiment of the present application may be any one of a single-click gesture, a sliding gesture, a drag gesture, a pressure recognition gesture, a long-press gesture, an area change gesture, a double-press gesture, and a double-click gesture; in the embodiment of the present application The click input can be single-click input, double-click input, or click input for any number of times, etc., and can also be long-press input or short-press input.
在一种示例中,用户在音频播放装置的屏幕上的点击输入具体可以为:用户对屏幕上的目标控件的点击输入。In an example, the user's click input on the screen of the audio playback device may specifically be: the user's click input on a target control on the screen.
需要说明的是,上述的目标控件可以为已有的控件,也可以为新增的控件,本申请实施例中对此不作限定。其中,目标控件可以包括以下至少一项:实体按键(又称为物理按键或机械按键),虚拟按键。It should be noted that the above-mentioned target control may be an existing control or a newly added control, which is not limited in this embodiment of the present application. Wherein, the target control may include at least one of the following: a physical key (also called a physical key or a mechanical key), and a virtual key.
具体的,第一输入可以为用户点击目标控件达到第一预设时长的输入。Specifically, the first input may be an input in which the user clicks the target control to reach the first preset duration.
例如,第一输入可以为用户按压电源键、音量键和新增的人工智能(Artificial  Intelligence,AI)按键等。具体的,第一输入可以为用户按压电源键和AI按键达到3秒,或者,第一输入可以为用户分别按压一次音量键“+”和音量键“-”。For example, the first input may be that the user presses the power button, the volume button, the newly added artificial intelligence (Artificial Intelligence, AI) button, and the like. Specifically, the first input may be that the user presses the power button and the AI button for 3 seconds, or the first input may be that the user presses the volume key "+" and the volume key "-" once respectively.
步骤201b:响应于上述第一输入,音频播放装置显示M个选项。Step 201b: In response to the above-mentioned first input, the audio playback device displays M options.
其中,上述M个选项对应M个降噪模型,每个降噪模型的噪声特征不同。The above M options correspond to M noise reduction models, and each noise reduction model has different noise characteristics.
示例性的,音频播放装置可以在上述多媒体的播放界面中显示M个选项,也可以在第一界面中显示M个选项,本申请实施例对此不作限定。其中,第一界面可以为已有的界面,也可以为新增的界面,本申请实施例对此不作限定。Exemplarily, the audio playback apparatus may display M options in the above-mentioned multimedia playback interface, or may display M options in the first interface, which is not limited in this embodiment of the present application. The first interface may be an existing interface or a newly added interface, which is not limited in this embodiment of the present application.
示例性的,上述的第一界面可以为窗口,也可以为菜单栏,本申请实施例对此不作限定。Exemplarily, the above-mentioned first interface may be a window or a menu bar, which is not limited in this embodiment of the present application.
示例性的,上述的M个降噪模型可以对应同一功能,也可以对应不同的功能,本申请实施例对此不作限定。Exemplarily, the above-mentioned M noise reduction models may correspond to the same function, or may correspond to different functions, which are not limited in this embodiment of the present application.
例如,M个降噪模型对应的功能均为人声增强,或者,M个降噪模型对应的功能均为人声抑制,或者,M个降噪模型中的至少一个降噪模型对应的功能为人声增强,M个降噪模型中除该至少一个降噪模型以外的降噪模型对应的功能为人声抑制。For example, the functions corresponding to the M noise reduction models are all vocal enhancement, or the functions corresponding to the M noise reduction models are vocal suppression, or the function corresponding to at least one noise reduction model in the M noise reduction models is vocal enhancement , the function corresponding to the noise reduction models other than the at least one noise reduction model among the M noise reduction models is vocal suppression.
步骤201c:音频播放装置接收用户对上述M个选项中的目标选项的第二输入。Step 201c: The audio playback device receives a second input from the user on the target option in the above-mentioned M options.
示例性的,上述的第二输入可以为用户对目标选项的点击输入,或者为用户输入的语音指令,或者为用户输入的特定手势,具体的可以根据实际使用需求确定,本申请实施例对此不作限定。Exemplarily, the above-mentioned second input may be a user's click input on the target option, or a voice command input by the user, or a specific gesture input by the user. Not limited.
步骤201d:响应于上述第二输入,音频播放装置确定与目标选项对应的目标降噪模型,并根据该目标降噪模型确定上述目标噪声特征。Step 201d: In response to the second input, the audio playback device determines a target noise reduction model corresponding to the target option, and determines the target noise feature according to the target noise reduction model.
示例性的,上述的降噪模型与上述噪声特征之间存在一一对应关系,音频播放装置确定目标降噪模型之后,可以应通过该对应关系确定出目标噪声特征。Exemplarily, there is a one-to-one correspondence between the aforementioned noise reduction model and the aforementioned noise feature. After the audio playback device determines the target noise reduction model, the target noise feature may be determined through the correspondence.
需要说明的是,上述的对应关系可以为系统预设的,也可以为用户设置的,本申请实施例对此不作限定。It should be noted that the above-mentioned corresponding relationship may be preset by the system or set by the user, which is not limited in this embodiment of the present application.
举例说明,以音频播放装置为手机、且视频1为在海边拍摄的用户A的歌声和海浪声为例,如图2所示,在手机显示有视频1的播放界面31、并播放视频1的情况下,若手机用户想要获取仅包含用户A的歌声的音频,首先,手机用户可以点击“AI降噪”控件32(即上述的第一输入)。此时,如图3所示,手机在播放界面31中叠加显示一个窗口33(即上述的第一界面)。其中,窗口33中显示有对应5个拍摄场景的5个选项,分别为“户外”选项、“演唱会”选项、“海边”选项34、“公交”选项以及“家”选项。然后,手机用户可以点击“海边”选项34(即上述的第二输入)。接着,手机确定AI降噪模型A,其中,降噪模型A对应目标噪声特征为海浪声,则手机可以根据海浪声,确定第一音频为用户A的歌声,并根据AI降噪模型A提取出用户A的歌声。For example, take the audio playback device as a mobile phone and the video 1 as an example of the singing and ocean waves of user A shot at the seaside, as shown in FIG. In this case, if the mobile phone user wants to obtain audio that only includes the singing voice of user A, first, the mobile phone user can click the "AI noise reduction" control 32 (ie, the above-mentioned first input). At this time, as shown in FIG. 3 , a window 33 (ie, the above-mentioned first interface) is superimposed and displayed on the playing interface 31 of the mobile phone. Among them, the window 33 displays 5 options corresponding to the 5 shooting scenes, namely the “outdoor” option, the “concert” option, the “seaside” option 34 , the “transit” option and the “home” option. Then, the mobile phone user can click on the "seaside" option 34 (ie, the above-mentioned second input). Next, the mobile phone determines the AI noise reduction model A, where the target noise feature corresponding to the noise reduction model A is the sound of waves, then the mobile phone can determine the first audio as the singing voice of user A according to the sound of the waves, and extract the sound according to the AI noise reduction model A. User A's singing voice.
需要说明的是,音频播放装置在启用特定应用程序的情况下,音频播放装置可以自动显示上述的“AI降噪”控件,供用户选择是否启用降噪功能,音频播放装置显示“AI降噪”控件达到第二预设时长(例如,3秒),音频播放装置可以取消显示“AI降噪”控件。当用户需求音频播放装置显示“AI降噪”控件时,可以通过在音频播放装置的屏幕上的滑动输入,触发音频播放装置显示“AI降噪”控件。It should be noted that when the audio playback device enables a specific application, the audio playback device can automatically display the above-mentioned "AI noise reduction" control for the user to choose whether to enable the noise reduction function, and the audio playback device displays "AI noise reduction" When the control reaches the second preset duration (for example, 3 seconds), the audio playback device may cancel the display of the "AI noise reduction" control. When the user requests the audio playback device to display the "AI noise reduction" control, the audio playback device may be triggered to display the "AI noise reduction" control by sliding input on the screen of the audio playback device.
其中,特定应用程序可以包括“相机”应用程序,或者“录音”应用程序等,或者具 有视频拍摄功能的聊天应用程序,或者具有视频播放功能的购物应用程序等。上述的用户在屏幕上的滑动输入可以为:用户在屏幕上的向左滑动,或者用户在屏幕上的向右滑动,或者用户在屏幕上的向上滑动,或者用户在屏幕上的向下滑动,本申请实施例对此不作限定。The specific application may include a "camera" application, or a "recording" application, etc., or a chat application with a video shooting function, or a shopping application with a video playback function, and the like. The above-mentioned user sliding input on the screen may be: the user sliding left on the screen, or the user sliding right on the screen, or the user sliding up on the screen, or the user sliding down on the screen, This embodiment of the present application does not limit this.
本申请实施例提供的音频播放方法可以应用于通过选择选项确定的噪声特征的场景中,当用户想要提高音频文件的播放效果时,可以在音频播放装置播放该音频文件情况下,根据需求触发音频播放装置显示M个选项,由用户根据需求选择对应的降噪模型,音频播放装置可以根据目标降噪模型确定出目标噪声特征,从而不仅能够提高音频播放装置抑制音频文件中噪声的准确性,还提高了音频播放装置对音频降噪的灵活性。The audio playback method provided by the embodiment of the present application can be applied to a scenario with noise characteristics determined by selecting an option. When a user wants to improve the playback effect of an audio file, the audio playback device can play the audio file and trigger a trigger according to requirements. The audio playback device displays M options, and the user selects a corresponding noise reduction model according to requirements, and the audio playback device can determine the target noise feature according to the target noise reduction model, thereby not only improving the accuracy of the audio playback device for suppressing noise in the audio file, The flexibility of the audio playback device for audio noise reduction is also improved.
进一步地,在本申请实施例中,在M个选项对应同一功能的情况下,在音频播放装置显示M个选项之前,音频播放装置还可以显示至少两个功能选项,由用户先选择想要实现的功能,再选择对应的降噪模型。Further, in the embodiment of the present application, in the case where M options correspond to the same function, before the audio playback device displays the M options, the audio playback device can also display at least two function options, and the user first selects the desired implementation function, and then select the corresponding noise reduction model.
在一种示例中,上述的步骤201b具体可以包括如下步骤201b1至步骤201b3:In an example, the above-mentioned step 201b may specifically include the following steps 201b1 to 201b3:
步骤201b1:响应于上述第一输入,音频播放装置显示N个功能选项。Step 201b1: In response to the above-mentioned first input, the audio playback device displays N function options.
其中,每个功能选项对应不同的功能,N为正整数。Among them, each function option corresponds to a different function, and N is a positive integer.
步骤201b2:音频播放装置接收用户对上述N个功能选项中的目标功能选项的第三输入。Step 201b2: The audio playback device receives a third input from the user on the target function option among the above N function options.
示例性的,上述的第二输入可以为用户对目标功能选项的点击输入,或者为用户输入的语音指令,或者为用户输入的特定手势,具体的可以根据实际使用需求确定,本申请实施例对此不作限定。Exemplarily, the above-mentioned second input may be a user's click input on the target function option, or a voice command input by the user, or a specific gesture input by the user, which can be specifically determined according to actual use requirements. This is not limited.
步骤201b3:响应于上述第三输入,音频播放装置显示M个选项。Step 201b3: In response to the third input, the audio playback device displays M options.
例7,结合图2,手机用户点击“AI降噪”控件31(即上述的第一输入)之后。如图4所示,手机屏幕中显示2个功能选项,分别为“人声增强”功能选项41和“人声抑制”功能选项42。若手机用户想要获取仅包含用户A的歌声的音频,用户可以点击“人声增强”功能选项41,此时,如图3所示,手机在窗口33中显示有对应5个拍摄场景的5个选项,分别为“户外”选项、“演唱会”选项、“海边”选项34、“公交”选项以及“家”选项。Example 7, in conjunction with FIG. 2 , after the mobile phone user clicks the “AI noise reduction” control 31 (ie, the above-mentioned first input). As shown in FIG. 4 , two function options are displayed on the screen of the mobile phone, namely, a “voice enhancement” function option 41 and a “voice suppression” function option 42 . If the mobile phone user wants to obtain the audio that only contains the singing voice of user A, the user can click the “voice enhancement” function option 41. At this time, as shown in FIG. 3, the mobile phone displays 5 corresponding 5 shooting scenes in the window 33. The options are "outdoor" option, "concert" option, "seaside" option34, "transit" option and "home" option.
例8,结合图2,手机用户点击“AI降噪”控件31(即上述的第一输入)之后。如图4所示,手机屏幕中显示2个功能选项,分别为“人声增强”功能选项41和“人声抑制”功能选项42。若手机用户想要获取仅包含海浪声的音频,手机用户可以点击“人声抑制”功能选项42,此时,如图5所示,手机在窗口51中显示有对应3个降噪程度的3个选项,分别为“高”选项52、“中”选项和“低”选项53。若手机用户想要稍微抑制用户A的歌声,则手机用户可以点击“低”选项53;若手机用户想要彻底抑制用户A的歌声,则手机用户可以点击“高”选项52,手机只保留视频1中的海浪声。Example 8, in conjunction with FIG. 2 , after the mobile phone user clicks the “AI noise reduction” control 31 (ie, the above-mentioned first input). As shown in FIG. 4 , two function options are displayed on the screen of the mobile phone, namely, a “voice enhancement” function option 41 and a “voice suppression” function option 42 . If the mobile phone user wants to obtain the audio that only contains the sound of the ocean waves, the mobile phone user can click the "Voice Suppression" function option 42. At this time, as shown in FIG. 5, the mobile phone displays 3 corresponding noise reduction levels in the window 51 The options are "High" option 52, "Medium" option and "Low" option 53. If the mobile phone user wants to suppress the singing voice of user A slightly, the mobile phone user can click the "low" option 53; if the mobile phone user wants to completely suppress the singing voice of user A, the mobile phone user can click the "high" option 52, and the mobile phone only keeps the video The sound of waves in 1.
本申请实施例提供的音频播放方法可以应用于通过选择功能选项确定的功能的场景中,音频播放装置可以通过显示N个功能选项,由用户根据需求选择对应的功能,进而根据该功能对应的M个选项,选择对应的降噪模型,如此,不仅可以提高用户选择降噪模型的准确性,还提高了音频播放装置对音频降噪的灵活性。The audio playback method provided by the embodiment of the present application can be applied to a scenario where a function is determined by selecting a function option. The audio playback device can display N function options, and the user can select a corresponding function according to requirements, and then select the corresponding function according to the M function corresponding to the function. One option is to select the corresponding noise reduction model. In this way, not only the accuracy of the user selecting the noise reduction model can be improved, but also the flexibility of the audio playback device for audio noise reduction can be improved.
可选地,在本申请实施例中,在目标噪声特征为与降噪程度对应的噪声特征的情况下,音频播放装置可以通过确定的降噪程度来提取目标音频。Optionally, in this embodiment of the present application, when the target noise feature is a noise feature corresponding to the noise reduction degree, the audio playback apparatus may extract the target audio by using the determined noise reduction degree.
在一种示例中,上述的步骤203中的根据第一音频,提取音频文件中的目标音频,具体可以包括如下步骤203a:In an example, in the above step 203, the target audio in the audio file is extracted according to the first audio, which may specifically include the following step 203a:
步骤203a:音频播放装置从音频文件中滤除部分或全部的第一音频,得到音频文件中的第二音频。Step 203a: The audio playback device filters out part or all of the first audio from the audio file to obtain the second audio in the audio file.
示例1,在第一音频为目标人声的情况下,音频播放装置从音频文件中滤除部分或全部的目标人声,得到音频文件中的第二音频,即对音频文件进行人声抑制。Example 1, when the first audio is the target human voice, the audio playback device filters out part or all of the target human voice from the audio file to obtain the second audio in the audio file, that is, the audio file is suppressed.
示例2,在第一音频为环境音的情况下,音频播放装置从音频文件中滤除部分或全部的环境音,得到音频文件中的第二音频,即对音频文件进行人声增强。Example 2, when the first audio is ambient sound, the audio playback device filters out part or all of the ambient sound from the audio file to obtain the second audio in the audio file, that is, performing vocal enhancement on the audio file.
在另一种示例中,音频播放装置从音频文件中提取全部或部分第一音频。In another example, the audio playback apparatus extracts all or part of the first audio from the audio file.
本申请实施例中的音频播放方法可以应用于获取不同降噪程度的目标音频的场景中,音频播放装置可以根据用户的需求,灵活的获取到不同降噪程度的目标音频,提高了音频播放的灵活性。The audio playback method in the embodiment of the present application can be applied to the scenario of acquiring target audio with different noise reduction degrees, and the audio playback device can flexibly obtain target audio with different noise reduction degrees according to the user's needs, which improves the audio playback performance. flexibility.
需要说明的是,本申请实施例提供的音频播放方法,执行主体可以为音频播放装置,或者该音频播放装置中的用于执行音频播放方法的控制模块。本申请实施例中以音频播放装置执行音频播放方法为例,说明本申请实施例提供的音频播放装置。It should be noted that, in the audio playback method provided by the embodiments of the present application, the execution body may be an audio playback device, or a control module in the audio playback device for executing the audio playback method. In the embodiments of the present application, an audio playing method performed by an audio playing device is used as an example to describe the audio playing device provided by the embodiments of the present application.
图6为实现本申请实施例提供的一种音频播放装置的可能的结构示意图,如图6所示,音频播放装置600包括:确定模块601、降噪模块602和播放模块603,其中:确定模块601,用于在播放多媒体文件中的音频文件情况下,确定目标噪声特征;降噪模块602,用于根据确定模块601确定的目标噪声特征,确定音频文件中的第一音频;以及根据该第一音频,提取音频文件中的目标音频;播放模块603,用于播放降噪模块602提取的目标音频;其中,该目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。FIG. 6 is a schematic diagram of a possible structure for implementing an audio playback device provided by an embodiment of the present application. As shown in FIG. 6 , the audio playback device 600 includes: a determination module 601, a noise reduction module 602, and a playback module 603, wherein: a determination module 601, for determining the target noise feature under the situation of playing the audio file in the multimedia file; the noise reduction module 602, for determining the first audio in the audio file according to the target noise feature determined by the determining module 601; an audio, extracting the target audio in the audio file; the playing module 603, for playing the target audio extracted by the noise reduction module 602; wherein, the target audio is the first audio or the second audio, and the second audio is in the audio file Audio other than the first audio.
可选地,如图6所示,音频播放装置600还包括:接收模块604和显示模块605;接收模块604,用于在播放多媒体文件中的音频文件情况下,接收用户的第一输入;显示模块605,用于响应于接收模块604接收到的第一输入,显示M个选项,该M个选项对应M个降噪模型,每个降噪模型的噪声特征不同,M为正整数;接收模块604,还用于接收用户对M个选项中的目标选项的第二输入;确定模块601,具体用于响应于接收模块604接收到的第二输入,确定与目标选项对应的目标降噪模型,并根据该目标降噪模型确定目标噪声特征。Optionally, as shown in FIG. 6 , the audio playback device 600 further includes: a receiving module 604 and a display module 605; the receiving module 604 is configured to receive the first input of the user in the case of playing the audio file in the multimedia file; displaying The module 605 is configured to display M options in response to the first input received by the receiving module 604, the M options correspond to M noise reduction models, the noise characteristics of each noise reduction model are different, and M is a positive integer; the receiving module 604, also used to receive the second input of the user to the target option in the M options; the determining module 601 is specifically configured to respond to the second input received by the receiving module 604, and determine the target noise reduction model corresponding to the target option, And determine the target noise feature according to the target noise reduction model.
可选地,降噪模块602,具体用于从音频文件中滤除部分或全部的第一音频,得到音频文件中的第二音频。Optionally, the noise reduction module 602 is specifically configured to filter out part or all of the first audio from the audio file to obtain the second audio in the audio file.
可选地,目标噪声特征为与拍摄场景对应的噪声特征,或为与降噪程度对应的噪声特征。Optionally, the target noise feature is a noise feature corresponding to a shooting scene, or a noise feature corresponding to a noise reduction degree.
可选地,降噪模块602,具体用于:在第一音频为目标人声的情况下,从音频文件中滤除部分或全部的目标人声,得到音频文件中的第二音频,对音频文件进行人声抑制;或者,在第一音频为环境音的情况下,从音频文件中滤除部分或全部的环境音,得到音频文件中的第二音频,对音频文件进行人声增强。Optionally, the noise reduction module 602 is specifically configured to: in the case that the first audio is the target human voice, filter out part or all of the target human voice from the audio file to obtain the second audio in the audio file, The file is subjected to vocal suppression; or, when the first audio is ambient sound, part or all of the ambient sound is filtered from the audio file to obtain the second audio in the audio file, and vocal enhancement is performed on the audio file.
需要说明的是,如图6所示,电子设备600中一定包括的模块用实线框示意,如确定模块601;电子设备600中可以包括也可以不包括的模块用虚线框示意,如显示模块605。It should be noted that, as shown in FIG. 6 , the modules that must be included in the electronic device 600 are indicated by solid line frames, such as the determination module 601; the modules that may or may not be included in the electronic device 600 are indicated by dotted line frames, such as display modules 605.
本申请实施例提供的音频播放装置,音频播放装置在播放多媒体文件中的音频文件情况下,可以确定出目标噪声特征。然后,音频播放装置可以根据该目标噪声特征,确定音频文件中的第一音频。接着,音频播放装置可以根据该第一音频,提取音频文件中的目标音频,播放目标音频;其中,该目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。通过上述方案,首先,在音频播放装置播放该音频文件情况下,音频播放装置可以确定音频文件对应的目标噪声特征。当音频播放装置确定了目标噪声特征,可以根据该目标噪声特征,准确地确定出音频文件中的第一音频。其次,音频播放装置可以根据第一音频,提取音频文件中的目标音频,由于确定第一音频的准确率的提高,使得提取音频文件中的目标音频的准确性也得到提高,从而可以使得音频播放装置准确地抑制音频文件中的噪声,进而得到用户需求的音频文件的播放效果。如此,达到了提高音频文件的播放效果的目的。In the audio playback device provided by the embodiment of the present application, the audio playback device can determine the target noise feature in the case of playing the audio file in the multimedia file. Then, the audio playback device may determine the first audio in the audio file according to the target noise feature. Then, the audio playback device can extract the target audio in the audio file according to the first audio, and play the target audio; wherein, the target audio is the first audio or the second audio, and the second audio is the first audio in the audio file except the first audio. Audio other than audio. Through the above solution, first, in the case that the audio playback device plays the audio file, the audio playback device can determine the target noise feature corresponding to the audio file. When the audio playback device determines the target noise feature, it can accurately determine the first audio in the audio file according to the target noise feature. Secondly, the audio playback device can extract the target audio in the audio file according to the first audio. Due to the improvement of the accuracy rate of determining the first audio, the accuracy of extracting the target audio in the audio file is also improved, so that the audio playback can be made. The device accurately suppresses the noise in the audio file, thereby obtaining the playback effect of the audio file required by the user. In this way, the purpose of improving the playback effect of the audio file is achieved.
本申请实施例中的音频播放装置可以是装置,也可以是终端中的部件、集成电路、或芯片。该装置可以是移动电子设备,也可以为非移动电子设备。示例性的,移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,非移动电子设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。The audio playback device in this embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The apparatus may be a mobile electronic device or a non-mobile electronic device. Exemplarily, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant). assistant, PDA), etc., non-mobile electronic devices can be servers, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (television, TV), teller machine or self-service machine, etc., this application Examples are not specifically limited.
本申请实施例中的音频播放装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。The audio playback device in the embodiment of the present application may be a device with an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
本申请实施例提供的音频播放装置能够实现图1至图5的方法实施例实现的各个过程,为避免重复,这里不再赘述。The audio playback device provided in the embodiment of the present application can implement each process implemented by the method embodiments in FIG. 1 to FIG. 5 , and to avoid repetition, details are not described here.
可选地,如图7所示,本申请实施例还提供一种电子设备700,包括处理器701,存储器702,存储在存储器702上并可在所述处理器701上运行的程序或指令,该程序或指令被处理器701执行时实现上述音频播放方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。Optionally, as shown in FIG. 7 , an embodiment of the present application further provides an electronic device 700, including a processor 701, a memory 702, a program or instruction stored in the memory 702 and executable on the processor 701, When the program or instruction is executed by the processor 701, each process of the above-mentioned audio playback method embodiments can be implemented, and the same technical effect can be achieved. To avoid repetition, details are not described here.
需要注意的是,本申请实施例中的电子设备包括上述所述的移动电子设备和非移动电子设备。It should be noted that the electronic devices in the embodiments of the present application include the aforementioned mobile electronic devices and non-mobile electronic devices.
图8为实现本申请实施例的一种电子设备的硬件结构示意图。FIG. 8 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
该电子设备100包括但不限于:射频单元101、网络模块102、音频输出单元103、输入单元104、传感器105、显示单元106、用户输入单元107、接口单元108、存储器109、以及处理器110等部件。The electronic device 100 includes but is not limited to: a radio frequency unit 101, a network module 102, an audio output unit 103, an input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, and a processor 110, etc. part.
本领域技术人员可以理解,电子设备100还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器110逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图8中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。Those skilled in the art can understand that the electronic device 100 may also include a power source (such as a battery) for supplying power to various components, and the power source may be logically connected to the processor 110 through a power management system, so as to manage charging, discharging, and power management through the power management system. consumption management and other functions. The structure of the electronic device shown in FIG. 8 does not constitute a limitation on the electronic device. The electronic device may include more or less components than those shown in the figure, or combine some components, or arrange different components, which will not be repeated here. .
其中,处理器110,用于在播放多媒体文件中的音频文件情况下,确定目标噪声特征; 并根据目标噪声特征,确定音频文件中的第一音频;以及根据该第一音频,提取音频文件中的目标音频;音频输出单元103,用于播放处理器110提取的目标音频;其中,该目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。Wherein, the processor 110 is used to determine the target noise feature under the condition of playing the audio file in the multimedia file; and according to the target noise feature, determine the first audio in the audio file; and according to the first audio, extract the audio in the audio file The audio output unit 103 is used for playing the target audio extracted by the processor 110; wherein, the target audio is the first audio or the second audio, and the second audio is the audio other than the first audio in the audio file audio.
可选地,用户输入单元107,用于在播放多媒体文件中的音频文件情况下,接收用户的第一输入;显示单元106,用于响应于用户输入单元107接收到的第一输入,显示M个选项,该M个选项对应M个降噪模型,每个降噪模型的噪声特征不同,M为正整数;用户输入单元107,还用于接收用户对M个选项中的目标选项的第二输入;处理器110,具体用于响应于用户输入单元107接收到的第二输入,确定与目标选项对应的目标降噪模型,该目标降噪模型对应目标噪声特征。Optionally, the user input unit 107 is used for receiving the first input of the user in the case of playing the audio file in the multimedia file; the display unit 106 is used for displaying M in response to the first input received by the user input unit 107 options, the M options correspond to M noise reduction models, each noise reduction model has different noise characteristics, and M is a positive integer; the user input unit 107 is further configured to receive the user's second selection of the target option in the M options. Input; the processor 110 is specifically configured to, in response to the second input received by the user input unit 107, determine a target noise reduction model corresponding to the target option, where the target noise reduction model corresponds to the target noise feature.
可选地,处理器110,具体用于从音频文件中滤除部分或全部的第一音频,得到音频文件中的第二音频。Optionally, the processor 110 is specifically configured to filter out part or all of the first audio from the audio file to obtain the second audio in the audio file.
可选地,目标噪声特征为与拍摄场景对应的噪声特征,或为与降噪程度对应的噪声特征。Optionally, the target noise feature is a noise feature corresponding to a shooting scene, or a noise feature corresponding to a noise reduction degree.
可选地,处理器110,具体用于:在第一音频为目标人声的情况下,从音频文件中滤除部分或全部的目标人声,得到音频文件中的第二音频,对音频文件进行人声抑制;或者,在第一音频为环境音的情况下,从音频文件中滤除部分或全部的环境音,得到音频文件中的第二音频,对音频文件进行人声增强。Optionally, the processor 110 is specifically configured to: in the case that the first audio is a target human voice, filter out part or all of the target human voice from the audio file to obtain the second audio in the audio file, Perform vocal suppression; or, when the first audio is ambient sound, filter out part or all of the ambient sound from the audio file to obtain the second audio in the audio file, and perform vocal enhancement on the audio file.
本申请实施例提供的电子设备,电子设备在播放多媒体文件中的音频文件情况下,可以确定出目标噪声特征。然后,电子设备可以根据该目标噪声特征,确定音频文件中的第一音频。接着,地爱你中设备可以根据该第一音频,提取音频文件中的目标音频,播放目标音频;其中,该目标音频为第一音频或为第二音频,该第二音频为音频文件中除第一音频之外的音频。通过上述方案,首先,在电子设备播放该音频文件情况下,电子设备可以确定音频文件对应的目标噪声特征。当电子设备确定了目标噪声特征,可以根据该目标噪声特征,准确地确定出音频文件中的第一音频。其次,电子设备可以根据第一音频,提取音频文件中的目标音频,由于确定第一音频的准确率的提高,使得提取音频文件中的目标音频的准确性也得到提高,从而可以使得电子设备准确地抑制音频文件中的噪声,进而得到用户需求的音频文件的播放效果。如此,达到了提高音频文件的播放效果的目的。In the electronic device provided by the embodiment of the present application, the electronic device can determine the target noise feature in the case of playing the audio file in the multimedia file. Then, the electronic device may determine the first audio in the audio file according to the target noise feature. Then, the device in Diaiyou can extract the target audio in the audio file according to the first audio, and play the target audio; wherein, the target audio is the first audio or the second audio, and the second audio is the audio file except the audio file. Audio other than the first audio. Through the above solution, firstly, when the electronic device plays the audio file, the electronic device can determine the target noise feature corresponding to the audio file. When the electronic device determines the target noise feature, it can accurately determine the first audio in the audio file according to the target noise feature. Secondly, the electronic device can extract the target audio in the audio file according to the first audio. Due to the improvement in the accuracy of determining the first audio, the accuracy of extracting the target audio in the audio file is also improved, so that the electronic device can be accurately The noise in the audio file is effectively suppressed, so as to obtain the playback effect of the audio file required by the user. In this way, the purpose of improving the playback effect of the audio file is achieved.
应理解的是,本申请实施例中,输入单元104可以包括图形处理器(Graphics Processing Unit,GPU)1041和麦克风1042,图形处理器1041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元106可包括显示面板1061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板1061。用户输入单元107包括触控面板1071以及其他输入设备1072。触控面板1071,也称为触摸屏。触控面板1071可包括触摸检测装置和触摸控制器两个部分。其他输入设备1072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。存储器109可用于存储软件程序以及各种数据,包括但不限于应用程序和操作系统。处理器110可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器110中。It should be understood that, in this embodiment of the present application, the input unit 104 may include a graphics processor (Graphics Processing Unit, GPU) 1041 and a microphone 1042. Such as camera) to obtain still pictures or video image data for processing. The display unit 106 may include a display panel 1061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 107 includes a touch panel 1071 and other input devices 1072 . The touch panel 1071 is also called a touch screen. The touch panel 1071 may include two parts, a touch detection device and a touch controller. Other input devices 1072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here. Memory 109 may be used to store software programs as well as various data including, but not limited to, application programs and operating systems. The processor 110 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface, and application programs, and the like, and the modem processor mainly processes wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 110 .
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该 程序或指令被处理器执行时实现上述音频播放方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。The embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium. When the program or instruction is executed by a processor, each process of the above-mentioned audio playback method embodiment can be achieved, and the same can be achieved. In order to avoid repetition, the technical effect will not be repeated here.
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。Wherein, the processor is the processor in the electronic device described in the foregoing embodiments. The readable storage medium includes a computer-readable storage medium, such as a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述音频播放方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the above audio playback method embodiments. Each process can achieve the same technical effect. In order to avoid repetition, it will not be repeated here.
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。It should be understood that the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element. In addition, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in the reverse order depending on the functions involved. To perform functions, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to some examples may be combined in other examples.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。The embodiments of the present application have been described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific embodiments, which are merely illustrative rather than restrictive. Under the inspiration of this application, without departing from the scope of protection of the purpose of this application and the claims, many forms can be made, which all fall within the protection of this application.

Claims (14)

  1. 一种音频播放方法,所述方法包括:A method for playing audio, the method comprising:
    在播放多媒体文件中的音频文件情况下,确定目标噪声特征;In the case of playing the audio file in the multimedia file, determine the target noise feature;
    根据所述目标噪声特征,确定所述音频文件中的第一音频;determining the first audio in the audio file according to the target noise feature;
    根据所述第一音频,提取所述音频文件中的目标音频,播放所述目标音频;According to the first audio, extract the target audio in the audio file, and play the target audio;
    其中,所述目标音频为所述第一音频或为第二音频,所述第二音频为所述音频文件中除所述第一音频之外的音频。Wherein, the target audio is the first audio or the second audio, and the second audio is the audio other than the first audio in the audio file.
  2. 根据权利要求1所述的方法,其中,所述在播放多媒体文件中的音频文件情况下,确定目标噪声特征,包括:The method according to claim 1, wherein, in the case of playing an audio file in a multimedia file, determining the target noise feature comprises:
    在播放多媒体文件中的音频文件情况下,接收用户的第一输入;In the case of playing the audio file in the multimedia file, receiving the first input of the user;
    响应于所述第一输入,显示M个选项,所述M个选项对应M个降噪模型,每个降噪模型的噪声特征不同,M为正整数;In response to the first input, displaying M options, the M options correspond to M noise reduction models, each noise reduction model has different noise characteristics, and M is a positive integer;
    接收用户对所述M个选项中的目标选项的第二输入;receiving a second input from the user for a target option in the M options;
    响应于所述第二输入,确定与所述目标选项对应的目标降噪模型,并根据所述目标降噪模型确定所述目标噪声特征。In response to the second input, a target noise reduction model corresponding to the target option is determined, and the target noise feature is determined according to the target noise reduction model.
  3. 根据权利要求1所述的方法,其中,所述根据所述第一音频,提取所述音频文件中的目标音频,包括:The method according to claim 1, wherein the extracting the target audio in the audio file according to the first audio comprises:
    从所述音频文件中滤除部分或全部的所述第一音频,得到所述音频文件中的第二音频。Filter out part or all of the first audio from the audio file to obtain the second audio in the audio file.
  4. 根据权利要求1至3中任一项所述的方法,其中,所述目标噪声特征为与拍摄场景对应的噪声特征,或为与降噪程度对应的噪声特征。The method according to any one of claims 1 to 3, wherein the target noise feature is a noise feature corresponding to a shooting scene, or a noise feature corresponding to a noise reduction degree.
  5. 根据权利要求1至3中任一项所述的方法,其中,所述根据所述第一音频,提取所述音频文件中的目标音频,包括:The method according to any one of claims 1 to 3, wherein the extracting the target audio in the audio file according to the first audio comprises:
    在所述第一音频为目标人声的情况下,从所述音频文件中滤除部分或全部的所述目标人声,得到所述音频文件中的第二音频,对所述音频文件进行人声抑制;或者,In the case where the first audio is the target human voice, filter out part or all of the target human voice from the audio file to obtain the second audio in the audio file, and perform human voice analysis on the audio file. sound suppression; or,
    在所述第一音频为环境音的情况下,从所述音频文件中滤除部分或全部的所述环境音,得到所述音频文件中的第二音频,对所述音频文件进行人声增强。In the case where the first audio is ambient sound, filter out part or all of the ambient sound from the audio file to obtain the second audio in the audio file, and perform vocal enhancement on the audio file .
  6. 一种音频播放装置,所述音频播放装置包括:确定模块、降噪模块和播放模块;An audio playback device comprising: a determination module, a noise reduction module and a playback module;
    所述确定模块,用于在播放多媒体文件中的音频文件情况下,确定目标噪声特征;The determining module is used to determine the target noise feature in the case of playing the audio file in the multimedia file;
    所述降噪模块,用于根据所述确定模块确定的所述目标噪声特征,确定所述音频文件中的第一音频;以及根据所述第一音频,提取所述音频文件中的目标音频;The noise reduction module is configured to determine the first audio in the audio file according to the target noise feature determined by the determining module; and extract the target audio in the audio file according to the first audio;
    所述播放模块,用于播放所述降噪模块提取的所述目标音频;The playing module is used to play the target audio extracted by the noise reduction module;
    其中,所述目标音频为所述第一音频或为第二音频,所述第二音频为所述音频文件中除所述第一音频之外的音频。Wherein, the target audio is the first audio or the second audio, and the second audio is the audio other than the first audio in the audio file.
  7. 根据权利要求6所述的音频播放装置,其中,所述音频播放装置还包括:接收模块和显示模块;The audio playback device according to claim 6, wherein the audio playback device further comprises: a receiving module and a display module;
    所述接收模块,用于在播放多媒体文件中的音频文件情况下,接收用户的第一输入;The receiving module is used to receive the first input of the user in the case of playing the audio file in the multimedia file;
    所述显示模块,用于响应于所述接收模块接收到的所述第一输入,显示M个选项,所 述M个选项对应M个降噪模型,每个降噪模型的噪声特征不同,M为正整数;The display module is configured to display M options in response to the first input received by the receiving module, the M options correspond to M noise reduction models, and the noise characteristics of each noise reduction model are different, and M is a positive integer;
    所述接收模块,还用于接收用户对所述M个选项中的目标选项的第二输入;The receiving module is further configured to receive a second input from the user to the target option in the M options;
    所述确定模块,具体用于响应于所述接收模块接收到的所述第二输入,确定与所述目标选项对应的目标降噪模型,并根据所述目标降噪模型确定所述目标噪声特征。The determining module is specifically configured to, in response to the second input received by the receiving module, determine a target noise reduction model corresponding to the target option, and determine the target noise feature according to the target noise reduction model .
  8. 根据权利要求6所述的音频播放装置,其中,所述降噪模块,具体用于从所述音频文件中滤除部分或全部的所述第一音频,得到所述音频文件中的第二音频。The audio playback device according to claim 6, wherein the noise reduction module is specifically configured to filter out part or all of the first audio from the audio file to obtain the second audio in the audio file .
  9. 根据权利要求6至8中任一项所述的音频播放装置,其中,所述目标噪声特征为与拍摄场景对应的噪声特征,或为与降噪程度对应的噪声特征。The audio playback device according to any one of claims 6 to 8, wherein the target noise feature is a noise feature corresponding to a shooting scene, or a noise feature corresponding to a noise reduction degree.
  10. 根据权利要求6至8中任一项所述的音频播放装置,其中,所述降噪模块,具体用于:The audio playback device according to any one of claims 6 to 8, wherein the noise reduction module is specifically used for:
    在所述第一音频为目标人声的情况下,从所述音频文件中滤除部分或全部的所述目标人声,得到所述音频文件中的第二音频,对所述音频文件进行人声抑制;或者,In the case where the first audio is the target human voice, filter out part or all of the target human voice from the audio file to obtain the second audio in the audio file, and perform human voice analysis on the audio file. sound suppression; or,
    在所述第一音频为环境音的情况下,从所述音频文件中滤除部分或全部的所述环境音,得到所述音频文件中的第二音频,对所述音频文件进行人声增强。In the case where the first audio is ambient sound, filter out part or all of the ambient sound from the audio file to obtain the second audio in the audio file, and perform vocal enhancement on the audio file .
  11. 一种电子设备,包括处理器,存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至5中任一项所述的音频播放方法的步骤。An electronic device, comprising a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction being executed by the processor to achieve as claimed in claims 1 to 5 The steps of any one of the audio playback methods.
  12. 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至5中任一项所述的音频播放方法的步骤。A readable storage medium on which programs or instructions are stored, and when the programs or instructions are executed by a processor, implement the steps of the audio playback method according to any one of claims 1 to 5.
  13. 一种计算机程序产品,所述计算机程序产品被至少一个处理器执行以实现如权利要求1至5中任一项所述的音频播放方法。A computer program product executed by at least one processor to implement the audio playback method of any one of claims 1 to 5.
  14. 一种电子设备,包括所述电子设备被配置成用于执行如权利要求1至5中任一项所述的音频播放方法。An electronic device comprising the electronic device configured to perform the audio playback method of any one of claims 1 to 5.
PCT/CN2021/108757 2020-07-30 2021-07-27 Audio playback method, audio playback apparatus, and electronic device WO2022022536A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010749736.4 2020-07-30
CN202010749736.4A CN111986689A (en) 2020-07-30 2020-07-30 Audio playing method, audio playing device and electronic equipment

Publications (1)

Publication Number Publication Date
WO2022022536A1 true WO2022022536A1 (en) 2022-02-03

Family

ID=73444727

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/108757 WO2022022536A1 (en) 2020-07-30 2021-07-27 Audio playback method, audio playback apparatus, and electronic device

Country Status (2)

Country Link
CN (1) CN111986689A (en)
WO (1) WO2022022536A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115209219A (en) * 2022-07-19 2022-10-18 深圳市艾酷通信软件有限公司 Video processing method and device and electronic equipment

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111986689A (en) * 2020-07-30 2020-11-24 维沃移动通信有限公司 Audio playing method, audio playing device and electronic equipment
CN113096686B (en) * 2021-03-29 2023-04-14 维沃移动通信有限公司 Audio processing method and device, electronic equipment and storage medium
CN113450755A (en) * 2021-04-30 2021-09-28 青岛海尔科技有限公司 Method, device, storage medium and electronic device for reducing noise
CN113284500B (en) * 2021-05-19 2024-02-06 Oppo广东移动通信有限公司 Audio processing method, device, electronic equipment and storage medium
CN113450752A (en) * 2021-06-28 2021-09-28 青岛海尔科技有限公司 Noise reduction method, noise reduction device, computer-readable storage medium and electronic device
CN114038488B (en) * 2021-11-02 2023-07-18 维沃移动通信有限公司 Method and device for acquiring and playing multimedia information and electronic equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102859592A (en) * 2010-06-04 2013-01-02 苹果公司 User-specific noise suppression for voice quality improvements
US9723421B2 (en) * 2014-03-04 2017-08-01 Samsung Electronics Co., Ltd Electronic device and method for controlling video function and call function therefor
CN107077859A (en) * 2014-10-31 2017-08-18 英特尔公司 The complexity based on environment for audio frequency process reduces
CN109584897A (en) * 2018-12-28 2019-04-05 努比亚技术有限公司 Vedio noise reduction method, mobile terminal and computer readable storage medium
US20200075034A1 (en) * 2017-07-03 2020-03-05 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Method and system for enhancing a speech signal of a human speaker in a video using visual information
WO2020148109A1 (en) * 2019-01-15 2020-07-23 Nokia Technologies Oy Audio processing
CN111540370A (en) * 2020-04-21 2020-08-14 闻泰通讯股份有限公司 Audio processing method and device, computer equipment and computer readable storage medium
CN111986689A (en) * 2020-07-30 2020-11-24 维沃移动通信有限公司 Audio playing method, audio playing device and electronic equipment
CN113129917A (en) * 2020-01-15 2021-07-16 荣耀终端有限公司 Speech processing method based on scene recognition, and apparatus, medium, and system thereof

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102859592A (en) * 2010-06-04 2013-01-02 苹果公司 User-specific noise suppression for voice quality improvements
US9723421B2 (en) * 2014-03-04 2017-08-01 Samsung Electronics Co., Ltd Electronic device and method for controlling video function and call function therefor
CN107077859A (en) * 2014-10-31 2017-08-18 英特尔公司 The complexity based on environment for audio frequency process reduces
US20200075034A1 (en) * 2017-07-03 2020-03-05 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. Method and system for enhancing a speech signal of a human speaker in a video using visual information
CN109584897A (en) * 2018-12-28 2019-04-05 努比亚技术有限公司 Vedio noise reduction method, mobile terminal and computer readable storage medium
WO2020148109A1 (en) * 2019-01-15 2020-07-23 Nokia Technologies Oy Audio processing
CN113129917A (en) * 2020-01-15 2021-07-16 荣耀终端有限公司 Speech processing method based on scene recognition, and apparatus, medium, and system thereof
CN111540370A (en) * 2020-04-21 2020-08-14 闻泰通讯股份有限公司 Audio processing method and device, computer equipment and computer readable storage medium
CN111986689A (en) * 2020-07-30 2020-11-24 维沃移动通信有限公司 Audio playing method, audio playing device and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115209219A (en) * 2022-07-19 2022-10-18 深圳市艾酷通信软件有限公司 Video processing method and device and electronic equipment

Also Published As

Publication number Publication date
CN111986689A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
WO2022022536A1 (en) Audio playback method, audio playback apparatus, and electronic device
US11030987B2 (en) Method for selecting background music and capturing video, device, terminal apparatus, and medium
CN104967900B (en) A kind of method and apparatus generating video
CN106575361B (en) Method for providing visual sound image and electronic equipment for implementing the method
CN109379641A (en) A kind of method for generating captions and device
CN111177453B (en) Method, apparatus, device and computer readable storage medium for controlling audio playing
US11511200B2 (en) Game playing method and system based on a multimedia file
WO2022161267A1 (en) Video recording method and apparatus
CN104866275B (en) Method and device for acquiring image information
WO2022156709A1 (en) Audio signal processing method and apparatus, electronic device and readable storage medium
CN111883091A (en) Audio noise reduction method and training method of audio noise reduction model
WO2022228377A1 (en) Sound recording method and apparatus, and electronic device and readable storage medium
CN113177134A (en) Music playing method and device, electronic equipment and storage medium
WO2022111458A1 (en) Image capture method and apparatus, electronic device, and storage medium
WO2022068721A1 (en) Screen capture method and apparatus, and electronic device
WO2020057241A1 (en) Method and apparatus for displaying application program, and terminal device
WO2023246823A1 (en) Video playing method, apparatus and device, and storage medium
WO2021169092A1 (en) Information display control method and apparatus, electronic device and storage medium
CN112309449A (en) Audio recording method and device
CN112380362A (en) Music playing method, device and equipment based on user interaction and storage medium
JP7331044B2 (en) Information processing method, device, system, electronic device, storage medium and computer program
CN112887782A (en) Image output method and device and electronic equipment
KR20230120668A (en) Video call method and device
CN114125149A (en) Video playing method, device, system, electronic equipment and storage medium
CN112261470A (en) Audio processing method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21849135

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21849135

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21849135

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.08.2023)