CN111508531A - Audio processing method and device - Google Patents

Audio processing method and device Download PDF

Info

Publication number
CN111508531A
CN111508531A CN202010327785.9A CN202010327785A CN111508531A CN 111508531 A CN111508531 A CN 111508531A CN 202010327785 A CN202010327785 A CN 202010327785A CN 111508531 A CN111508531 A CN 111508531A
Authority
CN
China
Prior art keywords
audio
audio signal
blank
time
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010327785.9A
Other languages
Chinese (zh)
Other versions
CN111508531B (en
Inventor
肖国坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to CN202010327785.9A priority Critical patent/CN111508531B/en
Publication of CN111508531A publication Critical patent/CN111508531A/en
Application granted granted Critical
Publication of CN111508531B publication Critical patent/CN111508531B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02165Two microphones, one receiving mainly the noise signal and the other one mainly the speech signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision

Abstract

The embodiment of the application provides an audio processing method, which is applied to the technical field of mobile communication. The method comprises the steps of respectively collecting first audio signals through two microphones; carrying out noise reduction processing on the first audio signal to generate a second audio signal; acquiring a blank audio clip in the second audio signal; and deleting at least part of the blank audio segments in the second audio signal to obtain the target audio. The collected first audio signal is subjected to removal of environmental noise and blank audio segments to obtain the target audio, so that the user is guaranteed not to be interrupted by useless audio information in the process of listening to the audio.

Description

Audio processing method and device
Technical Field
The embodiment of the application relates to the technical field of mobile communication, in particular to an audio processing method and device.
Background
With the rapid development of mobile communication technology, more and more electronic devices have a recording function, and users can conveniently record and listen to audio. However, in the audio recording process, from the beginning of recording to the end of recording, the electronic equipment inevitably records a lot of useless audio information. The useless information not only occupies the memory of the electronic equipment, but also reduces the information receiving efficiency of the user.
Content of application
The application provides an audio processing method, which aims to solve the problems that recorded audio comprises useless information and the receiving efficiency of user information is low in the prior art.
In order to solve the technical problem, the present application is implemented as follows:
in a first aspect, an embodiment of the present application provides an audio processing method, including acquiring two corresponding first audio signals by two microphones; carrying out noise reduction processing on the first audio signal to generate a second audio signal; acquiring a blank audio clip in the second audio signal; and deleting at least part of the blank audio segments in the second audio signal to obtain the target audio.
In a second aspect, an embodiment of the present application provides an audio processing apparatus, including a collecting module, configured to collect two corresponding first audio signals through two microphones; the processing module is used for carrying out noise reduction processing on the first audio signal to generate a second audio signal; the acquisition module is used for acquiring a blank audio clip in the second audio signal; and the deleting module is used for deleting at least part of the blank audio clips in the second audio signal to obtain the target audio.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program stored on the memory and executable on the processor, where the program, when executed by the processor, implements the steps of the audio processing method.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the audio processing method described above.
In the embodiment of the application, two first audio signals are collected through two microphones, and the first audio signals are subjected to noise reduction processing to obtain second audio signals with environmental noise removed. And further analyzing the second audio signal to obtain at least one blank audio segment, and deleting the blank audio segment from the second audio signal to obtain the target audio without the environmental noise and the blank segment. According to the embodiment of the application, the target audio is obtained after the environmental noise and the blank audio clip of the collected first audio signal are removed, so that the user is guaranteed not to be interrupted by useless audio information in the process of listening to the audio.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of an audio processing method according to an embodiment of the present application;
fig. 2 is a signal flow of audio processing provided by an embodiment of the present application;
fig. 3 is a signal flow of noise reduction processing provided by an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating a noise reduction principle according to an embodiment of the present application;
FIG. 5 is a schematic diagram of an audio signal before and after denoising according to an embodiment of the present application;
fig. 6 is a block diagram of an audio processing apparatus according to an embodiment of the present application;
fig. 7 is a block diagram of an audio processing apparatus according to an embodiment of the present application;
fig. 8 is a block diagram of an audio processing apparatus according to an embodiment of the present application;
fig. 9 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solution in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As shown in fig. 1, an embodiment of the present application exemplarily provides an audio processing method, which includes the following steps:
step 101: the first audio signal is acquired by two microphones, respectively.
In the embodiment of the application, a Microphone (MIC) is used for collecting sound information, and can be arranged on electronic equipment such as a mobile phone, a tablet computer, a notebook computer, a recording pen, a music player, an earphone and the like. The number of the microphones is at least two, so as to acquire first audio signals, the first audio signals may be at least two first audio signals whose number corresponds to the number of the microphones, or may be a total first audio signal acquired by at least two microphones, and the noise reduction audio is obtained by processing the first audio signals.
It should be noted that the acquisition of the first audio signal by the plurality of microphones necessarily includes the acquisition of the first audio signal by two microphones, and therefore, the present invention is also within the scope of the present application. Specifically, the number of the microphones may be two, three, four, and the like, and two, three, and four first audio signals may be correspondingly collected.
The microphone can collect digital audio signals and also can collect analog audio signals, and when the analog audio signals are collected, the collected signals can be converted into the digital audio signals through analog-to-digital conversion, so that the audio signals can be conveniently further processed subsequently, and high-quality audio files can be obtained. In order to simplify the structure of the electronic device and save the cost, the analog audio signal may also be directly processed, and the embodiment of the present application does not limit the specific form of the signal.
Because different microphones are arranged at different positions, the distance from sound sources at the same position to different microphones is different, and the audio characteristics of the first audio signal are different. Taking the microphone set on the mobile phone as an example, two microphones may be set, which are respectively set in the inner cavity of the mobile phone at the positions corresponding to the top and the bottom of the display screen, and also can be set at both sides of the display screen. For another example, the microphone may be disposed on the earphone, one microphone may be disposed on each of the left and right earphones, or the microphones may be disposed on the inner and outer sides of the left earphone and the inner and outer sides of the right earphone, respectively.
Step 102: and carrying out noise reduction processing on the first audio signal to generate a second audio signal.
According to the embodiment of the application, the noise reduction processing is carried out on the first audio signal acquired by the microphone, and the second audio signal for removing the environmental noise is obtained. In specific processing, the noise removal processing can be performed on the collected first audio signal by independently collecting the environmental noise signal, acquiring the environmental noise sample according to the characteristics of the environmental noise signal or setting the environmental noise as a certain reasonable parameter value. During processing, the collected first audio signals can be subjected to noise reduction processing respectively and then regenerated into second audio signals. Or, directly carrying out noise reduction processing on a total first audio signal obtained by collection; or synthesizing a plurality of acquired first audio signals into a signal, and then performing noise reduction processing on the signal. The embodiments of the present application are not limited to the specific manner of signal processing.
In an embodiment of the present application, the step 102 may further include the following sub-steps:
and a substep A: and carrying out differential processing on the first audio signal to obtain a de-noised audio signal.
Because the positions of the microphones cannot be completely overlapped, when sounds emitted from the same sound source (e.g., a user) are transmitted to different microphones, the physical information included in the sounds is different, that is, the signal characteristic parameters of the acquired first audio signal are different. As shown in fig. 4, for example, when the sound generated by the user propagates to MIC1 and MIC2, there is a difference between the sound information received by MIC1 and MIC2, and the amplitude and phase (time) of the sound received by MIC1 and MIC2 are different for each piece of syllable information generated by the user. For example, MIC1 receives a sound that is delayed relative to MIC2 and has a relatively low magnitude. In general, the sound source position of the ambient noise is far from the microphones, and when the ambient noise propagates to the positions of the microphones, the ambient noise is uniformly distributed around each microphone, so that the difference of sound information formed by the ambient noise at each microphone is small. When the microphone records sound, the sound source to be recorded (such as the voice of a user) and the environmental noise have the difference, the position of the microphone is different, and the collected first audio signal is different. And the difference of the environmental noise information included in different first audio signals is small, compared with the difference of the sound information emitted by the sound source to be recorded. Therefore, by performing differential processing on different first audio signals, the ambient noise can be cancelled, and the useful sound source information to be recorded has larger difference, and the corresponding audio information is retained after the differential processing. As can be seen from the waveform diagram of fig. 5, the waveform wave (b) after the noise reduction processing has a noise portion removed from the center of (a).
Specifically, when there are only two microphones, the two first audio signals collected by the two microphones are subjected to differential processing to obtain a denoised audio signal (which may also be regarded as two identical denoised audio signals).
When a plurality of first audio signals are collected through a plurality of microphones, taking four microphones to collect four first audio signals as an example, the four microphones can be respectively arranged at the inner side and the outer side of the left receiver and the inner side and the outer side of the right receiver, and because the positions of the microphones are different, the amplitude and the phase of sound source information to be recorded in the corresponding received first audio signals are different, but the ambient noise signals are basically the same. Therefore, any two audio signals in the four first audio signals are subjected to differential processing, common environmental noise can be removed, and useful sound source information to be recorded is reserved.
Optionally, only two first audio signals collected by two microphones with the farthest interval among the four first audio signals may be subjected to differential processing, so as to obtain a denoised audio signal. In order to obtain a better audio processing effect, optionally, any two of the four first audio signals are subjected to differential processing to obtain a plurality of denoised audio signals, which may include the same signal. For example, two first audio signals with a relatively long distance between the positions can be differentiated, and the distance from the inner microphone and the outer microphone of the left receiver to the outer microphone of the right receiver is greater than the distance from the inner microphone of the right receiver, so that the first audio signals collected by the inner microphone of the left receiver and the outer microphone of the right receiver are differentiated to obtain a denoised audio signal; similarly, the first audio signals collected by the left earphone outer side microphone and the right earphone outer side microphone are differentiated to obtain another de-noised audio signal. Optionally, any two of the four first audio signals are subjected to differential processing to obtain four denoised audio signals, where the four denoised audio signals may include the same signal, for example, the difference between the microphone inside the left earpiece and the microphone outside the right earpiece may obtain two same denoised audio signals. Compared with a single denoising audio signal, the denoising audio signals can retain more sound information and generate a more stereo sound effect. After the plurality of denoised audio signals are respectively amplified and blank audio segments are deleted, the processed signals are combined to obtain the target audio. Or a plurality of denoised audio signals may be combined into one signal and then subjected to subsequent processing, which is not limited in the embodiment of the present application.
And a substep B: and amplifying the de-noised audio signal to obtain the second audio signal.
The denoised audio signal after the difference processing has a relatively small amplitude, and the denoised audio signal is subjected to signal amplification processing in order to obtain an audio signal convenient for listening. Mature signal processing methods and circuits are already available in the prior art to realize signal amplification, and the embodiments of the present application do not specifically limit this.
As shown in fig. 2 to 4, the difference between the sound source information to be recorded received by the microphone and the ambient noise is utilized to perform the difference processing on the first audio signal, and then perform the signal amplification processing on the first audio signal, so as to obtain the second audio signal after noise reduction. Compared with a method for singly collecting the environmental noise to reduce the noise, the method provided by the embodiment of the application has the advantages of simple realization circuit and less data processing amount. The second audio obtained by processing can obtain target audio through further processing.
Step 103: and acquiring a blank audio segment in the second audio signal.
The second audio signal obtained after noise reduction usually includes valid audio, which is the user's voice information (or sound information emitted from other specific sound source to be recorded), and blank audio between the valid audio. The active audio and the blank audio have differences in audio characteristics, and the difference in the audio characteristics can be used to distinguish the two. When the two parts of audio are distinguished, the original audio can be divided according to a fixed or variable time period, and the average information, such as amplitude information, of the audio signal in the time period is analyzed. Since the second audio information is the audio information after noise reduction, the amplitude of the blank audio is smaller than the effective audio. For the second audio information obtained through the amplification processing, the difference between the amplitudes of the blank audio and the effective audio is larger. And regarding the corresponding second audio with the amplitude larger than a certain threshold as effective audio, and regarding the second audio as blank audio when the amplitude smaller than the certain threshold. The threshold may be a specific preset amplitude value, or may be an amplitude proportion, for example, 10% of the maximum amplitude value in the second audio signal. All blank audios in the time period corresponding to the whole second audio signal can be used as blank audio segments, and all or part of the blank audio segments are deleted to obtain the target audio. When all blank audios included in the second audio signal are acquired as blank audio segments, the corresponding calculation amount is larger, optionally, blank audios in a specific time period may be acquired according to needs without analyzing second audio signals in other time periods, so as to reduce the calculation amount.
And deleting part or all of the blank audio segments according to the requirement to obtain the target audio. Alternatively, only a portion of the blank audio in the first audio may be obtained as a blank audio segment, for example, a 30-minute second audio segment, and only the 10-minute to 20-minute audio therein may be analyzed to obtain the blank audio therein as the blank audio segment. Due to different practical application scenes, such as daily cold conversation, one-to-many teaching, single-person recorded course and the like, the second audio signal comprises different distribution and characteristics of effective audio and blank audio, and only the blank audio in the second audio signal can be obtained according to needs, so that the calculation amount can be reduced to a certain extent, and the software and hardware calculation resources of the electronic equipment can be saved.
Optionally, the blank audio segments are consecutive audio segments of the second audio signal, where the amplitude is smaller than a preset threshold. Specifically, the second audio signal may be analyzed according to a preset frame length interval, which may be 2 frames, 3 frames, 5 frames, 7 frames, and the like. Taking 5 frames as an example, the electronic device extracts one audio frame from every 5 audio frames for analysis, and if the amplitude of the audio frame is smaller than a preset threshold, it is determined that the corresponding 5 audio frames are all blank audio frames, and the preset threshold may be a specific preset amplitude value, or an amplitude ratio, for example, 10% of the maximum amplitude value in the second audio signal. When the electronic equipment detects a first blank audio frame, recording a corresponding first time point, continuously detecting a subsequent frame or an audio frame with a preset frame length interval, and when a first audio frame including effective audio after the first time point is detected, acquiring a previous audio frame of the first audio frame including the effective audio as a second audio frame corresponding to the second audio frame. And a plurality of audio frames between the first time point and the second time point are blank audio frames, and the corresponding continuous audio segments are blank audio segments. The second audio signal may also be analyzed according to a preset time interval, for example, 0.2 second, 0.3 second, 0.5 second, etc. Taking 0.5 second as an example, the electronic device extracts 0.1 second audio segments from every 0.5 second for analysis, and if the average amplitude of the audio segments is smaller than a preset threshold, it is determined that the corresponding 5 second audio segments are all blank audio segments.
By analyzing the amplitude of the audio frame, the effective audio and the blank audio can be accurately identified, so that a blank audio segment can be accurately found, and the effective audio can not be deleted by mistake when the audio is deleted subsequently.
Step 104: and deleting at least part of the blank audio segments in the second audio signal to obtain the target audio.
And deleting blank audio segments in the second audio signal, and splicing the rest audio signals according to audio frames or time to obtain the required target audio. When deleting, only part of blank audio clips can be deleted according to specific needs, and all blank audio clips can also be deleted. The position of the pre-deleted blank audio clip in the second audio signal, or the audio information and the carried sound information corresponding to the pre-deleted blank audio clip in the second audio signal may be different, and may be set according to actual needs, which is not limited in this embodiment of the application.
When a multi-person conversation exists, the audio information of different users included in the second audio signal can be identified through a voiceprint identification mode. At this time, the blank audio pieces in the audio information of the specified user may be deleted as needed, while a part of the blank audio is left, so that the dialog is easier to understand.
Optionally, deleting at least a portion of the blank audio segments in the second audio signal, and obtaining the target audio includes:
receiving a first input;
in response to the first input, acquiring an audio signal of a first user in the second audio signal; wherein the second audio signal comprises audio signals of at least two users, the first user being at least one of the at least two users;
deleting a first blank audio clip in the blank audio clips to obtain the target audio;
the audio clip between the first time and the second time of the second audio signal is the first blank audio clip, the time period from the preset time before the first time to the first time is a first time period, the time period from the second time to the preset time after the second time is a second time period, and the audio clips of the first time period and the second time period of the second audio signal are from the same one of the first users.
For example, in a dialog scenario of two or more users, the second audio signal includes dialog information of the users, and a blank audio clip in the middle of certain user audio information can be deleted as required. The first input is used to obtain an audio signal of the first user, and may specifically be a touch input or a voice input, which is not limited in the embodiment of the present application. Receiving a first input; in response to the first input, an audio signal of the first user in the second audio signal is acquired. Wherein the second audio signal comprises audio signals of at least two users, the first user being at least one of the at least two users. Specifically, the audio signals of different users participating in the conversation may be detected by means of voiceprint recognition and the like, which is not limited in the embodiment of the present application. For example, the existing four users, namely the first user, can be selected from the existing four users, and the first blank audio segment starts at the first time and ends at the second time. The time interval between the preset time before the first moment and the first moment is a first time interval, and the time interval between the preset time after the second moment and the second moment is a second time interval. In the second audio signal, the audio segments of the first time segment and the second time segment are both the sound from the first user, that is, blank audio segments in the audio segments of the first time segment during which the first user is not interrupted are removed, and the target audio is obtained. The preset time may be 10 seconds or 20 seconds, and may be specifically selected according to needs, which is not limited in the embodiment of the present application.
For another example, in the answering scene, the first is a student ready for answering, the second is a teacher, only the first is speaking in the first 10 minutes, and the second is asking in turn in the last 3 minutes. The first 10 minutes of the blank audio clip may be deleted. For another example, in a one-to-one conversation, there may be a first speech segment composed entirely of the voice of user A, a second speech segment composed entirely of the voice of user B, or a second speech segment in which two persons alternately speak. When deleting the blank audio segments, only the blank audio segments in the first speech segment may be deleted, and the other blank audio segments may be reserved to facilitate understanding of the audio content. For example, the user A is a student, the user B is a teacher, only the blank in the middle of the student speaking is deleted, the blank in the teacher speaking is reserved, when the student listens to audio for reviewing, time can be saved, thinking can be conducted in the pause time of the teacher speaking interval, and the class scene is restored.
For another example, when four people in first, second, third and fourth people have a conversation, the first user and the second user can acquire audio signals of the first user and the second user, audio segments of a first time period and a second time period adjacent to a first blank audio segment are from the first user, or the first blank audio segment is deleted to obtain target audio.
According to the embodiment of the application, a specific user can be selected as required, and the blank audio clip in the speaking time period of the user is deleted, so that the blank audio clip can be flexibly deleted, and the recording requirement of the user can be better met.
Optionally, the deleting at least a part of the blank audio segments in the second audio signal to obtain the target audio includes:
receiving a second input;
responding to the second input, and acquiring an audio signal of a second user in the second audio signal; the second audio signal comprises audio signals of N users, the second users are M users in the N users, N is an integer larger than 2, and M is an integer larger than 1 and smaller than N;
deleting a second blank audio clip in the blank audio clips to obtain the target audio;
wherein, an audio segment between a third time and a fourth time of the second audio signal is the second blank audio segment, a time period from a preset time before the third time to the third time is a third time period, a time period from the fourth time to a preset time after the fourth time is a fourth time period, and the audio segments of the third time and the fourth time of the second audio signal come from the second user.
In this embodiment of the application, the second input may be a touch input or a voice input, which is not limited in this embodiment of the application. The audio signals of different users participating in the conversation may be detected by means of voiceprint recognition and the like, which is not limited in the embodiment of the present application.
Specifically, in a dialog scenario of two or more users, the second audio signal includes dialog information of the users, and blank audio segments in the middle of some user audio information can be deleted as needed.
For example, in a teaching scene including three students, i.e., a first student and a teacher, a first user, i.e., a second user, can be selected as the first user, a blank audio segment in the middle of a speech of the first student, a blank audio segment in the middle of a speech of the second student, and a blank audio segment in the middle of a speech of the teacher, and a blank audio segment between a and a conversation of the students are reserved. That is, the audio signals of three users, i.e., first, second and third users, can be obtained, and the audio segments of the third time period and the fourth time period adjacent to the second blank audio segment are all from first, second and third users, or from any two of the users, i.e., first and second users, respectively, the second blank audio segment is deleted to obtain the target audio.
In the teaching scene, the pause in the speech of the teacher A is usually the time for the students to think and reflect, and the blank audio of the part is reserved, so that the teaching scene can be reproduced, and the students can have enough thinking time in the learning process. Compared with the teaching content of teachers, the conversation among students is less in knowledge amount, the significance of the existence of the middle blank audio clip is not large, and the deletion of the part is beneficial to enabling the students to review without wasting time.
Optionally, blank audio segments between the second users within the specified time period may also be deleted. For example, 15 minutes of teaching audio of teacher A, student B and student C, the first 5 minutes is the speech content of teacher A, the middle 3 minutes is the question and answer audio of three teachers and students, then 2 minutes is the student's second-third discussion time, then student B summarizes the speech for 2 minutes, and the last 3 minutes is the question and answer time when teacher A inquires student B. Blank audio clips can be selected to be deleted for minutes 6 through 10, designating users as student b and student c. Only blank audio clips in the middle of the second and third conversations of the student are deleted, and blank audio clips in the middle of the first speaking conversation of the teacher are reserved.
Optionally, in this embodiment of the application, the obtaining a blank audio clip in the second audio signal includes: displaying an original waveform map of the second audio signal, displaying a first mark on the original waveform map, the first mark referring to a blank audio segment.
Displaying a part or all of an original waveform map of the second audio signal, displaying first marks on one or more first segments on the original waveform map, the first segments corresponding to blank audio segments. Due to screen size limitations, only a portion of the original waveform map of the second audio signal may be displayed at the same time, which may be automatically moved on the electronic device to present the entire original waveform map to the user. The electronic device also displays other parts of the original waveform diagrams according to the dragging or sliding operation of the user, which is not limited in the embodiment of the present application. The first mark may be a line type, a color, or the like, which is distinguished from the waveform corresponding to the valid audio piece. Specifically, the corresponding portion of the valid audio segment in the original waveform diagram may be shown by a solid line, and the first mark may be a broken line, corresponding to a blank audio segment. The abscissa of the starting point and the end point of the first mark may be consistent with the abscissa of the starting point and the end point corresponding to the blank audio segment in the original oscillogram, the first mark is a red square, and two sides of the square perpendicular to the abscissa correspond to the starting point and the end point, respectively.
By means of the first markers, the user can intuitively determine the specific position of the blank audio piece in the second audio signal.
Deleting at least part of the blank audio segments in the second audio signal to obtain a target audio, including:
a third input is received and the third input is received,
and in response to the third input, deleting at least part of the blank audio segment referred by the first mark to obtain target audio.
Specifically, a third input is received, where the third input may be a touch operation of clicking and dragging the first segment. The first segment may also be sorted according to time sequence, the third input is a sequence number input, and in response to the sequence number input, the blank audio segment referred to by the sequence number is deleted. The third input may also be other input forms, and at least part of the blank audio segment referred by the first mark is deleted in response to the input, which is not limited by the embodiment of the present application. Optionally, at least a part of the blank audio segment referred by the first mark is deleted, and at least a part of the first mark may also be deleted. Specifically, after at least part of the first mark is deleted, other waveforms in the original waveform map may be spliced to form a waveform map corresponding to a new target audio.
By displaying the original waveform diagram corresponding to the second audio signal and distinguishing the blank audio segment from the valid audio segment by the first mark, the user can delete the blank audio segment to be deleted by simple operations such as dragging, moving and the like on the original waveform diagram. According to the embodiment of the application, the interestingness of user operation is increased, and meanwhile, the user can delete blank audio clips flexibly according to needs.
Further, the playing duration of the second audio signal is a first duration, and deleting at least a part of the blank audio segments indicated by the first marks in response to the third input to obtain the target audio includes:
deleting a target blank audio clip in response to the third input, the target blank audio clip being a blank audio clip to which the at least part of the first mark refers;
and carrying out variable speed processing on audio signals except the target blank audio clip in the second audio signal to obtain a target audio, wherein the playing time of the target audio is equal to the first time.
Optionally, a waveform diagram of the target audio after the variable speed processing may be displayed.
The second audio signal deletes the target blank audio clip, the playing time of the corresponding audio is shortened, the contained information amount is increased, but the excessively dense information input is not beneficial to the understanding of the user. In the embodiment of the present application, while or after the target blank audio segment is deleted, the audio signals in the second audio signal except for the target blank audio segment are subjected to variable speed processing to obtain the target audio signal, so that the playing time duration of the target audio signal is consistent with the playing time duration of the second audio signal. Therefore, the blank audio signals are deleted, so that the user is ensured not to be temporarily used by useless blank audio when listening to the audio, the playing speed of effective audio is reduced, and the user can understand the audio content conveniently.
According to the method provided by the embodiment of the application, the two microphones are used for respectively collecting the first audio signals, and the first audio signals are subjected to noise reduction processing to obtain the second audio signals with the environmental noise removed. And further analyzing the second audio signal to obtain at least one blank audio segment, and deleting the blank audio segment from the second audio signal to obtain the target audio without the environmental noise and the blank segment. According to the embodiment of the application, the target audio is obtained after the environmental noise and the blank audio clip of the collected first audio signal are removed, and therefore it is guaranteed that a user is not interrupted by useless audio information in the process of listening to the audio. In addition, after environmental noise of the collected first audio signal is removed, the blank audio segment is deleted, and at the moment, the difference of the audio characteristic parameters of the effective audio segment and the soundless segment in the second audio signal is large, so that the difference is easy to distinguish. Therefore, the effective audio can be distinguished accurately, and the effective audio is guaranteed not to be deleted by mistake.
As shown in fig. 6, an embodiment of the present application further provides an audio processing apparatus, including: the system comprises an acquisition module 201, a processing module 202, an acquisition module 203 and a deletion module 204. Wherein:
the acquisition module is used for acquiring first audio signals through two microphones respectively; the processing module is used for carrying out noise reduction processing on the first audio signal to generate a second audio signal; the acquisition module is used for acquiring a blank audio clip in the second audio signal; and the deleting module is used for deleting at least part of the blank audio clips in the second audio signal to obtain the target audio.
The device that this application embodiment provided gathers first audio signal respectively through two microphones, carries out noise reduction to this first audio signal, obtains the second audio signal after getting rid of ambient noise. And further analyzing the second audio signal to obtain at least one blank audio segment, and deleting the blank audio segment from the second audio signal to obtain the target audio without the environmental noise and the blank segment. According to the embodiment of the application, the target audio is obtained after the environmental noise and the blank audio clip of the collected first audio signal are removed, and therefore it is guaranteed that a user is not interrupted by useless audio information in the process of listening to the audio. In addition, after environmental noise of the collected first audio signal is removed, the blank audio segment is deleted, and at the moment, the difference of the audio characteristic parameters of the effective audio segment and the soundless segment in the second audio signal is large, so that the difference is easy to distinguish. Therefore, the effective audio can be distinguished accurately, and the effective audio is guaranteed not to be deleted by mistake.
Optionally, as shown in fig. 7, the processing module 202 specifically includes a difference module 2021 and an amplification module 2022. The difference module is used for carrying out difference processing on the first audio signal to obtain a de-noised audio signal; and the amplifying module is used for amplifying the denoising audio signal to obtain the second audio signal.
Optionally, the blank audio segment is a continuous audio segment of the second audio signal, where the amplitude is smaller than a preset threshold.
Optionally, as shown in fig. 8, the deleting module 204 specifically includes:
a first receiving submodule 2041 for receiving a first input;
a first obtaining sub-module 2042, configured to obtain, in response to the first input, an audio signal of a first user in the second audio signal; wherein the second audio signal comprises audio signals of at least two users, the first user being at least one of the at least two users;
the first deleting submodule 2043 is configured to delete a first blank audio segment in the blank audio segments, so as to obtain the target audio;
the audio clip between the first time and the second time of the second audio signal is the first blank audio clip, the time period from the preset time before the first time to the first time is a first time period, the time period from the second time to the preset time after the second time is a second time period, and the audio clips of the first time period and the second time period of the second audio signal are from the same one of the first users.
Optionally, as shown in fig. 9, the deleting module 204 specifically includes:
a second receiving submodule for receiving a second input;
the second obtaining submodule is used for responding to the second input and obtaining the audio signal of a second user in the second audio signal; the second audio signal comprises audio signals of N users, the second users are M users in the N users, N is an integer larger than 2, and M is an integer larger than 1 and smaller than N;
the second deleting submodule is used for deleting a second blank audio clip in the blank audio clips to obtain the target audio;
wherein, an audio segment between a third time and a fourth time of the second audio signal is the second blank audio segment, a time period from a preset time before the third time to the third time is a third time period, a time period from the fourth time to a preset time after the fourth time is a fourth time period, and the audio segments of the third time and the fourth time of the second audio signal come from the second user.
Optionally, the obtaining module 203 includes a display sub-module, where the display sub-module is configured to display an original waveform diagram of the second audio signal, and display a first mark on the original waveform diagram, where the first mark refers to a blank audio segment;
the deleting module 204 includes a third receiving sub-module for receiving a third input,
and the third deleting submodule is used for responding to the third input and deleting at least part of the blank audio segments indicated by the first marks to obtain the target audio.
Optionally, the playing duration of the second audio signal is a first duration, and the third deletion submodule is specifically configured to delete a target blank audio clip in response to the third input, where the target blank audio clip is a blank audio clip referred to by the at least part of the first mark; and carrying out variable speed processing on audio signals except the target blank audio clip in the second audio signal to obtain a target audio, wherein the playing time of the target audio is equal to the first time.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
It should be noted that, in the audio processing method provided in the embodiment of the present application, the execution main body may be an audio processing apparatus, or a control module in the audio processing apparatus for executing the method of loading audio processing. In the embodiment of the present application, a method for performing loading audio processing by audio processing is taken as an example, and the method for performing audio processing provided in the embodiment of the present application is described.
The audio processing device in the embodiment of the present application may be a device, and may also be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a kiosk, and the like, and the embodiments of the present application are not particularly limited.
The audio processing apparatus in the embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.
The audio processing apparatus provided in the embodiment of the present application can implement each process implemented by the audio processing apparatus in the method embodiment of fig. 1, and is not described herein again to avoid repetition.
Figure 9 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present application,
the electronic device 300 includes, but is not limited to: radio frequency unit 301, network module 302, audio output unit 301, input unit 304, sensor 305, display unit 306, user input unit 307, interface unit 308, memory 309, processor 310, and power supply 311. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 9 does not constitute a limitation of the electronic device, and that the electronic device may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present application, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer, and the like.
It should be understood that, in the embodiment of the present application, the audio output unit 301 may convert audio data received by the radio frequency unit 301 or the network module 302 or stored in the memory 309 into an audio signal and output as sound. Also, the audio output unit 301 may also provide audio output related to a specific function performed by the electronic apparatus 300 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 301 includes a speaker, a buzzer, a receiver, and the like.
The input unit 304 is used to receive audio or video signals. The input Unit 304 may include a Graphics Processing Unit (GPU) 3041 and a microphone 3042, and the Graphics processor 3041 processes image data of a still picture or video obtained by an image capturing apparatus (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 306. The image frames processed by the graphic processor 3041 may be stored in the memory 309 (or other storage medium) or transmitted via the radio frequency unit 301 or the network module 302. The microphone 3042 may receive sounds and may be capable of processing such sounds into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 301 in case of the phone call mode.
The Display unit 306 may include a Display panel 3061, and the Display panel 3061 may be configured in the form of a liquid Crystal Display (L acquired Crystal Display, L CD), an Organic light Emitting Diode (O L ED), or the like.
The user input unit 307 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 307 includes a touch panel 3071 and other input devices 3072. The touch panel 3071, also referred to as a touch screen, may collect touch operations by a user on or near the touch panel 3071 (e.g., operations by a user on or near the touch panel 3071 using a finger, a stylus, or any suitable object or attachment). The touch panel 3071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 310, and receives and executes commands sent by the processor 310. In addition, the touch panel 3071 may be implemented using various types, such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 307 may include other input devices 3072 in addition to the touch panel 3071. Specifically, the other input devices 3072 may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described herein.
Further, the touch panel 3071 may be overlaid on the display panel 3061, and when the touch panel 3071 detects a touch operation on or near the touch panel, the touch operation is transmitted to the processor 310 to determine the type of the touch event, and then the processor 310 provides a corresponding visual output on the display panel 3061 according to the type of the touch event. Although the touch panel 3071 and the display panel 3061 are shown in fig. 9 as two separate components to implement the input and output functions of the electronic device, in some embodiments, the touch panel 3071 and the display panel 3061 may be integrated to implement the input and output functions of the electronic device, which is not limited herein.
A Memory (Memory)309 may be used to store software programs as well as various data. The memory 309 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 309 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 310 is a control center of the electronic device, connects various parts of the whole electronic device by using various interfaces and lines, performs various functions of the electronic device and processes data by operating or executing software programs and/or modules stored in the memory 309 and calling data stored in the memory 309, thereby performing overall monitoring of the electronic device. Processor 310 may include one or more processing units; alternatively, the processor 310 may integrate an application processor, which mainly handles operating systems, user interfaces, applications, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 310. For example, the processor may include an Audio Digital Signal Processing Unit (Audio DSP) or may include a Central Processing Unit (CPU).
The electronic device 300 may further include a power supply 311 (such as a battery) for supplying power to various components, and optionally, the power supply 311 may be logically connected to the processor 310 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. In addition, the electronic device 300 includes some functional modules that are not shown, and are not described in detail herein.
Optionally, an electronic device is further provided in this embodiment of the present application, and includes a processor 310, a memory 309, and a computer program stored in the memory 309 and capable of running on the processor 310, where the computer program is executed by the processor 310 to implement each process of the above-mentioned audio processing method embodiment, and can achieve the same technical effect, and details are not described here to avoid repetition.
The embodiment of the present application further provides a computer-readable storage medium, where a program is stored on the computer-readable storage medium, and when the program is executed by a processor, the program implements the processes of the audio processing method embodiment, and can achieve the same technical effects, and details are not repeated here to avoid repetition. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (15)

1. An audio processing method, comprising:
respectively acquiring first audio signals through two microphones;
carrying out noise reduction processing on the first audio signal to generate a second audio signal;
acquiring a blank audio clip in the second audio signal;
and deleting at least part of the blank audio segments in the second audio signal to obtain the target audio.
2. The method of claim 1, wherein the performing noise reduction processing on the first audio signal to generate a second audio signal comprises:
carrying out differential processing on the first audio signal to obtain a de-noised audio signal;
and amplifying the de-noised audio signal to obtain the second audio signal.
3. The method according to claim 1 or 2, wherein the blank audio segment is a continuous audio segment of the second audio signal having an amplitude smaller than a preset threshold.
4. The method according to any of claims 1 to 3, wherein the deleting at least a portion of the blank audio segment in the second audio signal to obtain the target audio comprises:
receiving a first input;
in response to the first input, acquiring an audio signal of a first user in the second audio signal; wherein the second audio signal comprises audio signals of at least two users, the first user being at least one of the at least two users;
deleting a first blank audio clip in the blank audio clips to obtain the target audio;
the audio clip between the first time and the second time of the second audio signal is the first blank audio clip, the time period from the preset time before the first time to the first time is a first time period, the time period from the second time to the preset time after the second time is a second time period, and the audio clips of the first time period and the second time period of the second audio signal are from the same one of the first users.
5. The method according to any of claims 1 to 3, wherein the deleting at least a portion of the blank audio segment in the second audio signal to obtain the target audio comprises:
receiving a second input;
responding to the second input, and acquiring an audio signal of a second user in the second audio signal; the second audio signal comprises audio signals of N users, the second users are M users in the N users, N is an integer larger than 2, and M is an integer larger than 1 and smaller than N;
deleting a second blank audio clip in the blank audio clips to obtain the target audio;
wherein, an audio segment between a third time and a fourth time of the second audio signal is the second blank audio segment, a time period from a preset time before the third time to the third time is a third time period, a time period from the fourth time to a preset time after the fourth time is a fourth time period, and the audio segments of the third time and the fourth time of the second audio signal come from the second user.
6. The method of claim 1, wherein the obtaining a blank audio segment in the second audio signal comprises:
displaying an original waveform map of the second audio signal, displaying a first mark on the original waveform map, the first mark referring to a blank audio segment;
deleting at least part of the blank audio segments in the second audio signal to obtain a target audio, including:
a third input is received and the third input is received,
and in response to the third input, deleting at least part of the blank audio segment referred by the first mark to obtain target audio.
7. The method of claim 6, wherein the second audio signal is played for a first duration,
the deleting, in response to the third input, at least a portion of the blank audio segment to which the first mark refers, resulting in target audio comprising,
deleting a target blank audio clip in response to the third input, the target blank audio clip being a blank audio clip to which the at least part of the first mark refers;
and carrying out variable speed processing on audio signals except the target blank audio clip in the second audio signal to obtain a target audio, wherein the playing time of the target audio is equal to the first time.
8. An audio processing apparatus, comprising:
the acquisition module is used for acquiring first audio signals through two microphones respectively;
the processing module is used for carrying out noise reduction processing on the first audio signal to generate a second audio signal;
the acquisition module is used for acquiring a blank audio clip in the second audio signal;
and the deleting module is used for deleting at least part of the blank audio clips in the second audio signal to obtain the target audio.
9. The apparatus according to claim 8, wherein the processing module specifically includes:
the difference module is used for carrying out difference processing on the first audio signal to obtain a de-noised audio signal;
and the amplifying module is used for amplifying the denoising audio signal to obtain the second audio signal.
10. The apparatus according to claim 8 or 9, wherein the blank audio segment is a continuous audio segment of the second audio signal having an amplitude smaller than a preset threshold.
11. The apparatus according to any one of claims 8 to 10, wherein the deleting module specifically includes:
a first receiving submodule for receiving a first input;
the first obtaining sub-module is used for responding to the first input and obtaining the audio signal of the first user in the second audio signal; wherein the second audio signal comprises audio signals of at least two users, the first user being at least one of the at least two users;
the first deleting submodule is used for deleting a first blank audio clip in the blank audio clips to obtain the target audio;
the audio clip between the first time and the second time of the second audio signal is the first blank audio clip, the time period from the preset time before the first time to the first time is a first time period, the time period from the second time to the preset time after the second time is a second time period, and the audio clips of the first time period and the second time period of the second audio signal are from the same one of the first users.
12. The apparatus according to any one of claims 8 to 10, wherein the deleting module specifically includes:
a second receiving submodule for receiving a second input;
the second obtaining submodule is used for responding to the second input and obtaining the audio signal of a second user in the second audio signal; the second audio signal comprises audio signals of N users, the second users are M users in the N users, N is an integer larger than 2, and M is an integer larger than 1 and smaller than N;
the second deleting submodule is used for deleting a second blank audio clip in the blank audio clips to obtain the target audio;
wherein, an audio segment between a third time and a fourth time of the second audio signal is the second blank audio segment, a time period from a preset time before the third time to the third time is a third time period, a time period from the fourth time to a preset time after the fourth time is a fourth time period, and the audio segments of the third time and the fourth time of the second audio signal come from the second user.
13. The apparatus of claim 8, wherein the acquisition module comprises a display sub-module,
the display submodule is used for displaying an original oscillogram of the second audio signal, and displaying a first mark on the original oscillogram, wherein the first mark refers to a blank audio segment;
the deleting module comprises a third receiving submodule and a third deleting submodule,
the third receiving submodule is used for receiving a third input,
and the third deleting submodule is used for responding to the third input and deleting at least part of the blank audio segments indicated by the first marks to obtain the target audio.
14. The apparatus of claim 13, wherein the second audio signal is played for a first duration,
the third deletion submodule is specifically configured to delete a target blank audio clip in response to the third input, where the target blank audio clip is a blank audio clip referred to by the at least part of the first mark; and carrying out variable speed processing on audio signals except the target blank audio clip in the second audio signal to obtain a target audio, wherein the playing time of the target audio is equal to the first time.
15. A computer-readable storage medium, characterized in that a program is stored on the computer-readable storage medium, which program, when being executed by a processor, carries out the steps of the audio processing method according to any one of claims 1 to 7.
CN202010327785.9A 2020-04-23 2020-04-23 Audio processing method and device Active CN111508531B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010327785.9A CN111508531B (en) 2020-04-23 2020-04-23 Audio processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010327785.9A CN111508531B (en) 2020-04-23 2020-04-23 Audio processing method and device

Publications (2)

Publication Number Publication Date
CN111508531A true CN111508531A (en) 2020-08-07
CN111508531B CN111508531B (en) 2023-07-07

Family

ID=71864198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010327785.9A Active CN111508531B (en) 2020-04-23 2020-04-23 Audio processing method and device

Country Status (1)

Country Link
CN (1) CN111508531B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112423019A (en) * 2020-11-17 2021-02-26 北京达佳互联信息技术有限公司 Method and device for adjusting audio playing speed, electronic equipment and storage medium
CN112509609A (en) * 2020-12-16 2021-03-16 北京乐学帮网络技术有限公司 Audio processing method and device, electronic equipment and storage medium
CN112995755A (en) * 2021-03-01 2021-06-18 合肥学院 Automatic editing method for screen recording
CN113038338A (en) * 2021-03-22 2021-06-25 联想(北京)有限公司 Noise reduction processing method and device
CN114005469A (en) * 2021-10-20 2022-02-01 广州市网星信息技术有限公司 Audio playing method and system capable of automatically skipping mute segment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104157301A (en) * 2014-07-25 2014-11-19 广州三星通信技术研究有限公司 Method, device and terminal deleting voice information blank segment
CN104602162A (en) * 2014-12-17 2015-05-06 惠州Tcl移动通信有限公司 External noise reduction device for mobile terminal and noise reduction method of external noise reduction device
CN204334562U (en) * 2015-01-15 2015-05-13 厦门市普星电子科技有限公司 A kind of digital handset with ambient noise inhibit feature
CN105845124A (en) * 2016-05-05 2016-08-10 北京小米移动软件有限公司 Audio processing method and device
CN106790882A (en) * 2016-12-29 2017-05-31 贵州财富之舟科技有限公司 Communication Dolby circuit and noise-reduction method
CN106935253A (en) * 2017-03-10 2017-07-07 北京奇虎科技有限公司 The method of cutting out of audio file, device and terminal device
CN107230478A (en) * 2017-05-03 2017-10-03 上海斐讯数据通信技术有限公司 A kind of voice information processing method and system
CN107734426A (en) * 2017-08-28 2018-02-23 深圳市金立通信设备有限公司 Acoustic signal processing method, terminal and computer-readable recording medium
CN109119091A (en) * 2018-08-17 2019-01-01 西安蜂语信息科技有限公司 Voice communication noise-reduction method and device
CN110992989A (en) * 2019-12-06 2020-04-10 广州国音智能科技有限公司 Voice acquisition method and device and computer readable storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104157301A (en) * 2014-07-25 2014-11-19 广州三星通信技术研究有限公司 Method, device and terminal deleting voice information blank segment
CN104602162A (en) * 2014-12-17 2015-05-06 惠州Tcl移动通信有限公司 External noise reduction device for mobile terminal and noise reduction method of external noise reduction device
CN204334562U (en) * 2015-01-15 2015-05-13 厦门市普星电子科技有限公司 A kind of digital handset with ambient noise inhibit feature
CN105845124A (en) * 2016-05-05 2016-08-10 北京小米移动软件有限公司 Audio processing method and device
CN106790882A (en) * 2016-12-29 2017-05-31 贵州财富之舟科技有限公司 Communication Dolby circuit and noise-reduction method
CN106935253A (en) * 2017-03-10 2017-07-07 北京奇虎科技有限公司 The method of cutting out of audio file, device and terminal device
CN107230478A (en) * 2017-05-03 2017-10-03 上海斐讯数据通信技术有限公司 A kind of voice information processing method and system
CN107734426A (en) * 2017-08-28 2018-02-23 深圳市金立通信设备有限公司 Acoustic signal processing method, terminal and computer-readable recording medium
CN109119091A (en) * 2018-08-17 2019-01-01 西安蜂语信息科技有限公司 Voice communication noise-reduction method and device
CN110992989A (en) * 2019-12-06 2020-04-10 广州国音智能科技有限公司 Voice acquisition method and device and computer readable storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112423019A (en) * 2020-11-17 2021-02-26 北京达佳互联信息技术有限公司 Method and device for adjusting audio playing speed, electronic equipment and storage medium
CN112423019B (en) * 2020-11-17 2022-11-22 北京达佳互联信息技术有限公司 Method and device for adjusting audio playing speed, electronic equipment and storage medium
CN112509609A (en) * 2020-12-16 2021-03-16 北京乐学帮网络技术有限公司 Audio processing method and device, electronic equipment and storage medium
CN112509609B (en) * 2020-12-16 2022-06-10 北京乐学帮网络技术有限公司 Audio processing method and device, electronic equipment and storage medium
CN112995755A (en) * 2021-03-01 2021-06-18 合肥学院 Automatic editing method for screen recording
CN113038338A (en) * 2021-03-22 2021-06-25 联想(北京)有限公司 Noise reduction processing method and device
CN114005469A (en) * 2021-10-20 2022-02-01 广州市网星信息技术有限公司 Audio playing method and system capable of automatically skipping mute segment

Also Published As

Publication number Publication date
CN111508531B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
CN111508531B (en) Audio processing method and device
CN105872253B (en) Live broadcast sound processing method and mobile terminal
CN107509153B (en) Detection method and device of sound playing device, storage medium and terminal
US20210217433A1 (en) Voice processing method and apparatus, and device
CN109257498B (en) Sound processing method and mobile terminal
CN111370018B (en) Audio data processing method, electronic device and medium
CN106935253A (en) The method of cutting out of audio file, device and terminal device
CN107864353B (en) A kind of video recording method and mobile terminal
CN110097872B (en) Audio processing method and electronic equipment
CN108174236A (en) A kind of media file processing method, server and mobile terminal
CN107734426A (en) Acoustic signal processing method, terminal and computer-readable recording medium
CN104092809A (en) Communication sound recording method and recorded communication sound playing method and device
CN107888965A (en) Image present methods of exhibiting and device, terminal, system, storage medium
CN111445901A (en) Audio data acquisition method and device, electronic equipment and storage medium
CN109040444B (en) Call recording method, terminal and computer readable storage medium
CN109545246A (en) A kind of sound processing method and terminal device
CN113676668A (en) Video shooting method and device, electronic equipment and readable storage medium
CN110830368A (en) Instant messaging message sending method and electronic equipment
CN108763475B (en) Recording method, recording device and terminal equipment
CN110995921A (en) Call processing method, electronic device and computer readable storage medium
CN110111795B (en) Voice processing method and terminal equipment
CN107391076A (en) Audio evaluation display method and device
CN108989554B (en) Information processing method and terminal
CN107809692B (en) A kind of headset control method, equipment and computer readable storage medium
WO2020118560A1 (en) Recording method and apparatus, electronic device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant