CN105845124B - Audio processing method and device - Google Patents
Audio processing method and device Download PDFInfo
- Publication number
- CN105845124B CN105845124B CN201610291319.3A CN201610291319A CN105845124B CN 105845124 B CN105845124 B CN 105845124B CN 201610291319 A CN201610291319 A CN 201610291319A CN 105845124 B CN105845124 B CN 105845124B
- Authority
- CN
- China
- Prior art keywords
- audio
- file
- audio file
- original
- played
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title abstract description 12
- 238000012217 deletion Methods 0.000 claims abstract description 36
- 230000037430 deletion Effects 0.000 claims abstract description 36
- 238000000034 method Methods 0.000 claims abstract description 31
- 238000012545 processing Methods 0.000 claims description 22
- 238000004458 analytical method Methods 0.000 claims description 15
- 238000012790 confirmation Methods 0.000 claims description 11
- 238000010586 diagram Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
- G06F16/162—Delete operations
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/22—Means responsive to presence or absence of recorded information signals
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management Or Editing Of Information On Record Carriers (AREA)
Abstract
The disclosure provides an audio processing method and device, and belongs to the technical field of terminals. The method comprises the following steps: when detecting a blank segment deletion triggering operation, analyzing an original audio file to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information; and deleting the at least one audio blank segment in the original audio file to obtain an audio file to be played. After the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by the user only includes the audio segments carrying the audio information, but not includes the audio blank segments not carrying any audio information, the playing time consumption can be reduced when the audio file to be played is played subsequently, and the playing efficiency can be remarkably improved.
Description
Technical Field
The present disclosure relates to the field of terminal technologies, and in particular, to an audio processing method and apparatus.
Background
With the development of science and technology, intelligent terminals with multiple functions integrated have become very practical tools in daily life of users. For example, the user may record an audio of a meeting or an interview that needs to be recorded by using a recording function of the intelligent terminal, so that the content of the meeting or the interview can be obtained by playing a recorded audio file in the following.
Generally, in the process of recording audio, an intelligent terminal collects all sounds in the surrounding environment through a microphone, and generates and stores a corresponding audio file when recording is finished. The user can acquire all sound information including audio information segments and audio blank segments without audio information in the recording process through the audio file. Wherein the audio blank segment is a segment including ambient noise or a silent segment.
Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides an audio processing method and apparatus, where the technical scheme is as follows:
according to a first aspect of embodiments of the present disclosure, there is provided an audio processing method, the method comprising:
when detecting a blank segment deletion triggering operation, analyzing an original audio file to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information;
and deleting the at least one audio blank segment in the original audio file to obtain an audio file to be played.
Optionally, the analyzing the original audio file to obtain at least one audio blank segment includes:
extracting audio features of each audio frame contained in the original audio file;
acquiring continuous audio frames which do not contain audio information based on the audio features of each audio frame, wherein the continuous audio frames are continuous in time and the total duration of the continuous audio frames exceeds a preset threshold;
and determining the audio segment indicated by the continuous audio frames as an audio blank segment.
Optionally, the deleting the at least one audio blank segment from the original audio file to obtain an audio file to be played includes:
segmenting the original audio file based on the starting time point and the ending time point indicated by each audio blank segment to obtain a plurality of sub audio files;
and deleting the appointed sub-audio files in the plurality of sub-audio files, and combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising audio blank segments.
Optionally, before analyzing the original audio file, the method further includes:
displaying a blank segment delete button of the original audio file;
and when the clicking operation of the blank segment deleting button is detected, determining that the blank segment deleting triggering operation is detected, and executing the step of analyzing the original audio file.
Optionally, after the at least one audio blank segment is deleted from the original audio file to obtain an audio file to be played, the method further includes:
and adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file which does not contain audio blank segments.
Optionally, before analyzing the original audio file, the method further includes:
backing up the original audio file to obtain a backup file of the original audio file;
after the original audio file is analyzed to obtain the audio file to be played, the method further comprises:
displaying deletion prompt information, wherein the deletion prompt information is used for prompting a user whether to delete the backup file;
and when the deletion confirmation operation of the backup file is detected, deleting the backup file.
According to a second aspect of embodiments of the present disclosure, there is provided an audio processing apparatus, the apparatus comprising:
the analysis module is used for analyzing the original audio file to obtain at least one audio blank segment after detecting blank segment deletion triggering operation, wherein the at least one audio blank segment is an audio segment not containing audio information;
and the deleting module is used for deleting the at least one audio blank segment in the original audio file to obtain an audio file to be played.
Optionally, the analysis module is configured to perform audio feature extraction on each audio frame included in the original audio file; acquiring continuous audio frames which do not contain audio information based on the audio features of each audio frame, wherein the continuous audio frames are continuous in time and the total duration of the continuous audio frames exceeds a preset threshold; and determining the audio segment indicated by the continuous audio frames as an audio blank segment.
Optionally, the deleting module is configured to segment the original audio file based on a start time point and an end time point indicated by each audio blank segment to obtain a plurality of sub audio files; and deleting the appointed sub-audio files in the plurality of sub-audio files, and combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising audio blank segments.
Optionally, the apparatus further comprises:
the display module is used for displaying a blank segment deleting button of the original audio file; and when the clicking operation of the blank segment deleting button is detected, determining that the blank segment deleting triggering operation is detected, and executing the step of analyzing the original audio file.
Optionally, the apparatus further comprises:
and the adding module is used for adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file which does not contain audio blank segments.
Optionally, the apparatus further comprises:
the backup module is used for backing up the original audio file to obtain a backup file of the original audio file;
the display module is further used for displaying deletion prompt information, and the deletion prompt information is used for prompting a user whether to delete the backup file;
the deleting module is further configured to delete the backup file when a deletion confirmation operation of the backup file is detected.
According to a third aspect of the embodiments of the present disclosure, there is provided an audio processing apparatus including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: when detecting a blank segment deletion triggering operation, analyzing an original audio file to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information; and deleting the at least one audio blank segment in the original audio file to obtain an audio file to be played.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
after the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by the user only includes the audio segments carrying the audio information, but does not include the audio blank segments not carrying any useful information, the playing time consumption can be reduced when the audio file to be played is played subsequently, and the playing efficiency can be remarkably improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow diagram illustrating an audio processing method according to an example embodiment.
FIG. 2 is a flow diagram illustrating an audio processing method according to an example embodiment.
FIG. 3 is a diagram illustrating a comparison of an original audio file with an audio file to be played, according to an example embodiment.
Fig. 4 is a block diagram illustrating an audio processing device according to an example embodiment.
Fig. 5 is a block diagram illustrating an audio processing device according to an example embodiment.
Fig. 6 is a block diagram illustrating an audio processing device according to an example embodiment.
Fig. 7 is a block diagram illustrating an audio processing device according to an example embodiment.
Fig. 8 is a block diagram illustrating an audio processing device according to an example embodiment.
Detailed Description
To make the objects, technical solutions and advantages of the present disclosure more apparent, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating an audio processing method according to an exemplary embodiment, which is used in a terminal, as shown in fig. 1, and includes the steps of:
in step 101, after detecting a blank segment deletion trigger operation, analyzing an original audio file to obtain at least one audio blank segment, where the at least one audio blank segment is an audio segment that does not include audio information.
In step 102, at least one audio blank segment is deleted from the original audio file to obtain an audio file to be played.
According to the method provided by the embodiment of the disclosure, after the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by the user only includes the audio segments carrying useful information, but does not include the audio blank segments not carrying any useful information, and thus the playing time consumption can be reduced when the audio file to be played is played subsequently, and the playing efficiency can be remarkably improved.
Optionally, analyzing the original audio file to obtain at least one audio blank segment includes:
extracting audio features of each audio frame contained in the original audio file;
acquiring continuous audio frames which do not contain audio information based on the audio features of each audio frame, wherein the continuous audio frames are a plurality of audio frames which are continuous in time and the total duration of which exceeds a preset threshold;
an audio segment indicated by consecutive audio frames is determined as an audio blank segment.
Optionally, deleting at least one audio blank segment from the original audio file to obtain the audio file to be played includes:
segmenting an original audio file based on a starting time point and an ending time point indicated by each audio blank segment to obtain a plurality of sub audio files;
and deleting the appointed sub-audio file from the plurality of sub-audio files, combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising audio blank segments.
Optionally, before analyzing the original audio file, the method further includes:
displaying a blank segment delete button of the original audio file;
and when the clicking operation of the blank segment deleting button is detected, determining that the blank segment deleting triggering operation is detected, and executing the step of analyzing the original audio file.
Optionally, after deleting at least one audio blank segment from the original audio file to obtain an audio file to be played, the method further includes:
and adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file without audio blank segments.
Optionally, before analyzing the original audio file, the method further includes:
backing up the original audio file to obtain a backup file of the original audio file;
after analyzing the original audio file and obtaining the audio file to be played, the method further comprises the following steps:
displaying deletion prompt information, wherein the deletion prompt information is used for prompting a user whether to delete the backup file;
and when the deletion confirmation operation of the backup file is detected, deleting the backup file.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
Fig. 2 is a flowchart illustrating an audio processing method according to an exemplary embodiment, which is used in a terminal, as shown in fig. 2, and includes the steps of:
in step 201, after detecting a blank segment deletion trigger operation, analyzing an original audio file to obtain at least one audio blank segment.
Wherein at least one audio blank segment is an audio segment that does not contain audio information. The audio information may include human voice, animal sound, natural sound, machine sound, and the like, which are not particularly limited in this disclosure.
In the embodiment of the disclosure, after the terminal finishes audio recording, a recorded original audio file is generated, and the original audio file is added to the recorded file list. Wherein, the recording file list may include a plurality of original audio files. For each original audio file, at least one audio blank segment may be contained therein. For example, taking recording the voice of the user as an example, in the process of recording the original audio file by the terminal, sometimes a person speaks in the surrounding environment, sometimes no person speaks, and when no person speaks in the surrounding environment, the terminal records an audio blank segment. The audio blank segment includes a mute segment, an ambient noise segment, and other audio segments that do not contain audio information.
After detecting the blank segment deletion triggering operation, the terminal may analyze the original audio segment to determine whether the audio blank segment is contained therein and the position of the audio blank segment in the original audio file, so that the audio blank segment can be subsequently deleted. Specifically, for any original audio file, the terminal can display a blank segment deleting button set for the original audio file; when the terminal detects the clicking operation of the user on the blank segment deleting button, the triggering operation of blank segment deletion is determined to be detected, and the original audio file is analyzed, wherein the analyzing process can be as follows:
extracting audio features of each audio frame contained in the original audio file; based on the audio features of each audio frame, continuous audio frames without audio information are obtained, and an audio segment indicated by the continuous audio frames is determined as an audio blank segment. The continuous audio frames are a plurality of audio frames which are continuous in time and the total duration of which exceeds a preset threshold value. The preset threshold may be preset by a user or preset by a terminal, which is not specifically limited in this disclosure.
Specifically, when the terminal detects a first audio frame which does not contain audio information, recording a first time point corresponding to the first audio frame; continuing to detect the audio frame after the first time point, and when the terminal detects a first audio frame containing audio information after the first time point, acquiring a second audio frame which does not contain audio information and is before the first audio frame containing audio information; and recording a second time point corresponding to the second audio frame. If the duration between the first time point and the second time point exceeds a preset threshold, determining a plurality of audio frames between the first time point and the second time point as continuous audio frames without audio information, and determining an audio segment indicated by the continuous audio frames as an audio blank segment.
It should be noted that each audio frame in the original audio file corresponds to a time point, and the time point is used to indicate the position of the audio frame in the original audio frame. An original audio file may include a plurality of groups of consecutive audio frames without audio information, and a first time point and a second time point corresponding to each group of consecutive audio frames are a start time point and an end time point of an audio blank segment.
It should be noted that, in order to improve the analysis efficiency of the original audio file, on the premise that the accuracy of the analysis result is satisfied, the original file may be analyzed according to the length of the preset frame. That is, one audio frame is selected from a plurality of audio frames included in a preset frame length and analyzed. The preset frame length can be preset by the terminal according to the analysis capability. The preset frame length may include 2 frames, 3 frames, 5 frames, etc. If the preset frame length is 3 frames, one audio frame is taken from every 3 audio frames of the terminal for analysis, and if the audio frame does not contain audio information, the audio frame contained in the preset frame length is determined not to contain audio information.
In step 202, at least one audio blank segment is deleted from the original audio file, so as to obtain an audio file to be played.
According to the analysis process of the original audio file in step 201, the first time point and the second time point corresponding to the continuous audio frames are the start time point and the end time point of the audio blank segment. Then, the process of deleting at least one audio blank segment in the original audio file may be: segmenting an original audio file based on a starting time point and an ending time point indicated by each audio blank segment to obtain a plurality of sub audio files; and deleting the appointed sub-audio file from the plurality of sub-audio files, and combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising an audio blank segment.
For example, after analyzing an original audio file with the existing time length of 5 minutes, two audio blank segments are obtained, wherein the starting time point of the audio blank segment a is 1 minute, and the ending time point of the audio blank segment a is 1 minute and 30 seconds; the audio blank segment B has a start time point of 3 minutes and an end time point of 4 minutes. Fig. 3 is a schematic diagram illustrating a comparison between an original audio file and an audio file to be played. The audio file is represented by rectangles in fig. 3, and the positions of the two audio blank sections can be found in the original file according to the start time point and the end time point indicated by the two audio blank sections, as shown by the hatched areas of the original audio file in fig. 3. Then, the original audio file is segmented according to the positions of the audio blank segments in the original audio file to obtain 5 sub audio files, such as sub audio file 1 to sub audio file 5 shown in fig. 3. For convenience of representation, "sub audio file 1" is represented by "sub 1" in fig. 3. The 5 sub audio files include two designated sub audio files, that is, 2 sub audio files include audio blank segments, such as sub audio file 2 and sub audio file 4 shown in fig. 3. And after the sub audio file 2 and the sub audio file 4 are deleted, combining the sub audio file 1, the sub audio file 3 and the sub audio file 5 according to a time sequence to obtain a file to be played.
In another embodiment, in order to meet the requirement of the user on the sound quality of the audio file to be played, the terminal may further generate the audio file to be played, which is matched with the preset sound quality, according to the preset sound quality. That is, after the designated sub-audio file is deleted from the plurality of sub-audio files, the remaining sub-audio files are combined according to the preset tone quality in the time sequence, and the audio file to be played, which is matched with the preset tone quality, is obtained. Specifically, the terminal may provide a plurality of sound quality options for the audio file to be played. The plurality of sound quality options may include high sound quality, medium sound quality, low sound quality, and the like, which is not particularly limited by the embodiments of the present disclosure. And when the terminal detects the selection operation of any tone quality, determining the tone quality selected by the user as the preset tone quality.
In another embodiment, before analyzing the original audio file, the original audio file may be backed up to obtain a backup file of the original audio file. That is, after detecting the blank segment deletion triggering operation, the original audio file is backed up, and then the step of analyzing the original audio file is executed. By backing up the original audio file, the user can freely control how to process the original audio file, and the processing flexibility is improved. In addition, in order to avoid the situation that the backup files occupy excessive storage space, the terminal can display deletion prompt information after the original audio files are processed to obtain the files to be played; and when the deletion confirmation operation of the backup file is detected, deleting the backup file. The deletion prompting information is used for prompting a user whether to delete the backup file.
It should be noted that whether to backup the original audio file before analyzing the original audio file may be preset by the user. Specifically, in a setting page of the audio file, a backup option is displayed, and the user can perform an on or off operation on the backup option. When the backup option is detected to be in an on state, executing the step of backing up before analyzing.
In step 203, tagging information is added to the audio file to be played.
In order to facilitate the user to distinguish the original audio file from the audio file to be played, after the audio file to be played is obtained, mark information may be added to the audio file to be played, where the mark information is used to indicate that the audio file to be played is an audio file that does not include an audio blank segment.
When the terminal displays the audio file, the audio file to be played with the mark information can be displayed in a distinguishing way. For example, a preset tag is displayed in an entry where a file to be played is located, and the content and style of the preset tag may be preset. Or, the terminal may store the audio file to be played and the original audio file with the tag information into different folders, respectively, and when the terminal detects an opening operation of the folder in which the audio file to be played is located, only the file to be played is displayed.
In step 204, when a playing operation of the audio file to be played is detected, the audio file to be played is played.
In the embodiment of the present disclosure, the terminal may associate the original audio file and the corresponding audio file to be played. When a terminal detects the playing operation of an original audio file, detecting whether an audio file to be played related to the original audio file exists or not; and if so, displaying associated playing prompt information, wherein the associated playing prompt information is used for prompting a user to play the audio file to be played corresponding to the original audio file. The content of the associated playing promoting information may be "detect that the file has been processed, and whether the processed file is played" or not, and the embodiment of the present disclosure does not specifically limit this. And when the terminal detects the confirmation operation of the associated playing, acquiring the audio file to be played corresponding to the original audio file, and playing the audio file to be played.
According to the method provided by the embodiment of the disclosure, after the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by the user only includes the audio segments carrying the audio information, but does not include the audio blank segments not carrying any useful information, the playing time is reduced, and the playing efficiency is improved.
Fig. 4 is a block diagram illustrating an audio processing device according to an example embodiment. Referring to fig. 4, the apparatus includes an analysis module 401 and a deletion module 402.
The analysis module 401 is connected to the deletion module 402, and is configured to, after detecting a blank segment deletion trigger operation, analyze an original audio file to obtain at least one audio blank segment, where the at least one audio blank segment is an audio segment that does not include audio information; a deleting module 402, configured to delete at least one audio blank segment in the original audio file, so as to obtain an audio file to be played.
Optionally, the analysis module 401 is configured to perform audio feature extraction on each audio frame included in the original audio file; acquiring continuous audio frames which do not contain audio information based on the audio features of each audio frame, wherein the continuous audio frames are a plurality of audio frames which are continuous in time and the total duration of which exceeds a preset threshold; an audio segment indicated by consecutive audio frames is determined as an audio blank segment.
Optionally, the deleting module 402 is configured to segment the original audio file based on the start time point and the end time point indicated by each audio blank segment to obtain a plurality of sub audio files; and deleting the appointed sub-audio file from the plurality of sub-audio files, combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising audio blank segments.
Optionally, referring to fig. 5, the apparatus further comprises:
a display module 403, configured to display a blank segment delete button of the original audio file; and when the clicking operation of the blank segment deleting button is detected, determining that the blank segment deleting triggering operation is detected, and executing the step of analyzing the original audio file.
Optionally, referring to fig. 6, the apparatus further comprises:
an adding module 404, configured to add mark information to the audio file to be played, where the mark information is used to indicate that the audio file to be played is an audio file that does not contain an audio blank segment.
Optionally, referring to fig. 7, the apparatus further comprises:
the backup module 405 is configured to backup an original audio file to obtain a backup file of the original audio file;
the display module 403 is further configured to display a deletion prompt message, where the deletion prompt message is used to prompt a user whether to delete the backup file;
the deletion module 402 is further configured to delete the backup file when a deletion confirmation operation for the backup file is detected.
According to the device provided by the embodiment of the disclosure, after the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by a user only includes the audio segments carrying audio information, but does not include the audio blank segments not carrying any useful information, and therefore, the playing time can be reduced when the audio file to be played is played subsequently, and the playing efficiency can be remarkably improved.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 8 is a block diagram illustrating an audio processing device according to an example embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 8, the apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 804, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described audio processing methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium having instructions therein, which when executed by a processor of a mobile terminal, enable the mobile terminal to perform the above-described audio processing method.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (8)
1. A method of audio processing, the method comprising:
generating recorded original audio files, and displaying blank segment deleting buttons set for the original audio files, wherein each original audio file is provided with a corresponding blank segment deleting button;
when the clicking operation of a blank segment deleting button of the original audio file is detected, determining that the blank segment deleting triggering operation is detected, and backing up the original audio file to obtain a backup file of the original audio file;
selecting one audio frame from at least two audio frames contained in a preset frame length as a target frame for analysis;
when the target frame does not contain the audio information, determining that at least two audio frames contained in the preset frame length do not contain the audio information;
determining audio segments indicated by at least two audio frames contained in the preset frame length as audio blank segments to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information;
segmenting the original audio file based on the starting time point and the ending time point indicated by each audio blank segment to obtain a plurality of sub audio files;
deleting a designated sub-audio file from the plurality of sub-audio files, wherein the designated sub-audio file is a file comprising audio blank segments;
according to preset tone quality, combining the rest sub-audio files according to a time sequence to obtain an audio file to be played, which is matched with the preset tone quality;
when the playing operation of the original audio file is detected, detecting whether an audio file to be played related to the original audio file exists or not;
if the audio file exists, displaying associated playing prompt information, wherein the associated playing prompt information is used for prompting a user to play the audio file to be played corresponding to the original audio file;
and when the confirmation operation of the associated playing is detected, acquiring the audio file to be played corresponding to the original audio file, and playing the audio file to be played.
2. The method according to claim 1, wherein after obtaining the audio file to be played that matches the preset sound quality, the method further comprises:
and adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file which does not contain audio blank segments.
3. The method of claim 1, wherein after analyzing the original audio file to obtain an audio file to be played, the method further comprises:
displaying deletion prompt information, wherein the deletion prompt information is used for prompting a user whether to delete the backup file;
and when the deletion confirmation operation of the backup file is detected, deleting the backup file.
4. An audio processing apparatus, characterized in that the apparatus comprises:
the display module is used for generating recorded original audio files and displaying blank segment deleting buttons set for the original audio files, and each original audio file is provided with a corresponding blank segment deleting button;
the analysis module is used for determining that blank segment deletion triggering operation is detected when the clicking operation of a blank segment deletion button of the original audio file is detected;
the backup module is used for backing up the original audio file to obtain a backup file of the original audio file;
the analysis module is further configured to select one audio frame from at least two audio frames included in a preset frame length as a target frame for analysis; when the target frame does not contain the audio information, determining that at least two audio frames contained in the preset frame length do not contain the audio information; determining audio segments indicated by at least two audio frames contained in the preset frame length as audio blank segments to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information;
the deleting module is used for segmenting the original audio file based on the starting time point and the ending time point indicated by each audio blank segment to obtain a plurality of sub audio files; deleting the appointed sub-audio files from the plurality of sub-audio files, and combining the rest sub-audio files according to a preset tone quality and a time sequence to obtain audio files to be played, wherein the appointed sub-audio files are files comprising audio blank segments;
the apparatus is further configured to: when the playing operation of the original audio file is detected, detecting whether an audio file to be played related to the original audio file exists or not; if the audio file exists, displaying associated playing prompt information, wherein the associated playing prompt information is used for prompting a user to play the audio file to be played corresponding to the original audio file; and when the confirmation operation of the associated playing is detected, acquiring the audio file to be played corresponding to the original audio file, and playing the audio file to be played.
5. The apparatus of claim 4, further comprising:
and the adding module is used for adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file which does not contain audio blank segments.
6. The apparatus of claim 4,
the display module is further used for displaying deletion prompt information, and the deletion prompt information is used for prompting a user whether to delete the backup file;
the deleting module is further configured to delete the backup file when a deletion confirmation operation of the backup file is detected.
7. An audio processing apparatus, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
generating recorded original audio files, and displaying blank segment deleting buttons set for the original audio files, wherein each original audio file is provided with a corresponding blank segment deleting button;
when the clicking operation of a blank segment deleting button of the original audio file is detected, determining that the blank segment deleting triggering operation is detected, and backing up the original audio file to obtain a backup file of the original audio file;
selecting one audio frame from at least two audio frames contained in a preset frame length as a target frame for analysis;
when the target frame does not contain the audio information, determining that at least two audio frames contained in the preset frame length do not contain the audio information;
determining audio segments indicated by at least two audio frames contained in the preset frame length as audio blank segments to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information;
segmenting the original audio file based on the starting time point and the ending time point indicated by each audio blank segment to obtain a plurality of sub audio files;
deleting a designated sub-audio file from the plurality of sub-audio files, wherein the designated sub-audio file is a file comprising audio blank segments;
according to preset tone quality, combining the rest sub-audio files according to a time sequence to obtain an audio file to be played, which is matched with the preset tone quality;
when the playing operation of the original audio file is detected, detecting whether an audio file to be played related to the original audio file exists or not;
if the audio file exists, displaying associated playing prompt information, wherein the associated playing prompt information is used for prompting a user to play the audio file to be played corresponding to the original audio file;
and when the confirmation operation of the associated playing is detected, acquiring the audio file to be played corresponding to the original audio file, and playing the audio file to be played.
8. A computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement the steps of the method of any of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610291319.3A CN105845124B (en) | 2016-05-05 | 2016-05-05 | Audio processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610291319.3A CN105845124B (en) | 2016-05-05 | 2016-05-05 | Audio processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105845124A CN105845124A (en) | 2016-08-10 |
CN105845124B true CN105845124B (en) | 2020-06-19 |
Family
ID=56591052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610291319.3A Active CN105845124B (en) | 2016-05-05 | 2016-05-05 | Audio processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105845124B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106448702B (en) * | 2016-09-14 | 2019-10-01 | 努比亚技术有限公司 | A kind of recording data processing unit, mobile terminal and method |
CN106657544A (en) * | 2016-10-24 | 2017-05-10 | 广东欧珀移动通信有限公司 | Incoming call recording method and terminal equipment |
CN106935253A (en) * | 2017-03-10 | 2017-07-07 | 北京奇虎科技有限公司 | The method of cutting out of audio file, device and terminal device |
CN108447502B (en) * | 2018-03-09 | 2020-09-22 | 福州米鱼信息科技有限公司 | Memorandum method and terminal based on voice information |
CN110765080A (en) * | 2018-07-26 | 2020-02-07 | 北京搜狗科技发展有限公司 | File data processing method, device and equipment |
CN108986830B (en) * | 2018-08-28 | 2021-02-09 | 安徽淘云科技有限公司 | Audio corpus screening method and device |
CN111128253B (en) * | 2019-12-13 | 2022-03-01 | 北京小米智能科技有限公司 | Audio editing method and device |
CN111508531B (en) * | 2020-04-23 | 2023-07-07 | 维沃移动通信有限公司 | Audio processing method and device |
CN111614423B (en) * | 2020-04-30 | 2021-08-13 | 湖南声广信息科技有限公司 | Method for splicing presiding audio and music of music broadcasting station |
CN111666446B (en) * | 2020-05-26 | 2023-07-04 | 珠海九松科技有限公司 | Method and system for judging automatic video editing material of AI |
CN111932830B (en) * | 2020-07-31 | 2021-11-09 | 成都市美幻科技有限公司 | Earthquake early warning method, device, system and storage medium |
CN112509609B (en) * | 2020-12-16 | 2022-06-10 | 北京乐学帮网络技术有限公司 | Audio processing method and device, electronic equipment and storage medium |
CN112601153B (en) * | 2021-03-01 | 2021-05-07 | 成都大熊猫繁育研究基地 | Automatic sound acquisition and transmission device and use method thereof |
CN114005469A (en) * | 2021-10-20 | 2022-02-01 | 广州市网星信息技术有限公司 | Audio playing method and system capable of automatically skipping mute segment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103440138A (en) * | 2013-09-06 | 2013-12-11 | 网易(杭州)网络有限公司 | Behavior guidance method and device |
CN103702219A (en) * | 2013-12-16 | 2014-04-02 | 乐视网信息技术(北京)股份有限公司 | Video playing control method and equipment |
CN104157301A (en) * | 2014-07-25 | 2014-11-19 | 广州三星通信技术研究有限公司 | Method, device and terminal deleting voice information blank segment |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002093058A (en) * | 2000-09-20 | 2002-03-29 | Toshiba Corp | Signal processing method and device and information recording medium |
CN102394963A (en) * | 2011-08-23 | 2012-03-28 | 上海华勤通讯技术有限公司 | Mobile terminal with instant recording function and recording method thereof |
JP6264059B2 (en) * | 2014-01-23 | 2018-01-24 | ティアック株式会社 | Data recording device |
CN104869233B (en) * | 2015-04-27 | 2019-04-23 | 深圳市金立通信设备有限公司 | A kind of way of recording |
-
2016
- 2016-05-05 CN CN201610291319.3A patent/CN105845124B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103440138A (en) * | 2013-09-06 | 2013-12-11 | 网易(杭州)网络有限公司 | Behavior guidance method and device |
CN103702219A (en) * | 2013-12-16 | 2014-04-02 | 乐视网信息技术(北京)股份有限公司 | Video playing control method and equipment |
CN104157301A (en) * | 2014-07-25 | 2014-11-19 | 广州三星通信技术研究有限公司 | Method, device and terminal deleting voice information blank segment |
Also Published As
Publication number | Publication date |
---|---|
CN105845124A (en) | 2016-08-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105845124B (en) | Audio processing method and device | |
CN106911961B (en) | Multimedia data playing method and device | |
EP3817395A1 (en) | Video recording method and apparatus, device, and readable storage medium | |
CN106024009B (en) | Audio processing method and device | |
KR101789783B1 (en) | Method, apparatus, program, and recording medium for prompting call | |
CN106409317B (en) | Method and device for extracting dream speech | |
CN107423386B (en) | Method and device for generating electronic card | |
CN106534951B (en) | Video segmentation method and device | |
US9628966B2 (en) | Method and device for sending message | |
CN106777016B (en) | Method and device for information recommendation based on instant messaging | |
EP2988458A1 (en) | Method and device for sending message | |
CN113099297A (en) | Method and device for generating click video, electronic equipment and storage medium | |
CN111666015A (en) | Suspension short message display method and device | |
CN106331328B (en) | Information prompting method and device | |
CN112948704A (en) | Model training method and device for information recommendation, electronic equipment and medium | |
CN110673917A (en) | Information management method and device | |
EP3125514A1 (en) | Method and device for state notification | |
CN110019897B (en) | Method and device for displaying picture | |
CN113905192A (en) | Subtitle editing method and device, electronic equipment and storage medium | |
CN110213062B (en) | Method and device for processing message | |
CN106447747B (en) | Image processing method and device | |
CN112087653A (en) | Data processing method and device and electronic equipment | |
CN109145151B (en) | Video emotion classification acquisition method and device | |
CN106980781B (en) | External equipment and control method and device of external equipment | |
CN105262905A (en) | Method and device for management of contact persons |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |