CN105845124B - Audio processing method and device - Google Patents

Audio processing method and device Download PDF

Info

Publication number
CN105845124B
CN105845124B CN201610291319.3A CN201610291319A CN105845124B CN 105845124 B CN105845124 B CN 105845124B CN 201610291319 A CN201610291319 A CN 201610291319A CN 105845124 B CN105845124 B CN 105845124B
Authority
CN
China
Prior art keywords
audio
file
audio file
original
played
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610291319.3A
Other languages
Chinese (zh)
Other versions
CN105845124A (en
Inventor
朱印
杨静松
郝少华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201610291319.3A priority Critical patent/CN105845124B/en
Publication of CN105845124A publication Critical patent/CN105845124A/en
Application granted granted Critical
Publication of CN105845124B publication Critical patent/CN105845124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/22Means responsive to presence or absence of recorded information signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)

Abstract

The disclosure provides an audio processing method and device, and belongs to the technical field of terminals. The method comprises the following steps: when detecting a blank segment deletion triggering operation, analyzing an original audio file to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information; and deleting the at least one audio blank segment in the original audio file to obtain an audio file to be played. After the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by the user only includes the audio segments carrying the audio information, but not includes the audio blank segments not carrying any audio information, the playing time consumption can be reduced when the audio file to be played is played subsequently, and the playing efficiency can be remarkably improved.

Description

Audio processing method and device
Technical Field
The present disclosure relates to the field of terminal technologies, and in particular, to an audio processing method and apparatus.
Background
With the development of science and technology, intelligent terminals with multiple functions integrated have become very practical tools in daily life of users. For example, the user may record an audio of a meeting or an interview that needs to be recorded by using a recording function of the intelligent terminal, so that the content of the meeting or the interview can be obtained by playing a recorded audio file in the following.
Generally, in the process of recording audio, an intelligent terminal collects all sounds in the surrounding environment through a microphone, and generates and stores a corresponding audio file when recording is finished. The user can acquire all sound information including audio information segments and audio blank segments without audio information in the recording process through the audio file. Wherein the audio blank segment is a segment including ambient noise or a silent segment.
Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides an audio processing method and apparatus, where the technical scheme is as follows:
according to a first aspect of embodiments of the present disclosure, there is provided an audio processing method, the method comprising:
when detecting a blank segment deletion triggering operation, analyzing an original audio file to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information;
and deleting the at least one audio blank segment in the original audio file to obtain an audio file to be played.
Optionally, the analyzing the original audio file to obtain at least one audio blank segment includes:
extracting audio features of each audio frame contained in the original audio file;
acquiring continuous audio frames which do not contain audio information based on the audio features of each audio frame, wherein the continuous audio frames are continuous in time and the total duration of the continuous audio frames exceeds a preset threshold;
and determining the audio segment indicated by the continuous audio frames as an audio blank segment.
Optionally, the deleting the at least one audio blank segment from the original audio file to obtain an audio file to be played includes:
segmenting the original audio file based on the starting time point and the ending time point indicated by each audio blank segment to obtain a plurality of sub audio files;
and deleting the appointed sub-audio files in the plurality of sub-audio files, and combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising audio blank segments.
Optionally, before analyzing the original audio file, the method further includes:
displaying a blank segment delete button of the original audio file;
and when the clicking operation of the blank segment deleting button is detected, determining that the blank segment deleting triggering operation is detected, and executing the step of analyzing the original audio file.
Optionally, after the at least one audio blank segment is deleted from the original audio file to obtain an audio file to be played, the method further includes:
and adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file which does not contain audio blank segments.
Optionally, before analyzing the original audio file, the method further includes:
backing up the original audio file to obtain a backup file of the original audio file;
after the original audio file is analyzed to obtain the audio file to be played, the method further comprises:
displaying deletion prompt information, wherein the deletion prompt information is used for prompting a user whether to delete the backup file;
and when the deletion confirmation operation of the backup file is detected, deleting the backup file.
According to a second aspect of embodiments of the present disclosure, there is provided an audio processing apparatus, the apparatus comprising:
the analysis module is used for analyzing the original audio file to obtain at least one audio blank segment after detecting blank segment deletion triggering operation, wherein the at least one audio blank segment is an audio segment not containing audio information;
and the deleting module is used for deleting the at least one audio blank segment in the original audio file to obtain an audio file to be played.
Optionally, the analysis module is configured to perform audio feature extraction on each audio frame included in the original audio file; acquiring continuous audio frames which do not contain audio information based on the audio features of each audio frame, wherein the continuous audio frames are continuous in time and the total duration of the continuous audio frames exceeds a preset threshold; and determining the audio segment indicated by the continuous audio frames as an audio blank segment.
Optionally, the deleting module is configured to segment the original audio file based on a start time point and an end time point indicated by each audio blank segment to obtain a plurality of sub audio files; and deleting the appointed sub-audio files in the plurality of sub-audio files, and combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising audio blank segments.
Optionally, the apparatus further comprises:
the display module is used for displaying a blank segment deleting button of the original audio file; and when the clicking operation of the blank segment deleting button is detected, determining that the blank segment deleting triggering operation is detected, and executing the step of analyzing the original audio file.
Optionally, the apparatus further comprises:
and the adding module is used for adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file which does not contain audio blank segments.
Optionally, the apparatus further comprises:
the backup module is used for backing up the original audio file to obtain a backup file of the original audio file;
the display module is further used for displaying deletion prompt information, and the deletion prompt information is used for prompting a user whether to delete the backup file;
the deleting module is further configured to delete the backup file when a deletion confirmation operation of the backup file is detected.
According to a third aspect of the embodiments of the present disclosure, there is provided an audio processing apparatus including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: when detecting a blank segment deletion triggering operation, analyzing an original audio file to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information; and deleting the at least one audio blank segment in the original audio file to obtain an audio file to be played.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
after the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by the user only includes the audio segments carrying the audio information, but does not include the audio blank segments not carrying any useful information, the playing time consumption can be reduced when the audio file to be played is played subsequently, and the playing efficiency can be remarkably improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow diagram illustrating an audio processing method according to an example embodiment.
FIG. 2 is a flow diagram illustrating an audio processing method according to an example embodiment.
FIG. 3 is a diagram illustrating a comparison of an original audio file with an audio file to be played, according to an example embodiment.
Fig. 4 is a block diagram illustrating an audio processing device according to an example embodiment.
Fig. 5 is a block diagram illustrating an audio processing device according to an example embodiment.
Fig. 6 is a block diagram illustrating an audio processing device according to an example embodiment.
Fig. 7 is a block diagram illustrating an audio processing device according to an example embodiment.
Fig. 8 is a block diagram illustrating an audio processing device according to an example embodiment.
Detailed Description
To make the objects, technical solutions and advantages of the present disclosure more apparent, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating an audio processing method according to an exemplary embodiment, which is used in a terminal, as shown in fig. 1, and includes the steps of:
in step 101, after detecting a blank segment deletion trigger operation, analyzing an original audio file to obtain at least one audio blank segment, where the at least one audio blank segment is an audio segment that does not include audio information.
In step 102, at least one audio blank segment is deleted from the original audio file to obtain an audio file to be played.
According to the method provided by the embodiment of the disclosure, after the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by the user only includes the audio segments carrying useful information, but does not include the audio blank segments not carrying any useful information, and thus the playing time consumption can be reduced when the audio file to be played is played subsequently, and the playing efficiency can be remarkably improved.
Optionally, analyzing the original audio file to obtain at least one audio blank segment includes:
extracting audio features of each audio frame contained in the original audio file;
acquiring continuous audio frames which do not contain audio information based on the audio features of each audio frame, wherein the continuous audio frames are a plurality of audio frames which are continuous in time and the total duration of which exceeds a preset threshold;
an audio segment indicated by consecutive audio frames is determined as an audio blank segment.
Optionally, deleting at least one audio blank segment from the original audio file to obtain the audio file to be played includes:
segmenting an original audio file based on a starting time point and an ending time point indicated by each audio blank segment to obtain a plurality of sub audio files;
and deleting the appointed sub-audio file from the plurality of sub-audio files, combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising audio blank segments.
Optionally, before analyzing the original audio file, the method further includes:
displaying a blank segment delete button of the original audio file;
and when the clicking operation of the blank segment deleting button is detected, determining that the blank segment deleting triggering operation is detected, and executing the step of analyzing the original audio file.
Optionally, after deleting at least one audio blank segment from the original audio file to obtain an audio file to be played, the method further includes:
and adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file without audio blank segments.
Optionally, before analyzing the original audio file, the method further includes:
backing up the original audio file to obtain a backup file of the original audio file;
after analyzing the original audio file and obtaining the audio file to be played, the method further comprises the following steps:
displaying deletion prompt information, wherein the deletion prompt information is used for prompting a user whether to delete the backup file;
and when the deletion confirmation operation of the backup file is detected, deleting the backup file.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
Fig. 2 is a flowchart illustrating an audio processing method according to an exemplary embodiment, which is used in a terminal, as shown in fig. 2, and includes the steps of:
in step 201, after detecting a blank segment deletion trigger operation, analyzing an original audio file to obtain at least one audio blank segment.
Wherein at least one audio blank segment is an audio segment that does not contain audio information. The audio information may include human voice, animal sound, natural sound, machine sound, and the like, which are not particularly limited in this disclosure.
In the embodiment of the disclosure, after the terminal finishes audio recording, a recorded original audio file is generated, and the original audio file is added to the recorded file list. Wherein, the recording file list may include a plurality of original audio files. For each original audio file, at least one audio blank segment may be contained therein. For example, taking recording the voice of the user as an example, in the process of recording the original audio file by the terminal, sometimes a person speaks in the surrounding environment, sometimes no person speaks, and when no person speaks in the surrounding environment, the terminal records an audio blank segment. The audio blank segment includes a mute segment, an ambient noise segment, and other audio segments that do not contain audio information.
After detecting the blank segment deletion triggering operation, the terminal may analyze the original audio segment to determine whether the audio blank segment is contained therein and the position of the audio blank segment in the original audio file, so that the audio blank segment can be subsequently deleted. Specifically, for any original audio file, the terminal can display a blank segment deleting button set for the original audio file; when the terminal detects the clicking operation of the user on the blank segment deleting button, the triggering operation of blank segment deletion is determined to be detected, and the original audio file is analyzed, wherein the analyzing process can be as follows:
extracting audio features of each audio frame contained in the original audio file; based on the audio features of each audio frame, continuous audio frames without audio information are obtained, and an audio segment indicated by the continuous audio frames is determined as an audio blank segment. The continuous audio frames are a plurality of audio frames which are continuous in time and the total duration of which exceeds a preset threshold value. The preset threshold may be preset by a user or preset by a terminal, which is not specifically limited in this disclosure.
Specifically, when the terminal detects a first audio frame which does not contain audio information, recording a first time point corresponding to the first audio frame; continuing to detect the audio frame after the first time point, and when the terminal detects a first audio frame containing audio information after the first time point, acquiring a second audio frame which does not contain audio information and is before the first audio frame containing audio information; and recording a second time point corresponding to the second audio frame. If the duration between the first time point and the second time point exceeds a preset threshold, determining a plurality of audio frames between the first time point and the second time point as continuous audio frames without audio information, and determining an audio segment indicated by the continuous audio frames as an audio blank segment.
It should be noted that each audio frame in the original audio file corresponds to a time point, and the time point is used to indicate the position of the audio frame in the original audio frame. An original audio file may include a plurality of groups of consecutive audio frames without audio information, and a first time point and a second time point corresponding to each group of consecutive audio frames are a start time point and an end time point of an audio blank segment.
It should be noted that, in order to improve the analysis efficiency of the original audio file, on the premise that the accuracy of the analysis result is satisfied, the original file may be analyzed according to the length of the preset frame. That is, one audio frame is selected from a plurality of audio frames included in a preset frame length and analyzed. The preset frame length can be preset by the terminal according to the analysis capability. The preset frame length may include 2 frames, 3 frames, 5 frames, etc. If the preset frame length is 3 frames, one audio frame is taken from every 3 audio frames of the terminal for analysis, and if the audio frame does not contain audio information, the audio frame contained in the preset frame length is determined not to contain audio information.
In step 202, at least one audio blank segment is deleted from the original audio file, so as to obtain an audio file to be played.
According to the analysis process of the original audio file in step 201, the first time point and the second time point corresponding to the continuous audio frames are the start time point and the end time point of the audio blank segment. Then, the process of deleting at least one audio blank segment in the original audio file may be: segmenting an original audio file based on a starting time point and an ending time point indicated by each audio blank segment to obtain a plurality of sub audio files; and deleting the appointed sub-audio file from the plurality of sub-audio files, and combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising an audio blank segment.
For example, after analyzing an original audio file with the existing time length of 5 minutes, two audio blank segments are obtained, wherein the starting time point of the audio blank segment a is 1 minute, and the ending time point of the audio blank segment a is 1 minute and 30 seconds; the audio blank segment B has a start time point of 3 minutes and an end time point of 4 minutes. Fig. 3 is a schematic diagram illustrating a comparison between an original audio file and an audio file to be played. The audio file is represented by rectangles in fig. 3, and the positions of the two audio blank sections can be found in the original file according to the start time point and the end time point indicated by the two audio blank sections, as shown by the hatched areas of the original audio file in fig. 3. Then, the original audio file is segmented according to the positions of the audio blank segments in the original audio file to obtain 5 sub audio files, such as sub audio file 1 to sub audio file 5 shown in fig. 3. For convenience of representation, "sub audio file 1" is represented by "sub 1" in fig. 3. The 5 sub audio files include two designated sub audio files, that is, 2 sub audio files include audio blank segments, such as sub audio file 2 and sub audio file 4 shown in fig. 3. And after the sub audio file 2 and the sub audio file 4 are deleted, combining the sub audio file 1, the sub audio file 3 and the sub audio file 5 according to a time sequence to obtain a file to be played.
In another embodiment, in order to meet the requirement of the user on the sound quality of the audio file to be played, the terminal may further generate the audio file to be played, which is matched with the preset sound quality, according to the preset sound quality. That is, after the designated sub-audio file is deleted from the plurality of sub-audio files, the remaining sub-audio files are combined according to the preset tone quality in the time sequence, and the audio file to be played, which is matched with the preset tone quality, is obtained. Specifically, the terminal may provide a plurality of sound quality options for the audio file to be played. The plurality of sound quality options may include high sound quality, medium sound quality, low sound quality, and the like, which is not particularly limited by the embodiments of the present disclosure. And when the terminal detects the selection operation of any tone quality, determining the tone quality selected by the user as the preset tone quality.
In another embodiment, before analyzing the original audio file, the original audio file may be backed up to obtain a backup file of the original audio file. That is, after detecting the blank segment deletion triggering operation, the original audio file is backed up, and then the step of analyzing the original audio file is executed. By backing up the original audio file, the user can freely control how to process the original audio file, and the processing flexibility is improved. In addition, in order to avoid the situation that the backup files occupy excessive storage space, the terminal can display deletion prompt information after the original audio files are processed to obtain the files to be played; and when the deletion confirmation operation of the backup file is detected, deleting the backup file. The deletion prompting information is used for prompting a user whether to delete the backup file.
It should be noted that whether to backup the original audio file before analyzing the original audio file may be preset by the user. Specifically, in a setting page of the audio file, a backup option is displayed, and the user can perform an on or off operation on the backup option. When the backup option is detected to be in an on state, executing the step of backing up before analyzing.
In step 203, tagging information is added to the audio file to be played.
In order to facilitate the user to distinguish the original audio file from the audio file to be played, after the audio file to be played is obtained, mark information may be added to the audio file to be played, where the mark information is used to indicate that the audio file to be played is an audio file that does not include an audio blank segment.
When the terminal displays the audio file, the audio file to be played with the mark information can be displayed in a distinguishing way. For example, a preset tag is displayed in an entry where a file to be played is located, and the content and style of the preset tag may be preset. Or, the terminal may store the audio file to be played and the original audio file with the tag information into different folders, respectively, and when the terminal detects an opening operation of the folder in which the audio file to be played is located, only the file to be played is displayed.
In step 204, when a playing operation of the audio file to be played is detected, the audio file to be played is played.
In the embodiment of the present disclosure, the terminal may associate the original audio file and the corresponding audio file to be played. When a terminal detects the playing operation of an original audio file, detecting whether an audio file to be played related to the original audio file exists or not; and if so, displaying associated playing prompt information, wherein the associated playing prompt information is used for prompting a user to play the audio file to be played corresponding to the original audio file. The content of the associated playing promoting information may be "detect that the file has been processed, and whether the processed file is played" or not, and the embodiment of the present disclosure does not specifically limit this. And when the terminal detects the confirmation operation of the associated playing, acquiring the audio file to be played corresponding to the original audio file, and playing the audio file to be played.
According to the method provided by the embodiment of the disclosure, after the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by the user only includes the audio segments carrying the audio information, but does not include the audio blank segments not carrying any useful information, the playing time is reduced, and the playing efficiency is improved.
Fig. 4 is a block diagram illustrating an audio processing device according to an example embodiment. Referring to fig. 4, the apparatus includes an analysis module 401 and a deletion module 402.
The analysis module 401 is connected to the deletion module 402, and is configured to, after detecting a blank segment deletion trigger operation, analyze an original audio file to obtain at least one audio blank segment, where the at least one audio blank segment is an audio segment that does not include audio information; a deleting module 402, configured to delete at least one audio blank segment in the original audio file, so as to obtain an audio file to be played.
Optionally, the analysis module 401 is configured to perform audio feature extraction on each audio frame included in the original audio file; acquiring continuous audio frames which do not contain audio information based on the audio features of each audio frame, wherein the continuous audio frames are a plurality of audio frames which are continuous in time and the total duration of which exceeds a preset threshold; an audio segment indicated by consecutive audio frames is determined as an audio blank segment.
Optionally, the deleting module 402 is configured to segment the original audio file based on the start time point and the end time point indicated by each audio blank segment to obtain a plurality of sub audio files; and deleting the appointed sub-audio file from the plurality of sub-audio files, combining the rest sub-audio files according to the time sequence to obtain the audio file to be played, wherein the appointed sub-audio file is a file comprising audio blank segments.
Optionally, referring to fig. 5, the apparatus further comprises:
a display module 403, configured to display a blank segment delete button of the original audio file; and when the clicking operation of the blank segment deleting button is detected, determining that the blank segment deleting triggering operation is detected, and executing the step of analyzing the original audio file.
Optionally, referring to fig. 6, the apparatus further comprises:
an adding module 404, configured to add mark information to the audio file to be played, where the mark information is used to indicate that the audio file to be played is an audio file that does not contain an audio blank segment.
Optionally, referring to fig. 7, the apparatus further comprises:
the backup module 405 is configured to backup an original audio file to obtain a backup file of the original audio file;
the display module 403 is further configured to display a deletion prompt message, where the deletion prompt message is used to prompt a user whether to delete the backup file;
the deletion module 402 is further configured to delete the backup file when a deletion confirmation operation for the backup file is detected.
According to the device provided by the embodiment of the disclosure, after the original audio file is analyzed, the audio blank segments contained in the original audio file can be deleted, and the audio file to be played, which does not contain the audio blank segments, is obtained, so that the audio file obtained by a user only includes the audio segments carrying audio information, but does not include the audio blank segments not carrying any useful information, and therefore, the playing time can be reduced when the audio file to be played is played subsequently, and the playing efficiency can be remarkably improved.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 8 is a block diagram illustrating an audio processing device according to an example embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 8, the apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 804, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described audio processing methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium having instructions therein, which when executed by a processor of a mobile terminal, enable the mobile terminal to perform the above-described audio processing method.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (8)

1. A method of audio processing, the method comprising:
generating recorded original audio files, and displaying blank segment deleting buttons set for the original audio files, wherein each original audio file is provided with a corresponding blank segment deleting button;
when the clicking operation of a blank segment deleting button of the original audio file is detected, determining that the blank segment deleting triggering operation is detected, and backing up the original audio file to obtain a backup file of the original audio file;
selecting one audio frame from at least two audio frames contained in a preset frame length as a target frame for analysis;
when the target frame does not contain the audio information, determining that at least two audio frames contained in the preset frame length do not contain the audio information;
determining audio segments indicated by at least two audio frames contained in the preset frame length as audio blank segments to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information;
segmenting the original audio file based on the starting time point and the ending time point indicated by each audio blank segment to obtain a plurality of sub audio files;
deleting a designated sub-audio file from the plurality of sub-audio files, wherein the designated sub-audio file is a file comprising audio blank segments;
according to preset tone quality, combining the rest sub-audio files according to a time sequence to obtain an audio file to be played, which is matched with the preset tone quality;
when the playing operation of the original audio file is detected, detecting whether an audio file to be played related to the original audio file exists or not;
if the audio file exists, displaying associated playing prompt information, wherein the associated playing prompt information is used for prompting a user to play the audio file to be played corresponding to the original audio file;
and when the confirmation operation of the associated playing is detected, acquiring the audio file to be played corresponding to the original audio file, and playing the audio file to be played.
2. The method according to claim 1, wherein after obtaining the audio file to be played that matches the preset sound quality, the method further comprises:
and adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file which does not contain audio blank segments.
3. The method of claim 1, wherein after analyzing the original audio file to obtain an audio file to be played, the method further comprises:
displaying deletion prompt information, wherein the deletion prompt information is used for prompting a user whether to delete the backup file;
and when the deletion confirmation operation of the backup file is detected, deleting the backup file.
4. An audio processing apparatus, characterized in that the apparatus comprises:
the display module is used for generating recorded original audio files and displaying blank segment deleting buttons set for the original audio files, and each original audio file is provided with a corresponding blank segment deleting button;
the analysis module is used for determining that blank segment deletion triggering operation is detected when the clicking operation of a blank segment deletion button of the original audio file is detected;
the backup module is used for backing up the original audio file to obtain a backup file of the original audio file;
the analysis module is further configured to select one audio frame from at least two audio frames included in a preset frame length as a target frame for analysis; when the target frame does not contain the audio information, determining that at least two audio frames contained in the preset frame length do not contain the audio information; determining audio segments indicated by at least two audio frames contained in the preset frame length as audio blank segments to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information;
the deleting module is used for segmenting the original audio file based on the starting time point and the ending time point indicated by each audio blank segment to obtain a plurality of sub audio files; deleting the appointed sub-audio files from the plurality of sub-audio files, and combining the rest sub-audio files according to a preset tone quality and a time sequence to obtain audio files to be played, wherein the appointed sub-audio files are files comprising audio blank segments;
the apparatus is further configured to: when the playing operation of the original audio file is detected, detecting whether an audio file to be played related to the original audio file exists or not; if the audio file exists, displaying associated playing prompt information, wherein the associated playing prompt information is used for prompting a user to play the audio file to be played corresponding to the original audio file; and when the confirmation operation of the associated playing is detected, acquiring the audio file to be played corresponding to the original audio file, and playing the audio file to be played.
5. The apparatus of claim 4, further comprising:
and the adding module is used for adding mark information for the audio file to be played, wherein the mark information is used for indicating that the audio file to be played is an audio file which does not contain audio blank segments.
6. The apparatus of claim 4,
the display module is further used for displaying deletion prompt information, and the deletion prompt information is used for prompting a user whether to delete the backup file;
the deleting module is further configured to delete the backup file when a deletion confirmation operation of the backup file is detected.
7. An audio processing apparatus, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
generating recorded original audio files, and displaying blank segment deleting buttons set for the original audio files, wherein each original audio file is provided with a corresponding blank segment deleting button;
when the clicking operation of a blank segment deleting button of the original audio file is detected, determining that the blank segment deleting triggering operation is detected, and backing up the original audio file to obtain a backup file of the original audio file;
selecting one audio frame from at least two audio frames contained in a preset frame length as a target frame for analysis;
when the target frame does not contain the audio information, determining that at least two audio frames contained in the preset frame length do not contain the audio information;
determining audio segments indicated by at least two audio frames contained in the preset frame length as audio blank segments to obtain at least one audio blank segment, wherein the at least one audio blank segment is an audio segment not containing audio information;
segmenting the original audio file based on the starting time point and the ending time point indicated by each audio blank segment to obtain a plurality of sub audio files;
deleting a designated sub-audio file from the plurality of sub-audio files, wherein the designated sub-audio file is a file comprising audio blank segments;
according to preset tone quality, combining the rest sub-audio files according to a time sequence to obtain an audio file to be played, which is matched with the preset tone quality;
when the playing operation of the original audio file is detected, detecting whether an audio file to be played related to the original audio file exists or not;
if the audio file exists, displaying associated playing prompt information, wherein the associated playing prompt information is used for prompting a user to play the audio file to be played corresponding to the original audio file;
and when the confirmation operation of the associated playing is detected, acquiring the audio file to be played corresponding to the original audio file, and playing the audio file to be played.
8. A computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement the steps of the method of any of claims 1-3.
CN201610291319.3A 2016-05-05 2016-05-05 Audio processing method and device Active CN105845124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610291319.3A CN105845124B (en) 2016-05-05 2016-05-05 Audio processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610291319.3A CN105845124B (en) 2016-05-05 2016-05-05 Audio processing method and device

Publications (2)

Publication Number Publication Date
CN105845124A CN105845124A (en) 2016-08-10
CN105845124B true CN105845124B (en) 2020-06-19

Family

ID=56591052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610291319.3A Active CN105845124B (en) 2016-05-05 2016-05-05 Audio processing method and device

Country Status (1)

Country Link
CN (1) CN105845124B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106448702B (en) * 2016-09-14 2019-10-01 努比亚技术有限公司 A kind of recording data processing unit, mobile terminal and method
CN106657544A (en) * 2016-10-24 2017-05-10 广东欧珀移动通信有限公司 Incoming call recording method and terminal equipment
CN106935253A (en) * 2017-03-10 2017-07-07 北京奇虎科技有限公司 The method of cutting out of audio file, device and terminal device
CN108447502B (en) * 2018-03-09 2020-09-22 福州米鱼信息科技有限公司 Memorandum method and terminal based on voice information
CN110765080A (en) * 2018-07-26 2020-02-07 北京搜狗科技发展有限公司 File data processing method, device and equipment
CN108986830B (en) * 2018-08-28 2021-02-09 安徽淘云科技有限公司 Audio corpus screening method and device
CN111128253B (en) * 2019-12-13 2022-03-01 北京小米智能科技有限公司 Audio editing method and device
CN111508531B (en) * 2020-04-23 2023-07-07 维沃移动通信有限公司 Audio processing method and device
CN111614423B (en) * 2020-04-30 2021-08-13 湖南声广信息科技有限公司 Method for splicing presiding audio and music of music broadcasting station
CN111666446B (en) * 2020-05-26 2023-07-04 珠海九松科技有限公司 Method and system for judging automatic video editing material of AI
CN111932830B (en) * 2020-07-31 2021-11-09 成都市美幻科技有限公司 Earthquake early warning method, device, system and storage medium
CN112509609B (en) * 2020-12-16 2022-06-10 北京乐学帮网络技术有限公司 Audio processing method and device, electronic equipment and storage medium
CN112601153B (en) * 2021-03-01 2021-05-07 成都大熊猫繁育研究基地 Automatic sound acquisition and transmission device and use method thereof
CN114005469A (en) * 2021-10-20 2022-02-01 广州市网星信息技术有限公司 Audio playing method and system capable of automatically skipping mute segment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440138A (en) * 2013-09-06 2013-12-11 网易(杭州)网络有限公司 Behavior guidance method and device
CN103702219A (en) * 2013-12-16 2014-04-02 乐视网信息技术(北京)股份有限公司 Video playing control method and equipment
CN104157301A (en) * 2014-07-25 2014-11-19 广州三星通信技术研究有限公司 Method, device and terminal deleting voice information blank segment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002093058A (en) * 2000-09-20 2002-03-29 Toshiba Corp Signal processing method and device and information recording medium
CN102394963A (en) * 2011-08-23 2012-03-28 上海华勤通讯技术有限公司 Mobile terminal with instant recording function and recording method thereof
JP6264059B2 (en) * 2014-01-23 2018-01-24 ティアック株式会社 Data recording device
CN104869233B (en) * 2015-04-27 2019-04-23 深圳市金立通信设备有限公司 A kind of way of recording

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440138A (en) * 2013-09-06 2013-12-11 网易(杭州)网络有限公司 Behavior guidance method and device
CN103702219A (en) * 2013-12-16 2014-04-02 乐视网信息技术(北京)股份有限公司 Video playing control method and equipment
CN104157301A (en) * 2014-07-25 2014-11-19 广州三星通信技术研究有限公司 Method, device and terminal deleting voice information blank segment

Also Published As

Publication number Publication date
CN105845124A (en) 2016-08-10

Similar Documents

Publication Publication Date Title
CN105845124B (en) Audio processing method and device
CN106911961B (en) Multimedia data playing method and device
EP3817395A1 (en) Video recording method and apparatus, device, and readable storage medium
CN106024009B (en) Audio processing method and device
KR101789783B1 (en) Method, apparatus, program, and recording medium for prompting call
CN106409317B (en) Method and device for extracting dream speech
CN107423386B (en) Method and device for generating electronic card
CN106534951B (en) Video segmentation method and device
US9628966B2 (en) Method and device for sending message
CN106777016B (en) Method and device for information recommendation based on instant messaging
EP2988458A1 (en) Method and device for sending message
CN113099297A (en) Method and device for generating click video, electronic equipment and storage medium
CN111666015A (en) Suspension short message display method and device
CN106331328B (en) Information prompting method and device
CN112948704A (en) Model training method and device for information recommendation, electronic equipment and medium
CN110673917A (en) Information management method and device
EP3125514A1 (en) Method and device for state notification
CN110019897B (en) Method and device for displaying picture
CN113905192A (en) Subtitle editing method and device, electronic equipment and storage medium
CN110213062B (en) Method and device for processing message
CN106447747B (en) Image processing method and device
CN112087653A (en) Data processing method and device and electronic equipment
CN109145151B (en) Video emotion classification acquisition method and device
CN106980781B (en) External equipment and control method and device of external equipment
CN105262905A (en) Method and device for management of contact persons

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant