CN105895102A - Recording editing method and recording device - Google Patents
Recording editing method and recording device Download PDFInfo
- Publication number
- CN105895102A CN105895102A CN201510786352.9A CN201510786352A CN105895102A CN 105895102 A CN105895102 A CN 105895102A CN 201510786352 A CN201510786352 A CN 201510786352A CN 105895102 A CN105895102 A CN 105895102A
- Authority
- CN
- China
- Prior art keywords
- vocal print
- recording
- current recording
- edited
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0631—Creating reference templates; Clustering
Abstract
The invention provides a recording editing method and a recording device. Acoustic analysis is carried out on current recording. According to an acoustic analysis result, current recording is marked. An editing command to edit the current recording is received. The editing command carries the mark information of a fragment to be edited and an editing mode. According to the mark information, the fragment to be edited is acquired from the marked current recording. According to the editing mode, the fragment to be edited is edited. According to the invention, the current recording is marked through voiceprint identification; after marking, a user edits the current recording based on a mark; the fragment to be edited can be quickly located; the editing time is saved; and the user experience is enhanced.
Description
Technical field
The present invention relates to electronic technology field, particularly relate to a kind of recording edit methods and recording device.
Background technology
Smart mobile phone has gradually been dissolved among people's daily life at present, not only becomes daily communication
Equipment, also becomes the daily recording equipment being easy to carry about with one.Wherein, user's record by smart mobile phone
Voice messaging can be recorded and preserve by sound application program (Application is called for short APP),
It is easy to user preserve one section rapidly and be difficult to directly remember voice messaging, and also can be used for multiple times
This recording.
Typically, the recording file that user records usually comprises unwanted information segment, these sheets
Section not only takes up room but also hinder user to search really necessary information.Existing recording APP can expire
Recording file is edited by foot user according to the actual content of recording, and this needs user to recording literary composition
Part repeats playing so that it is determined that go out content to be edited, it is clear that this recording edit mode can take use
The time that family is more so that Consumer's Experience is poor.
Summary of the invention
The present invention provides a kind of recording edit methods and recording device, be used for solving existing to record into
Waste user time, the problem affecting Consumer's Experience is there is during edlin.
To achieve these goals, the invention provides a kind of recording edit methods, including:
Current recording is carried out acoustic wave analysis and according to acoustic wave analysis result, described current recording is carried out
Labelling;
Receive the edit instruction that described current recording is edited, described edit instruction carries and treats
The label information of editor's fragment and edit mode;
Described fragment to be edited is chosen according to the described current recording after labelling of the described label information;
According to described edit mode, described fragment to be edited is edited.
To achieve these goals, the invention provides a kind of recording device, including:
Mark module, for carrying out acoustic wave analysis and according to acoustic wave analysis result to institute to current recording
State current recording to be marked;
Acquisition module, for obtaining the edit instruction editing described current recording, described volume
Collect label information and the edit mode carrying fragment to be edited in instruction;
Choose module, for choosing according to the described current recording after labelling of the described label information
Go out described fragment to be edited;
Editor module, for editing described fragment to be edited according to described edit mode.
The recording edit methods of the present invention and recording device, by current recording is carried out acoustic wave analysis,
And according to described acoustic wave analysis result, described current recording is marked, receive described current record
Sound carries out the edit instruction edited, described edit instruction carries the label information of fragment to be edited with
And edit mode, treat according to the described current recording after labelling of the described label information obtains
Editor's fragment, edits described fragment to be edited according to described edit mode.The present invention passes through
Current recording is marked by Application on Voiceprint Recognition, based on labelling user to current recording after labelling completes
Edit such that it is able to navigate to fragment to be edited quickly, save edit session, promote
User impression.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the recording edit methods of the embodiment of the present invention one;
Fig. 2 is one of application example schematic diagram of recording edit methods of the embodiment of the present invention one;
Fig. 3 is the two of the application example schematic diagram of the recording edit methods of the embodiment of the present invention one;
Fig. 4 is the three of the application example schematic diagram of the recording edit methods of the embodiment of the present invention one;
Fig. 5 is the four of the application example schematic diagram of the recording edit methods of the embodiment of the present invention one;
Fig. 6 is the schematic flow sheet of the record labels method in the embodiment of the present invention one;
Fig. 7 is one of application example schematic diagram of record labels method in the embodiment of the present invention one;
Fig. 8 is the two of the application example schematic diagram of the record labels method in the embodiment of the present invention one;
Fig. 9 is the three of the application example schematic diagram of the record labels method in the embodiment of the present invention one;
Figure 10 is the schematic flow sheet of the voice print database method for building up in the embodiment of the present invention one;
Figure 11 is the structural representation of the recording device of the embodiment of the present invention two;
Figure 12 is the structural representation of mark module in the embodiment of the present invention two.
Detailed description of the invention
The recording the edit methods below in conjunction with the accompanying drawings embodiment of the present invention provided and recording device
It is described in detail.
Embodiment one
As it is shown in figure 1, the schematic flow sheet of the recording edit methods that it is the embodiment of the present invention one,
This recording edit methods includes:
Step 101, current recording is carried out acoustic wave analysis and according to acoustic wave analysis result to described currently
Recording is marked.
User can open, by the user interface of smart mobile phone, the recording downloaded in smart mobile phone
The sound-recording function of APP, recording APP starts to be acquired current recording, during gathering
Recording APP can carry out pretreatment to sound.Current recording to gathering carries out acoustic wave analysis, enters
And obtain the acoustic wave analysis result of current recording, include that sound wave is special getting acoustic wave analysis result
Levy parameter.Owing to the vocal print of speaker has uniqueness, vocal print therefore can be utilized to say as difference
The unique features of words people, and then just current recording can be carried out according to this acoustic characteristic parameter
Labelling.Wherein, acoustic characteristic parameter includes: the energy of sound, formant, mel cepstrum coefficients
(Mel-frequency cepstrum coefficients is called for short MFCC) and linear predictor coefficient
(Linear Prediction Coefficients is called for short LPC).
As in figure 2 it is shown, the application example schematic diagram that it is the present embodiment, such as one section recording has 5
Individual speaker, uses left oblique line, right oblique line, horizontal line, vertical line and grid to be marked respectively
Words people A, B, C, D, E.Wherein, in this section of recording, have twice by other when speaker A
The separate speech of speaker, these two sections speeches all can use left slash mark speaker A, to show
It it is the recording paragraph of same speaker.The difference of speaker is seen more intuitively for user, can
To use different color mark speaker, such as, redness, yellow, blueness, green is used respectively
Normal complexion purple comes labelling speaker A, B, C, D, E.Or when speaker A records in this section
In have twice by the separate speech of other speakers, these two sections speeches all can use red-label to speak
People A, to be shown to be the recording paragraph of same speaker.
The edit instruction that described current recording is edited by step 102, acquisition, described edit instruction
In carry label information and the edit mode of fragment to be edited.
Further, after being marked current recording, user can be by display circle of terminal
Seeing labeled recording under face, such user just can pass through terminal to recording APP according to labelling
Mode edit instruction.Wherein, edit instruction carries the label information of fragment to be edited, and
Edit mode to fragment to be edited.Edit mode can include shearing fragment, the merging choosing chosen
In multiple fragments or delete the fragment chosen.
In the present embodiment, the edit instruction that described current recording is edited by described acquisition, including:
First, user can click at least included in the waveform graph of current record by terminal
Individual labelling, chooses corresponding fragment to be edited.Specifically, after labelling is clicked on by user,
Recording APP can detect at least one fragment to be edited that the waveform graph to current recording is comprised
The first clicking operation that corresponding labelling is carried out.Further, after choosing fragment to be edited, use
Family can select team's fragment to be edited to compile by the edit mode of terminal demonstration interface display
The target edit mode collected.Specifically, after target edit mode is clicked on by user, recording
APP just can detect the second click behaviour that the target edit mode being used fragment to be edited is carried out
Make.After the first clicking operation and the second clicking operation being detected, according to the first clicking operation and
Two clicking operation generate edit instruction.
Alternatively, the edit instruction that described current recording is edited by described acquisition, including:
First, user can click at least included in the waveform graph of current record by terminal
Individual labelling, chooses corresponding fragment to be edited.Specifically, after labelling is clicked on by user,
Recording APP can detect at least one fragment to be edited that the waveform graph to current recording is comprised
The first clicking operation that corresponding labelling is carried out.Further, after choosing fragment to be edited, use
Family can select team's fragment to be edited to compile by the edit mode of terminal demonstration interface display
The target edit mode collected.Specifically, after target edit mode is clicked on by user, recording
APP just can detect the second click behaviour that the target edit mode being used fragment to be edited is carried out
Make.After the first clicking operation and the second clicking operation being detected, according to the first clicking operation and
Two clicking operation generate edit instruction.
Step 103, treat according to the described current recording after labelling of the described label information obtains
Editor's fragment.
Step 104, according to described edit mode, described fragment to be edited is edited.
After receiving edit instruction, recording APP can get to be edited from edit instruction
The label information of section, then chooses fragment to be edited according to this label information from current recording.
Recording APP can get the edit mode to fragment to be edited from edit instruction, such as, will treat
Editor's fragment carries out shearing, merging or deletion action.After getting fragment to be edited, recording
It just can be edited by APP according to the edit mode of instruction.
As it is shown on figure 3, the application example schematic diagram that it is the present embodiment, user is at one section of warp of editor
When crossing the recording file after voiceprint analysis labelling, it is clear that the moire pattern of this section of recording
Shape has different labelling to distinguish.User just can phase by clicking on certain labelling on moire pattern
That answers chooses this fragment as fragment to be edited.As shown in Figure 3, user is selected by click
Take the fragment of horizontal line labelling as fragment to be edited.In choosing after fragment to be edited, use
Family can click on the target edit mode to this fragment to be edited in edit menu, such as,
Can click on " fragment is chosen in shearing " as target edit mode, clicked on by above-mentioned twice
Operation just can generate the edit instruction editing fragment to be edited, refers to according to this editor
Order just can cut this fragment.
As shown in Figure 4, it is the application example schematic diagram of the present embodiment, below recording moire pattern
The list having this recording is supplied to user, and user directly can select one from list
Labelling, thus can choose whole fragments of speaker representated by this labelling.Such as one section recording
There are 3 speakers, use left oblique line, right oblique line, horizontal line to come labelling speaker A, B, C respectively.
Wherein speaker A this section recording in have twice by the separate speech of other speakers, these two sections send out
Yan Douhui uses left oblique line to carry out labelling.So when left oblique line option during user clicks on list,
Two fragments are the most selected, and user can click on certain fragment and cancel choosing and can also keep choosing
In this fragment.After choosing multiple fragment, when user attempts to merge it, it is possible to
" fragment is chosen in merging " is clicked on as target edit mode from edit mode list.In clicking operation
After completing, recording APP just can get edit instruction, can the multiple fragments chosen be closed
And be one section of new segment.
As it is shown in figure 5, the application example schematic diagram that it is the present embodiment, user can also be from labelling
List is chosen multiple labelling option.In Fig. 5, user have selected left oblique line and right oblique line the two mark
Note option, then whole recording fragments of speaker A and speaker B just can be selected.Finally
Clicking on " fragment is chosen in merging ", the fragment chosen i.e. is merged into one section of new segment.Further,
User can select subdialogue content from all fragments selected and merge.
The recording edit methods that the present embodiment provides, by current recording is carried out acoustic wave analysis, and
According to acoustic wave analysis result, described current recording is marked, receives and described current recording is carried out
The edit instruction of editor, carries label information and the editor of fragment to be edited in described edit instruction
Mode, obtains described to be edited according to the described current recording after labelling of the described label information
Section, edits described fragment to be edited according to described edit mode.The present embodiment passes through vocal print
Identify and current recording is marked, based on labelling user, current recording is carried out after labelling completes
Editor such that it is able to navigate to fragment to be edited quickly, save edit session, improve use
Family is experienced.
Before current recording is edited by the present embodiment one, it is necessary first to current recording is entered
Line flag, in above-described embodiment one, the detailed process of step 101 can be shown in See Figure 6.Fig. 6 is
The schematic flow sheet of the record labels method in the embodiment of the present invention one.This record labels method includes
Following steps:
Step 201, collection current recording also extract vocal print characteristic parameter from described current recording.
User can open, by the user interface of smart mobile phone, the recording downloaded in smart mobile phone
The sound-recording function of APP, recording APP starts to be acquired current recording, during gathering
Recording APP can carry out pretreatment to sound, such as, the data of collection carry out framing, windowing and
Filtering etc..
Further, the current recording to gathering carries out feature analysis, and then obtains current recording
Acoustic characteristic parameter, wherein, acoustic characteristic parameter includes: the energy of sound, formant, MFCC
And LPC.
Step 202, described vocal print parameter is carried out vocal print cluster training obtain the mesh of described vocal print parameter
Mark vocal print template.
In the present embodiment, in order to identify the template of recording, it is provided with vocal print cluster training aids,
After getting vocal print characteristic parameter, by this training aids, vocal print characteristic parameter is carried out vocal print cluster and instruct
Practice, it is possible to obtain the target vocal print template that this current recording is corresponding.
Step 203, judge whether described target vocal print template is the vocal print template in voice print database
In the present embodiment, by training aids, sample audio is carried out vocal print cluster training, obtained sample
The sample vocal print template that this sound is corresponding, uses sample vocal print template to pre-set a vocal print number
According to library storage in recording APP.In general voice print database, storage has multiple sample vocal print template,
So that user carries out record labels in Recording Process.After getting target vocal print template, record
Sound APP can make a look up in voice print database, it is judged that whether this target vocal print template is present in
In this voice print database.
If it is judged that be yes, perform step 204;Otherwise perform step 205.
Step 204, the target that acquisition is corresponding with described target vocal print template from described voice print database
Label information.
In voice print database, not only preserve sample vocal print template, but also storage has and sample sound
The label information that stricture of vagina template is corresponding, general each sample vocal print template is to there being respective label information.
When getting the sample vocal print template corresponding with target vocal print template in voice print database, so that it may
To obtain the target label information corresponding with this target vocal print template.
Step 205, generate the described target label information corresponding with described target vocal print template.
Identify target vocal print template do not exist with voice print database in after, recording APP can
Think this one target label information of target vocal print template-setup, with by this target label information pair
This target vocal print template is marked.
Step 206, use described target label information that described current recording is marked.
After getting target label information, recording APP uses this target label information automatically
Current recording is marked.
The record labels method related in the present embodiment, by the sound that Application on Voiceprint Recognition current recording is corresponding
Stricture of vagina template, utilizes the voice print database set up to obtain the label information corresponding with current recording, and then
Current recording is marked, it is achieved that the function of labelling recording automatically, and saves user and add
The tagged time.
The application example schematic diagram of concrete record labels method can be found in the present embodiment one shown in Fig. 2,
Here is omitted.
Step 207, set up mapping relations between described target vocal print template and described target label information
And be stored in described voice print database.
The remark information that step 208, reception user are sent by terminal.
Step 209, use described remark information that described current recording is carried out remarks.
Step 210, described remark information is updated target label information described in described voice print database
In.
Receiving the remark information that user is sent by terminal, remark information can be coming of current recording
Source name, after terminal gets remark information, instruction recording APP uses this remark information to working as
Front recording carries out remarks.Such as, recording APP can be the position interpolation one that current recording is corresponding
Label.Further, the remark information got can also be updated voice print database by recording APP
In in target label information corresponding to the target vocal print template corresponding with current recording, in order to recording
Sound is can be again called during source of sound corresponding to current recording.
As it is shown in fig. 7, the application example schematic diagram that it is the present embodiment, when recording APP is to currently
After recording carries out automatic labelling, user can send remark information by terminal to recording APP, uses
In recording to this section, every speaker adds remark information.Such as, user can be by recording APP
To be " Mr. Zhang " with speaker's A remarks of left slash mark.User can be that new speaker adds
Remark information, and directly mate with the voiceprint of this speaker, and as the title of this section of recording.
As shown in Figure 8, it is the application example schematic diagram of the present embodiment, when the newly-built one section of record of user
Sound, if wherein comprising the recording of the speaker having preserved sound title, after voiceprint analysis, this
The recording paragraph of position speaker can be directly labeled as the label information preserved.Save
The speaker A of recording the last period be " Mr. Zhang ", newly-built one section of recording comprising this speaker
The labelling of speaker A will not be shown again, but show " Mr. Zhang ".
As it is shown in figure 9, the application example schematic diagram that it is the present embodiment, recording comprises user and protects
The label information that the teller that deposited is corresponding, according to the speaker of institute's labelling, faster positions needs
The recording found.Such as user is look for the teaching recording of Mr. Zhang, as long as find "
Teacher " label.
Step 201 gather current recording and from described current recording extract vocal print characteristic parameter it
Before, in addition it is also necessary to set up a voice print database by sample audio.
As shown in Figure 10, the flow process of the voice print database method for building up during it is the embodiment of the present invention one
Schematic diagram, this voice print database method for building up includes:
Step 301, sample audio is analyzed, extracts the described vocal print feature of described sample audio
Parameter.
In the present embodiment, using the recording APP sound recorded before current recording every time as sample
This sound.After getting recording every time, the sample audio of recording can be analyzed by recording APP,
Extracting the vocal print characteristic parameter of this sample audio, wherein vocal print characteristic parameter includes: the energy of sound
Amount, formant, MFCC and LPC etc..
Step 302, described vocal print characteristic parameter according to described sample audio carry out vocal print cluster training
Generate sample vocal print template.
In order to the vocal print characteristic parameter getting sample vocal print is carried out vocal print cluster training, need into
One step determines that whether this vocal print characteristic parameter is the sound of same source of sound, specifically, Preset Time
The described vocal print characteristic parameter of the described sample audio in Duan, when the described sample in described Preset Time
Described vocal print when the described vocal print characteristic parameter of this sound has similarity, to described sample audio
Characteristic parameter carries out vocal print cluster training and generates described sample vocal print template.If it is determined that sample sound
The vocal print characteristic parameter of stricture of vagina not there is similarity, then need to cache vocal print characteristic parameter,
After judging that this sound characteristic parameter has similarity again, vocal print characteristic parameter is carried out vocal print and gathers
Class training generates sample vocal print template.
Such as, having in one section of recording and have 5 speakers, these 5 speakers just can finish sample
Sound, after by vocal print cluster training, can identify these 5 speakers and respectively speak
People A, B, C, D, E, and 5 speakers generate corresponding sample vocal print template.
Step 303, for sample labeling information corresponding to described sample vocal print template generation.
After generating sample vocal print template, generate corresponding sample labeling information, example for sample audio
As same speaker uses identical labelling to be marked.In the present embodiment, it is possible to use left oblique
Line, right oblique line, horizontal line, vertical line and grid are marked speaker A, B, C, D, E.
Step 304, use described sample vocal print template, described sample labeling information and described sample
Mapping relations between vocal print template and described sample labeling information generate described voice print database.
In order to improve the agility to record labels, in the present embodiment, use sample vocal print template,
Reflecting between described sample labeling information and described sample vocal print template and described sample labeling information
Relation of penetrating generates described voice print database.The sound generated after every time recording being carried out vocal print cluster training
Stricture of vagina template all can be saved in voice print database as sample vocal print template, and can be by this sample
The label information of vocal print template and mapping relations between the two also can be saved in voice print database,
So that voice print database is updated.So when again running into the recording of same speaker, recording
APP passes through voiceprint analysis, it is possible to rapidly the recording to this speaker is marked, and improves
The convenience of record labels.
Embodiment two
As shown in figure 11, it is the structural representation of recording device of the embodiment of the present invention two.Should
Device includes: mark module 11, acquisition module 12, choose module 13 and editor module 14.
Wherein, mark module 11, for carrying out acoustic wave analysis and according to acoustic wave analysis to current recording
Current recording is marked by result.
Acquisition module 12, for obtaining the edit instruction editing current recording, edit instruction
In carry label information and the edit mode of fragment to be edited.
Choose module 13, for waiting to compile according to label information current recording after labelling selects
Collect fragment.
Editor module 14, for editing fragment to be edited according to edit mode.
As shown in figure 12, for the one of mark module 11 frame mode alternatively in the present embodiment two,
Including: extraction unit 111, training unit 112, judging unit 113, acquiring unit 114, labelling
Unit 115, signal generating unit 116, set up unit 117 and receive unit 118.
Wherein, extraction unit 111, it is used for gathering current recording and from current recording, extract vocal print special
Levy parameter.
Training unit 112, obtains the mesh of vocal print parameter for vocal print parameter carries out vocal print cluster training
Mark vocal print template.
Judging unit 113, for judging whether target vocal print template is the vocal print mould in voice print database
Plate.
Acquiring unit 114, for when the judged result of judging unit is for being, from voice print database
Obtain the target label information corresponding with target vocal print template.
Indexing unit 115, is used for using target label information to be marked current recording.
Signal generating unit 116, for when the result of judging unit 113 is no, generates and target vocal print
The target label information that template is corresponding.
Wherein, set up unit 116, for using target label information to currently at indexing unit 115
After recording is marked, set up between target vocal print template and target label information mapping relations also
It is stored in voice print database.
Further, extraction unit 111, it is additionally operable to gathering current recording and carrying from current recording
Before taking vocal print characteristic parameter, sample audio is analyzed extracting the vocal print feature ginseng of sample audio
Number.
Training unit 112, is additionally operable to the vocal print characteristic parameter according to sample audio and carries out vocal print cluster instruction
Practice and generate sample vocal print template.
Signal generating unit 116, is additionally operable to as sample labeling information corresponding to sample vocal print template generation.
Set up unit 117, be also used for sample vocal print template, sample labeling information and sample sound
Mapping relations between stricture of vagina template and sample labeling information generate voice print database.
Further, training unit 112, specifically for the sample audio in acquisition preset time period
Vocal print characteristic parameter, when the vocal print characteristic parameter of the sample audio in Preset Time has similar,
The vocal print characteristic parameter of sample audio is carried out vocal print cluster training generation sample vocal print template.
Wherein, receive unit 118, for using target label information to currently at indexing unit 115
After recording is marked, receive the remark information that user is sent by terminal.
Indexing unit 115, is also used for remark information and current recording is carried out remarks.
Set up unit 117, be additionally operable to update remark information in voice print database in target label information.
Further, acquisition module 12, is wrapped the waveform graph of current recording specifically for detection
The first clicking operation that labelling corresponding at least one contained fragment to be edited is carried out, and detect and treat
The second clicking operation that editor's target edit mode of being used of fragment is carried out and according to detecting
First clicking operation and the second clicking operation generate edit instruction.
Alternatively, acquisition module 12, the list comprised from current recording specifically for detection
In choose the first clicking operation that at least one labelling is carried out;The labelling chosen is waited to compile for indicating
Collect fragment, and detect the second click behaviour that the target edit mode being used fragment to be edited is carried out
Make, and generate edit instruction according to the first clicking operation detected and the second clicking operation.
Each functional module of the recording device that the present embodiment provides can be used for performing above-mentioned enforcement
The flow process of the recording edit methods shown in example, its specific works principle repeats no more, and refers to
The description of embodiment of the method.
The recording device that the present embodiment provides, by carrying out acoustic wave analysis, and root to current recording
According to acoustic wave analysis result, current recording is marked, receives the editor that current recording is edited
Instruction, carries label information and the edit mode of fragment to be edited, according to labelling in edit instruction
Information obtains fragment to be edited the current recording after labelling, according to edit mode to be edited
Duan Jinhang edits.Current recording is marked by the present embodiment by Application on Voiceprint Recognition, completes at labelling
After based on labelling user, current recording is edited such that it is able to navigate to be edited quickly
Section, saves edit session, improves user's impression.
One of ordinary skill in the art will appreciate that: realize the whole of above-mentioned each method embodiment
Or part steps can be completed by the hardware that programmed instruction is relevant.Aforesaid program is permissible
It is stored in a computer read/write memory medium.This program upon execution, performs to include
State the step of each method embodiment;And aforesaid storage medium includes: ROM, RAM, magnetic
The various medium that can store program code such as dish or CD.
It is last it is noted that various embodiments above is only in order to illustrate technical scheme,
It is not intended to limit;Although the present invention being described in detail with reference to foregoing embodiments,
It will be understood by those within the art that: foregoing embodiments still can be remembered by it
The technical scheme carried is modified, or carries out the most some or all of technical characteristic
With replacing;And these amendments or replacement, do not make the essence of appropriate technical solution depart from this
Invent the scope of each embodiment technical scheme.
Claims (18)
1. a recording edit methods, it is characterised in that including:
Current recording is carried out acoustic wave analysis and according to acoustic wave analysis result, described current recording is carried out
Labelling;
Obtain the edit instruction that described current recording is edited, described edit instruction carries and treats
The label information of editor's fragment and edit mode;
Described to be edited is selected according to the described current recording after labelling of the described label information
Section;
According to described edit mode, described fragment to be edited is edited.
Recording edit methods the most according to claim 1, it is characterised in that described to currently
Recording carries out acoustic wave analysis and is marked described current recording according to acoustic wave analysis result, including:
Gather described current recording and from described current recording, extract vocal print characteristic parameter;
Described vocal print parameter is carried out vocal print cluster training and obtains the target vocal print mould of described vocal print parameter
Plate;
Judge whether described target vocal print template is the vocal print template in voice print database;
If it is judged that be yes, obtain and described target vocal print template from described voice print database
Corresponding target label information;
Use described target label information that described current recording is marked.
Recording edit methods the most according to claim 2, it is characterised in that described use institute
State before described current recording is marked by target label information, also include:
If it is judged that be no, generate the described target label corresponding with described target vocal print template
Information.
Recording edit methods the most according to claim 3, it is characterised in that described use institute
State after described current recording is marked by target label information, also include:
Set up between described target vocal print template and described target label information mapping relations and be stored in
In described voice print database.
5. according to the recording edit methods described in any one of claim 1-4, it is characterised in that institute
Before stating collection current recording and extracting vocal print characteristic parameter from described current recording, including:
Sample audio is analyzed, extracts the described vocal print characteristic parameter of described sample audio;
Described vocal print characteristic parameter according to described sample audio carries out vocal print cluster training and generates sample
Vocal print template;
For the sample labeling information that described sample vocal print template generation is corresponding;
Use described sample vocal print template, described sample labeling information and described sample vocal print template
And the mapping relations between described sample labeling information generate described voice print database.
Recording edit methods the most according to claim 5, it is characterised in that described according to institute
The described vocal print characteristic parameter stating sample audio carries out vocal print cluster training generation sample vocal print template bag
Include:
The described vocal print characteristic parameter of the described sample audio in acquisition preset time period;
The described vocal print characteristic parameter of the described sample audio in described Preset Time has similarity
Time, the described vocal print characteristic parameter of described sample audio is carried out the vocal print cluster training described sample of generation
This vocal print template.
7. according to the recording edit methods described in any one of claim 1-4, it is characterised in that institute
State after using described target label information that described current recording is marked, also include:
Receive the remark information that user is sent by terminal;
Use described remark information that described current recording is carried out remarks;
Described remark information is updated described in described voice print database in target label information.
8. according to the recording edit methods described in any one of claim 1-4, it is characterised in that institute
State and obtain the edit instruction that described current recording is edited, including:
Detect at least one described fragment to be edited that the waveform graph to described current recording is comprised
The first clicking operation that corresponding labelling is carried out;
The second clicking operation that the target edit mode that described fragment to be edited is used by detection is carried out;
Described editor is generated according to described first clicking operation detected and described second clicking operation
Instruction.
9. according to the recording edit methods described in any one of claim 1-4, it is characterised in that institute
State and obtain the edit instruction that described current recording is edited, including:
Detection chooses what at least one labelling was carried out from the list that described current recording is comprised
First clicking operation;The described labelling chosen is for indicating described fragment to be edited;
Detect the second clicking operation that the target edit mode that described fragment to be edited used is carried out;
Described editor is generated according to described first clicking operation detected and described second clicking operation
Instruction.
10. a recording device, it is characterised in that including:
Mark module, for carrying out acoustic wave analysis and according to acoustic wave analysis result to institute to current recording
State current recording to be marked;
Acquisition module, for obtaining the edit instruction editing described current recording, described volume
Collect label information and the edit mode carrying fragment to be edited in instruction;
Choose module, for choosing according to the described current recording after labelling of the described label information
Go out described fragment to be edited;
Editor module, for editing described fragment to be edited according to described edit mode.
11. recording devices according to claim 10, it is characterised in that described mark module
Including:
Extraction unit, is used for gathering described current recording and extracts vocal print from described current recording special
Levy parameter;
Training unit, obtains described vocal print ginseng for described vocal print parameter carries out vocal print cluster training
The target vocal print template of number;
Judging unit, for judging whether described target vocal print template is the vocal print in voice print database
Template;
Acquiring unit, for when the judged result of described judging unit is for being, from described vocal print number
According to storehouse obtains the target label information corresponding with described target vocal print template;
Indexing unit, is used for using described target label information to be marked described current recording.
12. recording devices according to claim 11, it is characterised in that described mark module,
Also include:
Signal generating unit, for when the result of described judging unit is no, generates and described target sound
The described target label information that stricture of vagina template is corresponding.
13. recording devices according to claim 12, it is characterised in that described mark module,
Also include:
Set up unit, for described indexing unit use described target label information to described currently
After recording is marked, sets up and reflect between described target vocal print template and described target label information
Penetrate relation and be stored in described voice print database.
14. according to the recording device described in any one of claim 10-13, it is characterised in that described
Extraction unit, is additionally operable to gathering current recording and extracting vocal print feature ginseng from described current recording
Before number, sample audio is analyzed extracting the described vocal print characteristic parameter of described sample audio;
Described training unit, is additionally operable to the described vocal print characteristic parameter according to described sample audio and carries out
Vocal print cluster training generates sample vocal print template;
Described signal generating unit, is additionally operable to the sample labeling into described sample vocal print template generation is corresponding and believes
Breath;
Described set up unit, be also used for described sample vocal print template, described sample labeling information
And the mapping relations between described sample vocal print template and described sample labeling information generate described sound
Stricture of vagina data base.
15. recording devices according to claim 14, it is characterised in that described training unit,
Specifically for obtaining the described vocal print characteristic parameter of the described sample audio in preset time period, in institute
State the described vocal print characteristic parameter of the described sample audio in Preset Time when having similar, to described
The described vocal print characteristic parameter of sample audio carries out vocal print cluster training and generates described sample vocal print template.
16. recording devices according to claim 13, it is characterised in that described mark module,
Also include:
Receive unit, for described mark module use described target label information to described currently
After recording is marked, receive the remark information that user is sent by terminal;
Described indexing unit, is also used for described remark information and described current recording is carried out remarks;
Described set up unit, be additionally operable to and described remark information is updated institute in described voice print database
State in target label information.
17. according to the recording device described in any one of claim 10-13, it is characterised in that described
Acquisition module, at least one waveform graph of described current recording comprised specifically for detection
The first clicking operation that labelling corresponding to described fragment to be edited is carried out, and detect described to be edited
The second clicking operation that the target edit mode that fragment is used is carried out and according to described in detecting
First clicking operation and described second clicking operation generate described edit instruction.
18. according to the recording edit methods described in any one of claim 10-13, it is characterised in that
Described acquisition module, chooses from the list that described current recording is comprised specifically for detection
The first clicking operation that at least one labelling is carried out;The described labelling chosen is treated described in indicating
Editor's fragment, and detect that the target edit mode being used described fragment to be edited is carried out second
Clicking operation, and raw according to described first clicking operation detected and described second clicking operation
Become described edit instruction.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510786352.9A CN105895102A (en) | 2015-11-15 | 2015-11-15 | Recording editing method and recording device |
PCT/CN2016/089020 WO2017080235A1 (en) | 2015-11-15 | 2016-07-07 | Audio recording editing method and recording device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510786352.9A CN105895102A (en) | 2015-11-15 | 2015-11-15 | Recording editing method and recording device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105895102A true CN105895102A (en) | 2016-08-24 |
Family
ID=57001979
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510786352.9A Pending CN105895102A (en) | 2015-11-15 | 2015-11-15 | Recording editing method and recording device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105895102A (en) |
WO (1) | WO2017080235A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106356067A (en) * | 2016-08-25 | 2017-01-25 | 乐视控股(北京)有限公司 | Recording method, device and terminal |
CN107403623A (en) * | 2017-07-31 | 2017-11-28 | 努比亚技术有限公司 | Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance |
CN107481743A (en) * | 2017-08-07 | 2017-12-15 | 捷开通讯(深圳)有限公司 | The edit methods of mobile terminal, memory and recording file |
CN107564531A (en) * | 2017-08-25 | 2018-01-09 | 百度在线网络技术(北京)有限公司 | Minutes method, apparatus and computer equipment based on vocal print feature |
CN109545200A (en) * | 2018-10-31 | 2019-03-29 | 深圳大普微电子科技有限公司 | Edit the method and storage device of voice content |
CN110753263A (en) * | 2019-10-29 | 2020-02-04 | 腾讯科技(深圳)有限公司 | Video dubbing method, device, terminal and storage medium |
CN114242120A (en) * | 2021-11-25 | 2022-03-25 | 广东电力信息科技有限公司 | Audio editing method and audio marking method based on DTMF technology |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116132234B (en) * | 2023-01-09 | 2024-03-12 | 天津大学 | Underwater hidden communication method and device using whale animal whistle phase code |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011160390A (en) * | 2010-01-28 | 2011-08-18 | Akitoshi Noda | System utilizing sound recording function of mobile phone for screen display, emergency communication access function or the like |
CN102985965A (en) * | 2010-05-24 | 2013-03-20 | 微软公司 | Voice print identification |
CN103530432A (en) * | 2013-09-24 | 2014-01-22 | 华南理工大学 | Conference recorder with speech extracting function and speech extracting method |
CN103700370A (en) * | 2013-12-04 | 2014-04-02 | 北京中科模识科技有限公司 | Broadcast television voice recognition method and system |
-
2015
- 2015-11-15 CN CN201510786352.9A patent/CN105895102A/en active Pending
-
2016
- 2016-07-07 WO PCT/CN2016/089020 patent/WO2017080235A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011160390A (en) * | 2010-01-28 | 2011-08-18 | Akitoshi Noda | System utilizing sound recording function of mobile phone for screen display, emergency communication access function or the like |
CN102985965A (en) * | 2010-05-24 | 2013-03-20 | 微软公司 | Voice print identification |
CN103530432A (en) * | 2013-09-24 | 2014-01-22 | 华南理工大学 | Conference recorder with speech extracting function and speech extracting method |
CN103700370A (en) * | 2013-12-04 | 2014-04-02 | 北京中科模识科技有限公司 | Broadcast television voice recognition method and system |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106356067A (en) * | 2016-08-25 | 2017-01-25 | 乐视控股(北京)有限公司 | Recording method, device and terminal |
CN107403623A (en) * | 2017-07-31 | 2017-11-28 | 努比亚技术有限公司 | Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance |
CN107481743A (en) * | 2017-08-07 | 2017-12-15 | 捷开通讯(深圳)有限公司 | The edit methods of mobile terminal, memory and recording file |
WO2019029494A1 (en) * | 2017-08-07 | 2019-02-14 | 捷开通讯(深圳)有限公司 | Mobile terminal, memory and record file editing method |
CN107564531A (en) * | 2017-08-25 | 2018-01-09 | 百度在线网络技术(北京)有限公司 | Minutes method, apparatus and computer equipment based on vocal print feature |
CN109545200A (en) * | 2018-10-31 | 2019-03-29 | 深圳大普微电子科技有限公司 | Edit the method and storage device of voice content |
CN110753263A (en) * | 2019-10-29 | 2020-02-04 | 腾讯科技(深圳)有限公司 | Video dubbing method, device, terminal and storage medium |
CN114242120A (en) * | 2021-11-25 | 2022-03-25 | 广东电力信息科技有限公司 | Audio editing method and audio marking method based on DTMF technology |
CN114242120B (en) * | 2021-11-25 | 2023-11-10 | 广东电力信息科技有限公司 | Audio editing method and audio marking method based on DTMF technology |
Also Published As
Publication number | Publication date |
---|---|
WO2017080235A1 (en) | 2017-05-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105895102A (en) | Recording editing method and recording device | |
US10977299B2 (en) | Systems and methods for consolidating recorded content | |
CN107274916B (en) | Method and device for operating audio/video file based on voiceprint information | |
CN105895077A (en) | Recording editing method and recording device | |
CN1333363C (en) | Audio signal processing apparatus and audio signal processing method | |
CN108305632A (en) | A kind of the voice abstract forming method and system of meeting | |
CN102568478B (en) | Video play control method and system based on voice recognition | |
US20210366488A1 (en) | Speaker Identification Method and Apparatus in Multi-person Speech | |
CN105206258A (en) | Generation method and device of acoustic model as well as voice synthetic method and device | |
CN102436812A (en) | Conference recording device and conference recording method using same | |
CN108257592A (en) | A kind of voice dividing method and system based on shot and long term memory models | |
RU2013140574A (en) | SEMANTIC SOUND TRACK MIXER | |
CN108009303A (en) | Searching method, device, electronic equipment and storage medium based on speech recognition | |
CN109448460A (en) | One kind reciting detection method and user equipment | |
CN101185115A (en) | Voice edition device, voice edition method, and voice edition program | |
CN108010512A (en) | The acquisition methods and recording terminal of a kind of audio | |
Stoeger et al. | Age-group estimation in free-ranging African elephants based on acoustic cues of low-frequency rumbles | |
CN107679196A (en) | A kind of multimedia recognition methods, electronic equipment and storage medium | |
Roy et al. | Fast transcription of unstructured audio recordings | |
CN108364655A (en) | Method of speech processing, medium, device and computing device | |
CN109817223A (en) | Phoneme notation method and device based on audio-frequency fingerprint | |
CN105895079A (en) | Voice data processing method and device | |
CN104240697A (en) | Audio data feature extraction method and device | |
CN108235137B (en) | Method and device for judging channel switching action through sound waveform and television | |
CN114242120B (en) | Audio editing method and audio marking method based on DTMF technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160824 |
|
WD01 | Invention patent application deemed withdrawn after publication |