CN105895102A - Recording editing method and recording device - Google Patents

Recording editing method and recording device Download PDF

Info

Publication number
CN105895102A
CN105895102A CN201510786352.9A CN201510786352A CN105895102A CN 105895102 A CN105895102 A CN 105895102A CN 201510786352 A CN201510786352 A CN 201510786352A CN 105895102 A CN105895102 A CN 105895102A
Authority
CN
China
Prior art keywords
vocal print
recording
current recording
edited
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510786352.9A
Other languages
Chinese (zh)
Inventor
蔡竹沁
齐峰岩
牛磊
关彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LeTV Mobile Intelligent Information Technology Beijing Co Ltd
Original Assignee
LeTV Mobile Intelligent Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LeTV Mobile Intelligent Information Technology Beijing Co Ltd filed Critical LeTV Mobile Intelligent Information Technology Beijing Co Ltd
Priority to CN201510786352.9A priority Critical patent/CN105895102A/en
Priority to PCT/CN2016/089020 priority patent/WO2017080235A1/en
Publication of CN105895102A publication Critical patent/CN105895102A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Abstract

The invention provides a recording editing method and a recording device. Acoustic analysis is carried out on current recording. According to an acoustic analysis result, current recording is marked. An editing command to edit the current recording is received. The editing command carries the mark information of a fragment to be edited and an editing mode. According to the mark information, the fragment to be edited is acquired from the marked current recording. According to the editing mode, the fragment to be edited is edited. According to the invention, the current recording is marked through voiceprint identification; after marking, a user edits the current recording based on a mark; the fragment to be edited can be quickly located; the editing time is saved; and the user experience is enhanced.

Description

Recording edit methods and recording device
Technical field
The present invention relates to electronic technology field, particularly relate to a kind of recording edit methods and recording device.
Background technology
Smart mobile phone has gradually been dissolved among people's daily life at present, not only becomes daily communication Equipment, also becomes the daily recording equipment being easy to carry about with one.Wherein, user's record by smart mobile phone Voice messaging can be recorded and preserve by sound application program (Application is called for short APP), It is easy to user preserve one section rapidly and be difficult to directly remember voice messaging, and also can be used for multiple times This recording.
Typically, the recording file that user records usually comprises unwanted information segment, these sheets Section not only takes up room but also hinder user to search really necessary information.Existing recording APP can expire Recording file is edited by foot user according to the actual content of recording, and this needs user to recording literary composition Part repeats playing so that it is determined that go out content to be edited, it is clear that this recording edit mode can take use The time that family is more so that Consumer's Experience is poor.
Summary of the invention
The present invention provides a kind of recording edit methods and recording device, be used for solving existing to record into Waste user time, the problem affecting Consumer's Experience is there is during edlin.
To achieve these goals, the invention provides a kind of recording edit methods, including:
Current recording is carried out acoustic wave analysis and according to acoustic wave analysis result, described current recording is carried out Labelling;
Receive the edit instruction that described current recording is edited, described edit instruction carries and treats The label information of editor's fragment and edit mode;
Described fragment to be edited is chosen according to the described current recording after labelling of the described label information;
According to described edit mode, described fragment to be edited is edited.
To achieve these goals, the invention provides a kind of recording device, including:
Mark module, for carrying out acoustic wave analysis and according to acoustic wave analysis result to institute to current recording State current recording to be marked;
Acquisition module, for obtaining the edit instruction editing described current recording, described volume Collect label information and the edit mode carrying fragment to be edited in instruction;
Choose module, for choosing according to the described current recording after labelling of the described label information Go out described fragment to be edited;
Editor module, for editing described fragment to be edited according to described edit mode.
The recording edit methods of the present invention and recording device, by current recording is carried out acoustic wave analysis, And according to described acoustic wave analysis result, described current recording is marked, receive described current record Sound carries out the edit instruction edited, described edit instruction carries the label information of fragment to be edited with And edit mode, treat according to the described current recording after labelling of the described label information obtains Editor's fragment, edits described fragment to be edited according to described edit mode.The present invention passes through Current recording is marked by Application on Voiceprint Recognition, based on labelling user to current recording after labelling completes Edit such that it is able to navigate to fragment to be edited quickly, save edit session, promote User impression.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the recording edit methods of the embodiment of the present invention one;
Fig. 2 is one of application example schematic diagram of recording edit methods of the embodiment of the present invention one;
Fig. 3 is the two of the application example schematic diagram of the recording edit methods of the embodiment of the present invention one;
Fig. 4 is the three of the application example schematic diagram of the recording edit methods of the embodiment of the present invention one;
Fig. 5 is the four of the application example schematic diagram of the recording edit methods of the embodiment of the present invention one;
Fig. 6 is the schematic flow sheet of the record labels method in the embodiment of the present invention one;
Fig. 7 is one of application example schematic diagram of record labels method in the embodiment of the present invention one;
Fig. 8 is the two of the application example schematic diagram of the record labels method in the embodiment of the present invention one;
Fig. 9 is the three of the application example schematic diagram of the record labels method in the embodiment of the present invention one;
Figure 10 is the schematic flow sheet of the voice print database method for building up in the embodiment of the present invention one;
Figure 11 is the structural representation of the recording device of the embodiment of the present invention two;
Figure 12 is the structural representation of mark module in the embodiment of the present invention two.
Detailed description of the invention
The recording the edit methods below in conjunction with the accompanying drawings embodiment of the present invention provided and recording device It is described in detail.
Embodiment one
As it is shown in figure 1, the schematic flow sheet of the recording edit methods that it is the embodiment of the present invention one, This recording edit methods includes:
Step 101, current recording is carried out acoustic wave analysis and according to acoustic wave analysis result to described currently Recording is marked.
User can open, by the user interface of smart mobile phone, the recording downloaded in smart mobile phone The sound-recording function of APP, recording APP starts to be acquired current recording, during gathering Recording APP can carry out pretreatment to sound.Current recording to gathering carries out acoustic wave analysis, enters And obtain the acoustic wave analysis result of current recording, include that sound wave is special getting acoustic wave analysis result Levy parameter.Owing to the vocal print of speaker has uniqueness, vocal print therefore can be utilized to say as difference The unique features of words people, and then just current recording can be carried out according to this acoustic characteristic parameter Labelling.Wherein, acoustic characteristic parameter includes: the energy of sound, formant, mel cepstrum coefficients (Mel-frequency cepstrum coefficients is called for short MFCC) and linear predictor coefficient (Linear Prediction Coefficients is called for short LPC).
As in figure 2 it is shown, the application example schematic diagram that it is the present embodiment, such as one section recording has 5 Individual speaker, uses left oblique line, right oblique line, horizontal line, vertical line and grid to be marked respectively Words people A, B, C, D, E.Wherein, in this section of recording, have twice by other when speaker A The separate speech of speaker, these two sections speeches all can use left slash mark speaker A, to show It it is the recording paragraph of same speaker.The difference of speaker is seen more intuitively for user, can To use different color mark speaker, such as, redness, yellow, blueness, green is used respectively Normal complexion purple comes labelling speaker A, B, C, D, E.Or when speaker A records in this section In have twice by the separate speech of other speakers, these two sections speeches all can use red-label to speak People A, to be shown to be the recording paragraph of same speaker.
The edit instruction that described current recording is edited by step 102, acquisition, described edit instruction In carry label information and the edit mode of fragment to be edited.
Further, after being marked current recording, user can be by display circle of terminal Seeing labeled recording under face, such user just can pass through terminal to recording APP according to labelling Mode edit instruction.Wherein, edit instruction carries the label information of fragment to be edited, and Edit mode to fragment to be edited.Edit mode can include shearing fragment, the merging choosing chosen In multiple fragments or delete the fragment chosen.
In the present embodiment, the edit instruction that described current recording is edited by described acquisition, including:
First, user can click at least included in the waveform graph of current record by terminal Individual labelling, chooses corresponding fragment to be edited.Specifically, after labelling is clicked on by user, Recording APP can detect at least one fragment to be edited that the waveform graph to current recording is comprised The first clicking operation that corresponding labelling is carried out.Further, after choosing fragment to be edited, use Family can select team's fragment to be edited to compile by the edit mode of terminal demonstration interface display The target edit mode collected.Specifically, after target edit mode is clicked on by user, recording APP just can detect the second click behaviour that the target edit mode being used fragment to be edited is carried out Make.After the first clicking operation and the second clicking operation being detected, according to the first clicking operation and Two clicking operation generate edit instruction.
Alternatively, the edit instruction that described current recording is edited by described acquisition, including:
First, user can click at least included in the waveform graph of current record by terminal Individual labelling, chooses corresponding fragment to be edited.Specifically, after labelling is clicked on by user, Recording APP can detect at least one fragment to be edited that the waveform graph to current recording is comprised The first clicking operation that corresponding labelling is carried out.Further, after choosing fragment to be edited, use Family can select team's fragment to be edited to compile by the edit mode of terminal demonstration interface display The target edit mode collected.Specifically, after target edit mode is clicked on by user, recording APP just can detect the second click behaviour that the target edit mode being used fragment to be edited is carried out Make.After the first clicking operation and the second clicking operation being detected, according to the first clicking operation and Two clicking operation generate edit instruction.
Step 103, treat according to the described current recording after labelling of the described label information obtains Editor's fragment.
Step 104, according to described edit mode, described fragment to be edited is edited.
After receiving edit instruction, recording APP can get to be edited from edit instruction The label information of section, then chooses fragment to be edited according to this label information from current recording. Recording APP can get the edit mode to fragment to be edited from edit instruction, such as, will treat Editor's fragment carries out shearing, merging or deletion action.After getting fragment to be edited, recording It just can be edited by APP according to the edit mode of instruction.
As it is shown on figure 3, the application example schematic diagram that it is the present embodiment, user is at one section of warp of editor When crossing the recording file after voiceprint analysis labelling, it is clear that the moire pattern of this section of recording Shape has different labelling to distinguish.User just can phase by clicking on certain labelling on moire pattern That answers chooses this fragment as fragment to be edited.As shown in Figure 3, user is selected by click Take the fragment of horizontal line labelling as fragment to be edited.In choosing after fragment to be edited, use Family can click on the target edit mode to this fragment to be edited in edit menu, such as, Can click on " fragment is chosen in shearing " as target edit mode, clicked on by above-mentioned twice Operation just can generate the edit instruction editing fragment to be edited, refers to according to this editor Order just can cut this fragment.
As shown in Figure 4, it is the application example schematic diagram of the present embodiment, below recording moire pattern The list having this recording is supplied to user, and user directly can select one from list Labelling, thus can choose whole fragments of speaker representated by this labelling.Such as one section recording There are 3 speakers, use left oblique line, right oblique line, horizontal line to come labelling speaker A, B, C respectively. Wherein speaker A this section recording in have twice by the separate speech of other speakers, these two sections send out Yan Douhui uses left oblique line to carry out labelling.So when left oblique line option during user clicks on list, Two fragments are the most selected, and user can click on certain fragment and cancel choosing and can also keep choosing In this fragment.After choosing multiple fragment, when user attempts to merge it, it is possible to " fragment is chosen in merging " is clicked on as target edit mode from edit mode list.In clicking operation After completing, recording APP just can get edit instruction, can the multiple fragments chosen be closed And be one section of new segment.
As it is shown in figure 5, the application example schematic diagram that it is the present embodiment, user can also be from labelling List is chosen multiple labelling option.In Fig. 5, user have selected left oblique line and right oblique line the two mark Note option, then whole recording fragments of speaker A and speaker B just can be selected.Finally Clicking on " fragment is chosen in merging ", the fragment chosen i.e. is merged into one section of new segment.Further, User can select subdialogue content from all fragments selected and merge.
The recording edit methods that the present embodiment provides, by current recording is carried out acoustic wave analysis, and According to acoustic wave analysis result, described current recording is marked, receives and described current recording is carried out The edit instruction of editor, carries label information and the editor of fragment to be edited in described edit instruction Mode, obtains described to be edited according to the described current recording after labelling of the described label information Section, edits described fragment to be edited according to described edit mode.The present embodiment passes through vocal print Identify and current recording is marked, based on labelling user, current recording is carried out after labelling completes Editor such that it is able to navigate to fragment to be edited quickly, save edit session, improve use Family is experienced.
Before current recording is edited by the present embodiment one, it is necessary first to current recording is entered Line flag, in above-described embodiment one, the detailed process of step 101 can be shown in See Figure 6.Fig. 6 is The schematic flow sheet of the record labels method in the embodiment of the present invention one.This record labels method includes Following steps:
Step 201, collection current recording also extract vocal print characteristic parameter from described current recording.
User can open, by the user interface of smart mobile phone, the recording downloaded in smart mobile phone The sound-recording function of APP, recording APP starts to be acquired current recording, during gathering Recording APP can carry out pretreatment to sound, such as, the data of collection carry out framing, windowing and Filtering etc..
Further, the current recording to gathering carries out feature analysis, and then obtains current recording Acoustic characteristic parameter, wherein, acoustic characteristic parameter includes: the energy of sound, formant, MFCC And LPC.
Step 202, described vocal print parameter is carried out vocal print cluster training obtain the mesh of described vocal print parameter Mark vocal print template.
In the present embodiment, in order to identify the template of recording, it is provided with vocal print cluster training aids, After getting vocal print characteristic parameter, by this training aids, vocal print characteristic parameter is carried out vocal print cluster and instruct Practice, it is possible to obtain the target vocal print template that this current recording is corresponding.
Step 203, judge whether described target vocal print template is the vocal print template in voice print database
In the present embodiment, by training aids, sample audio is carried out vocal print cluster training, obtained sample The sample vocal print template that this sound is corresponding, uses sample vocal print template to pre-set a vocal print number According to library storage in recording APP.In general voice print database, storage has multiple sample vocal print template, So that user carries out record labels in Recording Process.After getting target vocal print template, record Sound APP can make a look up in voice print database, it is judged that whether this target vocal print template is present in In this voice print database.
If it is judged that be yes, perform step 204;Otherwise perform step 205.
Step 204, the target that acquisition is corresponding with described target vocal print template from described voice print database Label information.
In voice print database, not only preserve sample vocal print template, but also storage has and sample sound The label information that stricture of vagina template is corresponding, general each sample vocal print template is to there being respective label information. When getting the sample vocal print template corresponding with target vocal print template in voice print database, so that it may To obtain the target label information corresponding with this target vocal print template.
Step 205, generate the described target label information corresponding with described target vocal print template.
Identify target vocal print template do not exist with voice print database in after, recording APP can Think this one target label information of target vocal print template-setup, with by this target label information pair This target vocal print template is marked.
Step 206, use described target label information that described current recording is marked.
After getting target label information, recording APP uses this target label information automatically Current recording is marked.
The record labels method related in the present embodiment, by the sound that Application on Voiceprint Recognition current recording is corresponding Stricture of vagina template, utilizes the voice print database set up to obtain the label information corresponding with current recording, and then Current recording is marked, it is achieved that the function of labelling recording automatically, and saves user and add The tagged time.
The application example schematic diagram of concrete record labels method can be found in the present embodiment one shown in Fig. 2, Here is omitted.
Step 207, set up mapping relations between described target vocal print template and described target label information And be stored in described voice print database.
The remark information that step 208, reception user are sent by terminal.
Step 209, use described remark information that described current recording is carried out remarks.
Step 210, described remark information is updated target label information described in described voice print database In.
Receiving the remark information that user is sent by terminal, remark information can be coming of current recording Source name, after terminal gets remark information, instruction recording APP uses this remark information to working as Front recording carries out remarks.Such as, recording APP can be the position interpolation one that current recording is corresponding Label.Further, the remark information got can also be updated voice print database by recording APP In in target label information corresponding to the target vocal print template corresponding with current recording, in order to recording Sound is can be again called during source of sound corresponding to current recording.
As it is shown in fig. 7, the application example schematic diagram that it is the present embodiment, when recording APP is to currently After recording carries out automatic labelling, user can send remark information by terminal to recording APP, uses In recording to this section, every speaker adds remark information.Such as, user can be by recording APP To be " Mr. Zhang " with speaker's A remarks of left slash mark.User can be that new speaker adds Remark information, and directly mate with the voiceprint of this speaker, and as the title of this section of recording.
As shown in Figure 8, it is the application example schematic diagram of the present embodiment, when the newly-built one section of record of user Sound, if wherein comprising the recording of the speaker having preserved sound title, after voiceprint analysis, this The recording paragraph of position speaker can be directly labeled as the label information preserved.Save The speaker A of recording the last period be " Mr. Zhang ", newly-built one section of recording comprising this speaker The labelling of speaker A will not be shown again, but show " Mr. Zhang ".
As it is shown in figure 9, the application example schematic diagram that it is the present embodiment, recording comprises user and protects The label information that the teller that deposited is corresponding, according to the speaker of institute's labelling, faster positions needs The recording found.Such as user is look for the teaching recording of Mr. Zhang, as long as find " Teacher " label.
Step 201 gather current recording and from described current recording extract vocal print characteristic parameter it Before, in addition it is also necessary to set up a voice print database by sample audio.
As shown in Figure 10, the flow process of the voice print database method for building up during it is the embodiment of the present invention one Schematic diagram, this voice print database method for building up includes:
Step 301, sample audio is analyzed, extracts the described vocal print feature of described sample audio Parameter.
In the present embodiment, using the recording APP sound recorded before current recording every time as sample This sound.After getting recording every time, the sample audio of recording can be analyzed by recording APP, Extracting the vocal print characteristic parameter of this sample audio, wherein vocal print characteristic parameter includes: the energy of sound Amount, formant, MFCC and LPC etc..
Step 302, described vocal print characteristic parameter according to described sample audio carry out vocal print cluster training Generate sample vocal print template.
In order to the vocal print characteristic parameter getting sample vocal print is carried out vocal print cluster training, need into One step determines that whether this vocal print characteristic parameter is the sound of same source of sound, specifically, Preset Time The described vocal print characteristic parameter of the described sample audio in Duan, when the described sample in described Preset Time Described vocal print when the described vocal print characteristic parameter of this sound has similarity, to described sample audio Characteristic parameter carries out vocal print cluster training and generates described sample vocal print template.If it is determined that sample sound The vocal print characteristic parameter of stricture of vagina not there is similarity, then need to cache vocal print characteristic parameter, After judging that this sound characteristic parameter has similarity again, vocal print characteristic parameter is carried out vocal print and gathers Class training generates sample vocal print template.
Such as, having in one section of recording and have 5 speakers, these 5 speakers just can finish sample Sound, after by vocal print cluster training, can identify these 5 speakers and respectively speak People A, B, C, D, E, and 5 speakers generate corresponding sample vocal print template.
Step 303, for sample labeling information corresponding to described sample vocal print template generation.
After generating sample vocal print template, generate corresponding sample labeling information, example for sample audio As same speaker uses identical labelling to be marked.In the present embodiment, it is possible to use left oblique Line, right oblique line, horizontal line, vertical line and grid are marked speaker A, B, C, D, E.
Step 304, use described sample vocal print template, described sample labeling information and described sample Mapping relations between vocal print template and described sample labeling information generate described voice print database.
In order to improve the agility to record labels, in the present embodiment, use sample vocal print template, Reflecting between described sample labeling information and described sample vocal print template and described sample labeling information Relation of penetrating generates described voice print database.The sound generated after every time recording being carried out vocal print cluster training Stricture of vagina template all can be saved in voice print database as sample vocal print template, and can be by this sample The label information of vocal print template and mapping relations between the two also can be saved in voice print database, So that voice print database is updated.So when again running into the recording of same speaker, recording APP passes through voiceprint analysis, it is possible to rapidly the recording to this speaker is marked, and improves The convenience of record labels.
Embodiment two
As shown in figure 11, it is the structural representation of recording device of the embodiment of the present invention two.Should Device includes: mark module 11, acquisition module 12, choose module 13 and editor module 14.
Wherein, mark module 11, for carrying out acoustic wave analysis and according to acoustic wave analysis to current recording Current recording is marked by result.
Acquisition module 12, for obtaining the edit instruction editing current recording, edit instruction In carry label information and the edit mode of fragment to be edited.
Choose module 13, for waiting to compile according to label information current recording after labelling selects Collect fragment.
Editor module 14, for editing fragment to be edited according to edit mode.
As shown in figure 12, for the one of mark module 11 frame mode alternatively in the present embodiment two, Including: extraction unit 111, training unit 112, judging unit 113, acquiring unit 114, labelling Unit 115, signal generating unit 116, set up unit 117 and receive unit 118.
Wherein, extraction unit 111, it is used for gathering current recording and from current recording, extract vocal print special Levy parameter.
Training unit 112, obtains the mesh of vocal print parameter for vocal print parameter carries out vocal print cluster training Mark vocal print template.
Judging unit 113, for judging whether target vocal print template is the vocal print mould in voice print database Plate.
Acquiring unit 114, for when the judged result of judging unit is for being, from voice print database Obtain the target label information corresponding with target vocal print template.
Indexing unit 115, is used for using target label information to be marked current recording.
Signal generating unit 116, for when the result of judging unit 113 is no, generates and target vocal print The target label information that template is corresponding.
Wherein, set up unit 116, for using target label information to currently at indexing unit 115 After recording is marked, set up between target vocal print template and target label information mapping relations also It is stored in voice print database.
Further, extraction unit 111, it is additionally operable to gathering current recording and carrying from current recording Before taking vocal print characteristic parameter, sample audio is analyzed extracting the vocal print feature ginseng of sample audio Number.
Training unit 112, is additionally operable to the vocal print characteristic parameter according to sample audio and carries out vocal print cluster instruction Practice and generate sample vocal print template.
Signal generating unit 116, is additionally operable to as sample labeling information corresponding to sample vocal print template generation.
Set up unit 117, be also used for sample vocal print template, sample labeling information and sample sound Mapping relations between stricture of vagina template and sample labeling information generate voice print database.
Further, training unit 112, specifically for the sample audio in acquisition preset time period Vocal print characteristic parameter, when the vocal print characteristic parameter of the sample audio in Preset Time has similar, The vocal print characteristic parameter of sample audio is carried out vocal print cluster training generation sample vocal print template.
Wherein, receive unit 118, for using target label information to currently at indexing unit 115 After recording is marked, receive the remark information that user is sent by terminal.
Indexing unit 115, is also used for remark information and current recording is carried out remarks.
Set up unit 117, be additionally operable to update remark information in voice print database in target label information.
Further, acquisition module 12, is wrapped the waveform graph of current recording specifically for detection The first clicking operation that labelling corresponding at least one contained fragment to be edited is carried out, and detect and treat The second clicking operation that editor's target edit mode of being used of fragment is carried out and according to detecting First clicking operation and the second clicking operation generate edit instruction.
Alternatively, acquisition module 12, the list comprised from current recording specifically for detection In choose the first clicking operation that at least one labelling is carried out;The labelling chosen is waited to compile for indicating Collect fragment, and detect the second click behaviour that the target edit mode being used fragment to be edited is carried out Make, and generate edit instruction according to the first clicking operation detected and the second clicking operation.
Each functional module of the recording device that the present embodiment provides can be used for performing above-mentioned enforcement The flow process of the recording edit methods shown in example, its specific works principle repeats no more, and refers to The description of embodiment of the method.
The recording device that the present embodiment provides, by carrying out acoustic wave analysis, and root to current recording According to acoustic wave analysis result, current recording is marked, receives the editor that current recording is edited Instruction, carries label information and the edit mode of fragment to be edited, according to labelling in edit instruction Information obtains fragment to be edited the current recording after labelling, according to edit mode to be edited Duan Jinhang edits.Current recording is marked by the present embodiment by Application on Voiceprint Recognition, completes at labelling After based on labelling user, current recording is edited such that it is able to navigate to be edited quickly Section, saves edit session, improves user's impression.
One of ordinary skill in the art will appreciate that: realize the whole of above-mentioned each method embodiment Or part steps can be completed by the hardware that programmed instruction is relevant.Aforesaid program is permissible It is stored in a computer read/write memory medium.This program upon execution, performs to include State the step of each method embodiment;And aforesaid storage medium includes: ROM, RAM, magnetic The various medium that can store program code such as dish or CD.
It is last it is noted that various embodiments above is only in order to illustrate technical scheme, It is not intended to limit;Although the present invention being described in detail with reference to foregoing embodiments, It will be understood by those within the art that: foregoing embodiments still can be remembered by it The technical scheme carried is modified, or carries out the most some or all of technical characteristic With replacing;And these amendments or replacement, do not make the essence of appropriate technical solution depart from this Invent the scope of each embodiment technical scheme.

Claims (18)

1. a recording edit methods, it is characterised in that including:
Current recording is carried out acoustic wave analysis and according to acoustic wave analysis result, described current recording is carried out Labelling;
Obtain the edit instruction that described current recording is edited, described edit instruction carries and treats The label information of editor's fragment and edit mode;
Described to be edited is selected according to the described current recording after labelling of the described label information Section;
According to described edit mode, described fragment to be edited is edited.
Recording edit methods the most according to claim 1, it is characterised in that described to currently Recording carries out acoustic wave analysis and is marked described current recording according to acoustic wave analysis result, including:
Gather described current recording and from described current recording, extract vocal print characteristic parameter;
Described vocal print parameter is carried out vocal print cluster training and obtains the target vocal print mould of described vocal print parameter Plate;
Judge whether described target vocal print template is the vocal print template in voice print database;
If it is judged that be yes, obtain and described target vocal print template from described voice print database Corresponding target label information;
Use described target label information that described current recording is marked.
Recording edit methods the most according to claim 2, it is characterised in that described use institute State before described current recording is marked by target label information, also include:
If it is judged that be no, generate the described target label corresponding with described target vocal print template Information.
Recording edit methods the most according to claim 3, it is characterised in that described use institute State after described current recording is marked by target label information, also include:
Set up between described target vocal print template and described target label information mapping relations and be stored in In described voice print database.
5. according to the recording edit methods described in any one of claim 1-4, it is characterised in that institute Before stating collection current recording and extracting vocal print characteristic parameter from described current recording, including:
Sample audio is analyzed, extracts the described vocal print characteristic parameter of described sample audio;
Described vocal print characteristic parameter according to described sample audio carries out vocal print cluster training and generates sample Vocal print template;
For the sample labeling information that described sample vocal print template generation is corresponding;
Use described sample vocal print template, described sample labeling information and described sample vocal print template And the mapping relations between described sample labeling information generate described voice print database.
Recording edit methods the most according to claim 5, it is characterised in that described according to institute The described vocal print characteristic parameter stating sample audio carries out vocal print cluster training generation sample vocal print template bag Include:
The described vocal print characteristic parameter of the described sample audio in acquisition preset time period;
The described vocal print characteristic parameter of the described sample audio in described Preset Time has similarity Time, the described vocal print characteristic parameter of described sample audio is carried out the vocal print cluster training described sample of generation This vocal print template.
7. according to the recording edit methods described in any one of claim 1-4, it is characterised in that institute State after using described target label information that described current recording is marked, also include:
Receive the remark information that user is sent by terminal;
Use described remark information that described current recording is carried out remarks;
Described remark information is updated described in described voice print database in target label information.
8. according to the recording edit methods described in any one of claim 1-4, it is characterised in that institute State and obtain the edit instruction that described current recording is edited, including:
Detect at least one described fragment to be edited that the waveform graph to described current recording is comprised The first clicking operation that corresponding labelling is carried out;
The second clicking operation that the target edit mode that described fragment to be edited is used by detection is carried out;
Described editor is generated according to described first clicking operation detected and described second clicking operation Instruction.
9. according to the recording edit methods described in any one of claim 1-4, it is characterised in that institute State and obtain the edit instruction that described current recording is edited, including:
Detection chooses what at least one labelling was carried out from the list that described current recording is comprised First clicking operation;The described labelling chosen is for indicating described fragment to be edited;
Detect the second clicking operation that the target edit mode that described fragment to be edited used is carried out;
Described editor is generated according to described first clicking operation detected and described second clicking operation Instruction.
10. a recording device, it is characterised in that including:
Mark module, for carrying out acoustic wave analysis and according to acoustic wave analysis result to institute to current recording State current recording to be marked;
Acquisition module, for obtaining the edit instruction editing described current recording, described volume Collect label information and the edit mode carrying fragment to be edited in instruction;
Choose module, for choosing according to the described current recording after labelling of the described label information Go out described fragment to be edited;
Editor module, for editing described fragment to be edited according to described edit mode.
11. recording devices according to claim 10, it is characterised in that described mark module Including:
Extraction unit, is used for gathering described current recording and extracts vocal print from described current recording special Levy parameter;
Training unit, obtains described vocal print ginseng for described vocal print parameter carries out vocal print cluster training The target vocal print template of number;
Judging unit, for judging whether described target vocal print template is the vocal print in voice print database Template;
Acquiring unit, for when the judged result of described judging unit is for being, from described vocal print number According to storehouse obtains the target label information corresponding with described target vocal print template;
Indexing unit, is used for using described target label information to be marked described current recording.
12. recording devices according to claim 11, it is characterised in that described mark module, Also include:
Signal generating unit, for when the result of described judging unit is no, generates and described target sound The described target label information that stricture of vagina template is corresponding.
13. recording devices according to claim 12, it is characterised in that described mark module, Also include:
Set up unit, for described indexing unit use described target label information to described currently After recording is marked, sets up and reflect between described target vocal print template and described target label information Penetrate relation and be stored in described voice print database.
14. according to the recording device described in any one of claim 10-13, it is characterised in that described Extraction unit, is additionally operable to gathering current recording and extracting vocal print feature ginseng from described current recording Before number, sample audio is analyzed extracting the described vocal print characteristic parameter of described sample audio;
Described training unit, is additionally operable to the described vocal print characteristic parameter according to described sample audio and carries out Vocal print cluster training generates sample vocal print template;
Described signal generating unit, is additionally operable to the sample labeling into described sample vocal print template generation is corresponding and believes Breath;
Described set up unit, be also used for described sample vocal print template, described sample labeling information And the mapping relations between described sample vocal print template and described sample labeling information generate described sound Stricture of vagina data base.
15. recording devices according to claim 14, it is characterised in that described training unit, Specifically for obtaining the described vocal print characteristic parameter of the described sample audio in preset time period, in institute State the described vocal print characteristic parameter of the described sample audio in Preset Time when having similar, to described The described vocal print characteristic parameter of sample audio carries out vocal print cluster training and generates described sample vocal print template.
16. recording devices according to claim 13, it is characterised in that described mark module, Also include:
Receive unit, for described mark module use described target label information to described currently After recording is marked, receive the remark information that user is sent by terminal;
Described indexing unit, is also used for described remark information and described current recording is carried out remarks;
Described set up unit, be additionally operable to and described remark information is updated institute in described voice print database State in target label information.
17. according to the recording device described in any one of claim 10-13, it is characterised in that described Acquisition module, at least one waveform graph of described current recording comprised specifically for detection The first clicking operation that labelling corresponding to described fragment to be edited is carried out, and detect described to be edited The second clicking operation that the target edit mode that fragment is used is carried out and according to described in detecting First clicking operation and described second clicking operation generate described edit instruction.
18. according to the recording edit methods described in any one of claim 10-13, it is characterised in that Described acquisition module, chooses from the list that described current recording is comprised specifically for detection The first clicking operation that at least one labelling is carried out;The described labelling chosen is treated described in indicating Editor's fragment, and detect that the target edit mode being used described fragment to be edited is carried out second Clicking operation, and raw according to described first clicking operation detected and described second clicking operation Become described edit instruction.
CN201510786352.9A 2015-11-15 2015-11-15 Recording editing method and recording device Pending CN105895102A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201510786352.9A CN105895102A (en) 2015-11-15 2015-11-15 Recording editing method and recording device
PCT/CN2016/089020 WO2017080235A1 (en) 2015-11-15 2016-07-07 Audio recording editing method and recording device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510786352.9A CN105895102A (en) 2015-11-15 2015-11-15 Recording editing method and recording device

Publications (1)

Publication Number Publication Date
CN105895102A true CN105895102A (en) 2016-08-24

Family

ID=57001979

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510786352.9A Pending CN105895102A (en) 2015-11-15 2015-11-15 Recording editing method and recording device

Country Status (2)

Country Link
CN (1) CN105895102A (en)
WO (1) WO2017080235A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106356067A (en) * 2016-08-25 2017-01-25 乐视控股(北京)有限公司 Recording method, device and terminal
CN107403623A (en) * 2017-07-31 2017-11-28 努比亚技术有限公司 Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance
CN107481743A (en) * 2017-08-07 2017-12-15 捷开通讯(深圳)有限公司 The edit methods of mobile terminal, memory and recording file
CN107564531A (en) * 2017-08-25 2018-01-09 百度在线网络技术(北京)有限公司 Minutes method, apparatus and computer equipment based on vocal print feature
CN109545200A (en) * 2018-10-31 2019-03-29 深圳大普微电子科技有限公司 Edit the method and storage device of voice content
CN110753263A (en) * 2019-10-29 2020-02-04 腾讯科技(深圳)有限公司 Video dubbing method, device, terminal and storage medium
CN114242120A (en) * 2021-11-25 2022-03-25 广东电力信息科技有限公司 Audio editing method and audio marking method based on DTMF technology

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116132234B (en) * 2023-01-09 2024-03-12 天津大学 Underwater hidden communication method and device using whale animal whistle phase code

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011160390A (en) * 2010-01-28 2011-08-18 Akitoshi Noda System utilizing sound recording function of mobile phone for screen display, emergency communication access function or the like
CN102985965A (en) * 2010-05-24 2013-03-20 微软公司 Voice print identification
CN103530432A (en) * 2013-09-24 2014-01-22 华南理工大学 Conference recorder with speech extracting function and speech extracting method
CN103700370A (en) * 2013-12-04 2014-04-02 北京中科模识科技有限公司 Broadcast television voice recognition method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011160390A (en) * 2010-01-28 2011-08-18 Akitoshi Noda System utilizing sound recording function of mobile phone for screen display, emergency communication access function or the like
CN102985965A (en) * 2010-05-24 2013-03-20 微软公司 Voice print identification
CN103530432A (en) * 2013-09-24 2014-01-22 华南理工大学 Conference recorder with speech extracting function and speech extracting method
CN103700370A (en) * 2013-12-04 2014-04-02 北京中科模识科技有限公司 Broadcast television voice recognition method and system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106356067A (en) * 2016-08-25 2017-01-25 乐视控股(北京)有限公司 Recording method, device and terminal
CN107403623A (en) * 2017-07-31 2017-11-28 努比亚技术有限公司 Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance
CN107481743A (en) * 2017-08-07 2017-12-15 捷开通讯(深圳)有限公司 The edit methods of mobile terminal, memory and recording file
WO2019029494A1 (en) * 2017-08-07 2019-02-14 捷开通讯(深圳)有限公司 Mobile terminal, memory and record file editing method
CN107564531A (en) * 2017-08-25 2018-01-09 百度在线网络技术(北京)有限公司 Minutes method, apparatus and computer equipment based on vocal print feature
CN109545200A (en) * 2018-10-31 2019-03-29 深圳大普微电子科技有限公司 Edit the method and storage device of voice content
CN110753263A (en) * 2019-10-29 2020-02-04 腾讯科技(深圳)有限公司 Video dubbing method, device, terminal and storage medium
CN114242120A (en) * 2021-11-25 2022-03-25 广东电力信息科技有限公司 Audio editing method and audio marking method based on DTMF technology
CN114242120B (en) * 2021-11-25 2023-11-10 广东电力信息科技有限公司 Audio editing method and audio marking method based on DTMF technology

Also Published As

Publication number Publication date
WO2017080235A1 (en) 2017-05-18

Similar Documents

Publication Publication Date Title
CN105895102A (en) Recording editing method and recording device
US10977299B2 (en) Systems and methods for consolidating recorded content
CN107274916B (en) Method and device for operating audio/video file based on voiceprint information
CN105895077A (en) Recording editing method and recording device
CN1333363C (en) Audio signal processing apparatus and audio signal processing method
CN108305632A (en) A kind of the voice abstract forming method and system of meeting
CN102568478B (en) Video play control method and system based on voice recognition
US20210366488A1 (en) Speaker Identification Method and Apparatus in Multi-person Speech
CN105206258A (en) Generation method and device of acoustic model as well as voice synthetic method and device
CN102436812A (en) Conference recording device and conference recording method using same
CN108257592A (en) A kind of voice dividing method and system based on shot and long term memory models
RU2013140574A (en) SEMANTIC SOUND TRACK MIXER
CN108009303A (en) Searching method, device, electronic equipment and storage medium based on speech recognition
CN109448460A (en) One kind reciting detection method and user equipment
CN101185115A (en) Voice edition device, voice edition method, and voice edition program
CN108010512A (en) The acquisition methods and recording terminal of a kind of audio
Stoeger et al. Age-group estimation in free-ranging African elephants based on acoustic cues of low-frequency rumbles
CN107679196A (en) A kind of multimedia recognition methods, electronic equipment and storage medium
Roy et al. Fast transcription of unstructured audio recordings
CN108364655A (en) Method of speech processing, medium, device and computing device
CN109817223A (en) Phoneme notation method and device based on audio-frequency fingerprint
CN105895079A (en) Voice data processing method and device
CN104240697A (en) Audio data feature extraction method and device
CN108235137B (en) Method and device for judging channel switching action through sound waveform and television
CN114242120B (en) Audio editing method and audio marking method based on DTMF technology

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160824

WD01 Invention patent application deemed withdrawn after publication