WO2017080235A1

WO2017080235A1 - Audio recording editing method and recording device

Info

Publication number: WO2017080235A1
Application number: PCT/CN2016/089020
Authority: WO
Inventors: 蔡竹沁; 齐峰岩; 牛磊; 关彬
Original assignee: 乐视控股（北京）有限公司; 乐视移动智能信息技术（北京）有限公司
Priority date: 2015-11-15
Filing date: 2016-07-07
Publication date: 2017-05-18
Also published as: CN105895102A

Abstract

An audio recording editing method and recording device: performing sound wave analysis on a current audio recording, and tagging same on the basis of the results of said sound wave analysis (101); receiving editing commands for editing the current audio recording, said editing instructions carrying tag information for an audio clip to be edited and editing mode (102); on the basis of said tag information, obtaining the audio clip to be edited from the tagged current audio recording (103); editing said audio clip according to said editing mode (104). An embodiment of the present invention tags a current recording by means of voice print identification, and, when tagging is completed, edits the current recording on the basis of tagged users, making it possible to quickly locate an audio clip to be edited, saving editing time and improving user experience.

Description

Recording editing method and recording device

The present application claims priority to Chinese Patent Application No. 2015.

Technical field

The embodiments of the present application relate to the field of electronic technologies, and in particular, to a recording editing method and a recording device.

Background technique

At present, smart phones are gradually integrated into people's daily lives, which not only become daily communication devices, but also become recording devices that are easy to carry everyday. The user can record and save the voice information through the recording application (Application, APP for short) of the smart phone, so that the user can quickly save a piece of voice information that is difficult to directly memorize, and can also use the recording multiple times.

In general, user-recorded recording files often contain unwanted pieces of information that take up space and prevent users from finding the information they really need. The existing recording APP can satisfy the user to edit the recording file according to the actual content of the recording, which requires the user to repeatedly play the recording file to determine the content to be edited. Obviously, the recording editing mode takes up more time for the user, The user experience is poor.

Summary of the invention

The embodiment of the present invention provides a recording editing method and a recording device, which are used to solve the problem that the user has wasted user time and affects the user experience when editing the recording.

In order to achieve the above object, an embodiment of the present application provides a recording editing method, including:

Performing sonic analysis on the current recording and marking the current recording according to the result of the acoustic analysis;

Receiving an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;

Selecting the to-be-edited segment from the currently recorded recording after the marking according to the marking information;

The edited segment is edited according to the editing mode.

In order to achieve the above object, an embodiment of the present application provides a recording apparatus, including:

a marking module, configured to perform sonic analysis on the current recording and mark the current recording according to the result of the acoustic wave analysis;

An obtaining module, configured to acquire an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;

a selection module, configured to select the to-be-edited segment from the marked current recording according to the marking information;

And an editing module, configured to edit the to-be-edited segment according to the editing manner.

In another aspect, embodiments of the present application provide a recording apparatus including a memory, one or more processors, and one or more programs, wherein the one or more programs are executed by the one or more processors Performing the following operations: performing sonic analysis on the current recording and marking the current recording according to the result of the acoustic analysis; acquiring an editing instruction for editing the current recording, the editing instruction carrying the marking information of the segment to be edited and Editing mode; selecting the to-be-edited segment from the marked current recording according to the marking information; and editing the to-be-edited segment according to the editing mode.

In another aspect, embodiments of the present application provide a computer readable storage medium having stored thereon computer executable instructions that, in response to execution, cause a recording device to perform an operation, the operation The method includes: performing sonic analysis on the current recording, and marking the current recording according to the sound wave analysis result; acquiring an editing instruction for editing the current recording, where the editing instruction carries the marking information of the to-be-edited segment and the editing mode; The mark information selects the to-be-edited segment from the currently recorded recording, and edits the to-be-edited segment according to the editing mode.

The recording editing method and the recording device of the embodiment of the present application perform sound wave analysis on the current recording, and mark the current recording according to the sound wave analysis result, and receive an editing instruction for editing the current recording, the editing The instruction carries the to-be-edited piece The marking information of the segment and the editing mode are obtained, and the to-be-edited segment is obtained from the currently recorded recording after the marking according to the marking information, and the to-be-edited segment is edited according to the editing manner. In the embodiment of the present application, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.

DRAWINGS

1 is a schematic flow chart of a recording editing method according to Embodiment 1 of the present application;

2 is a schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application;

3 is a second schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application;

4 is a third schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application;

FIG. 5 is a fourth schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application; FIG.

6 is a schematic flowchart of a method for recording marks in the first embodiment of the present application;

FIG. 7 is a schematic diagram of an application example of a recording mark method in Embodiment 1 of the present application; FIG.

8 is a second schematic diagram of an application example of a recording mark method in the first embodiment of the present application;

9 is a third schematic diagram of an application example of a recording mark method in the first embodiment of the present application;

10 is a schematic flowchart of a method for establishing a voiceprint database according to Embodiment 1 of the present application;

11 is a schematic structural diagram of a recording apparatus according to Embodiment 2 of the present application;

12 is a schematic structural diagram of a marking module according to Embodiment 2 of the present application;

FIG. 13 is a schematic structural diagram of still another embodiment of a recording apparatus provided by the present application; FIG.

FIG. 14 is a schematic structural diagram of an embodiment of a computer program product for recording editing provided by the present application.

detailed description

The recording editing method and the recording apparatus provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Embodiment 1

As shown in FIG. 1 , it is a schematic flowchart of a recording editing method according to Embodiment 1 of the present application, and the recording editing method includes:

Step 101: Perform sonic analysis on the current recording and mark the current recording according to the result of the acoustic analysis.

The user can open the recording function of the recording APP downloaded in the smart phone through the user interface of the smart phone, and the recording APP starts collecting the current recording, and the recording APP can preprocess the sound during the collection process. Acoustic analysis is performed on the current recorded recording, and then the acoustic analysis result of the current recording is obtained, and the acoustic wave characteristic parameter is included in the obtained acoustic wave analysis result. Since the voiceprint of the speaker is unique, the voiceprint can be used as the unique feature of distinguishing the speaker, and the current recording can be marked according to the acoustic feature parameter. Among them, the acoustic wave characteristic parameters include: energy of sound, formant, Mel-frequency cepstrum coefficients (MFCC) and Linear Prediction Coefficients (LPC).

As shown in FIG. 2 , it is a schematic diagram of an application example of the embodiment. For example, a recording has 5 speakers, and the left oblique line, the right oblique line, the horizontal line, the vertical line, and the grid are respectively used to mark the speakers A and B. , C, D, E. Among them, when speaker A has two speeches separated by other speakers in this recording, both speeches will use the left slash to mark speaker A to indicate the recording segment of the same speaker. In order for the user to more intuitively see the difference in the speaker, the speaker can be marked with a different color, for example, the speakers A, B, C, D, E are marked with red, yellow, blue, green and purple, respectively. Or when Speaker A has two speeches separated by other speakers in this recording, both of these speeches will be marked with a red marker to indicate that they are the recorded passage of the same speaker.

Step 102: Acquire an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode.

Further, after marking the current recording, the user can see the marked recording through the display interface of the terminal, so that the user can edit the instruction according to the marking to the recording APP mode through the terminal. The editing instruction carries the marking information of the segment to be edited, and the editing manner of the segment to be edited. Editing can include cutting selected segments, merging selected segments, or deleting selected segments.

In this embodiment, the acquiring an edit instruction for editing the current recording includes:

First, the user can select the corresponding segment to be edited by clicking at least one tag included in the currently recorded waveform graphic. Specifically, after the user clicks on the mark, the recording APP can detect a first click operation on the mark corresponding to the at least one piece to be edited included in the currently recorded waveform pattern. Further, after the segment to be edited is selected, the user can select a target editing mode for editing by editing a segment to be edited by the editing mode displayed on the terminal display interface. Specifically, after the user clicks on the target editing mode, the recording APP can detect the second click operation performed by the target editing mode used to edit the segment. After detecting the first click operation and the second click operation, an edit instruction is generated according to the first click operation and the second click operation.

Optionally, the obtaining an editing instruction for editing the current recording includes:

Step 103: Acquire the to-be-edited segment from the marked current recording according to the marking information.

Step 104: Edit the to-be-edited segment according to the editing mode.

After receiving the editing instruction, the recording APP can obtain the marking information of the segment to be edited from the editing instruction, and then select the segment to be edited from the current recording according to the marking information. The recording APP can obtain the editing mode of the segment to be edited from the editing instruction, for example, cutting, merging or deleting the segment to be edited. After the clip to be edited is acquired, the recording APP can edit it according to the indicated editing mode.

As shown in FIG. 3, it is a schematic diagram of an application example of the embodiment. When editing a recording file after the voiceprint analysis mark, the user can clearly see that the ripple pattern of the recording has different mark distinctions. The user can select the clip as the clip to be edited by clicking on a mark on the ripple pattern. As shown in FIG. 3, the user selects a clip marked with a horizontal line as a clip to be edited by clicking. After selecting the segment to be edited in the selection, the user can click the target editing mode of the segment to be edited in the editing menu. For example, the user can click "cut selected segment" as the target editing mode, and the above two click operations can be used to generate the treatment. Edit the edit instruction of the clip for editing, and the clip can be cut according to the edit command.

As shown in FIG. 4 , it is a schematic diagram of an application example of the embodiment. A list of tags of the recording is provided to the user under the recording ripple graphic, and the user can directly select a tag from the tag list, so that the tag can be selected. All the fragments of the speaker. For example, a recording has 3 speakers, using the left slash, the right slash, and the horizontal line to mark the speakers A, B, and C. Among them, speaker A has two speeches separated by other speakers in this recording, and both speeches are marked with a left slash. Then when the user clicks the left slash option in the tag list, both segments are selected at the same time, and the user can click on a segment to uncheck or keep the selected segment. When multiple clips are selected, when the user tries to merge them, they can click "Merge Selected Clips" from the edit mode list as the target editing mode. After the click operation is completed, the recording APP can obtain an editing instruction, and the selected plurality of segments can be merged into a new segment.

As shown in FIG. 5, which is a schematic diagram of an application example of the embodiment, the user may also select multiple marking options from the tag list. In Figure 5, the user selects the two mark options of the left slash and the right slash, so that all the segments of the speaker A and the speaker B can be selected. Finally, click on "Merge selected clips" and the selected clips will be merged into a new clip. Further, the user can select a part of the conversation content from all the selected segments for merging.

The recording editing method provided in this embodiment performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, where the editing instruction carries the to-be-edited Marking information of the segment and editing manner, according to the marking information, obtaining the The segment to be edited is edited according to the editing mode. In this embodiment, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the segment to be edited can be quickly located, which saves editing time and improves user experience.

Before the current recording is edited in the first embodiment, the current recording needs to be marked first. The specific process of step 101 in the first embodiment can be seen in FIG. 6 below. FIG. 6 is a schematic flow chart of a method for recording marks in the first embodiment of the present application. The recording marking method includes the following steps:

Step 201: Acquire a current recording and extract a voiceprint feature parameter from the current recording.

The user can open the recording function of the recording APP downloaded in the smart phone through the user interface of the smart phone, and the recording APP starts to collect the current recording. During the collection process, the recording APP can preprocess the sound, for example, the collected data is performed. Framing, windowing, filtering, etc.

Further, feature analysis is performed on the collected current recording, and then the acoustic characteristic parameters of the current recording are obtained, wherein the acoustic characteristic parameters include: energy of the sound, formant, MFCC, and LPC.

Step 202: Perform voiceprint clustering training on the voiceprint parameter to obtain a target voiceprint template of the voiceprint parameter.

In this embodiment, in order to identify the template of the recording, a voiceprint clustering trainer is arranged, and after obtaining the voiceprint feature parameters, the voiceprint clustering training is performed on the voiceprint feature parameters by the trainer, and the current The target voiceprint template corresponding to the recording.

Step 203: Determine whether the target voiceprint template is a voiceprint template in the voiceprint database.

In this embodiment, the voice sound clustering training is performed on the sample sound by the trainer, and the sample voiceprint template corresponding to the sample sound is obtained, and a voiceprint database is preset in the recording APP by using the sample voiceprint template. A plurality of sample voiceprint templates are stored in the general voiceprint database, so that the user can perform recording marks during the recording process. After obtaining the target voiceprint template, the recording APP can search in the voiceprint database to determine whether the target voiceprint template exists in the voiceprint database.

If the result of the determination is yes, go to step 204; otherwise, go to step 205.

Step 204: Obtain target tag information corresponding to the target voiceprint template from the voiceprint database.

In the voiceprint database, not only the sample voiceprint template is stored, but also the marker information corresponding to the sample voiceprint template is stored. Generally, each sample voiceprint template corresponds to the corresponding marker information. When the sample voiceprint template corresponding to the target voiceprint template is obtained in the voiceprint database, the target marker information corresponding to the target voiceprint template may be acquired.

Step 205: Generate the target tag information corresponding to the target voiceprint template.

After identifying that the target voiceprint template does not exist in the voiceprint database, the recording APP may set a target marker information for the target voiceprint template to mark the target voiceprint template by the target marker information.

Step 206: Mark the current recording with the target tag information.

After the target tag information is obtained, the recording APP automatically uses the target tag information to mark the current recording.

In the recording and marking method involved in the embodiment, the voiceprint template corresponding to the current recording is identified by the voiceprint, and the marked information corresponding to the current recording is obtained by using the established voiceprint database, thereby marking the current recording, and realizing the automatic marking recording. Features, and saves time when users add tags.

For a schematic example of the application of the specific recording and marking method, refer to FIG. 2 in the first embodiment, and details are not described herein again.

Step 207: Establish a mapping relationship between the target voiceprint template and the target tag information, and store the data in the voiceprint database.

Step 208: Receive remark information sent by the user through the terminal.

Step 209: Remark the current recording by using the remark information.

Step 210: Update the remark information to the target tag information in the voiceprint data.

Receiving the remark information sent by the user through the terminal, the remark information may be the source name of the current recording. After the terminal obtains the remark information, the recording APP is instructed to use the remark information to remark the current recording. For example, the recording app can add the position corresponding to the current recording. Add a label. Further, the recording APP can also update the obtained remark information to the target mark information corresponding to the target voiceprint template corresponding to the current recording in the voiceprint data, so that the recorded sound can be called again when the sound source corresponding to the current recording is used. .

As shown in FIG. 7 , it is a schematic diagram of an application example of the embodiment. After the recording APP automatically marks the current recording, the user can send a remark information to the recording APP through the terminal, which is used to add each speaker in the recording. Remarks. For example, the user can note the speaker A marked with a left slash as "Zhang Teacher" through the recording app. The comment information that the user can add to the new speaker and directly matches the speaker's voiceprint information as the name of the recording.

As shown in FIG. 8 , it is a schematic diagram of an application example of the embodiment. When a user creates a new recording, if the recording of the speaker of the saved sound name is included, the voice recording of the speaker will be directly after the voiceprint analysis. Mark as saved tag information. For example, the speaker A who has saved the previous recording is “Zhang Teacher”. The new recording containing the speaker will not display the speaker A's mark, but the “Zhang Teacher”.

As shown in FIG. 9 , it is a schematic diagram of an application example of the embodiment. The recording includes the marking information corresponding to the speaker saved by the user, and the recorded recording needs to be quickly located according to the marked speaker. For example, if the user wants to find a recording of Mr. Zhang's lecture, just look for the label of "Zhang Teacher".

Before the current recording is acquired in step 201 and the voiceprint feature parameters are extracted from the current recording, a voiceprint database needs to be created by the sample sound.

As shown in FIG. 10, it is a schematic flowchart of a method for establishing a voiceprint database in Embodiment 1 of the present application, and the method for establishing a voiceprint database includes:

Step 301: Analyze the sample sound, and extract the voiceprint feature parameter of the sample sound.

In this embodiment, each recorded sound of the recording APP before the current recording is taken as the sample sound. After each recording is obtained, the recording APP analyzes the sampled sound of the recording, and extracts the voiceprint characteristic parameters of the sample sound, wherein the voiceprint characteristic parameters include: sound energy, formant, MFCC, and LPC.

Step 302: Perform voiceprint clustering according to the voiceprint feature parameter of the sample sound Train to generate a sample voiceprint template.

In order to perform voiceprint clustering training on the voiceprint feature parameters of the sample voiceprint, it is necessary to further determine whether the voiceprint feature parameter is the sound of the same sound source, specifically, the sample sound in the preset time period. a voiceprint feature parameter, when the voiceprint feature parameter of the sample sound has similarity in the preset time, performing a voiceprint clustering training generation on the voiceprint feature parameter of the sample sound The sample voiceprint template. If it is determined that the voiceprint characteristic parameters of the sample voiceprint have no similarity, the voiceprint feature parameters need to be buffered, and then the sound feature parameters are judged to have similarity, and then the voiceprint clustering training is performed on the voiceprint feature parameters. Generate a sample voiceprint template.

For example, if there are 5 speakers in a recording, the 5 speakers can complete the sample sound. After training through voiceprint clustering, the 5 speakers can be identified as speakers A, B, and C. , D, E, and 5 speakers generate the corresponding sample voiceprint template.

Step 303: Generate corresponding sample tag information for the sample voiceprint template.

After the sample voiceprint template is generated, corresponding sample marker information is generated for the sample sound, for example, the same speaker is marked with the same marker. In this embodiment, the speakers A, B, C, D, and E can be marked using a left oblique line, a right oblique line, a horizontal line, a vertical line, and a grid.

Step 304: Generate the voiceprint database by using the sample voiceprint template, the sample marker information, and a mapping relationship between the sample voiceprint template and the sample marker information.

In order to improve the speed of the recording mark, in the embodiment, the voiceprint database is generated by using a sample voiceprint template, the sample mark information, and a mapping relationship between the sample voiceprint template and the sample mark information. The voiceprint template generated after each voiceprint clustering training of the recording is saved as a sample voiceprint template in the voiceprint database, and the mapping information of the sample voiceprint template and the mapping relationship between the two are also It will be saved to the voiceprint database to update the voiceprint database. In this way, when the recording of the same speaker is encountered again, the recording APP can quickly mark the speaker's recording through the voiceprint analysis, thereby improving the convenience of the recording mark.

Embodiment 2

FIG. 11 is a schematic structural diagram of a recording apparatus according to Embodiment 2 of the present application. The device comprises: a marking module 11, an obtaining module 12, a selecting module 13 and an editing module 14.

The marking module 11 is configured to perform acoustic wave analysis on the current recording and mark the current recording according to the sound wave analysis result.

The obtaining module 12 is configured to obtain an editing instruction for editing the current recording, and the editing instruction carries the marking information of the to-be-edited segment and the editing mode.

The selecting module 13 is configured to select a segment to be edited from the current recording after the marking according to the marking information.

The editing module 14 is configured to edit the edited segments according to the editing mode.

As shown in FIG. 12, an optional configuration manner of the marking module 11 in the second embodiment includes: an extracting unit 111, a training unit 112, a determining unit 113, an obtaining unit 114, a marking unit 115, a generating unit 116, The unit 117 and the receiving unit 118 are established.

The extracting unit 111 is configured to collect the current recording and extract the voiceprint feature parameters from the current recording.

The training unit 112 is configured to perform voiceprint clustering training on the voiceprint parameters to obtain a target voiceprint template of the voiceprint parameters.

The determining unit 113 is configured to determine whether the target voiceprint template is a voiceprint template in the voiceprint database.

The obtaining unit 114 is configured to acquire target tag information corresponding to the target voiceprint template from the voiceprint database when the determination result of the determining unit is YES.

The marking unit 115 is configured to mark the current recording with the target marking information.

The generating unit 116 is configured to generate target tag information corresponding to the target voiceprint template when the result of the determining unit 113 is NO.

The establishing unit 116 is configured to establish a mapping relationship between the target voiceprint template and the target tag information after the tag unit 115 marks the current recording by using the target tag information, and store the data in the voiceprint database.

Further, the extracting unit 111 is further configured to analyze the sample sound to extract the voiceprint feature parameter of the sample sound before acquiring the current recording and extracting the voiceprint feature parameter from the current recording.

The training unit 112 is further configured to generate a sample voiceprint template by performing voiceprint clustering training according to the voiceprint feature parameter of the sample sound.

The generating unit 116 is further configured to generate corresponding sample tag information for the sample voiceprint template.

The establishing unit 117 is further configured to generate a voiceprint database by using a sample voiceprint template, sample mark information, and a mapping relationship between the sample voiceprint template and the sample mark information.

Further, the training unit 112 is specifically configured to acquire a voiceprint feature parameter of the sample sound in the preset time period, and the voiceprint feature parameter of the sample sound when the voiceprint feature parameters of the sample sound have similar in the preset time period The voiceprint clustering training is performed to generate a sample voiceprint template.

The receiving unit 118 is configured to receive the remark information sent by the user through the terminal after the marking unit 115 marks the current recording by using the target tag information.

The marking unit 115 is further configured to use the comment information to comment on the current recording.

The establishing unit 117 is further configured to update the remark information into the target tag information in the voiceprint data.

Further, the obtaining module 12 is specifically configured to detect a first click operation performed on a mark corresponding to at least one to-be-edited segment included in the currently recorded waveform pattern, and detect a second edit operation performed on the target edit mode used by the edited segment Clicking an operation and generating an editing instruction according to the detected first click operation and second click operation.

Optionally, the obtaining module 12 is specifically configured to detect a first click operation performed by selecting at least one tag from the tag list included in the current recording; the selected tag is used to indicate the segment to be edited, and detecting the segment to be edited The second click operation performed by the target editing mode, and the editing instruction is generated according to the detected first click operation and second click operation.

The function modules of the recording apparatus provided in this embodiment can be used to execute the flow of the recording editing method shown in the above embodiments. The specific working principle is not described here. For details, refer to the description of the method embodiments.

The recording device provided in this embodiment performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, receives an editing instruction for editing the current recording, and carries the marking information of the to-be-edited segment and the editing mode in the editing instruction. According to The tag information is obtained from the current recording after the tag, and the edited segment is edited according to the editing mode. In this embodiment, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the segment to be edited can be quickly located, which saves editing time and improves user experience.

Embodiment 3

FIG. 13 is a schematic structural diagram of still another embodiment of a recording apparatus provided by the present application. As shown in FIG. 13, the recording apparatus of the embodiment of the present application includes a memory 61, one or more processors 62, and one or more programs 63.

The one or more programs 63, when executed by one or more processors 62, perform any of the above-described embodiments.

The recording apparatus of the embodiment of the present invention performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, and the editing instruction is carried in the editing instruction. Editing the mark information of the segment and the editing mode, and acquiring the to-be-edited segment from the current recording after the marking according to the marking information, and editing the segment to be edited according to the editing mode. In the embodiment of the present application, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.

Embodiment 4

FIG. 14 is a schematic structural diagram of an embodiment of a computer program product for recording editing provided by the present application. As shown in Figure 14,

The computer program product 71 for recording editing of the embodiment of the present application may include a signal bearing medium 72. Signal bearing medium 72 may include one or more instructions 73 that, when executed by, for example, a processor, may provide the functionality described above with respect to Figures 1-12. For example, the instructions 73 can include: one or more instructions for performing sonic analysis on the current recording and marking the current recording based on the results of the acoustic analysis; for obtaining an editing instruction to edit the current recording, Carrying in the edit order One or more instructions of the mark information of the segment to be edited and the edit mode; one or more instructions for selecting the slice to be edited from the marked current record according to the mark information; and for The editing mode is one or more instructions for editing the segment to be edited. Thus, for example, referring to FIG. 11, the recording device can perform one or more of the steps shown in FIG. 1 in response to instruction 73.

In some implementations, signal bearing medium 72 can include computer readable media 74 such as, but not limited to, a hard disk drive, a compact disk (CD), a digital versatile disk (DVD), a digital tape, a memory, and the like. In some implementations, the signal bearing medium 72 can include a recordable medium 75 such as, but not limited to, a memory, a read/write (R/W) CD, an R/W DVD, and the like. In some implementations, the signal bearing medium 72 can include a communication medium 76 such as, but not limited to, a digital and/or analog communication medium (eg, fiber optic cable, waveguide, wired communication link, wireless communication link, etc.). Thus, for example, the computer program product 71 can be transmitted by the RF signal bearing medium 72 to one or more modules of the identification device of the multi-finger swipe gesture, wherein the signal bearing medium 72 is comprised of a wireless communication medium (eg, wireless compliant with the IEEE 802.11 standard) Communication medium) transmission.

The computer program product of the embodiment of the present application performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, where the editing instruction carries The mark information of the clip to be edited and the edit mode, the clip to be edited is obtained from the current record after the mark according to the mark information, and the clip to be edited is edited according to the edit mode. In the embodiment of the present application, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.

Through the description of the above embodiments, those skilled in the art can clearly understand that the various embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware. Based on such understanding, the above-described technical solutions may be embodied in the form of software products in essence or in the form of software products, which may be stored in a computer readable storage medium such as ROM/RAM, magnetic Disc, CD, etc., including a number of instructions to make a computer device (can be a A human computer, server, or network device, etc.) performs the methods described in various embodiments or portions of the embodiments.

Finally, it should be noted that the above embodiments are only for explaining the technical solutions of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present application. range.

Claims

A recording editing method, comprising:

Performing sonic analysis on the current recording and marking the current recording according to the result of the acoustic analysis;

Obtaining an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;

Selecting the to-be-edited segment from the marked current recording according to the marking information;

The edited segment is edited according to the editing mode.
The recording editing method according to claim 1, wherein the performing sound wave analysis on the current recording and marking the current recording according to the sound wave analysis result comprises:

Collecting the current recording and extracting a voiceprint feature parameter from the current recording;

Performing a voiceprint clustering training on the voiceprint parameter to obtain a target voiceprint template of the voiceprint parameter;

Determining whether the target voiceprint template is a voiceprint template in a voiceprint database;

If the determination result is yes, the target mark information corresponding to the target voiceprint template is obtained from the voiceprint database;

The current recording is marked using the target tag information.
The recording editing method according to claim 2, wherein before the marking the current recording using the target marking information, the method further comprises:

If the determination result is no, the target tag information corresponding to the target voiceprint template is generated.
The recording editing method according to claim 3, wherein after the marking the current recording using the target marking information, the method further comprises:

Establishing a mapping relationship between the target voiceprint template and the target marker information and storing in the voiceprint database.
The recording editing method according to any one of claims 1 to 4, wherein before the collecting the current recording and extracting the voiceprint feature parameters from the current recording, the method comprises:

And analyzing the sample sound, and extracting the voiceprint characteristic parameter of the sample sound;

Performing a voiceprint clustering training according to the voiceprint feature parameter of the sample sound to generate a sample voiceprint template;

Generating corresponding sample tag information for the sample voiceprint template;

The voiceprint database is generated using the sample voiceprint template, the sample marker information, and a mapping relationship between the sample voiceprint template and the sample marker information.
The recording editing method according to claim 5, wherein the generating the sample voiceprint template by performing the voiceprint clustering training according to the voiceprint feature parameter of the sample sound comprises:

Obtaining the voiceprint feature parameter of the sample sound within a preset time period;

And performing the voiceprint clustering training on the voiceprint feature parameter of the sample sound to generate the sample voiceprint template when the voiceprint feature parameters of the sample sound have similarity within the preset time.
The recording editing method according to any one of claims 1 to 4, wherein after the marking of the current recording using the target tag information, the method further comprises:

Receiving remark information sent by the user through the terminal;

Remarking the current recording using the comment information;

Updating the comment information to the target tag information in the voiceprint data.
The recording editing method according to any one of claims 1 to 4, wherein the obtaining an editing instruction for editing the current recording comprises:

Detecting a first click operation performed on a mark corresponding to at least one of the to-be-edited segments included in the waveform pattern of the current recording;

Detecting a second click operation performed on the target editing mode adopted by the segment to be edited;

The editing instruction is generated according to the detected first click operation and the second click operation.
The recording editing method according to any one of claims 1 to 4, wherein the obtaining an editing instruction for editing the current recording comprises:

Detecting to select at least one tag from the list of tags included in the current recording a first click operation; the selected mark is used to indicate the segment to be edited;

Detecting a second click operation performed by the target editing mode used by the to-be-edited segment;

The editing instruction is generated according to the detected first click operation and the second click operation.
A recording device, comprising:

a marking module, configured to perform sonic analysis on the current recording and mark the current recording according to the result of the acoustic wave analysis;

An obtaining module, configured to acquire an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;

a selection module, configured to select the to-be-edited segment from the marked current recording according to the marking information;

And an editing module, configured to edit the to-be-edited segment according to the editing manner.
The recording apparatus according to claim 10, wherein the marking module comprises:

An extracting unit, configured to collect the current recording and extract a voiceprint feature parameter from the current recording;

a training unit, configured to perform voiceprint clustering training on the voiceprint parameter to obtain a target voiceprint template of the voiceprint parameter;

a determining unit, configured to determine whether the target voiceprint template is a voiceprint template in a voiceprint database;

An obtaining unit, configured to acquire target tag information corresponding to the target voiceprint template from the voiceprint database when the determination result of the determining unit is YES;

a marking unit for marking the current recording with the target marking information.
The recording device according to claim 11, wherein the marking module further comprises:

And a generating unit, configured to generate, when the result of the determining unit is no, the target tag information corresponding to the target voiceprint template.
A recording apparatus according to claim 12, wherein said marking mode Block, also includes:

a establishing unit, configured to establish a mapping relationship between the target voiceprint template and the target tag information after the tag unit marks the current recording by using the target tag information, and store the mapping relationship in the voiceprint database in.
The recording apparatus according to any one of claims 10 to 13, wherein the extracting unit is further configured to analyze the sample sound before acquiring the current recording and extracting the voiceprint characteristic parameter from the current recording. Extracting the voiceprint feature parameters of the sample sound;

The training unit is further configured to generate a sample voiceprint template by performing voiceprint clustering training according to the voiceprint feature parameter of the sample sound;

The generating unit is further configured to generate corresponding sample tag information for the sample voiceprint template;

The establishing unit is further configured to generate the voiceprint database by using the sample voiceprint template, the sample marker information, and a mapping relationship between the sample voiceprint template and the sample marker information.
The recording apparatus according to claim 14, wherein the training unit is configured to acquire the voiceprint feature parameter of the sample sound in a preset time period, where the preset time is When the voiceprint feature parameters of the sample sound have similarities, voiceprint clustering training is performed on the voiceprint feature parameters of the sample sound to generate the sample voiceprint template.
The recording device according to claim 13, wherein the marking module further comprises:

a receiving unit, configured to receive the remark information sent by the user through the terminal after the marking module marks the current recording by using the target tag information;

The marking unit is further configured to remark the current recording by using the remark information;

The establishing unit is further configured to update the remark information into the target tag information in the voiceprint data.
A recording apparatus according to any one of claims 10 to 13, wherein The acquiring module is specifically configured to detect a first click operation performed on a mark corresponding to at least one of the to-be-edited segments included in the currently recorded waveform pattern, and detect target editing used for the to-be-edited segment And performing a second click operation and generating the editing instruction according to the detected first click operation and the second click operation.
The recording editing method according to any one of claims 10 to 13, wherein the acquiring module is specifically configured to detect a first click operation performed by selecting at least one mark from a list of tags included in the current recording. The selected mark is used to indicate the to-be-edited segment, and detects a second click operation performed on the target editing mode adopted by the segment to be edited, and according to the detected first click operation and The second click operation generates the edit instruction.