WO2017080235A1 - Audio recording editing method and recording device - Google Patents

Audio recording editing method and recording device Download PDF

Info

Publication number
WO2017080235A1
WO2017080235A1 PCT/CN2016/089020 CN2016089020W WO2017080235A1 WO 2017080235 A1 WO2017080235 A1 WO 2017080235A1 CN 2016089020 W CN2016089020 W CN 2016089020W WO 2017080235 A1 WO2017080235 A1 WO 2017080235A1
Authority
WO
WIPO (PCT)
Prior art keywords
voiceprint
recording
editing
target
sample
Prior art date
Application number
PCT/CN2016/089020
Other languages
French (fr)
Chinese (zh)
Inventor
蔡竹沁
齐峰岩
牛磊
关彬
Original Assignee
乐视控股(北京)有限公司
乐视移动智能信息技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 乐视控股(北京)有限公司, 乐视移动智能信息技术(北京)有限公司 filed Critical 乐视控股(北京)有限公司
Publication of WO2017080235A1 publication Critical patent/WO2017080235A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Definitions

  • the embodiments of the present application relate to the field of electronic technologies, and in particular, to a recording editing method and a recording device.
  • smart phones are gradually integrated into people's daily lives, which not only become daily communication devices, but also become recording devices that are easy to carry everyday.
  • the user can record and save the voice information through the recording application (Application, APP for short) of the smart phone, so that the user can quickly save a piece of voice information that is difficult to directly memorize, and can also use the recording multiple times.
  • Application Application, APP for short
  • the embodiment of the present invention provides a recording editing method and a recording device, which are used to solve the problem that the user has wasted user time and affects the user experience when editing the recording.
  • an embodiment of the present application provides a recording editing method, including:
  • the edited segment is edited according to the editing mode.
  • an embodiment of the present application provides a recording apparatus, including:
  • a marking module configured to perform sonic analysis on the current recording and mark the current recording according to the result of the acoustic wave analysis
  • An obtaining module configured to acquire an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;
  • a selection module configured to select the to-be-edited segment from the marked current recording according to the marking information
  • an editing module configured to edit the to-be-edited segment according to the editing manner.
  • embodiments of the present application provide a recording apparatus including a memory, one or more processors, and one or more programs, wherein the one or more programs are executed by the one or more processors Performing the following operations: performing sonic analysis on the current recording and marking the current recording according to the result of the acoustic analysis; acquiring an editing instruction for editing the current recording, the editing instruction carrying the marking information of the segment to be edited and Editing mode; selecting the to-be-edited segment from the marked current recording according to the marking information; and editing the to-be-edited segment according to the editing mode.
  • embodiments of the present application provide a computer readable storage medium having stored thereon computer executable instructions that, in response to execution, cause a recording device to perform an operation, the operation
  • the method includes: performing sonic analysis on the current recording, and marking the current recording according to the sound wave analysis result; acquiring an editing instruction for editing the current recording, where the editing instruction carries the marking information of the to-be-edited segment and the editing mode; The mark information selects the to-be-edited segment from the currently recorded recording, and edits the to-be-edited segment according to the editing mode.
  • the recording editing method and the recording device of the embodiment of the present application perform sound wave analysis on the current recording, and mark the current recording according to the sound wave analysis result, and receive an editing instruction for editing the current recording, the editing The instruction carries the to-be-edited piece
  • the marking information of the segment and the editing mode are obtained, and the to-be-edited segment is obtained from the currently recorded recording after the marking according to the marking information, and the to-be-edited segment is edited according to the editing manner.
  • the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.
  • FIG. 1 is a schematic flow chart of a recording editing method according to Embodiment 1 of the present application.
  • FIG. 2 is a schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application;
  • FIG. 3 is a second schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application;
  • FIG. 4 is a third schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application.
  • FIG. 5 is a fourth schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application.
  • FIG. 6 is a schematic flowchart of a method for recording marks in the first embodiment of the present application.
  • FIG. 7 is a schematic diagram of an application example of a recording mark method in Embodiment 1 of the present application.
  • FIG. 8 is a second schematic diagram of an application example of a recording mark method in the first embodiment of the present application.
  • FIG. 9 is a third schematic diagram of an application example of a recording mark method in the first embodiment of the present application.
  • FIG. 10 is a schematic flowchart of a method for establishing a voiceprint database according to Embodiment 1 of the present application;
  • FIG. 11 is a schematic structural diagram of a recording apparatus according to Embodiment 2 of the present application.
  • FIG. 12 is a schematic structural diagram of a marking module according to Embodiment 2 of the present application.
  • FIG. 13 is a schematic structural diagram of still another embodiment of a recording apparatus provided by the present application.
  • FIG. 14 is a schematic structural diagram of an embodiment of a computer program product for recording editing provided by the present application.
  • FIG. 1 it is a schematic flowchart of a recording editing method according to Embodiment 1 of the present application, and the recording editing method includes:
  • Step 101 Perform sonic analysis on the current recording and mark the current recording according to the result of the acoustic analysis.
  • the user can open the recording function of the recording APP downloaded in the smart phone through the user interface of the smart phone, and the recording APP starts collecting the current recording, and the recording APP can preprocess the sound during the collection process.
  • Acoustic analysis is performed on the current recorded recording, and then the acoustic analysis result of the current recording is obtained, and the acoustic wave characteristic parameter is included in the obtained acoustic wave analysis result.
  • the voiceprint of the speaker is unique, the voiceprint can be used as the unique feature of distinguishing the speaker, and the current recording can be marked according to the acoustic feature parameter.
  • the acoustic wave characteristic parameters include: energy of sound, formant, Mel-frequency cepstrum coefficients (MFCC) and Linear Prediction Coefficients (LPC).
  • FIG. 2 it is a schematic diagram of an application example of the embodiment.
  • a recording has 5 speakers, and the left oblique line, the right oblique line, the horizontal line, the vertical line, and the grid are respectively used to mark the speakers A and B. , C, D, E.
  • both speeches will use the left slash to mark speaker A to indicate the recording segment of the same speaker.
  • the speaker can be marked with a different color, for example, the speakers A, B, C, D, E are marked with red, yellow, blue, green and purple, respectively.
  • Speaker A has two speeches separated by other speakers in this recording, both of these speeches will be marked with a red marker to indicate that they are the recorded passage of the same speaker.
  • Step 102 Acquire an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode.
  • the editing instruction carries the marking information of the segment to be edited, and the editing manner of the segment to be edited. Editing can include cutting selected segments, merging selected segments, or deleting selected segments.
  • the acquiring an edit instruction for editing the current recording includes:
  • the user can select the corresponding segment to be edited by clicking at least one tag included in the currently recorded waveform graphic.
  • the recording APP can detect a first click operation on the mark corresponding to the at least one piece to be edited included in the currently recorded waveform pattern.
  • the user can select a target editing mode for editing by editing a segment to be edited by the editing mode displayed on the terminal display interface.
  • the recording APP can detect the second click operation performed by the target editing mode used to edit the segment. After detecting the first click operation and the second click operation, an edit instruction is generated according to the first click operation and the second click operation.
  • the obtaining an editing instruction for editing the current recording includes:
  • the user can select the corresponding segment to be edited by clicking at least one tag included in the currently recorded waveform graphic.
  • the recording APP can detect a first click operation on the mark corresponding to the at least one piece to be edited included in the currently recorded waveform pattern.
  • the user can select a target editing mode for editing by editing a segment to be edited by the editing mode displayed on the terminal display interface.
  • the recording APP can detect the second click operation performed by the target editing mode used to edit the segment. After detecting the first click operation and the second click operation, an edit instruction is generated according to the first click operation and the second click operation.
  • Step 103 Acquire the to-be-edited segment from the marked current recording according to the marking information.
  • Step 104 Edit the to-be-edited segment according to the editing mode.
  • the recording APP can obtain the marking information of the segment to be edited from the editing instruction, and then select the segment to be edited from the current recording according to the marking information.
  • the recording APP can obtain the editing mode of the segment to be edited from the editing instruction, for example, cutting, merging or deleting the segment to be edited. After the clip to be edited is acquired, the recording APP can edit it according to the indicated editing mode.
  • FIG. 3 it is a schematic diagram of an application example of the embodiment.
  • the user can clearly see that the ripple pattern of the recording has different mark distinctions.
  • the user can select the clip as the clip to be edited by clicking on a mark on the ripple pattern.
  • the user selects a clip marked with a horizontal line as a clip to be edited by clicking.
  • the user can click the target editing mode of the segment to be edited in the editing menu. For example, the user can click "cut selected segment" as the target editing mode, and the above two click operations can be used to generate the treatment. Edit the edit instruction of the clip for editing, and the clip can be cut according to the edit command.
  • FIG. 4 it is a schematic diagram of an application example of the embodiment.
  • a list of tags of the recording is provided to the user under the recording ripple graphic, and the user can directly select a tag from the tag list, so that the tag can be selected. All the fragments of the speaker.
  • a recording has 3 speakers, using the left slash, the right slash, and the horizontal line to mark the speakers A, B, and C.
  • speaker A has two speeches separated by other speakers in this recording, and both speeches are marked with a left slash. Then when the user clicks the left slash option in the tag list, both segments are selected at the same time, and the user can click on a segment to uncheck or keep the selected segment.
  • the user When multiple clips are selected, when the user tries to merge them, they can click "Merge Selected Clips" from the edit mode list as the target editing mode. After the click operation is completed, the recording APP can obtain an editing instruction, and the selected plurality of segments can be merged into a new segment.
  • the user may also select multiple marking options from the tag list.
  • the user selects the two mark options of the left slash and the right slash, so that all the segments of the speaker A and the speaker B can be selected.
  • click on "Merge selected clips” and the selected clips will be merged into a new clip. Further, the user can select a part of the conversation content from all the selected segments for merging.
  • the recording editing method provided in this embodiment performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, where the editing instruction carries the to-be-edited Marking information of the segment and editing manner, according to the marking information, obtaining the The segment to be edited is edited according to the editing mode.
  • the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the segment to be edited can be quickly located, which saves editing time and improves user experience.
  • FIG. 6 is a schematic flow chart of a method for recording marks in the first embodiment of the present application.
  • the recording marking method includes the following steps:
  • Step 201 Acquire a current recording and extract a voiceprint feature parameter from the current recording.
  • the user can open the recording function of the recording APP downloaded in the smart phone through the user interface of the smart phone, and the recording APP starts to collect the current recording.
  • the recording APP can preprocess the sound, for example, the collected data is performed. Framing, windowing, filtering, etc.
  • acoustic characteristic parameters of the current recording are obtained, wherein the acoustic characteristic parameters include: energy of the sound, formant, MFCC, and LPC.
  • Step 202 Perform voiceprint clustering training on the voiceprint parameter to obtain a target voiceprint template of the voiceprint parameter.
  • a voiceprint clustering trainer is arranged, and after obtaining the voiceprint feature parameters, the voiceprint clustering training is performed on the voiceprint feature parameters by the trainer, and the current The target voiceprint template corresponding to the recording.
  • Step 203 Determine whether the target voiceprint template is a voiceprint template in the voiceprint database.
  • the voice sound clustering training is performed on the sample sound by the trainer, and the sample voiceprint template corresponding to the sample sound is obtained, and a voiceprint database is preset in the recording APP by using the sample voiceprint template.
  • a plurality of sample voiceprint templates are stored in the general voiceprint database, so that the user can perform recording marks during the recording process.
  • the recording APP can search in the voiceprint database to determine whether the target voiceprint template exists in the voiceprint database.
  • step 204 If the result of the determination is yes, go to step 204; otherwise, go to step 205.
  • Step 204 Obtain target tag information corresponding to the target voiceprint template from the voiceprint database.
  • the sample voiceprint template In the voiceprint database, not only the sample voiceprint template is stored, but also the marker information corresponding to the sample voiceprint template is stored. Generally, each sample voiceprint template corresponds to the corresponding marker information.
  • the target marker information corresponding to the target voiceprint template may be acquired.
  • Step 205 Generate the target tag information corresponding to the target voiceprint template.
  • the recording APP may set a target marker information for the target voiceprint template to mark the target voiceprint template by the target marker information.
  • Step 206 Mark the current recording with the target tag information.
  • the recording APP automatically uses the target tag information to mark the current recording.
  • the voiceprint template corresponding to the current recording is identified by the voiceprint, and the marked information corresponding to the current recording is obtained by using the established voiceprint database, thereby marking the current recording, and realizing the automatic marking recording.
  • Step 207 Establish a mapping relationship between the target voiceprint template and the target tag information, and store the data in the voiceprint database.
  • Step 208 Receive remark information sent by the user through the terminal.
  • Step 209 Remark the current recording by using the remark information.
  • Step 210 Update the remark information to the target tag information in the voiceprint data.
  • the remark information may be the source name of the current recording.
  • the recording APP is instructed to use the remark information to remark the current recording.
  • the recording app can add the position corresponding to the current recording. Add a label.
  • the recording APP can also update the obtained remark information to the target mark information corresponding to the target voiceprint template corresponding to the current recording in the voiceprint data, so that the recorded sound can be called again when the sound source corresponding to the current recording is used. .
  • FIG. 7 it is a schematic diagram of an application example of the embodiment.
  • the recording APP automatically marks the current recording
  • the user can send a remark information to the recording APP through the terminal, which is used to add each speaker in the recording.
  • Remarks For example, the user can note the speaker A marked with a left slash as "Zhang Teacher" through the recording app.
  • FIG. 8 it is a schematic diagram of an application example of the embodiment.
  • the voice recording of the speaker of the saved sound name will be directly after the voiceprint analysis. Mark as saved tag information.
  • the speaker A who has saved the previous recording is “Zhang Teacher”.
  • the new recording containing the speaker will not display the speaker A's mark, but the “Zhang Teacher”.
  • the recording includes the marking information corresponding to the speaker saved by the user, and the recorded recording needs to be quickly located according to the marked speaker. For example, if the user wants to find a recording of Mr. Zhang's lecture, just look for the label of "Zhang Teacher".
  • a voiceprint database needs to be created by the sample sound.
  • FIG. 10 it is a schematic flowchart of a method for establishing a voiceprint database in Embodiment 1 of the present application, and the method for establishing a voiceprint database includes:
  • Step 301 Analyze the sample sound, and extract the voiceprint feature parameter of the sample sound.
  • each recorded sound of the recording APP before the current recording is taken as the sample sound.
  • the recording APP analyzes the sampled sound of the recording, and extracts the voiceprint characteristic parameters of the sample sound, wherein the voiceprint characteristic parameters include: sound energy, formant, MFCC, and LPC.
  • Step 302 Perform voiceprint clustering according to the voiceprint feature parameter of the sample sound Train to generate a sample voiceprint template.
  • a voiceprint feature parameter when the voiceprint feature parameter of the sample sound has similarity in the preset time, performing a voiceprint clustering training generation on the voiceprint feature parameter of the sample sound
  • the sample voiceprint template If it is determined that the voiceprint characteristic parameters of the sample voiceprint have no similarity, the voiceprint feature parameters need to be buffered, and then the sound feature parameters are judged to have similarity, and then the voiceprint clustering training is performed on the voiceprint feature parameters. Generate a sample voiceprint template.
  • the 5 speakers can complete the sample sound. After training through voiceprint clustering, the 5 speakers can be identified as speakers A, B, and C. , D, E, and 5 speakers generate the corresponding sample voiceprint template.
  • Step 303 Generate corresponding sample tag information for the sample voiceprint template.
  • the speakers A, B, C, D, and E can be marked using a left oblique line, a right oblique line, a horizontal line, a vertical line, and a grid.
  • Step 304 Generate the voiceprint database by using the sample voiceprint template, the sample marker information, and a mapping relationship between the sample voiceprint template and the sample marker information.
  • the voiceprint database is generated by using a sample voiceprint template, the sample mark information, and a mapping relationship between the sample voiceprint template and the sample mark information.
  • the voiceprint template generated after each voiceprint clustering training of the recording is saved as a sample voiceprint template in the voiceprint database, and the mapping information of the sample voiceprint template and the mapping relationship between the two are also It will be saved to the voiceprint database to update the voiceprint database. In this way, when the recording of the same speaker is encountered again, the recording APP can quickly mark the speaker's recording through the voiceprint analysis, thereby improving the convenience of the recording mark.
  • FIG. 11 is a schematic structural diagram of a recording apparatus according to Embodiment 2 of the present application.
  • the device comprises: a marking module 11, an obtaining module 12, a selecting module 13 and an editing module 14.
  • the marking module 11 is configured to perform acoustic wave analysis on the current recording and mark the current recording according to the sound wave analysis result.
  • the obtaining module 12 is configured to obtain an editing instruction for editing the current recording, and the editing instruction carries the marking information of the to-be-edited segment and the editing mode.
  • the selecting module 13 is configured to select a segment to be edited from the current recording after the marking according to the marking information.
  • the editing module 14 is configured to edit the edited segments according to the editing mode.
  • an optional configuration manner of the marking module 11 in the second embodiment includes: an extracting unit 111, a training unit 112, a determining unit 113, an obtaining unit 114, a marking unit 115, a generating unit 116, The unit 117 and the receiving unit 118 are established.
  • the extracting unit 111 is configured to collect the current recording and extract the voiceprint feature parameters from the current recording.
  • the training unit 112 is configured to perform voiceprint clustering training on the voiceprint parameters to obtain a target voiceprint template of the voiceprint parameters.
  • the determining unit 113 is configured to determine whether the target voiceprint template is a voiceprint template in the voiceprint database.
  • the obtaining unit 114 is configured to acquire target tag information corresponding to the target voiceprint template from the voiceprint database when the determination result of the determining unit is YES.
  • the marking unit 115 is configured to mark the current recording with the target marking information.
  • the generating unit 116 is configured to generate target tag information corresponding to the target voiceprint template when the result of the determining unit 113 is NO.
  • the establishing unit 116 is configured to establish a mapping relationship between the target voiceprint template and the target tag information after the tag unit 115 marks the current recording by using the target tag information, and store the data in the voiceprint database.
  • the extracting unit 111 is further configured to analyze the sample sound to extract the voiceprint feature parameter of the sample sound before acquiring the current recording and extracting the voiceprint feature parameter from the current recording.
  • the training unit 112 is further configured to generate a sample voiceprint template by performing voiceprint clustering training according to the voiceprint feature parameter of the sample sound.
  • the generating unit 116 is further configured to generate corresponding sample tag information for the sample voiceprint template.
  • the establishing unit 117 is further configured to generate a voiceprint database by using a sample voiceprint template, sample mark information, and a mapping relationship between the sample voiceprint template and the sample mark information.
  • the training unit 112 is specifically configured to acquire a voiceprint feature parameter of the sample sound in the preset time period, and the voiceprint feature parameter of the sample sound when the voiceprint feature parameters of the sample sound have similar in the preset time period
  • the voiceprint clustering training is performed to generate a sample voiceprint template.
  • the receiving unit 118 is configured to receive the remark information sent by the user through the terminal after the marking unit 115 marks the current recording by using the target tag information.
  • the marking unit 115 is further configured to use the comment information to comment on the current recording.
  • the establishing unit 117 is further configured to update the remark information into the target tag information in the voiceprint data.
  • the obtaining module 12 is specifically configured to detect a first click operation performed on a mark corresponding to at least one to-be-edited segment included in the currently recorded waveform pattern, and detect a second edit operation performed on the target edit mode used by the edited segment Clicking an operation and generating an editing instruction according to the detected first click operation and second click operation.
  • the obtaining module 12 is specifically configured to detect a first click operation performed by selecting at least one tag from the tag list included in the current recording; the selected tag is used to indicate the segment to be edited, and detecting the segment to be edited.
  • the second click operation performed by the target editing mode, and the editing instruction is generated according to the detected first click operation and second click operation.
  • the function modules of the recording apparatus provided in this embodiment can be used to execute the flow of the recording editing method shown in the above embodiments.
  • the specific working principle is not described here. For details, refer to the description of the method embodiments.
  • the recording device performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, receives an editing instruction for editing the current recording, and carries the marking information of the to-be-edited segment and the editing mode in the editing instruction.
  • the tag information is obtained from the current recording after the tag, and the edited segment is edited according to the editing mode.
  • the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the segment to be edited can be quickly located, which saves editing time and improves user experience.
  • FIG. 13 is a schematic structural diagram of still another embodiment of a recording apparatus provided by the present application.
  • the recording apparatus of the embodiment of the present application includes a memory 61, one or more processors 62, and one or more programs 63.
  • the one or more programs 63 when executed by one or more processors 62, perform any of the above-described embodiments.
  • the recording apparatus of the embodiment of the present invention performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, and the editing instruction is carried in the editing instruction. Editing the mark information of the segment and the editing mode, and acquiring the to-be-edited segment from the current recording after the marking according to the marking information, and editing the segment to be edited according to the editing mode.
  • the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.
  • FIG. 14 is a schematic structural diagram of an embodiment of a computer program product for recording editing provided by the present application. As shown in Figure 14,
  • the computer program product 71 for recording editing of the embodiment of the present application may include a signal bearing medium 72.
  • Signal bearing medium 72 may include one or more instructions 73 that, when executed by, for example, a processor, may provide the functionality described above with respect to Figures 1-12.
  • the instructions 73 can include: one or more instructions for performing sonic analysis on the current recording and marking the current recording based on the results of the acoustic analysis; for obtaining an editing instruction to edit the current recording, Carrying in the edit order One or more instructions of the mark information of the segment to be edited and the edit mode; one or more instructions for selecting the slice to be edited from the marked current record according to the mark information; and for The editing mode is one or more instructions for editing the segment to be edited.
  • the recording device can perform one or more of the steps shown in FIG. 1 in response to instruction 73.
  • signal bearing medium 72 can include computer readable media 74 such as, but not limited to, a hard disk drive, a compact disk (CD), a digital versatile disk (DVD), a digital tape, a memory, and the like.
  • the signal bearing medium 72 can include a recordable medium 75 such as, but not limited to, a memory, a read/write (R/W) CD, an R/W DVD, and the like.
  • the signal bearing medium 72 can include a communication medium 76 such as, but not limited to, a digital and/or analog communication medium (eg, fiber optic cable, waveguide, wired communication link, wireless communication link, etc.).
  • the computer program product 71 can be transmitted by the RF signal bearing medium 72 to one or more modules of the identification device of the multi-finger swipe gesture, wherein the signal bearing medium 72 is comprised of a wireless communication medium (eg, wireless compliant with the IEEE 802.11 standard) Communication medium) transmission.
  • a wireless communication medium eg, wireless compliant with the IEEE 802.11 standard
  • the computer program product of the embodiment of the present application performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, where the editing instruction carries The mark information of the clip to be edited and the edit mode, the clip to be edited is obtained from the current record after the mark according to the mark information, and the clip to be edited is edited according to the edit mode.
  • the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)
  • User Interface Of Digital Computer (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

An audio recording editing method and recording device: performing sound wave analysis on a current audio recording, and tagging same on the basis of the results of said sound wave analysis (101); receiving editing commands for editing the current audio recording, said editing instructions carrying tag information for an audio clip to be edited and editing mode (102); on the basis of said tag information, obtaining the audio clip to be edited from the tagged current audio recording (103); editing said audio clip according to said editing mode (104). An embodiment of the present invention tags a current recording by means of voice print identification, and, when tagging is completed, edits the current recording on the basis of tagged users, making it possible to quickly locate an audio clip to be edited, saving editing time and improving user experience.

Description

录音编辑方法及录音装置Recording editing method and recording device
本专利申请要求申请日为2015年11月15日、申请号为2015107863529的中国专利申请的优先权,并将上述专利申请以引用的方式全文引入本文中。The present application claims priority to Chinese Patent Application No. 2015.
技术领域Technical field
本申请实施例涉及电子技术领域,尤其涉及一种录音编辑方法及录音装置。The embodiments of the present application relate to the field of electronic technologies, and in particular, to a recording editing method and a recording device.
背景技术Background technique
目前智能手机逐渐融入到了人们日常生活之中,不但成为日常通讯设备,也成为日常易于携带的记录设备。其中,用户通过智能手机的录音应用程序(Application,简称APP)可以对语音信息进行录制和保存,便于用户快速地保存一段难以直接记忆语音信息,并且还可以多次使用该录音。At present, smart phones are gradually integrated into people's daily lives, which not only become daily communication devices, but also become recording devices that are easy to carry everyday. The user can record and save the voice information through the recording application (Application, APP for short) of the smart phone, so that the user can quickly save a piece of voice information that is difficult to directly memorize, and can also use the recording multiple times.
一般,用户录制的录音文件中常常包含不需要的信息片段,这些片段既占用空间又妨碍用户查找真正所需的信息。现有的录音APP可以满足用户根据录音的实际内容对录音文件进行编辑,这需要用户对录音文件重复播放从而确定出待编辑的内容,显然这种录音编辑方式会占用用户较多的时间,使得用户体验较差。In general, user-recorded recording files often contain unwanted pieces of information that take up space and prevent users from finding the information they really need. The existing recording APP can satisfy the user to edit the recording file according to the actual content of the recording, which requires the user to repeatedly play the recording file to determine the content to be edited. Obviously, the recording editing mode takes up more time for the user, The user experience is poor.
发明内容Summary of the invention
本申请实施例提供一种录音编辑方法及录音装置,用于解决现有对录音进行编辑时存在浪费用户时间,影响用户体验的问题。The embodiment of the present invention provides a recording editing method and a recording device, which are used to solve the problem that the user has wasted user time and affects the user experience when editing the recording.
为了实现上述目的,本申请实施例提供了一种录音编辑方法,包括:In order to achieve the above object, an embodiment of the present application provides a recording editing method, including:
对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记;Performing sonic analysis on the current recording and marking the current recording according to the result of the acoustic analysis;
接收对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式; Receiving an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;
根据所述标记信息从标记后的所述当前录音中选中所述待编辑片段;Selecting the to-be-edited segment from the currently recorded recording after the marking according to the marking information;
按照所述编辑方式对所述待编辑片段进行编辑。The edited segment is edited according to the editing mode.
为了实现上述目的,本申请实施例提供了一种录音装置,包括:In order to achieve the above object, an embodiment of the present application provides a recording apparatus, including:
标记模块,用于对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记;a marking module, configured to perform sonic analysis on the current recording and mark the current recording according to the result of the acoustic wave analysis;
获取模块,用于获取对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式;An obtaining module, configured to acquire an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;
选取模块,用于根据所述标记信息从标记后的所述当前录音中选取出所述待编辑片段;a selection module, configured to select the to-be-edited segment from the marked current recording according to the marking information;
编辑模块,用于按照所述编辑方式对所述待编辑片段进行编辑。And an editing module, configured to edit the to-be-edited segment according to the editing manner.
另一方面,本申请实施例提供一种录音装置,包括存储器、一个或多个处理器以及一个或多个程序,其中,所述一个或多个程序在由所述一个或多个处理器执行时执行下述操作:对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记;获取对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式;根据所述标记信息从标记后的所述当前录音中选取出所述待编辑片段;按照所述编辑方式对所述待编辑片段进行编辑。In another aspect, embodiments of the present application provide a recording apparatus including a memory, one or more processors, and one or more programs, wherein the one or more programs are executed by the one or more processors Performing the following operations: performing sonic analysis on the current recording and marking the current recording according to the result of the acoustic analysis; acquiring an editing instruction for editing the current recording, the editing instruction carrying the marking information of the segment to be edited and Editing mode; selecting the to-be-edited segment from the marked current recording according to the marking information; and editing the to-be-edited segment according to the editing mode.
另一方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可执行指令,所述计算机可执行指令响应于执行使得录音装置执行操作,所述操作包括:对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记;获取对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式;根据所述标记信息从标记后的所述当前录音中选取出所述待编辑片段;按照所述编辑方式对所述待编辑片段进行编辑。In another aspect, embodiments of the present application provide a computer readable storage medium having stored thereon computer executable instructions that, in response to execution, cause a recording device to perform an operation, the operation The method includes: performing sonic analysis on the current recording, and marking the current recording according to the sound wave analysis result; acquiring an editing instruction for editing the current recording, where the editing instruction carries the marking information of the to-be-edited segment and the editing mode; The mark information selects the to-be-edited segment from the currently recorded recording, and edits the to-be-edited segment according to the editing mode.
本申请实施例的录音编辑方法及录音装置,通过对当前录音进行声波分析,并根据所述声波分析结果对所述当前录音进行标记,接收对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片 段的标记信息以及编辑方式,根据所述标记信息从标记后的所述当前录音中获取所述待编辑片段,按照所述编辑方式对所述待编辑片段进行编辑。本申请实施例通过声纹识别对当前录音进行标记,在标记完成后基于标记用户对当前录音进行编辑,从而能够快捷地定位到待编辑片段,节省了编辑时间,提升了用户感受。The recording editing method and the recording device of the embodiment of the present application perform sound wave analysis on the current recording, and mark the current recording according to the sound wave analysis result, and receive an editing instruction for editing the current recording, the editing The instruction carries the to-be-edited piece The marking information of the segment and the editing mode are obtained, and the to-be-edited segment is obtained from the currently recorded recording after the marking according to the marking information, and the to-be-edited segment is edited according to the editing manner. In the embodiment of the present application, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.
附图说明DRAWINGS
图1为本申请实施例一的录音编辑方法的流程示意图;1 is a schematic flow chart of a recording editing method according to Embodiment 1 of the present application;
图2为本申请实施例一的录音编辑方法的应用示例示意图之一;2 is a schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application;
图3为本申请实施例一的录音编辑方法的应用示例示意图之二;3 is a second schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application;
图4为本申请实施例一的录音编辑方法的应用示例示意图之三;4 is a third schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application;
图5为本申请实施例一的录音编辑方法的应用示例示意图之四;FIG. 5 is a fourth schematic diagram of an application example of a recording editing method according to Embodiment 1 of the present application; FIG.
图6为本申请实施例一中的录音标记方法的流程示意图;6 is a schematic flowchart of a method for recording marks in the first embodiment of the present application;
图7为本申请实施例一中的录音标记方法的应用示例示意图之一;FIG. 7 is a schematic diagram of an application example of a recording mark method in Embodiment 1 of the present application; FIG.
图8为本申请实施例一中的录音标记方法的应用示例示意图之二;8 is a second schematic diagram of an application example of a recording mark method in the first embodiment of the present application;
图9为本申请实施例一中的录音标记方法的应用示例示意图之三;9 is a third schematic diagram of an application example of a recording mark method in the first embodiment of the present application;
图10为本申请实施例一中的声纹数据库建立方法的流程示意图;10 is a schematic flowchart of a method for establishing a voiceprint database according to Embodiment 1 of the present application;
图11为本申请实施例二的录音装置的结构示意图;11 is a schematic structural diagram of a recording apparatus according to Embodiment 2 of the present application;
图12为本申请实施例二中标记模块的结构示意图;12 is a schematic structural diagram of a marking module according to Embodiment 2 of the present application;
图13为本申请提供的录音装置的又一个实施例的结构示意图;FIG. 13 is a schematic structural diagram of still another embodiment of a recording apparatus provided by the present application; FIG.
图14为本申请提供的用于录音编辑的计算机程序产品一个实施例的结构示意图。FIG. 14 is a schematic structural diagram of an embodiment of a computer program product for recording editing provided by the present application.
具体实施方式detailed description
下面结合附图对本申请实施例提供的录音编辑方法及录音装置进行详细描述。The recording editing method and the recording apparatus provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
实施例一 Embodiment 1
如图1所示,其为本申请实施例一的录音编辑方法的流程示意图,该录音编辑方法包括:As shown in FIG. 1 , it is a schematic flowchart of a recording editing method according to Embodiment 1 of the present application, and the recording editing method includes:
步骤101、对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记。Step 101: Perform sonic analysis on the current recording and mark the current recording according to the result of the acoustic analysis.
用户可以通过智能手机的用户界面,开启智能手机中下载的录音APP的录音功能,录音APP开始对当前录音进行采集,在采集的过程中录音APP可以对声音进行预处理。对采集的当前录音进行声波分析,进而得到当前录音的声波分析结果,在获取到声波分析结果中包括声波特征参数。由于说话人的声纹具有唯一性,因此可以利用声纹作为区别说话人的唯一特征,进而可以根据该声波特征参数就可以对当前录音进行标记。其中,声波特征参数包括:声音的能量、共振峰、梅尔倒谱系数(Mel-frequency cepstrum coefficients,简称MFCC)以及线性预测系数(Linear Prediction Coefficients,简称LPC)。The user can open the recording function of the recording APP downloaded in the smart phone through the user interface of the smart phone, and the recording APP starts collecting the current recording, and the recording APP can preprocess the sound during the collection process. Acoustic analysis is performed on the current recorded recording, and then the acoustic analysis result of the current recording is obtained, and the acoustic wave characteristic parameter is included in the obtained acoustic wave analysis result. Since the voiceprint of the speaker is unique, the voiceprint can be used as the unique feature of distinguishing the speaker, and the current recording can be marked according to the acoustic feature parameter. Among them, the acoustic wave characteristic parameters include: energy of sound, formant, Mel-frequency cepstrum coefficients (MFCC) and Linear Prediction Coefficients (LPC).
如图2所示,其为本实施例的应用示例示意图,比如一段录音有5个说话人,分别使用左斜线、右斜线、横线、竖线以及网格进行标记说话人A、B、C、D、E。其中,当说话人A在这段录音中有两次被其他说话人分开的发言,这两段发言都会使用左斜线标记说话人A,以表明是同一个说话人的录音段落。为了用户更直观的看到说话人的不同,可以使用不同的颜色标记说话人,例如,分别使用红色、黄色、蓝色、绿色和紫色来标记说话人A、B、C、D、E。或者当说话人A在这段录音中有两次被其他说话人分开的发言,这两段发言都会使用红色标记说话人A,以表明是同一个说话人的录音段落。As shown in FIG. 2 , it is a schematic diagram of an application example of the embodiment. For example, a recording has 5 speakers, and the left oblique line, the right oblique line, the horizontal line, the vertical line, and the grid are respectively used to mark the speakers A and B. , C, D, E. Among them, when speaker A has two speeches separated by other speakers in this recording, both speeches will use the left slash to mark speaker A to indicate the recording segment of the same speaker. In order for the user to more intuitively see the difference in the speaker, the speaker can be marked with a different color, for example, the speakers A, B, C, D, E are marked with red, yellow, blue, green and purple, respectively. Or when Speaker A has two speeches separated by other speakers in this recording, both of these speeches will be marked with a red marker to indicate that they are the recorded passage of the same speaker.
步骤102、获取对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式。Step 102: Acquire an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode.
进一步地,在对当前录音进行标记后,用户可以通过终端的显示界面下看到被标记的录音,这样用户就可以根据标记通过终端向录音APP方式编辑指令。其中,编辑指令中携带有待编辑片段的标记信息,以及对待编辑片段的编辑方式。编辑方式可以包括剪切选中的片段、合并选中的多个片段、或者删除选中的片段。 Further, after marking the current recording, the user can see the marked recording through the display interface of the terminal, so that the user can edit the instruction according to the marking to the recording APP mode through the terminal. The editing instruction carries the marking information of the segment to be edited, and the editing manner of the segment to be edited. Editing can include cutting selected segments, merging selected segments, or deleting selected segments.
本实施例中,所述获取对所述当前录音进行编辑的编辑指令,包括:In this embodiment, the acquiring an edit instruction for editing the current recording includes:
首先,用户可以通过终端点击当前录的波形图形中所包含的至少一个标记,来选取相应的待编辑片段。具体地,在用户对标记进行点击后,录音APP可以检测对当前录音的波形图形所包含的至少一个待编辑片段对应的标记进行的第一点击操作。进一步地,在选中待编辑片段后,用户可以通过终端显示界面显示的编辑方式选择一个队待编辑片段进行编辑的目标编辑方式。具体地,在用户对目标编辑方式进行点击后,录音APP就可以检测对待编辑片段所采用的目标编辑方式进行的第二点击操作。当检测到第一点击操作和第二点击操作后,根据第一点击操作和第二点击操作生成编辑指令。First, the user can select the corresponding segment to be edited by clicking at least one tag included in the currently recorded waveform graphic. Specifically, after the user clicks on the mark, the recording APP can detect a first click operation on the mark corresponding to the at least one piece to be edited included in the currently recorded waveform pattern. Further, after the segment to be edited is selected, the user can select a target editing mode for editing by editing a segment to be edited by the editing mode displayed on the terminal display interface. Specifically, after the user clicks on the target editing mode, the recording APP can detect the second click operation performed by the target editing mode used to edit the segment. After detecting the first click operation and the second click operation, an edit instruction is generated according to the first click operation and the second click operation.
可选地,所述获取对所述当前录音进行编辑的编辑指令,包括:Optionally, the obtaining an editing instruction for editing the current recording includes:
首先,用户可以通过终端点击当前录的波形图形中所包含的至少一个标记,来选取相应的待编辑片段。具体地,在用户对标记进行点击后,录音APP可以检测对当前录音的波形图形所包含的至少一个待编辑片段对应的标记进行的第一点击操作。进一步地,在选中待编辑片段后,用户可以通过终端显示界面显示的编辑方式选择一个队待编辑片段进行编辑的目标编辑方式。具体地,在用户对目标编辑方式进行点击后,录音APP就可以检测对待编辑片段所采用的目标编辑方式进行的第二点击操作。当检测到第一点击操作和第二点击操作后,根据第一点击操作和第二点击操作生成编辑指令。First, the user can select the corresponding segment to be edited by clicking at least one tag included in the currently recorded waveform graphic. Specifically, after the user clicks on the mark, the recording APP can detect a first click operation on the mark corresponding to the at least one piece to be edited included in the currently recorded waveform pattern. Further, after the segment to be edited is selected, the user can select a target editing mode for editing by editing a segment to be edited by the editing mode displayed on the terminal display interface. Specifically, after the user clicks on the target editing mode, the recording APP can detect the second click operation performed by the target editing mode used to edit the segment. After detecting the first click operation and the second click operation, an edit instruction is generated according to the first click operation and the second click operation.
步骤103、根据所述标记信息从标记后的所述当前录音中获取所述待编辑片段。Step 103: Acquire the to-be-edited segment from the marked current recording according to the marking information.
步骤104、按照所述编辑方式对所述待编辑片段进行编辑。Step 104: Edit the to-be-edited segment according to the editing mode.
在接收到编辑指令后,录音APP可以从编辑指令中获取到待编辑片段的标记信息,然后根据该标记信息从当前录音中选取到待编辑片段。录音APP可以从编辑指令中获取到对待编辑片段的编辑方式,例如将待编辑片段进行剪切、合并或者删除操作。在获取到待编辑片段后,录音APP就可以根据指示的编辑方式对其进行编辑。 After receiving the editing instruction, the recording APP can obtain the marking information of the segment to be edited from the editing instruction, and then select the segment to be edited from the current recording according to the marking information. The recording APP can obtain the editing mode of the segment to be edited from the editing instruction, for example, cutting, merging or deleting the segment to be edited. After the clip to be edited is acquired, the recording APP can edit it according to the indicated editing mode.
如图3所示,其为本实施例的应用示例示意图,用户在编辑一段经过声纹分析标记后的录音文件时,可以清晰看到这段录音的波纹图形有不同标记区分。用户通过点击波纹图形上的某个标记就可以相应的选中该片段作为待编辑片段。如图3中所示,用户通过点击选取了横线标记的片段作为待编辑片段。在选取中待编辑片段后,用户可以在编辑菜单中点击对该待编辑片段的目标编辑方式,例如,可以点击“剪切选中片段”作为目标编辑方式,通过上述两次点击操作就可以生成对待编辑片段进行编辑的编辑指令,根据该编辑指令就能剪下这个片段。As shown in FIG. 3, it is a schematic diagram of an application example of the embodiment. When editing a recording file after the voiceprint analysis mark, the user can clearly see that the ripple pattern of the recording has different mark distinctions. The user can select the clip as the clip to be edited by clicking on a mark on the ripple pattern. As shown in FIG. 3, the user selects a clip marked with a horizontal line as a clip to be edited by clicking. After selecting the segment to be edited in the selection, the user can click the target editing mode of the segment to be edited in the editing menu. For example, the user can click "cut selected segment" as the target editing mode, and the above two click operations can be used to generate the treatment. Edit the edit instruction of the clip for editing, and the clip can be cut according to the edit command.
如图4所示,其为本实施例的应用示例示意图,录音波纹图形下方有该录音的标记列表提供给用户,用户可以直接从标记列表中选择一个标记,这样就能选中该标记所代表的说话人的全部片段。比如一段录音有3个说话人,分别使用左斜线、右斜线、横线来标记说话人A、B、C。其中说话人A在这段录音中有两次被其他说话人分开的发言,这两段发言都会使用左斜线来标记。那么当用户点击标记列表中的左斜线选项时,两个片段同时都被选中,用户可以点击某个片段取消选中也可以保持选中这的片段。当选取多个片段后,当用户试图对其进行合并时,就可以从编辑方式列表中点击“合并选中片段”作为目标编辑方式。在点击操作完成后,录音APP就可以获取到编辑指令,可以将选中的多个片段被合并为一段新片段。As shown in FIG. 4 , it is a schematic diagram of an application example of the embodiment. A list of tags of the recording is provided to the user under the recording ripple graphic, and the user can directly select a tag from the tag list, so that the tag can be selected. All the fragments of the speaker. For example, a recording has 3 speakers, using the left slash, the right slash, and the horizontal line to mark the speakers A, B, and C. Among them, speaker A has two speeches separated by other speakers in this recording, and both speeches are marked with a left slash. Then when the user clicks the left slash option in the tag list, both segments are selected at the same time, and the user can click on a segment to uncheck or keep the selected segment. When multiple clips are selected, when the user tries to merge them, they can click "Merge Selected Clips" from the edit mode list as the target editing mode. After the click operation is completed, the recording APP can obtain an editing instruction, and the selected plurality of segments can be merged into a new segment.
如图5所示,其为本实施例的应用示例示意图,用户还可以从标记列表中选中多个标记选项。图5中用户选中了左斜线和右斜线这两个标记选项,那么说话人A和说话人B的全部录音片段就可以被选中。最后点击“合并选中片段”,选中的片段即被合并为一段新片段。进一步地,用户可以从选取出的所有片段中挑选部分对话内容进行合并。As shown in FIG. 5, which is a schematic diagram of an application example of the embodiment, the user may also select multiple marking options from the tag list. In Figure 5, the user selects the two mark options of the left slash and the right slash, so that all the segments of the speaker A and the speaker B can be selected. Finally, click on "Merge selected clips" and the selected clips will be merged into a new clip. Further, the user can select a part of the conversation content from all the selected segments for merging.
本实施例提供的录音编辑方法,通过对当前录音进行声波分析,并根据声波分析结果对所述当前录音进行标记,接收对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式,根据所述标记信息从标记后的所述当前录音中获取所述 待编辑片段,按照所述编辑方式对所述待编辑片段进行编辑。本实施例通过声纹识别对当前录音进行标记,在标记完成后基于标记用户对当前录音进行编辑,从而能够快捷地定位到待编辑片段,节省了编辑时间,提升了用户感受。The recording editing method provided in this embodiment performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, where the editing instruction carries the to-be-edited Marking information of the segment and editing manner, according to the marking information, obtaining the The segment to be edited is edited according to the editing mode. In this embodiment, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the segment to be edited can be quickly located, which saves editing time and improves user experience.
在本实施例一中对当前录音进行编辑之前,首先需要对当前录音进行标记,上述实施例一中步骤101的具体过程可见下图6所示。图6为本申请实施例一中的录音标记方法的流程示意图。该录音标记方法包括以下步骤:Before the current recording is edited in the first embodiment, the current recording needs to be marked first. The specific process of step 101 in the first embodiment can be seen in FIG. 6 below. FIG. 6 is a schematic flow chart of a method for recording marks in the first embodiment of the present application. The recording marking method includes the following steps:
步骤201、采集当前录音并从所述当前录音中提取声纹特征参数。Step 201: Acquire a current recording and extract a voiceprint feature parameter from the current recording.
用户可以通过智能手机的用户界面,开启智能手机中下载的录音APP的录音功能,录音APP开始对当前录音进行采集,在采集的过程中录音APP可以对声音进行预处理,例如,采集的数据进行分帧、加窗和滤波等。The user can open the recording function of the recording APP downloaded in the smart phone through the user interface of the smart phone, and the recording APP starts to collect the current recording. During the collection process, the recording APP can preprocess the sound, for example, the collected data is performed. Framing, windowing, filtering, etc.
进一步地,对采集的当前录音进行特征分析,进而得到当前录音的声波特征参数,其中,声波特征参数包括:声音的能量、共振峰、MFCC以及LPC。Further, feature analysis is performed on the collected current recording, and then the acoustic characteristic parameters of the current recording are obtained, wherein the acoustic characteristic parameters include: energy of the sound, formant, MFCC, and LPC.
步骤202、对所述声纹参数进行声纹聚类训练得到所述声纹参数的目标声纹模板。Step 202: Perform voiceprint clustering training on the voiceprint parameter to obtain a target voiceprint template of the voiceprint parameter.
本实施例中,为了识别出录音的模板,设置有声纹聚类训练器,在获取到声纹特征参数后,通过该训练器对声纹特征参数进行声纹聚类训练,就可以得到该当前录音对应的目标声纹模板。In this embodiment, in order to identify the template of the recording, a voiceprint clustering trainer is arranged, and after obtaining the voiceprint feature parameters, the voiceprint clustering training is performed on the voiceprint feature parameters by the trainer, and the current The target voiceprint template corresponding to the recording.
步骤203、判断所述目标声纹模板是否为声纹数据库中的声纹模板Step 203: Determine whether the target voiceprint template is a voiceprint template in the voiceprint database.
本实施例中,通过训练器对样本声音进行声纹聚类训练,得到了样本声音对应的样本声纹模板,使用样本声纹模板预先设置了一个声纹数据库存储在录音APP中。一般声纹数据库中存储有多个样本声纹模板,以便于用户在录音过程中进行录音标记。在获取到目标声纹模板后,录音APP可以在声纹数据库中进行查找,判断该目标声纹模板是否存在于该声纹数据库中。 In this embodiment, the voice sound clustering training is performed on the sample sound by the trainer, and the sample voiceprint template corresponding to the sample sound is obtained, and a voiceprint database is preset in the recording APP by using the sample voiceprint template. A plurality of sample voiceprint templates are stored in the general voiceprint database, so that the user can perform recording marks during the recording process. After obtaining the target voiceprint template, the recording APP can search in the voiceprint database to determine whether the target voiceprint template exists in the voiceprint database.
如果判断结果为是,执行步骤204;否则执行步骤205。If the result of the determination is yes, go to step 204; otherwise, go to step 205.
步骤204、从所述声纹数据库中获取与所述目标声纹模板对应的目标标记信息。Step 204: Obtain target tag information corresponding to the target voiceprint template from the voiceprint database.
在声纹数据库中不仅保存有样本声纹模板,而且还存储有与样本声纹模板对应的标记信息,一般每个样本声纹模板对应有各自的标记信息。当在声纹数据库中获取到与目标声纹模板对应的样本声纹模板时,就可以获取与该目标声纹模板对应的目标标记信息。In the voiceprint database, not only the sample voiceprint template is stored, but also the marker information corresponding to the sample voiceprint template is stored. Generally, each sample voiceprint template corresponds to the corresponding marker information. When the sample voiceprint template corresponding to the target voiceprint template is obtained in the voiceprint database, the target marker information corresponding to the target voiceprint template may be acquired.
步骤205、生成与所述目标声纹模板对应的所述目标标记信息。Step 205: Generate the target tag information corresponding to the target voiceprint template.
在识别出目标声纹模板并不存在与声纹数据库中之后,录音APP可以为该目标声纹模板设置一个目标标记信息,以通过该目标标记信息对该目标声纹模板进行标记。After identifying that the target voiceprint template does not exist in the voiceprint database, the recording APP may set a target marker information for the target voiceprint template to mark the target voiceprint template by the target marker information.
步骤206、使用所述目标标记信息对所述当前录音进行标记。Step 206: Mark the current recording with the target tag information.
在获取到目标标记信息后,录音APP自动使用该目标标记信息对当前录音进行标记。After the target tag information is obtained, the recording APP automatically uses the target tag information to mark the current recording.
本实施例中涉及的录音标记方法,通过声纹识别当前录音对应的声纹模板,利用建立的声纹数据库获取与当前录音对应的标记信息,进而对当前录音进行标记,实现了自动标记录音的功能,而且节省了用户添加标记的时间。In the recording and marking method involved in the embodiment, the voiceprint template corresponding to the current recording is identified by the voiceprint, and the marked information corresponding to the current recording is obtained by using the established voiceprint database, thereby marking the current recording, and realizing the automatic marking recording. Features, and saves time when users add tags.
具体录音标记方法的应用示例示意图可参见本实施例一中图2所示,此处不再赘述。For a schematic example of the application of the specific recording and marking method, refer to FIG. 2 in the first embodiment, and details are not described herein again.
步骤207、建立所述目标声纹模板与所述目标标记信息之间映射关系并存储在所述声纹数据库中。Step 207: Establish a mapping relationship between the target voiceprint template and the target tag information, and store the data in the voiceprint database.
步骤208、接收用户通过终端发送的备注信息。Step 208: Receive remark information sent by the user through the terminal.
步骤209、使用所述备注信息对所述当前录音进行备注。Step 209: Remark the current recording by using the remark information.
步骤210、将所述备注信息更新到所述声纹数据中所述目标标记信息中。Step 210: Update the remark information to the target tag information in the voiceprint data.
接收用户通过终端发送的备注信息,备注信息可以为当前录音的来源名称,在终端获取到备注信息后,指示录音APP使用该备注信息对当前录音进行备注。例如,录音APP可以为当前录音对应的位置添 加一个标签。进一步地,录音APP还可以将获取到的备注信息更新到声纹数据中与当前录音对应的目标声纹模板对应的目标标记信息中,以便录制的声音为当前录音对应的音源时可以再次被调用。Receiving the remark information sent by the user through the terminal, the remark information may be the source name of the current recording. After the terminal obtains the remark information, the recording APP is instructed to use the remark information to remark the current recording. For example, the recording app can add the position corresponding to the current recording. Add a label. Further, the recording APP can also update the obtained remark information to the target mark information corresponding to the target voiceprint template corresponding to the current recording in the voiceprint data, so that the recorded sound can be called again when the sound source corresponding to the current recording is used. .
如图7所示,其为本实施例的应用示例示意图,当录音APP对当前录音进行自动标记后,用户可以通过终端向录音APP发送备注信息,用于给这段录音中每位说话人添加备注信息。比如,用户可以通过录音APP将用左斜线标记的说话人A备注为“张老师”。用户可为新说话人添加的备注信息,并直接与该说话人的声纹信息匹配,并作为这段录音的名称。As shown in FIG. 7 , it is a schematic diagram of an application example of the embodiment. After the recording APP automatically marks the current recording, the user can send a remark information to the recording APP through the terminal, which is used to add each speaker in the recording. Remarks. For example, the user can note the speaker A marked with a left slash as "Zhang Teacher" through the recording app. The comment information that the user can add to the new speaker and directly matches the speaker's voiceprint information as the name of the recording.
如图8所示,其为本实施例的应用示例示意图,当用户新建一段录音,如果其中包含已保存声音名称的说话人的录音,在声纹分析后,这位说话人的录音段落会直接标记为已保存的标记信息。比如已经保存了之前一段录音的说话人A为“张老师”,新建一段包含这个说话人的录音不会再显示说话人A的标记,而是显示“张老师”。As shown in FIG. 8 , it is a schematic diagram of an application example of the embodiment. When a user creates a new recording, if the recording of the speaker of the saved sound name is included, the voice recording of the speaker will be directly after the voiceprint analysis. Mark as saved tag information. For example, the speaker A who has saved the previous recording is “Zhang Teacher”. The new recording containing the speaker will not display the speaker A's mark, but the “Zhang Teacher”.
如图9所示,其为本实施例的应用示例示意图,录音中包含用户保存过的讲话人对应的标记信息,按照所标记的说话人,更快定位需要寻找的录音。比如用户想要寻找张老师的讲课录音,只要寻找“张老师”的标签即可。As shown in FIG. 9 , it is a schematic diagram of an application example of the embodiment. The recording includes the marking information corresponding to the speaker saved by the user, and the recorded recording needs to be quickly located according to the marked speaker. For example, if the user wants to find a recording of Mr. Zhang's lecture, just look for the label of "Zhang Teacher".
在步骤201采集当前录音并从所述当前录音中提取声纹特征参数之前,还需要通过样本声音建立一个声纹数据库。Before the current recording is acquired in step 201 and the voiceprint feature parameters are extracted from the current recording, a voiceprint database needs to be created by the sample sound.
如图10所示,其为本申请实施例一中的声纹数据库建立方法的流程示意图,该声纹数据库建立方法包括:As shown in FIG. 10, it is a schematic flowchart of a method for establishing a voiceprint database in Embodiment 1 of the present application, and the method for establishing a voiceprint database includes:
步骤301、对样本声音进行分析,提取所述样本声音的所述声纹特征参数。Step 301: Analyze the sample sound, and extract the voiceprint feature parameter of the sample sound.
本实施例中,将录音APP在当前录音之前的每次录制的声音作为样本声音。在获取到每次录音后,录音APP会对录音的样本声音进行分析,提取出该样本声音的声纹特征参数,其中声纹特征参数包括:声音的能量、共振峰、MFCC以及LPC等。In this embodiment, each recorded sound of the recording APP before the current recording is taken as the sample sound. After each recording is obtained, the recording APP analyzes the sampled sound of the recording, and extracts the voiceprint characteristic parameters of the sample sound, wherein the voiceprint characteristic parameters include: sound energy, formant, MFCC, and LPC.
步骤302、根据所述样本声音的所述声纹特征参数进行声纹聚类 训练生成样本声纹模板。Step 302: Perform voiceprint clustering according to the voiceprint feature parameter of the sample sound Train to generate a sample voiceprint template.
为了对获取到样本声纹的声纹特征参数进行声纹聚类训练,需要进一步确定该声纹特征参数是否为同一个音源的声音,具体地,预设时间段内的所述样本声音的所述声纹特征参数,当所述预设时间内的所述样本声音的所述声纹特征参数具有相似性时,对所述样本声音的所述声纹特征参数进行声纹聚类训练生成所述样本声纹模板。如果确定出样本声纹的声纹特征参数的不具有相似性,则需要将声纹特征参数进行缓存,再判断出该声音特征参数具有相似性之后,对声纹特征参数进行声纹聚类训练生成样本声纹模板。In order to perform voiceprint clustering training on the voiceprint feature parameters of the sample voiceprint, it is necessary to further determine whether the voiceprint feature parameter is the sound of the same sound source, specifically, the sample sound in the preset time period. a voiceprint feature parameter, when the voiceprint feature parameter of the sample sound has similarity in the preset time, performing a voiceprint clustering training generation on the voiceprint feature parameter of the sample sound The sample voiceprint template. If it is determined that the voiceprint characteristic parameters of the sample voiceprint have no similarity, the voiceprint feature parameters need to be buffered, and then the sound feature parameters are judged to have similarity, and then the voiceprint clustering training is performed on the voiceprint feature parameters. Generate a sample voiceprint template.
比如,有一段录音中有5个说话人,这5个说话人就可以做完样本声音,在通过声纹聚类训练后,可以识别出这个5个说话人分别为说话人A、B、C、D、E,并5个说话人生成相应的样本声纹模板。For example, if there are 5 speakers in a recording, the 5 speakers can complete the sample sound. After training through voiceprint clustering, the 5 speakers can be identified as speakers A, B, and C. , D, E, and 5 speakers generate the corresponding sample voiceprint template.
步骤303、为所述样本声纹模板生成对应的样本标记信息。Step 303: Generate corresponding sample tag information for the sample voiceprint template.
在生成样本声纹模板后,为样本声音生成对应的样本标记信息,例如同一个说话人使用相同的标记进行标记。本实施例中,可以使用左斜线、右斜线、横线、竖线以及网格进行标记说话人A、B、C、D、E。After the sample voiceprint template is generated, corresponding sample marker information is generated for the sample sound, for example, the same speaker is marked with the same marker. In this embodiment, the speakers A, B, C, D, and E can be marked using a left oblique line, a right oblique line, a horizontal line, a vertical line, and a grid.
步骤304、使用所述样本声纹模板、所述样本标记信息以及所述样本声纹模板与所述样本标记信息之间的映射关系生成所述声纹数据库。Step 304: Generate the voiceprint database by using the sample voiceprint template, the sample marker information, and a mapping relationship between the sample voiceprint template and the sample marker information.
为了提高对录音标记的快捷性,本实施例中,使用样本声纹模板、所述样本标记信息以及所述样本声纹模板与所述样本标记信息之间的映射关系生成所述声纹数据库。每次对录音进行声纹聚类训练后生成的声纹模板都会作为样本声纹模板保存到声纹数据库中,而且会将对该样本声纹模板的标记信息以及两者之间的映射关系也会保存到声纹数据库中,以对声纹数据库进行更新。这样当再次遇到同一说话人的录音时,录音APP通过声纹分析,能够很迅速地对该说话人的录音进行标记,提高了录音标记的便捷性。In order to improve the speed of the recording mark, in the embodiment, the voiceprint database is generated by using a sample voiceprint template, the sample mark information, and a mapping relationship between the sample voiceprint template and the sample mark information. The voiceprint template generated after each voiceprint clustering training of the recording is saved as a sample voiceprint template in the voiceprint database, and the mapping information of the sample voiceprint template and the mapping relationship between the two are also It will be saved to the voiceprint database to update the voiceprint database. In this way, when the recording of the same speaker is encountered again, the recording APP can quickly mark the speaker's recording through the voiceprint analysis, thereby improving the convenience of the recording mark.
实施例二 Embodiment 2
如图11所示,其为本申请实施例二的录音装置的结构示意图。该装置包括:标记模块11、获取模块12、选取模块13和编辑模块14。FIG. 11 is a schematic structural diagram of a recording apparatus according to Embodiment 2 of the present application. The device comprises: a marking module 11, an obtaining module 12, a selecting module 13 and an editing module 14.
其中,标记模块11,用于对当前录音进行声波分析并根据声波分析结果对当前录音进行标记。The marking module 11 is configured to perform acoustic wave analysis on the current recording and mark the current recording according to the sound wave analysis result.
获取模块12,用于获取对当前录音进行编辑的编辑指令,编辑指令中携带待编辑片段的标记信息以及编辑方式。The obtaining module 12 is configured to obtain an editing instruction for editing the current recording, and the editing instruction carries the marking information of the to-be-edited segment and the editing mode.
选取模块13,用于根据标记信息从标记后的当前录音中选取出待编辑片段。The selecting module 13 is configured to select a segment to be edited from the current recording after the marking according to the marking information.
编辑模块14,用于按照编辑方式对待编辑片段进行编辑。The editing module 14 is configured to edit the edited segments according to the editing mode.
如图12所示,为本实施例二中标记模块11的一种可选地结构方式,包括:提取单元111、训练单元112、判断单元113、获取单元114、标记单元115、生成单元116、建立单元117和接收单元118。As shown in FIG. 12, an optional configuration manner of the marking module 11 in the second embodiment includes: an extracting unit 111, a training unit 112, a determining unit 113, an obtaining unit 114, a marking unit 115, a generating unit 116, The unit 117 and the receiving unit 118 are established.
其中,提取单元111,用于采集当前录音并从当前录音中提取声纹特征参数。The extracting unit 111 is configured to collect the current recording and extract the voiceprint feature parameters from the current recording.
训练单元112,用于对声纹参数进行声纹聚类训练得到声纹参数的目标声纹模板。The training unit 112 is configured to perform voiceprint clustering training on the voiceprint parameters to obtain a target voiceprint template of the voiceprint parameters.
判断单元113,用于判断目标声纹模板是否为声纹数据库中的声纹模板。The determining unit 113 is configured to determine whether the target voiceprint template is a voiceprint template in the voiceprint database.
获取单元114,用于在判断单元的判断结果为是时,从声纹数据库中获取与目标声纹模板对应的目标标记信息。The obtaining unit 114 is configured to acquire target tag information corresponding to the target voiceprint template from the voiceprint database when the determination result of the determining unit is YES.
标记单元115,用于使用目标标记信息对当前录音进行标记。The marking unit 115 is configured to mark the current recording with the target marking information.
生成单元116,用于在判断单元113的结果为否时,生成与目标声纹模板对应的目标标记信息。The generating unit 116 is configured to generate target tag information corresponding to the target voiceprint template when the result of the determining unit 113 is NO.
其中,建立单元116,用于在标记单元115使用目标标记信息对当前录音进行标记之后,建立目标声纹模板与目标标记信息之间映射关系并存储在声纹数据库中。The establishing unit 116 is configured to establish a mapping relationship between the target voiceprint template and the target tag information after the tag unit 115 marks the current recording by using the target tag information, and store the data in the voiceprint database.
进一步地,提取单元111,还用于在采集当前录音并从当前录音中提取声纹特征参数之前,对样本声音进行分析提取样本声音的声纹特征参数。 Further, the extracting unit 111 is further configured to analyze the sample sound to extract the voiceprint feature parameter of the sample sound before acquiring the current recording and extracting the voiceprint feature parameter from the current recording.
训练单元112,还用于根据样本声音的声纹特征参数进行声纹聚类训练生成样本声纹模板。The training unit 112 is further configured to generate a sample voiceprint template by performing voiceprint clustering training according to the voiceprint feature parameter of the sample sound.
生成单元116,还用于为样本声纹模板生成对应的样本标记信息。The generating unit 116 is further configured to generate corresponding sample tag information for the sample voiceprint template.
建立单元117,还用于使用样本声纹模板、样本标记信息以及样本声纹模板与样本标记信息之间的映射关系生成声纹数据库。The establishing unit 117 is further configured to generate a voiceprint database by using a sample voiceprint template, sample mark information, and a mapping relationship between the sample voiceprint template and the sample mark information.
进一步地,训练单元112,具体用于获取预设时间段内的样本声音的声纹特征参数,在预设时间内的样本声音的声纹特征参数具有相似时,对样本声音的声纹特征参数进行声纹聚类训练生成样本声纹模板。Further, the training unit 112 is specifically configured to acquire a voiceprint feature parameter of the sample sound in the preset time period, and the voiceprint feature parameter of the sample sound when the voiceprint feature parameters of the sample sound have similar in the preset time period The voiceprint clustering training is performed to generate a sample voiceprint template.
其中,接收单元118,用于在标记单元115使用目标标记信息对当前录音进行标记之后,接收用户通过终端发送的备注信息。The receiving unit 118 is configured to receive the remark information sent by the user through the terminal after the marking unit 115 marks the current recording by using the target tag information.
标记单元115,还用于使用备注信息对当前录音进行备注。The marking unit 115 is further configured to use the comment information to comment on the current recording.
建立单元117,还用于将备注信息更新到声纹数据中目标标记信息中。The establishing unit 117 is further configured to update the remark information into the target tag information in the voiceprint data.
进一步地,获取模块12,具体用于检测对当前录音的波形图形所包含的至少一个待编辑片段对应的标记进行的第一点击操作,并检测对待编辑片段所采用的目标编辑方式进行的第二点击操作以及根据检测到的第一点击操作和第二点击操作生成编辑指令。Further, the obtaining module 12 is specifically configured to detect a first click operation performed on a mark corresponding to at least one to-be-edited segment included in the currently recorded waveform pattern, and detect a second edit operation performed on the target edit mode used by the edited segment Clicking an operation and generating an editing instruction according to the detected first click operation and second click operation.
可选地,获取模块12,具体用于检测从当前录音所包含的标记列表中选取至少一个标记进行的第一点击操作;选取的标记用于指示出待编辑片段,并检测对待编辑片段所采用的目标编辑方式进行的第二点击操作,以及根据检测到的第一点击操作和第二点击操作生成编辑指令。Optionally, the obtaining module 12 is specifically configured to detect a first click operation performed by selecting at least one tag from the tag list included in the current recording; the selected tag is used to indicate the segment to be edited, and detecting the segment to be edited The second click operation performed by the target editing mode, and the editing instruction is generated according to the detected first click operation and second click operation.
本实施例提供的录音装置的各功能模块可用于执行上述实施例中所示的录音编辑方法的流程,其具体工作原理不再赘述,详见方法实施例的描述。The function modules of the recording apparatus provided in this embodiment can be used to execute the flow of the recording editing method shown in the above embodiments. The specific working principle is not described here. For details, refer to the description of the method embodiments.
本实施例提供的录音装置,通过对当前录音进行声波分析,并根据声波分析结果对当前录音进行标记,接收对当前录音进行编辑的编辑指令,编辑指令中携带待编辑片段的标记信息以及编辑方式,根据 标记信息从标记后的当前录音中获取待编辑片段,按照编辑方式对待编辑片段进行编辑。本实施例通过声纹识别对当前录音进行标记,在标记完成后基于标记用户对当前录音进行编辑,从而能够快捷地定位到待编辑片段,节省了编辑时间,提升了用户感受。The recording device provided in this embodiment performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, receives an editing instruction for editing the current recording, and carries the marking information of the to-be-edited segment and the editing mode in the editing instruction. According to The tag information is obtained from the current recording after the tag, and the edited segment is edited according to the editing mode. In this embodiment, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the segment to be edited can be quickly located, which saves editing time and improves user experience.
实施例三Embodiment 3
图13为本申请提供的录音装置的又一个实施例的结构示意图。如图13所示,本申请实施例的录音装置包括:存储器61、一个或多个处理器62以及一个或多个程序63。FIG. 13 is a schematic structural diagram of still another embodiment of a recording apparatus provided by the present application. As shown in FIG. 13, the recording apparatus of the embodiment of the present application includes a memory 61, one or more processors 62, and one or more programs 63.
其中,所述一个或多个程序63在由一个或多个处理器62执行时执行上述实施例中的任意一种方法。The one or more programs 63, when executed by one or more processors 62, perform any of the above-described embodiments.
本申请实施例的录音装置,通过对当前录音进行声波分析,并根据所述声波分析结果对所述当前录音进行标记,接收对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式,根据所述标记信息从标记后的所述当前录音中获取所述待编辑片段,按照所述编辑方式对所述待编辑片段进行编辑。本申请实施例通过声纹识别对当前录音进行标记,在标记完成后基于标记用户对当前录音进行编辑,从而能够快捷地定位到待编辑片段,节省了编辑时间,提升了用户感受。The recording apparatus of the embodiment of the present invention performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, and the editing instruction is carried in the editing instruction. Editing the mark information of the segment and the editing mode, and acquiring the to-be-edited segment from the current recording after the marking according to the marking information, and editing the segment to be edited according to the editing mode. In the embodiment of the present application, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.
实施例四Embodiment 4
图14为本申请提供的用于录音编辑的计算机程序产品一个实施例的结构示意图。如图14所示,FIG. 14 is a schematic structural diagram of an embodiment of a computer program product for recording editing provided by the present application. As shown in Figure 14,
本申请实施例的用于录音编辑的计算机程序产品71,可以包括信号承载介质72。信号承载介质72可以包括一个或更多个指令73,该指令73在由例如处理器执行时,处理器可以提供以上针对图1-12描述的功能。例如,指令73可以包括:用于对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记的一个或多个指令;用于获取对所述当前录音进行编辑的编辑指令,所述编辑指令中携带 待编辑片段的标记信息以及编辑方式的一个或多个指令;用于根据所述标记信息从标记后的所述当前录音中选取出所述待编辑片段的一个或多个指令;以及用于按照所述编辑方式对所述待编辑片段进行编辑的一个或多个指令。因此,例如,参照图11,录音装置可以响应于指令73来进行图1中所示的步骤中的一个或更多个。The computer program product 71 for recording editing of the embodiment of the present application may include a signal bearing medium 72. Signal bearing medium 72 may include one or more instructions 73 that, when executed by, for example, a processor, may provide the functionality described above with respect to Figures 1-12. For example, the instructions 73 can include: one or more instructions for performing sonic analysis on the current recording and marking the current recording based on the results of the acoustic analysis; for obtaining an editing instruction to edit the current recording, Carrying in the edit order One or more instructions of the mark information of the segment to be edited and the edit mode; one or more instructions for selecting the slice to be edited from the marked current record according to the mark information; and for The editing mode is one or more instructions for editing the segment to be edited. Thus, for example, referring to FIG. 11, the recording device can perform one or more of the steps shown in FIG. 1 in response to instruction 73.
在一些实现中,信号承载介质72可以包括计算机可读介质74,诸如但不限于硬盘驱动器、压缩盘(CD)、数字通用盘(DVD)、数字带、存储器等。在一些实现中,信号承载介质72可以包括可记录介质75,诸如但不限于存储器、读/写(R/W)CD、R/W DVD等。在一些实现中,信号承载介质72可以包括通信介质76,诸如但不限于数字和/或模拟通信介质(例如,光纤线缆、波导、有线通信链路、无线通信链路等)。因此,例如,计算机程序产品71可以通过RF信号承载介质72传送给多指滑动手势的识别装置的一个或多个模块,其中,信号承载介质72由无线通信介质(例如,符合IEEE 802.11标准的无线通信介质)传送。In some implementations, signal bearing medium 72 can include computer readable media 74 such as, but not limited to, a hard disk drive, a compact disk (CD), a digital versatile disk (DVD), a digital tape, a memory, and the like. In some implementations, the signal bearing medium 72 can include a recordable medium 75 such as, but not limited to, a memory, a read/write (R/W) CD, an R/W DVD, and the like. In some implementations, the signal bearing medium 72 can include a communication medium 76 such as, but not limited to, a digital and/or analog communication medium (eg, fiber optic cable, waveguide, wired communication link, wireless communication link, etc.). Thus, for example, the computer program product 71 can be transmitted by the RF signal bearing medium 72 to one or more modules of the identification device of the multi-finger swipe gesture, wherein the signal bearing medium 72 is comprised of a wireless communication medium (eg, wireless compliant with the IEEE 802.11 standard) Communication medium) transmission.
本申请实施例的计算机程序产品,通过对当前录音进行声波分析,并根据所述声波分析结果对所述当前录音进行标记,接收对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式,根据所述标记信息从标记后的所述当前录音中获取所述待编辑片段,按照所述编辑方式对所述待编辑片段进行编辑。本申请实施例通过声纹识别对当前录音进行标记,在标记完成后基于标记用户对当前录音进行编辑,从而能够快捷地定位到待编辑片段,节省了编辑时间,提升了用户感受。The computer program product of the embodiment of the present application performs sound wave analysis on the current recording, and marks the current recording according to the sound wave analysis result, and receives an editing instruction for editing the current recording, where the editing instruction carries The mark information of the clip to be edited and the edit mode, the clip to be edited is obtained from the current record after the mark according to the mark information, and the clip to be edited is edited according to the edit mode. In the embodiment of the present application, the current recording is marked by voiceprint recognition, and the current recording is edited based on the marking user after the marking is completed, so that the clip to be edited can be quickly located, which saves editing time and improves user experience.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个 人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the various embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware. Based on such understanding, the above-described technical solutions may be embodied in the form of software products in essence or in the form of software products, which may be stored in a computer readable storage medium such as ROM/RAM, magnetic Disc, CD, etc., including a number of instructions to make a computer device (can be a A human computer, server, or network device, etc.) performs the methods described in various embodiments or portions of the embodiments.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。 Finally, it should be noted that the above embodiments are only for explaining the technical solutions of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present application. range.

Claims (18)

  1. 一种录音编辑方法,其特征在于,包括:A recording editing method, comprising:
    对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记;Performing sonic analysis on the current recording and marking the current recording according to the result of the acoustic analysis;
    获取对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式;Obtaining an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;
    根据所述标记信息从标记后的所述当前录音中选取出所述待编辑片段;Selecting the to-be-edited segment from the marked current recording according to the marking information;
    按照所述编辑方式对所述待编辑片段进行编辑。The edited segment is edited according to the editing mode.
  2. 根据权利要求1所述的录音编辑方法,其特征在于,所述对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记,包括:The recording editing method according to claim 1, wherein the performing sound wave analysis on the current recording and marking the current recording according to the sound wave analysis result comprises:
    采集所述当前录音并从所述当前录音中提取声纹特征参数;Collecting the current recording and extracting a voiceprint feature parameter from the current recording;
    对所述声纹参数进行声纹聚类训练得到所述声纹参数的目标声纹模板;Performing a voiceprint clustering training on the voiceprint parameter to obtain a target voiceprint template of the voiceprint parameter;
    判断所述目标声纹模板是否为声纹数据库中的声纹模板;Determining whether the target voiceprint template is a voiceprint template in a voiceprint database;
    如果判断结果为是,从所述声纹数据库中获取与所述目标声纹模板对应的目标标记信息;If the determination result is yes, the target mark information corresponding to the target voiceprint template is obtained from the voiceprint database;
    使用所述目标标记信息对所述当前录音进行标记。The current recording is marked using the target tag information.
  3. 根据权利要求2所述的录音编辑方法,其特征在于,所述使用所述目标标记信息对所述当前录音进行标记之前,还包括:The recording editing method according to claim 2, wherein before the marking the current recording using the target marking information, the method further comprises:
    如果判断结果为否,生成与所述目标声纹模板对应的所述目标标记信息。If the determination result is no, the target tag information corresponding to the target voiceprint template is generated.
  4. 根据权利要求3所述的录音编辑方法,其特征在于,所述使用所述目标标记信息对所述当前录音进行标记之后,还包括:The recording editing method according to claim 3, wherein after the marking the current recording using the target marking information, the method further comprises:
    建立所述目标声纹模板与所述目标标记信息之间映射关系并存储在所述声纹数据库中。Establishing a mapping relationship between the target voiceprint template and the target marker information and storing in the voiceprint database.
  5. 根据权利要求1-4任一项所述的录音编辑方法,其特征在于,所述采集当前录音并从所述当前录音中提取声纹特征参数之前,包括: The recording editing method according to any one of claims 1 to 4, wherein before the collecting the current recording and extracting the voiceprint feature parameters from the current recording, the method comprises:
    对样本声音进行分析,提取所述样本声音的所述声纹特征参数;And analyzing the sample sound, and extracting the voiceprint characteristic parameter of the sample sound;
    根据所述样本声音的所述声纹特征参数进行声纹聚类训练生成样本声纹模板;Performing a voiceprint clustering training according to the voiceprint feature parameter of the sample sound to generate a sample voiceprint template;
    为所述样本声纹模板生成对应的样本标记信息;Generating corresponding sample tag information for the sample voiceprint template;
    使用所述样本声纹模板、所述样本标记信息以及所述样本声纹模板与所述样本标记信息之间的映射关系生成所述声纹数据库。The voiceprint database is generated using the sample voiceprint template, the sample marker information, and a mapping relationship between the sample voiceprint template and the sample marker information.
  6. 根据权利要求5所述的录音编辑方法,其特征在于,所述根据所述样本声音的所述声纹特征参数进行声纹聚类训练生成样本声纹模板包括:The recording editing method according to claim 5, wherein the generating the sample voiceprint template by performing the voiceprint clustering training according to the voiceprint feature parameter of the sample sound comprises:
    获取预设时间段内的所述样本声音的所述声纹特征参数;Obtaining the voiceprint feature parameter of the sample sound within a preset time period;
    在所述预设时间内的所述样本声音的所述声纹特征参数具有相似性时,对所述样本声音的所述声纹特征参数进行声纹聚类训练生成所述样本声纹模板。And performing the voiceprint clustering training on the voiceprint feature parameter of the sample sound to generate the sample voiceprint template when the voiceprint feature parameters of the sample sound have similarity within the preset time.
  7. 根据权利要求1-4任一项所述的录音编辑方法,其特征在于,所述使用所述目标标记信息对所述当前录音进行标记之后,还包括:The recording editing method according to any one of claims 1 to 4, wherein after the marking of the current recording using the target tag information, the method further comprises:
    接收用户通过终端发送的备注信息;Receiving remark information sent by the user through the terminal;
    使用所述备注信息对所述当前录音进行备注;Remarking the current recording using the comment information;
    将所述备注信息更新到所述声纹数据中所述目标标记信息中。Updating the comment information to the target tag information in the voiceprint data.
  8. 根据权利要求1-4任一项所述的录音编辑方法,其特征在于,所述获取对所述当前录音进行编辑的编辑指令,包括:The recording editing method according to any one of claims 1 to 4, wherein the obtaining an editing instruction for editing the current recording comprises:
    检测对所述当前录音的波形图形所包含的至少一个所述待编辑片段对应的标记进行的第一点击操作;Detecting a first click operation performed on a mark corresponding to at least one of the to-be-edited segments included in the waveform pattern of the current recording;
    检测对所述待编辑片段所采用的目标编辑方式进行的第二点击操作;Detecting a second click operation performed on the target editing mode adopted by the segment to be edited;
    根据检测到的所述第一点击操作和所述第二点击操作生成所述编辑指令。The editing instruction is generated according to the detected first click operation and the second click operation.
  9. 根据权利要求1-4任一项所述的录音编辑方法,其特征在于,所述获取对所述当前录音进行编辑的编辑指令,包括:The recording editing method according to any one of claims 1 to 4, wherein the obtaining an editing instruction for editing the current recording comprises:
    检测从所述当前录音所包含的标记列表中选取至少一个标记进行 的第一点击操作;所述选取的标记用于指示出所述待编辑片段;Detecting to select at least one tag from the list of tags included in the current recording a first click operation; the selected mark is used to indicate the segment to be edited;
    检测所述待编辑片段所采用的目标编辑方式进行的第二点击操作;Detecting a second click operation performed by the target editing mode used by the to-be-edited segment;
    根据检测到的所述第一点击操作和所述第二点击操作生成所述编辑指令。The editing instruction is generated according to the detected first click operation and the second click operation.
  10. 一种录音装置,其特征在于,包括:A recording device, comprising:
    标记模块,用于对当前录音进行声波分析并根据声波分析结果对所述当前录音进行标记;a marking module, configured to perform sonic analysis on the current recording and mark the current recording according to the result of the acoustic wave analysis;
    获取模块,用于获取对所述当前录音进行编辑的编辑指令,所述编辑指令中携带待编辑片段的标记信息以及编辑方式;An obtaining module, configured to acquire an editing instruction for editing the current recording, where the editing instruction carries the marking information of the segment to be edited and the editing mode;
    选取模块,用于根据所述标记信息从标记后的所述当前录音中选取出所述待编辑片段;a selection module, configured to select the to-be-edited segment from the marked current recording according to the marking information;
    编辑模块,用于按照所述编辑方式对所述待编辑片段进行编辑。And an editing module, configured to edit the to-be-edited segment according to the editing manner.
  11. 根据权利要求10所述的录音装置,其特征在于,所述标记模块包括:The recording apparatus according to claim 10, wherein the marking module comprises:
    提取单元,用于采集所述当前录音并从所述当前录音中提取声纹特征参数;An extracting unit, configured to collect the current recording and extract a voiceprint feature parameter from the current recording;
    训练单元,用于对所述声纹参数进行声纹聚类训练得到所述声纹参数的目标声纹模板;a training unit, configured to perform voiceprint clustering training on the voiceprint parameter to obtain a target voiceprint template of the voiceprint parameter;
    判断单元,用于判断所述目标声纹模板是否为声纹数据库中的声纹模板;a determining unit, configured to determine whether the target voiceprint template is a voiceprint template in a voiceprint database;
    获取单元,用于在所述判断单元的判断结果为是时,从所述声纹数据库中获取与所述目标声纹模板对应的目标标记信息;An obtaining unit, configured to acquire target tag information corresponding to the target voiceprint template from the voiceprint database when the determination result of the determining unit is YES;
    标记单元,用于使用所述目标标记信息对所述当前录音进行标记。a marking unit for marking the current recording with the target marking information.
  12. 根据权利要求11所述的录音装置,其特征在于,所述标记模块,还包括:The recording device according to claim 11, wherein the marking module further comprises:
    生成单元,用于在所述判断单元的结果为否时,生成与所述目标声纹模板对应的所述目标标记信息。And a generating unit, configured to generate, when the result of the determining unit is no, the target tag information corresponding to the target voiceprint template.
  13. 根据权利要求12所述的录音装置,其特征在于,所述标记模 块,还包括:A recording apparatus according to claim 12, wherein said marking mode Block, also includes:
    建立单元,用于在所述标记单元使用所述目标标记信息对所述当前录音进行标记之后,建立所述目标声纹模板与所述目标标记信息之间映射关系并存储在所述声纹数据库中。a establishing unit, configured to establish a mapping relationship between the target voiceprint template and the target tag information after the tag unit marks the current recording by using the target tag information, and store the mapping relationship in the voiceprint database in.
  14. 根据权利要求10-13任一项所述的录音装置,其特征在于,所述提取单元,还用于在采集当前录音并从所述当前录音中提取声纹特征参数之前,对样本声音进行分析提取所述样本声音的所述声纹特征参数;The recording apparatus according to any one of claims 10 to 13, wherein the extracting unit is further configured to analyze the sample sound before acquiring the current recording and extracting the voiceprint characteristic parameter from the current recording. Extracting the voiceprint feature parameters of the sample sound;
    所述训练单元,还用于根据所述样本声音的所述声纹特征参数进行声纹聚类训练生成样本声纹模板;The training unit is further configured to generate a sample voiceprint template by performing voiceprint clustering training according to the voiceprint feature parameter of the sample sound;
    所述生成单元,还用于为所述样本声纹模板生成对应的样本标记信息;The generating unit is further configured to generate corresponding sample tag information for the sample voiceprint template;
    所述建立单元,还用于使用所述样本声纹模板、所述样本标记信息以及所述样本声纹模板与所述样本标记信息之间的映射关系生成所述声纹数据库。The establishing unit is further configured to generate the voiceprint database by using the sample voiceprint template, the sample marker information, and a mapping relationship between the sample voiceprint template and the sample marker information.
  15. 根据权利要求14所述的录音装置,其特征在于,所述训练单元,具体用于获取预设时间段内的所述样本声音的所述声纹特征参数,在所述预设时间内的所述样本声音的所述声纹特征参数具有相似时,对所述样本声音的所述声纹特征参数进行声纹聚类训练生成所述样本声纹模板。The recording apparatus according to claim 14, wherein the training unit is configured to acquire the voiceprint feature parameter of the sample sound in a preset time period, where the preset time is When the voiceprint feature parameters of the sample sound have similarities, voiceprint clustering training is performed on the voiceprint feature parameters of the sample sound to generate the sample voiceprint template.
  16. 根据权利要求13所述的录音装置,其特征在于,所述标记模块,还包括:The recording device according to claim 13, wherein the marking module further comprises:
    接收单元,用于在所述标记模块使用所述目标标记信息对所述当前录音进行标记之后,接收用户通过终端发送的备注信息;a receiving unit, configured to receive the remark information sent by the user through the terminal after the marking module marks the current recording by using the target tag information;
    所述标记单元,还用于使用所述备注信息对所述当前录音进行备注;The marking unit is further configured to remark the current recording by using the remark information;
    所述建立单元,还用于并将所述备注信息更新到所述声纹数据中所述目标标记信息中。The establishing unit is further configured to update the remark information into the target tag information in the voiceprint data.
  17. 根据权利要求10-13任一项所述的录音装置,其特征在于, 所述获取模块,具体用于检测对所述当前录音的波形图形所包含的至少一个所述待编辑片段对应的标记进行的第一点击操作,并检测对所述待编辑片段所采用的目标编辑方式进行的第二点击操作以及根据检测到的所述第一点击操作和所述第二点击操作生成所述编辑指令。A recording apparatus according to any one of claims 10 to 13, wherein The acquiring module is specifically configured to detect a first click operation performed on a mark corresponding to at least one of the to-be-edited segments included in the currently recorded waveform pattern, and detect target editing used for the to-be-edited segment And performing a second click operation and generating the editing instruction according to the detected first click operation and the second click operation.
  18. 根据权利要求10-13任一项所述的录音编辑方法,其特征在于,所述获取模块,具体用于检测从所述当前录音所包含的标记列表中选取至少一个标记进行的第一点击操作;所述选取的标记用于指示出所述待编辑片段,并检测对所述待编辑片段所采用的目标编辑方式进行的第二点击操作,以及根据检测到的所述第一点击操作和所述第二点击操作生成所述编辑指令。 The recording editing method according to any one of claims 10 to 13, wherein the acquiring module is specifically configured to detect a first click operation performed by selecting at least one mark from a list of tags included in the current recording. The selected mark is used to indicate the to-be-edited segment, and detects a second click operation performed on the target editing mode adopted by the segment to be edited, and according to the detected first click operation and The second click operation generates the edit instruction.
PCT/CN2016/089020 2015-11-15 2016-07-07 Audio recording editing method and recording device WO2017080235A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510786352.9A CN105895102A (en) 2015-11-15 2015-11-15 Recording editing method and recording device
CN201510786352.9 2015-11-15

Publications (1)

Publication Number Publication Date
WO2017080235A1 true WO2017080235A1 (en) 2017-05-18

Family

ID=57001979

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/089020 WO2017080235A1 (en) 2015-11-15 2016-07-07 Audio recording editing method and recording device

Country Status (2)

Country Link
CN (1) CN105895102A (en)
WO (1) WO2017080235A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116132234A (en) * 2023-01-09 2023-05-16 天津大学 Underwater hidden communication method and device using whale animal whistle phase code

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106356067A (en) * 2016-08-25 2017-01-25 乐视控股(北京)有限公司 Recording method, device and terminal
CN107403623A (en) * 2017-07-31 2017-11-28 努比亚技术有限公司 Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance
CN107481743A (en) * 2017-08-07 2017-12-15 捷开通讯(深圳)有限公司 The edit methods of mobile terminal, memory and recording file
CN107564531A (en) * 2017-08-25 2018-01-09 百度在线网络技术(北京)有限公司 Minutes method, apparatus and computer equipment based on vocal print feature
CN109545200A (en) * 2018-10-31 2019-03-29 深圳大普微电子科技有限公司 Edit the method and storage device of voice content
CN110753263A (en) * 2019-10-29 2020-02-04 腾讯科技(深圳)有限公司 Video dubbing method, device, terminal and storage medium
CN114242120B (en) * 2021-11-25 2023-11-10 广东电力信息科技有限公司 Audio editing method and audio marking method based on DTMF technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011160390A (en) * 2010-01-28 2011-08-18 Akitoshi Noda System utilizing sound recording function of mobile phone for screen display, emergency communication access function or the like
CN102985965A (en) * 2010-05-24 2013-03-20 微软公司 Voice print identification
CN103530432A (en) * 2013-09-24 2014-01-22 华南理工大学 Conference recorder with speech extracting function and speech extracting method
CN103700370A (en) * 2013-12-04 2014-04-02 北京中科模识科技有限公司 Broadcast television voice recognition method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011160390A (en) * 2010-01-28 2011-08-18 Akitoshi Noda System utilizing sound recording function of mobile phone for screen display, emergency communication access function or the like
CN102985965A (en) * 2010-05-24 2013-03-20 微软公司 Voice print identification
CN103530432A (en) * 2013-09-24 2014-01-22 华南理工大学 Conference recorder with speech extracting function and speech extracting method
CN103700370A (en) * 2013-12-04 2014-04-02 北京中科模识科技有限公司 Broadcast television voice recognition method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116132234A (en) * 2023-01-09 2023-05-16 天津大学 Underwater hidden communication method and device using whale animal whistle phase code
CN116132234B (en) * 2023-01-09 2024-03-12 天津大学 Underwater hidden communication method and device using whale animal whistle phase code

Also Published As

Publication number Publication date
CN105895102A (en) 2016-08-24

Similar Documents

Publication Publication Date Title
WO2017080235A1 (en) Audio recording editing method and recording device
WO2017080239A1 (en) Audio recording tagging method and recording device
JP6688340B2 (en) Method and apparatus for entering facial expression icon
WO2020024690A1 (en) Speech labeling method and apparatus, and device
CN108305632A (en) A kind of the voice abstract forming method and system of meeting
CN105632484B (en) Speech database for speech synthesis pause information automatic marking method and system
JP4600828B2 (en) Document association apparatus and document association method
CN1333363C (en) Audio signal processing apparatus and audio signal processing method
CN107274916B (en) Method and device for operating audio/video file based on voiceprint information
WO2019148586A1 (en) Method and device for speaker recognition during multi-person speech
US20200042279A1 (en) Platform for producing and delivering media content
CN107305541A (en) Speech recognition text segmentation method and device
JP2014519058A (en) Automatic creation of mapping between text data and audio data
CN108257592A (en) Human voice segmentation method and system based on long-term and short-term memory model
CN107025913A (en) A kind of way of recording and terminal
CN106373598A (en) Audio replay control method and apparatus
JP5099211B2 (en) Voice data question utterance extraction program, method and apparatus, and customer inquiry tendency estimation processing program, method and apparatus using voice data question utterance
KR102287431B1 (en) Apparatus for recording meeting and meeting recording system
CN109213970B (en) Method and device for generating notes
CN106874684B (en) A kind of image labeling system and method
CN113573096A (en) Video processing method, video processing device, electronic equipment and medium
CN103337247A (en) Data annotation analysis system for electromagnetic pronunciation recorder
CN108364654B (en) Voice processing method, medium, device and computing equipment
CN117174092B (en) Mobile corpus transcription method and device based on voiceprint recognition and multi-modal analysis
CN105069146B (en) Sound searching method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16863413

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16863413

Country of ref document: EP

Kind code of ref document: A1