CN106782506A - A kind of method that recorded audio is divided into section - Google Patents

A kind of method that recorded audio is divided into section Download PDF

Info

Publication number
CN106782506A
CN106782506A CN201611037945.6A CN201611037945A CN106782506A CN 106782506 A CN106782506 A CN 106782506A CN 201611037945 A CN201611037945 A CN 201611037945A CN 106782506 A CN106782506 A CN 106782506A
Authority
CN
China
Prior art keywords
node
pause
recorded audio
divided
section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611037945.6A
Other languages
Chinese (zh)
Inventor
张悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Language Network (wuhan) Information Technology Co Ltd
Original Assignee
Language Network (wuhan) Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Language Network (wuhan) Information Technology Co Ltd filed Critical Language Network (wuhan) Information Technology Co Ltd
Priority to CN201611037945.6A priority Critical patent/CN106782506A/en
Publication of CN106782506A publication Critical patent/CN106782506A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)

Abstract

Recorded audio is divided into the method for section the invention discloses a kind of, it is characterized in that comprising the following steps:Recorded audio data are obtained and traveled through, phonological component and mute part is obtained;At setting pause;Several nodes are formed according to time division, node serial number is set;Section is formed between two adjacent nodes;Node is modified;It is described to be whether decision node belongs at pause to the method that node is modified, if node is not belonging at pause, then at the supreme pause of knot adjustment;If node belongs at pause, continue to correct next node until terminating;The time of the mute part is the time difference between two adjacent phonological components.Advantage is:1. the audio segmentation of Large Copacity facilitates storage to take into some sections;2. the node for splitting formation during segmentation belongs at pause(Usually sentence tail or section tail), it is to avoid audio loss, enhance Consumer's Experience.

Description

A kind of method that recorded audio is divided into section
Technical field
The invention belongs to the present invention relates to field of audio processing, more particularly to a kind of side that recorded audio is divided into section Method.
Background technology
With continuing to develop for Internet technology, the multi-medium data such as image, video, audio has been increasingly becoming internet letter Main information medium form in breath process field.Wherein, voice data occupies critically important position.Original audio data is in itself It is that a kind of non-semantic symbol is represented and non-structured binary stream.Often capacity is very for the recorded audio formed in convention Greatly, the time is very long, and person for recording has many people, and that user needs is often wherein a bit of, or someone audio, The audio segmentation Large Copacity is at this time accomplished by into some sections, facilitates storage to take, shape is often split during segmentation Into node at be not a tail or section tail(It is defined as at pause), audio loss can be caused, while also resulting in user's body Test bad.
The content of the invention
The technical problems to be solved by the invention be audio segmentation formed node at be not so as to cause audio at pause Loss and the bad problem of Consumer's Experience, and the method for improving this problem will be exactly adjusted to pause at spliting node.
In order to solve the above technical problems, the invention provides a kind of method that recorded audio is divided into section, it is characterized in that Comprise the following steps:
Recorded audio data are obtained and traveled through, phonological component and mute part is obtained;
At setting pause;
Several nodes are formed according to time division, node serial number is set;
Section is formed between two adjacent nodes;
Node is modified;
It is described to be whether decision node belongs at pause to the method that node is modified, if node is not belonging at pause, that At the supreme pause of knot adjustment;
If node belongs at pause, continue to correct next node until terminating;
The time of the mute part is the time difference between two adjacent phonological components.
Further, the method at the setting pause is that the average mute time of the Time Calculation according to mute part will be big It is judged as at pause in the mute part of the threshold value of average mute time.
Further, the step of Time Calculation according to mute part average mute time is to obtain mute part Total duration, and mute part quantity, calculated divided by the quantity of mute part with the total duration of mute part average Jing Yin Time.
Further, the method at the setting pause is the median of the time for taking mute part and is set as at pause.
Further, the method at the setting pause is to record the sample of recorded audio according to custom word speed by person for recording, The sample of the recorded audio is including at a pause, will be set as the pause of recorded audio at the pause of the sample of recorded audio Place.
Further, it is described amendment node method also include node before and/or node after character whether with node label Tag match in storehouse, the node label storehouse be store some sentences section start or word label that section terminates language material Storehouse.
Further, whether the method also character including decision node of the amendment node changes tag match with personage, It is to be used to the personage's differentiation identifier distinguished according to what the sound of people was differently formed in recording that the personage changes label.
Using above-mentioned technical proposal, following effect is can reach:
1. the audio segmentation of Large Copacity facilitates storage to take into some sections;
2. the node for splitting formation during segmentation belongs at pause(Usually sentence tail or section tail), it is to avoid audio damage Lose, enhance Consumer's Experience.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description does not constitute inappropriate limitation of the present invention, in the accompanying drawings for explaining the present invention:
Fig. 1 shows a kind of schematic flow sheet of the method that recorded audio is divided into section.
Specific embodiment
Technical scheme is further described in detail with reference to the accompanying drawings and detailed description.
In order to solve the above technical problems, as shown in figure 1, the invention provides a kind of side that recorded audio is divided into section Method, it is characterized in that comprising the following steps:
Recorded audio data are obtained and traveled through, phonological component and mute part is obtained;
At setting pause;
Several nodes are formed according to time division, node serial number is set;
Section is formed between two adjacent nodes;
Node is modified;
It is described to be whether decision node belongs at pause to the method that node is modified, if node is not belonging at pause, that At the supreme pause of knot adjustment;
If node belongs at pause, continue to correct next node until all node processings are terminated;
The time of the mute part is the time difference between two adjacent phonological components.
Further, the method at the setting pause is that the average mute time of the Time Calculation according to mute part will be big It is judged as at pause in the mute part of the threshold value of average mute time.
Further, the step of Time Calculation according to mute part average mute time is to obtain mute part Total duration, and mute part quantity, calculated divided by the quantity of mute part with the total duration of mute part average Jing Yin Time.
Further, the method at the setting pause is the median of the time for taking mute part and is set as at pause.
Further, the method at the setting pause is to record the sample of recorded audio according to custom word speed by person for recording, The sample of the recorded audio is including at a pause, will be set as the pause of recorded audio at the pause of the sample of recorded audio Place.
Further, it is described amendment node method also include node before and/or node after character whether with node label Tag match in storehouse, the node label storehouse be store some sentences section start or word label that section terminates language material Storehouse.
Further, whether the method also character including decision node of the amendment node changes tag match with personage, It is to be used to the personage's differentiation identifier distinguished according to what the sound of people was differently formed in recording that the personage changes label.
It should also be appreciated by one skilled in the art that the foregoing is only the preferred embodiments of the present invention, it is not used to The limitation present invention, for a person skilled in the art, the present invention can have various modifications and variations.It is all in essence of the invention Within god and principle, any modification, equivalent substitution and improvements made etc. should be included within the scope of the present invention.

Claims (7)

1. it is a kind of that recorded audio is divided into the method for section, it is characterized in that comprising the following steps:
Recorded audio data are obtained and traveled through, phonological component and mute part is obtained;
At setting pause;
Several nodes are formed according to time division, node serial number is set;
Section is formed between two adjacent nodes;
Node is modified;
It is described to be whether decision node belongs at pause to the method that node is modified, if node is not belonging at pause, that At the supreme pause of knot adjustment;
If node belongs at pause, continue to correct next node until terminating;
The time of the mute part is the time difference between two adjacent phonological components.
2. the method that recorded audio is divided into section according to claim 1, it is characterized in that the side at the setting pause Method is that the average mute time of the Time Calculation according to mute part, the mute part that will be greater than the threshold value of average mute time is sentenced Break as at pause.
3. the method that recorded audio is divided into section according to claim 2, it is characterized in that described according to mute part The step of Time Calculation average mute time is the quantity for obtaining the total duration of mute part, and mute part, uses Jing Yin portion The total duration divided calculates average mute time divided by the quantity of mute part.
4. the method that recorded audio is divided into section according to claim 1, it is characterized in that the side at the setting pause Method is the median of the time for taking mute part and is set as at pause.
5. the method that recorded audio is divided into section according to claim 1, it is characterized in that the side at the setting pause Method is to record the sample of recorded audio according to custom word speed by person for recording, and the sample of the recorded audio is included at a pause, To be set as at the pause of recorded audio at the pause of the sample of recorded audio.
6. the method that recorded audio is divided into section according to claim 1, it is characterized in that the method for the amendment node Also include node before and/or node after character whether with node label storehouse in tag match, the node label storehouse is to deposit Stored up some sentences section start or word label that section terminates corpus.
7. the method that recorded audio is divided into section according to claim 1, it is characterized in that the method for the amendment node Also whether the character including decision node changes tag match with personage, and it is according to people in recording that the personage changes label What sound was differently formed is used to the personage's differentiation identifier distinguished.
CN201611037945.6A 2016-11-23 2016-11-23 A kind of method that recorded audio is divided into section Pending CN106782506A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611037945.6A CN106782506A (en) 2016-11-23 2016-11-23 A kind of method that recorded audio is divided into section

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611037945.6A CN106782506A (en) 2016-11-23 2016-11-23 A kind of method that recorded audio is divided into section

Publications (1)

Publication Number Publication Date
CN106782506A true CN106782506A (en) 2017-05-31

Family

ID=58975016

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611037945.6A Pending CN106782506A (en) 2016-11-23 2016-11-23 A kind of method that recorded audio is divided into section

Country Status (1)

Country Link
CN (1) CN106782506A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108172219A (en) * 2017-11-14 2018-06-15 珠海格力电器股份有限公司 The method and apparatus for identifying voice
CN108962283A (en) * 2018-01-29 2018-12-07 北京猎户星空科技有限公司 A kind of question terminates the determination method, apparatus and electronic equipment of mute time
CN109166570A (en) * 2018-07-24 2019-01-08 百度在线网络技术(北京)有限公司 A kind of method, apparatus of phonetic segmentation, equipment and computer storage medium
CN109389999A (en) * 2018-09-28 2019-02-26 北京亿幕信息技术有限公司 A kind of high performance audio-video is made pauses in reading unpunctuated ancient writings method and system automatically
CN109448455A (en) * 2018-12-20 2019-03-08 广东小天才科技有限公司 A kind of real-time error recites method and private tutor's equipment
CN110473519A (en) * 2018-05-11 2019-11-19 北京国双科技有限公司 A kind of method of speech processing and device
CN112256871A (en) * 2020-10-16 2021-01-22 国网江苏省电力有限公司连云港供电分公司 Material fulfillment system and method

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1624685A (en) * 2003-12-02 2005-06-08 英业达股份有限公司 Paragraph type language learning system and its method
CN1716380A (en) * 2005-07-26 2006-01-04 浙江大学 Audio frequency splitting method for changing detection based on decision tree and speaking person
CN101221762A (en) * 2007-12-06 2008-07-16 上海大学 MP3 compression field audio partitioning method
CN101749951A (en) * 2008-12-17 2010-06-23 上海立谊环保工程技术有限公司 Treatment method for segmentation of metallurgy sintering smoke
CN102591892A (en) * 2011-01-13 2012-07-18 索尼公司 Data segmenting device and method
CN103247317A (en) * 2013-04-03 2013-08-14 深圳大学 Editing method and system for record files
CN102348049B (en) * 2011-09-16 2013-09-18 央视国际网络有限公司 Method and device for detecting position of cut point of video segment
CN103646654A (en) * 2013-12-12 2014-03-19 深圳市金立通信设备有限公司 Recording data sharing method and terminal
CN103812754A (en) * 2012-11-12 2014-05-21 腾讯科技(深圳)有限公司 Contact matching method, instant messaging client, server and system
CN104519401A (en) * 2013-09-30 2015-04-15 华为技术有限公司 Video division point acquiring method and equipment
CN105336321A (en) * 2015-09-25 2016-02-17 百度在线网络技术(北京)有限公司 Phonetic segmentation method and device for speech synthesis
CN105653729A (en) * 2016-01-28 2016-06-08 努比亚技术有限公司 Device and method for indexing sound recording file
CN105719642A (en) * 2016-02-29 2016-06-29 黄博 Continuous and long voice recognition method and system and hardware equipment
WO2016112519A1 (en) * 2015-01-15 2016-07-21 华为技术有限公司 Audio content segmentation method and apparatus
CN105845129A (en) * 2016-03-25 2016-08-10 乐视控股(北京)有限公司 Method and system for dividing sentences in audio and automatic caption generation method and system for video files

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1624685A (en) * 2003-12-02 2005-06-08 英业达股份有限公司 Paragraph type language learning system and its method
CN1716380A (en) * 2005-07-26 2006-01-04 浙江大学 Audio frequency splitting method for changing detection based on decision tree and speaking person
CN101221762A (en) * 2007-12-06 2008-07-16 上海大学 MP3 compression field audio partitioning method
CN101749951A (en) * 2008-12-17 2010-06-23 上海立谊环保工程技术有限公司 Treatment method for segmentation of metallurgy sintering smoke
CN102591892A (en) * 2011-01-13 2012-07-18 索尼公司 Data segmenting device and method
CN102348049B (en) * 2011-09-16 2013-09-18 央视国际网络有限公司 Method and device for detecting position of cut point of video segment
CN103812754A (en) * 2012-11-12 2014-05-21 腾讯科技(深圳)有限公司 Contact matching method, instant messaging client, server and system
CN103247317A (en) * 2013-04-03 2013-08-14 深圳大学 Editing method and system for record files
CN104519401A (en) * 2013-09-30 2015-04-15 华为技术有限公司 Video division point acquiring method and equipment
CN103646654A (en) * 2013-12-12 2014-03-19 深圳市金立通信设备有限公司 Recording data sharing method and terminal
WO2016112519A1 (en) * 2015-01-15 2016-07-21 华为技术有限公司 Audio content segmentation method and apparatus
CN105336321A (en) * 2015-09-25 2016-02-17 百度在线网络技术(北京)有限公司 Phonetic segmentation method and device for speech synthesis
CN105653729A (en) * 2016-01-28 2016-06-08 努比亚技术有限公司 Device and method for indexing sound recording file
CN105719642A (en) * 2016-02-29 2016-06-29 黄博 Continuous and long voice recognition method and system and hardware equipment
CN105845129A (en) * 2016-03-25 2016-08-10 乐视控股(北京)有限公司 Method and system for dividing sentences in audio and automatic caption generation method and system for video files

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108172219A (en) * 2017-11-14 2018-06-15 珠海格力电器股份有限公司 The method and apparatus for identifying voice
CN108962283A (en) * 2018-01-29 2018-12-07 北京猎户星空科技有限公司 A kind of question terminates the determination method, apparatus and electronic equipment of mute time
CN110473519A (en) * 2018-05-11 2019-11-19 北京国双科技有限公司 A kind of method of speech processing and device
CN110473519B (en) * 2018-05-11 2022-05-27 北京国双科技有限公司 Voice processing method and device
CN109166570A (en) * 2018-07-24 2019-01-08 百度在线网络技术(北京)有限公司 A kind of method, apparatus of phonetic segmentation, equipment and computer storage medium
CN109389999A (en) * 2018-09-28 2019-02-26 北京亿幕信息技术有限公司 A kind of high performance audio-video is made pauses in reading unpunctuated ancient writings method and system automatically
CN109389999B (en) * 2018-09-28 2020-12-11 北京亿幕信息技术有限公司 High-performance audio and video automatic sentence-breaking method and system
CN109448455A (en) * 2018-12-20 2019-03-08 广东小天才科技有限公司 A kind of real-time error recites method and private tutor's equipment
CN112256871A (en) * 2020-10-16 2021-01-22 国网江苏省电力有限公司连云港供电分公司 Material fulfillment system and method
CN112256871B (en) * 2020-10-16 2021-05-07 国网江苏省电力有限公司连云港供电分公司 Material fulfillment system and method

Similar Documents

Publication Publication Date Title
CN106782506A (en) A kind of method that recorded audio is divided into section
CN107195295B (en) Voice recognition method and device based on Chinese-English mixed dictionary
CN109065033B (en) Automatic speech recognition method based on random deep time delay neural network model
KR102598824B1 (en) Automated voice translation dubbing for prerecorded videos
CN108829894B (en) Spoken word recognition and semantic recognition method and device
WO2018036555A1 (en) Session processing method and apparatus
WO2020087655A1 (en) Translation method, apparatus and device, and readable storage medium
CN108630193A (en) Audio recognition method and device
CN105208463B (en) The method and system of frame determination is carried out for m3u8 files
CN110085261A (en) A kind of pronunciation correction method, apparatus, equipment and computer readable storage medium
US10984801B2 (en) ASR training and adaptation
JP2004229283A (en) Method for identifying transition of news presenter in news video
CN112992125B (en) Voice recognition method and device, electronic equipment and readable storage medium
CN110706695B (en) Data labeling method and device
CN112541095B (en) Video title generation method and device, electronic equipment and storage medium
CN111627423B (en) VAD tail point detection method, device, server and computer readable medium
CN107517406A (en) A kind of video clipping and the method for translation
CN108549628A (en) The punctuate device and method of streaming natural language information
CN112016271A (en) Language style conversion model training method, text processing method and device
CN107181986A (en) The matching process and device of video and captions
WO2023151424A1 (en) Method and apparatus for adjusting playback rate of audio picture of video
CN112800263A (en) Video synthesis system, method and medium based on artificial intelligence
Alluri et al. Multi Modal Analysis of memes for Sentiment extraction
KR101389730B1 (en) Method to create split position accordance with subjects for the video file
US20100278505A1 (en) Multi-media data editing system, method and electronic device using same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170531