CN107230478A

CN107230478A - A kind of voice information processing method and system

Info

Publication number: CN107230478A
Application number: CN201710302993.1A
Authority: CN
Inventors: 王泓喆
Original assignee: Shanghai Feixun Data Communication Technology Co Ltd
Current assignee: Taizhou Jiji Intellectual Property Operation Co.,Ltd.
Priority date: 2017-05-03
Filing date: 2017-05-03
Publication date: 2017-10-03

Abstract

The invention provides a kind of voice information processing method and system, its method includes：S100 obtains the voice messaging of user；S200 intercepts the voice messaging, obtains multiple sound bites；S300 recognizes the sound bite, obtains corresponding speech recognition fragment；The S400 processing speech recognition fragment, obtains voice identification result.System includes acquisition module, obtains the voice messaging of user；Interception module, intercepts the voice messaging, obtains multiple sound bites；Identification module, recognizes the sound bite, obtains corresponding speech recognition fragment；First processing module, handles the speech recognition fragment, obtains voice identification result.The present invention carries out speech recognition during realizing voice recording, reducing user needs after the completion of voice recording, speech recognition can be carried out and export the stand-by period of sound result, shortened recording time delay while normal recognition result is not influenceed, improve user experience.

Description

A kind of voice information processing method and system

Technical field

The present invention relates to technical field of voice recognition, espespecially a kind of voice information processing method and system.

Background technology

With flourishing for the communication technology, the application of speech recognition is more and more extensive, and various network service instruments are for example micro- The meanss of communication such as letter, Tencent QQ progressively turn into one of main tool that mass-communication is linked up.Wherein, the operation of speech message is simple Property, convenience are extensively liked by user.In the intelligent terminals such as current mobile phone, computer, voice can be provided by means of communication Input, output function.

In the prior art, the identifying schemes of current speech recognition do not make consideration for identification time started length, The stand-by period of user will be longer when identification is shorter, and not only the stand-by period is veryer long but also knows for the speech recognition of user when longer It is not imperfect, have a strong impact on the use demand of user.And prior art is after voice recording terminates, then to send out recording result Deliver to sound identification module and carry out speech recognition, record length adds recognition time, causes the unnecessary stand-by period, wastes Time, influence the usage experience of user.

The content of the invention

It is an object of the invention to provide a kind of voice information processing method and system, realize and carry out language during voice recording Sound is recognized, after the completion of reduction user waiting voice is recorded.

The technical scheme that the present invention is provided is as follows：

A kind of voice information processing method, including step：S100 is periodically gathered during user recording and is recognized use The voice messaging at family, obtains speech recognition fragment；The S200 processing speech recognition fragment, obtains voice identification result.

The present invention carries out speech recognition during realizing voice recording, and reducing user needs after the completion of voice recording, Speech recognition can be carried out and export the stand-by period of sound result, when shortening recording while normal recognition result is not influenceed Prolong, improve user experience.

Further, the step S100 includes step：S110 is during user recording, according to preset collection rule The voice messaging of user is gathered, current speech segment is obtained；S120 recognizes the current speech segment according to speech recognition library, obtains To speech recognition fragment；S130 obtains next sound bite and performs step S110-130, until user terminates recording；Wherein, The default collection rule is according to the equal acquisition mode of time interval.

Further, S110 also includes step：S111 judges whether the current speech segment is blank sound bite；If It is to perform step S112；Otherwise, step S120 is performed；S112 deletes the current speech segment, and performs step S130.

Further, the step S200 includes step：S210 according to collection time sequencing, by the speech recognition piece Section is ranked up integration, obtains institute's speech recognition result.

Further, the step S200 also includes step：S220 exports the voice and known according to the time sequencing of collection Other fragment, obtains institute's speech recognition result.

The present invention also provides a kind of speech information processing system, including：Control module and processing module；The processing module Communicated to connect with the control module；The control module, periodically gathers and recognizes the language of user during user recording Message ceases, and obtains speech recognition fragment；The processing module, handles the speech recognition piece that the control module identification is obtained Section, obtains voice identification result.

Further, the control module includes：Gather submodule and identification submodule；It is described collection submodule with it is described Recognize submodule communication connection；The collection submodule, during user recording, gathers user's according to default collection rule Voice messaging, obtains current speech segment, sends the current speech segment to the identification submodule；The identification submodule Block, receives the current speech segment that the collection submodule is sent, the current speech piece is recognized according to speech recognition library Section, obtains speech recognition fragment；The collection submodule also obtains and sends next sound bite to the identification submodule, directly Terminate recording to user；The identification submodule also receives next sound bite that the collection submodule is sent, according to Speech recognition library recognizes next sound bite, obtains speech recognition fragment, until user terminates recording；Wherein, it is described pre- If collection rule is according to the equal acquisition mode of time interval.

Further, the control module also includes：Judging submodule and deletion submodule, the judging submodule difference Communicated to connect with the collection submodule, the deletion submodule and the identification submodule；The judging submodule, judges institute Whether state current speech segment is blank sound bite；Judge the current speech segment for blank sound bite if so, sending Result to the deletion submodule；Otherwise, send judge the current speech segment not for blank sound bite result extremely The identification submodule；The deletion submodule, receives the judged result that the judging submodule is sent, and deletes the current language Tablet section.

Further, the processing module includes：Sorting sub-module；The sorting sub-module communicates with the control module Connection；The sorting sub-module, according to the time sequencing of collection, is ranked up integration by the speech recognition fragment, obtains institute Speech recognition result.

Further, the processing module also includes：Output sub-module, the output sub-module leads to the control module Letter connection；The output sub-module, according to the time sequencing of collection, exports the speech recognition fragment, obtains the voice and knows Other result

A kind of voice information processing method and system provided by the present invention, can bring the following beneficial effect of at least one Really：

1st, the present invention is during recording, and the sound bite that collection recording is obtained carries out speech recognition, compared to traditional language Sound identification method, processing voice identification result faster, reduces the time of the typing of user's waiting voice and speech recognition.

2nd, according to fifo queue, (FIFO is First Input First Output abbreviation, FIFO team to the present invention Row, this is a kind of traditional sequentially execution method, and the instruction being introduced into first is completed and retired from office, and and then just performs Article 2 instruction. A kind of data buffer of first in first out) carry out acquisition voice messaging, and by fifo queue carry out speech recognition, for compared with Prolonged Recording Process can not only efficiently reduce the stand-by period of voice recording and speech recognition, can also make complete Speech recognition.

3rd, the present invention carries out speech recognition during realizing voice recording, and solving user needs after the completion of voice recording, The problem of speech recognition being carried out.

4th, the present invention shortens recording time delay while normal recognition result is not influenceed, and improves user experience.

5th, the present invention can delete invalid voice fragment, help user more rapidly to carry out speech recognition.

Brief description of the drawings

Below by clearly understandable mode, preferred embodiment is described with reference to the drawings, to a kind of speech signal analysis side Above-mentioned characteristic, technical characteristic, advantage and its implementation of method and system are further described.

Fig. 1 is a kind of flow chart of one embodiment of voice information processing method of the invention；

Fig. 2 is a kind of flow chart of another embodiment of voice information processing method of the invention；

Fig. 3 is a kind of flow chart of another embodiment of voice information processing method of the invention；

Fig. 4 is a kind of flow chart of another embodiment of voice information processing method of the invention；

Fig. 5 is a kind of structural representation of one embodiment of speech information processing system of the invention；

Fig. 6 is a kind of structural representation of another embodiment of speech information processing system of the invention；

Fig. 7 is a kind of structural representation of another embodiment of speech information processing system of the invention；

Fig. 8 is a kind of structural representation of another embodiment of speech information processing system of the invention；

Fig. 9 is a kind of flow chart of an example of voice information processing method of the invention.

Embodiment

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, control is illustrated below The embodiment of the present invention.It should be evident that drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings Accompanying drawing, and obtain other embodiments.

To make only to schematically show part related to the present invention in simplified form, each figure, they are not represented Its as product practical structures.In addition, so that simplified form is readily appreciated, there is identical structure or function in some figures Part, only symbolically depicts one of those, or has only marked one of those.Herein, " one " is not only represented " only this ", can also represent the situation of " more than one ".

With reference to shown in Fig. 1, the present invention provides a kind of one embodiment of voice information processing method, including：

S110 is periodically gathered during user recording and is recognized the voice messaging of user, obtains speech recognition fragment；

The S120 processing speech recognition fragment, obtains voice identification result.

In the embodiment of the present invention, realize and speech recognition is carried out during voice recording, reducing user needs in voice recording After the completion of, speech recognition can be carried out and export the stand-by period of sound result, while normal recognition result is not influenceed Shorten recording time delay, improve user experience.

With reference to shown in Fig. 2, the present invention provides a kind of another embodiment of voice information processing method, including：

S210 gathers the voice messaging of user according to preset collection rule during user recording, obtains current language Tablet section；

S220 recognizes the current speech segment according to speech recognition library, obtains speech recognition fragment；

S230 obtains next sound bite and performs step S210-230, until user terminates recording；

The speech recognition fragment is ranked up integration by S240 according to the time sequencing of collection, is obtained the voice and is known Other result.

Wherein, the default collection rule is according to the equal acquisition mode of time interval.

In the embodiment of the present invention, the foundation of specific speech recognition library, prior art has a lot, herein not fine explanation. During recording, the sound bite that collection recording is obtained carries out speech recognition, compared to traditional voice identification method, handles language Sound recognition result faster, reduces the time of the typing of user's waiting voice and speech recognition.Acquisition voice is carried out according to fifo queue Information, and speech recognition is carried out by fifo queue, for shorter recording, sound identification module, which is needed not wait for, reaches voice Recognition time could carry out speech recognition after starting, it is to avoid the increase unnecessary stand-by period, for the recording of long period Journey can not only efficiently reduce the stand-by period of voice recording and speech recognition, can also make complete speech recognition.With Family can set default collection rule according to oneself hobby, demand.Avoid causing the unnecessary stand-by period, during saving Between lifted user usage experience.Acquisition voice messaging is carried out according to fifo queue, and speech recognition is carried out by fifo queue, Recording Process for the long period can not only efficiently reduce the stand-by period of voice recording and speech recognition, can also do Go out complete speech recognition.For example, user's first sets collection rule to be to carry out interception voice messaging per 1S in Recording Process, that User starts after recording, and the collection rule set according to user's first collects first 1S sound bite Y1, second 1S n-th of 1S of sound bite Y2 ... ... sound bite Yn, then after sound bite Y1 is collected, pass through voice Identification module carries out speech recognition, obtains speech recognition fragment S1, obtains after sound bite Y2, entered by sound identification module Row speech recognition, obtains speech recognition fragment S2, the like, during recording, once collection obtains corresponding voice Speech recognition can be just carried out after fragment immediately and obtains corresponding speech recognition fragment, speech recognition fragment is saved, Sequencing arrangement is carried out according to the time order and function order of acquisition, then almost complete language is instantly obtained after End of Tape Sound recognition result, lifts the efficiency of speech recognition.

Technology in the embodiment of the present invention can be applied to be controlled including indoor equipment, in terms of voice dialogue robot, By carrying out the function of speech recognition during voice recording in recording, solving user needs after the completion of voice recording, The problem of speech recognition being carried out, and shortening recording time delay while normal recognition result is not influenceed, and user Voice command is quickly converted into voice recognition commands and inputted to intelligent home device, intelligent robot, so as to more facilitate fast The voice recognition commands promptly obtained according to identification control intelligent home device, intelligent robot, without user with hand come Operation, voice operating is more rapid compared to manually operated, improves user experience.So avoid to do shopping such as Taobao Platform, causes user to prefer to manual service of transferring due to the inefficiency of speech recognition, improves the utilization rate of speech recognition, The wasting of resources of voice service is reduced, the workload of human customer is reduced, labour cost is reduced.The embodiment of the present invention can also be applied In speech searching system, such as Baidu's phonetic search is a kind of brand-new search pattern, and user can use voice to say search Intention, such as saying " weather will be how tomorrow ", " way of Spicy diced chicken with peanuts ", user during speaking, just can side obtain Take family information of speaking and just carry out speech recognition, the embodiment of the present invention can be instantly obtained desired result, output character version The phonetic search such as " how is weather tomorrow ", " way of Spicy diced chicken with peanuts " allow user to remove the cumbersome of typewriting from, make the whole mistake of search Journey is more smooth, more convenient.

With reference to shown in Fig. 3, the present invention provides a kind of another embodiment of voice information processing method, including：

S310 gathers the voice messaging of user according to preset collection rule during user recording, obtains current language Tablet section；

S320 recognizes the current speech segment according to speech recognition library, obtains speech recognition fragment；

S330 exports the speech recognition fragment, obtains institute's speech recognition result according to the time sequencing of collection；

S340 obtains next sound bite and performs step S310-330, until user terminates recording.

The embodiment of the present invention, during recording, the sound bite that collection recording is obtained carries out speech recognition, handles language Sound identification is fast, reduces period of reservation of number.Acquisition voice messaging is carried out according to fifo queue, and voice is carried out by fifo queue Identification, the Recording Process for the long period can not only efficiently reduce the stand-by period of voice recording and speech recognition, Complete speech recognition can be made.Such as general speech recognition effective time is 30S, the record if user's second is spoken without a break Sound recorded 60S, because recording time is long, and it is long not only result in the recording stand-by period, and because voice messaging is long, leads Sound identification module is caused intactly to identify the recording substance of user's second.

The embodiment of the present invention can also be applied and phonetic dialing, Voice Navigation, dictation data inputting etc. field.For example, listening Write in Data Input Process, user side speech utterance identification module just exports the content that user speaks in typing column at once, tool Body starts after recording, and the collection rule set according to user's second collects first 0.5S sound bite X1, second 0.5S n-th of 0.5S of sound bite X2 ... ... sound bite Xn, then after sound bite X1 is collected, pass through Sound identification module carries out speech recognition, obtains speech recognition fragment B1, the like.During recording, once collection Speech recognition can just be carried out immediately and obtain corresponding speech recognition fragment by obtaining after corresponding sound bite, according to collection Time sequencing, exports the speech recognition fragment, obtains institute's speech recognition result.If user's second finds the word on typing column Which part has different from the content that oneself is spoken, and the part of the wrong identification can also be found out according to time sequencing, carries out Re-recognize.

With reference to shown in Fig. 4, the present invention provides a kind of another embodiment of voice information processing method, including：

S410 gathers the voice messaging of user according to preset collection rule during user recording, obtains current language Tablet section；

S420 judges whether the current speech segment is blank sound bite；If so, performing step S430；Otherwise, hold Row step S440；

S430 deletes the current speech segment, and performs step S450；

S440 recognizes the current speech segment according to speech recognition library, obtains speech recognition fragment；

S450 obtains next sound bite and performs step S410-S450, until user terminates recording；

In the embodiment of the present invention, invalid voice fragment can be deleted, helps user more rapidly to carry out speech recognition. In preprocessing process before speech recognition, according to the skill such as sound wave change frequency during the speaking of user and sound wave change fluctuation Art can identify user speech information, and which is partly efficient voice part, and which is invalid voice part, and mark user is empty Bai Yuyin time point, and remove invalid voice partial information i.e. blank sound bite.For example assume the adopting according to 2S of user third Collection rule carry out interception user speech information, it is also assumed that user third speak beginning time point be 14：30, user is 14:33- 14:36 periods did not speak, that is, detected and the Jing Yin of 3s occur.Collection rule so according to embodiments of the present invention, 14:33- 14:The sound bite of 35 this interception is the sound bite of blank, this sound bite is marked, at this point it is possible to think The initial speech information is invalid, and sound identification module can not carry out speech recognition to it

The present embodiment by speech recognition technology by that can reduce key-press input, enhancing and the interactivity of user；By adopting With fifo queue, realize multichannel microphone and share a speech recognition engine, improve engine utilization rate.

With reference to shown in Fig. 5, the present invention provides a kind of one embodiment of speech information processing system 1000, including：Control Module and processing module；The processing module is communicated to connect with the control module；

The control module, periodically gathers and recognizes during user recording the voice messaging of user, obtain voice Recognize fragment；

The processing module, handles the speech recognition fragment that the control module identification is obtained, obtains speech recognition As a result.

With reference to shown in Fig. 6, it will not be repeated here with upper one embodiment identical part.The present invention provides a kind of voice letter Another embodiment of processing system 1000 is ceased, including：The control module includes：Gather submodule and identification submodule；Institute State collection submodule and the identification submodule communication connection；The processing module includes：Sorting sub-module；The sequence submodule Block is communicated to connect with the control module；

The collection submodule, during user recording, the voice messaging of user is gathered according to default collection rule, is obtained Current speech segment is obtained, the current speech segment is sent to the identification submodule；

The identification submodule, receives the current speech segment that the collection submodule is sent, according to speech recognition Storehouse recognizes the current speech segment, obtains speech recognition fragment；

The collection submodule also obtains and sends next sound bite to the identification submodule, until user terminates record Sound；

The identification submodule also receives next sound bite that the collection submodule is sent, according to speech recognition Storehouse recognizes next sound bite, obtains speech recognition fragment, until user terminates recording；

The sorting sub-module, according to the time sequencing of collection, is ranked up integration by the speech recognition fragment, obtains Institute's speech recognition result；

In the embodiment of the present invention, the foundation of specific speech recognition library, prior art has a lot, herein not fine explanation. During recording, the sound bite that collection recording is obtained carries out speech recognition, compared to traditional voice identification method, handles language Sound recognition result faster, reduces the time of the typing of user's waiting voice and speech recognition.Acquisition voice is carried out according to fifo queue Information, and speech recognition is carried out by fifo queue, the Recording Process for the long period can not only efficiently reduce voice Recording and the stand-by period of speech recognition, complete speech recognition can also be made.User can be according to oneself hobby, demand To set default collection rule.Avoid causing the unnecessary stand-by period, the usage experience that the time of saving lifts user.According to Fifo queue carry out acquisition voice messaging, and by fifo queue carry out speech recognition, for the long period Recording Process not The stand-by period of voice recording and speech recognition can be only efficiently reduced, complete speech recognition can also be made.The present invention Technology in embodiment can be applied to be controlled including indoor equipment, in terms of voice dialogue robot, passes through voice recording mistake The function of speech recognition is carried out in journey in recording, solving user needs after the completion of voice recording, can carry out voice knowledge Other problem, and while normal recognition result is not influenceed shorten recording time delay, and user voice command promptly It is converted into voice recognition commands to input to intelligent home device, intelligent robot, so that more conveniently according to recognizing Voice recognition commands control intelligent home device, the intelligent robot arrived, is operated, voice operating phase without user with hand It is more rapider than manually operated, improve user experience.Specific example is shown in corresponding method embodiment.Realize voice recording process Middle carry out speech recognition, reducing user needs after the completion of voice recording, can carry out speech recognition and export sound result Stand-by period, while normal recognition result is not influenceed shorten recording time delay, improve user experience.

With reference to shown in Fig. 7, it will not be repeated here with upper one embodiment identical part.The present invention provides a kind of voice letter Another embodiment of processing system 1000 is ceased, including：The processing module also includes：Output sub-module, the output submodule Block is communicated to connect with the control module；

The output sub-module, according to the time sequencing of collection, exports the speech recognition fragment, obtains the voice and knows Other result.

Specifically, the present embodiment is during recording, it can enter immediately once collection is obtained after corresponding sound bite Row speech recognition obtains corresponding speech recognition fragment, according to the time sequencing of collection, exports the speech recognition fragment, Obtain institute's speech recognition result.If user's second finds which the word segment on typing column has different from the content that oneself is spoken , because acquisition time is regular, the sound bite can be found according to the time sequencing of collection and re-start identification, greatly Big lifting user experience.Realize and speech recognition is carried out during voice recording, reducing user needs to complete in voice recording Afterwards, speech recognition can be carried out and exports the stand-by period of sound result, shortened while normal recognition result is not influenceed Recording time delay, improves user experience.

With reference to shown in Fig. 8, the present invention provides a kind of another embodiment of speech information processing system 1000, including：Institute Stating control module includes：Gather submodule, identification submodule, judging submodule and delete submodule；The judging submodule point Do not communicated to connect with the collection submodule, the deletion submodule and the identification submodule；

The collection submodule, during user recording, the voice messaging of user is gathered according to default collection rule, is obtained Current speech segment is obtained, the current speech segment is sent to the judging submodule；

Whether the judging submodule, it is blank sound bite to judge the current speech segment；Judge institute if so, sending Result that current speech segment is blank sound bite is stated to the deletion submodule；Otherwise, send and judge the current speech Fragment is not the result of blank sound bite to the identification submodule；

The deletion submodule, receives the judged result that the judging submodule is sent, and deletes the current speech segment；

The collection submodule also obtains and sends next sound bite to the judging submodule, until user terminates record Sound；

The identification submodule also receives next sound bite that the collection submodule is sent, according to speech recognition Storehouse recognizes next sound bite, obtains speech recognition fragment, until user terminates recording.

In the embodiment of the present invention, invalid voice fragment can be deleted, helps user more rapidly to carry out speech recognition. In preprocessing process before speech recognition, according to the skill such as sound wave change frequency during the speaking of user and sound wave change fluctuation Art can identify user speech information, and which is partly efficient voice part, and which is invalid voice part, and removes invalid Phonological component information is blank sound bite.Realize and speech recognition is carried out during voice recording, reducing user needs in voice After the completion of recording, speech recognition can be carried out and export the stand-by period of sound result, not influence normal recognition result Shorten recording time delay simultaneously, improve user experience.

With reference to shown in Fig. 9, the present invention provides an a kind of example of voice information processing method, including：

1st, recording starts.

2nd, recording module is kept in Recording Process, is intercepted successively for 2S/ times.

3rd, file is intercepted.

4th, recording result is sent to sound identification module and carries out voice dictation.

5th, voice dictation result is put into fifo queue.

6th, semantics recognition module constantly carries out semantics recognition to the sentence in queue, and semantic analysis understands sentence.

7th, according to semantics recognition result, send command adapted thereto or answer result, so as to complete a whole set of speech recognition.

In the embodiment of the present invention, it is not special case that 2S/ times, which carries out interception, can be set according to the hobby and demand of user Put the temporal frequency of interception.Realize and speech recognition carried out during voice recording, reducing user needs after the completion of voice recording, Speech recognition can be carried out and export the stand-by period of sound result, shorten recording while normal recognition result is not influenceed Time delay, improves user experience.By using FIFO fifo queues, realize multichannel microphone and share a speech recognition Engine, improves engine utilization rate.Reduce for shorter recording, sound identification module, which is needed not wait for, reaches the speech recognition time It could carry out speech recognition after beginning, reduce the stand-by period of speech recognition, the Recording Process for the long period not only can be with The stand-by period of voice recording and speech recognition is efficiently reduced, complete speech recognition can also be made.This programme is in recording Time uses two second time, is once recorded within every two seconds, and recording result then is sent into sound identification module is identified, It is put into after recognition result in fifo queues, so continuous recording result is all in queue, then in semantics recognition module to splicing Sentence is identified, so as to reach the effect of Rapid Speech identification.Realize and speech recognition is carried out during voice recording, reduce and use Family is needed after the completion of voice recording, can be carried out speech recognition and be exported the stand-by period of sound result, not influence just Shorten recording time delay while normal recognition result, improve user experience.

It should be noted that above-described embodiment can independent assortment as needed.Described above is only the preferred of the present invention Embodiment, it is noted that for those skilled in the art, is not departing from the premise of the principle of the invention Under, some improvements and modifications can also be made, these improvements and modifications also should be regarded as protection scope of the present invention.

Claims

1. a kind of voice information processing method, it is characterised in that including step：

S100 is periodically gathered during user recording and is recognized the voice messaging of user, obtains speech recognition fragment；

The S200 processing speech recognition fragment, obtains voice identification result.

2. a kind of voice information processing method according to claim 1, it is characterised in that the step S100 includes step Suddenly：

S110 gathers the voice messaging of user according to preset collection rule during user recording, obtains current speech piece Section；

S120 recognizes the current speech segment according to speech recognition library, obtains speech recognition fragment；

S130 obtains next sound bite and performs step S110-130, until user terminates recording；

3. a kind of voice information processing method according to claim 2, it is characterised in that the step S110 also includes step Suddenly：

S111 judges whether the current speech segment is blank sound bite；If so, performing step S112；Otherwise, step is performed Rapid S120；

S112 deletes the current speech segment, and performs step S130.

4. a kind of voice information processing method according to claim 1, it is characterised in that the step S200 includes step Suddenly：

The speech recognition fragment is ranked up integration by S210 according to the time sequencing of collection, obtains the speech recognition knot Really.

5. a kind of voice information processing method according to claim any one of 1-4, it is characterised in that the step S200 Also include step：

S220 exports the speech recognition fragment, obtains institute's speech recognition result according to the time sequencing of collection.

6. a kind of speech information processing system, it is characterised in that including：Control module and processing module；The processing module with The control module communication connection；

The control module, periodically gathers and recognizes during user recording the voice messaging of user, obtain speech recognition Fragment；

The processing module, handles the speech recognition fragment that the control module identification is obtained, obtains voice identification result.

7. speech information processing system according to claim 6, it is characterised in that the control module includes：Collection Module and identification submodule；The collection submodule and the identification submodule communication connection；

The collection submodule, during user recording, the voice messaging of user is gathered according to default collection rule, is worked as Preceding sound bite, sends the current speech segment to the identification submodule；

The identification submodule, receives the current speech segment that the collection submodule is sent, is known according to speech recognition library Not described current speech segment, obtains speech recognition fragment；

The collection submodule also obtains and sends next sound bite to the identification submodule, until user terminates recording；

The identification submodule also receives next sound bite that the collection submodule is sent, and is known according to speech recognition library Not described next sound bite, obtains speech recognition fragment, until user terminates recording；

8. speech information processing system according to claim 7, it is characterised in that the control module also includes：Judge Submodule and deletion submodule, the judging submodule gather submodule, the deletion submodule and the knowledge with described respectively Small pin for the case module is communicated to connect；

Whether the judging submodule, it is blank sound bite to judge the current speech segment；Judge described work as if so, sending Preceding sound bite for blank sound bite result to the deletion submodule；Otherwise, send and judge the current speech segment It is not the result of blank sound bite to the identification submodule；

The deletion submodule, receives the judged result that the judging submodule is sent, and deletes the current speech segment.

9. speech information processing system according to claim 7, it is characterised in that the processing module includes：Sequence Module；The sorting sub-module is communicated to connect with the control module；

The sorting sub-module, according to the time sequencing of collection, integration is ranked up by the speech recognition fragment, obtains described Voice identification result.

10. the speech information processing system according to claim any one of 6-9, it is characterised in that the processing module is also Including：Output sub-module, the output sub-module is communicated to connect with the control module；

The output sub-module, according to the time sequencing of collection, exports the speech recognition fragment, obtains the speech recognition knot Really.