CN108733649A

CN108733649A - A kind of speech recognition text is inserted into the method, apparatus and system of notes document

Info

Publication number: CN108733649A
Application number: CN201810377108.0A
Authority: CN
Inventors: 卢闪明; 张亚鹏; 李行; 单衍景
Original assignee: BEIJING HUAXIA DIANTONG TECHNOLOGY Co Ltd
Current assignee: BEIJING HUAXIA DIANTONG TECHNOLOGY Co Ltd
Priority date: 2018-04-25
Filing date: 2018-04-25
Publication date: 2018-11-02
Anticipated expiration: 2038-04-25
Also published as: CN108733649B

Abstract

The application embodiment discloses a kind of method, apparatus and system of speech recognition text insertion notes document, wherein speech recognition text is inserted into the method for putting down document and includes:Receive the current text identification information of target audio subflow；Wherein, the current text information includes text identification content, text identification status indicator and text size；Corresponding text identification content is inserted into the corresponding position of notes document according to the text identification status indicator of current text identification information.Identify the text identification content of server return in spite of confirmation in this programme, it is inserted into time in notes document, both it had solved the problems, such as that different spokesman's speech habits etc. can not be unified to be in turn ensured the problem of correction due to identifying that text confirmation speed is low caused by network or server self problem and then causing text identification content insertion notes document slow, and had greatly increased user experience.

Description

A kind of speech recognition text is inserted into the method, apparatus and system of notes document

Technical field

This application involves technical field of voice recognition, more particularly to a kind of speech recognition text is inserted into the side of notes document Method, apparatus and system.

Background technology

With the development of speech recognition technology, speech recognition technology obtains more and more extensive application in all trades and professions.Example Such as：In court hearing or conference process, if speech recognition technology can be applied in court's trial or meeting, sound is turned It is changed to word and the real-time subangle color of word is inserted into notes document simultaneously, court's trial or the work of minutes personnel will be mitigated significantly in this way Manpower is saved in the work that the problem of measuring, and avoiding the occurrence of error of omission incorrect posting even substitutes record personnel completely.

In speech recognition process, identification server obtains the audio stream of current certain role speech, by the audio Being repeated as many times for stream is sliced and the context of context, semanteme is combined to be analyzed, and gradually generates the identification for current audio stream Text.If the text identification content in textual identification cannot be identified, identify that server can be repeatedly to present video Stream carries out identifying processing, until the text identification content in the textual identification of current audio stream is identified, in text identification Appearance is just inserted into notes document.In identification process, if the word speed of spokesman is too fast and the speech dead time is shorter Can cause to identify server make pauses in reading unpunctuated ancient writings automatically calculate occur error (by two of spokesman corresponding audio streams of making a speech be considered as one into Row processing), since number increase is compared for current audio stream in identification server and then obtains final acknowledgement state It identifies that text time increases, it is poor to eventually lead to user experience.

Invention content

The purpose of the application embodiment is to provide a kind of speech recognition text and is inserted into the method, apparatus of notes document and is System solves the existing technical problem for being inserted into notes document experience sense difference.

To achieve the above object, the application embodiment provides a kind of method that speech recognition text is inserted into notes document, Including:

Receive the current text identification information of target audio subflow；Wherein, the current text information includes text identification Content, text identification status indicator and text size；

Corresponding text identification content is inserted into notes text according to the text identification status indicator of current text identification information The corresponding position of shelves.

Preferably, the text identification status indicator according to current text identification information is by corresponding text identification content Be inserted into notes document corresponding position the step of include：

Text identification status indicator in the current text identification information identifies for non-acknowledgement, and a upper text identification is believed Text identification in breath is identified as non-determined mark, then according in the text size and text identification in a upper textual identification Text size and text identification content in appearance and current text identification information will be in the text identifications of current text identification information Hold the corresponding position for being inserted into notes document；

Text identification status indicator in the current text identification information identifies for non-acknowledgement, and a upper text identification is believed Text identification in breath is identified as confirmation mark, then the text identification content of current text identification information is inserted into notes document Corresponding position；

Text identification status indicator in the current text identification information identifies to confirm, and a upper textual identification In text identification be identified as non-acknowledgement mark, then according to the text size and text identification content in a upper textual identification With in current text identification information text size and text identification content the text identification content of current text information is inserted into Put down the corresponding position of document；

Text identification status indicator in the current text identification information identifies to confirm, and a upper textual identification In text identification be identified as confirmation mark, then by the text identification content of current text identification information be inserted into notes document phase Answer position.

Preferably, according in a upper textual identification text size and the identification of text identification content and current text believe The text identification content of current text identification information is inserted into the phase of notes document by text size and text identification content in breath The step of answering position include：

By in the text identification content of current text identification information since initial position to a upper textual identification In the content of the identical position of text size be compared with the text identification content in a upper textual identification, if than It is identical compared with result, then will in the text identification content of current text identification information remove since initial position to a upper text Remaining content is inserted into a upper textual identification in notes document by the content of the identical position of text size in identification information Text identification content behind；If comparison result differs, the text identification content of a upper textual identification is deleted, it will The text identification content of current text identification information is inserted into the text identification content of a upper textual identification for notes document Position.

Preferably, the step of text identification content of current text identification information being inserted into the corresponding position of notes document is wrapped It includes：

Text identification in a upper textual identification is identified as non-determined mark, the current text identification information In text identification status indicator be non-acknowledgement mark, then when being inserted by text identification content in a upper textual identification The bookmark used obtains the insertion position of the text identification content in current text identification information, and the current text is identified and is believed Text identification content in breath is inserted into corresponding position, and updates the scope of the bookmark；

Text identification in a upper textual identification is identified as confirmation mark, then is obtained by mapping function current The insertion position of text identification content in textual identification, by the text identification content in the current text identification information It is inserted into corresponding position, it includes text to remove the bookmark used when the text identification content in a upper textual identification is inserted into The poster edge of content, and corresponding bookmark is re-created, the bookmark includes the text identification in current text identification information The band of position of content.

To achieve the above object, the application embodiment also provides a kind of side of speech recognition text insertion notes document Method, including:

Receive audio stream；

The audio stream is subjected to cutting, obtains audio sub-stream；

According to the text identification status indicator in a upper textual identification, target audio for currently needing to identify is determined Stream；

The target audio subflow is identified, current text identification information is obtained；Wherein, the current text information Including text identification content, text identification status indicator and text size；

The current text identification information is sent to insertion notes end, realizes that the text in current text identification information is known Other content is inserted into notes document.

Preferably, determine currently need include the step of the target audio subflow identified：

If the text identification status indicator in a upper textual identification identifies for non-acknowledgement, what current needs identified Target audio subflow is the corresponding audio sub-stream of a upper textual identification；

If the text identification status indicator in a upper textual identification is to confirm to identify, the mesh identified is currently needed Mark audio sub-stream is next audio sub-stream.

To achieve the above object, the application embodiment provides a kind of device of speech recognition text insertion notes document, Including:

Receiving unit, the current text identification information for receiving target audio subflow；Wherein, the current text information Including text identification content, text identification status indicator and text size；

It is inserted into notes unit, for knowing corresponding text according to the text identification status indicator of current text identification information Other content is inserted into the corresponding position of notes document.

Preferably, the insertion notes unit includes：

First is inserted into notes module, is non-acknowledgement for the text identification status indicator in the current text identification information Mark, and the text identification in a upper textual identification is identified as non-determined mark, then according in a upper textual identification Text size and text size in text identification content and current text identification information and text identification content ought be above The text identification content of this identification information is inserted into the corresponding position of notes document；

Second is inserted into notes module, is non-acknowledgement for the text identification status indicator in the current text identification information Mark, and the text identification in a upper textual identification is identified as confirmation mark, then by the text of current text identification information Identify that content is inserted into the corresponding position of notes document；

Third is inserted into notes module, is to confirm to mark for the text identification status indicator in the current text identification information Know, and the text identification in a upper textual identification is identified as non-acknowledgement mark, then according in a upper textual identification Text size and text identification content in current text identification information text size and text identification content by current text The text identification content of information is inserted into the corresponding position of notes document；

4th is inserted into notes module, is to confirm to mark for the text identification status indicator in the current text identification information Know, and the text identification in a upper textual identification is identified as confirmation mark, then knows the text of current text identification information Other content is inserted into the corresponding position of notes document.

To achieve the above object, the application embodiment also provides a kind of dress of speech recognition text insertion notes document It sets, including:

Receiving unit, for receiving audio stream；

Cutting unit obtains audio sub-stream for the audio stream to be carried out cutting；

Target audio subflow confirmation unit is used for according to the text identification status indicator in a upper textual identification, really The target audio subflow identified is needed before settled；

Recognition unit obtains current text identification information for the target audio subflow to be identified；Wherein, institute It includes text identification content, text identification status indicator and text size to state current text information；

Transmission unit realizes current text identification for the current text identification information to be sent to insertion notes end Text identification content in information is inserted into notes document.

Preferably, the target audio subflow confirmation unit includes：

First confirmation module, if being non-acknowledgement mark for the text identification status indicator in a upper textual identification Know, then it is the corresponding audio sub-stream of a upper textual identification currently to need the target audio subflow identified；

Second confirmation module, if being to confirm to identify for the text identification status indicator in a upper textual identification, It is next audio sub-stream then currently to need the target audio subflow identified.

Therefore due to spokesman each one make a speech custom, network and identification server configuration and only in recognition and verification In the case of returned text identification content and lead to identify text that return to slow user experience poor.Based on this, this programme is proposed, know The text identification content that other server returns is inserted into notes document in spite of confirmation, had both solved different speeches in time People's speech habits etc. can not be unified in turn ensure due to identifying text caused by network or server self problem the problem of correction Confirm that speed is low and then text identification content is caused to be inserted into the slow problem of notes document, greatly increases user experience.

Description of the drawings

It, below will be to embodiment in order to illustrate more clearly of the application embodiment or technical solution in the prior art Or attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only It is some embodiments described in the application, for those of ordinary skill in the art, in not making the creative labor property Under the premise of, other drawings may also be obtained based on these drawings.

Fig. 1 is the system schematic that a kind of speech recognition text that the embodiment of the present application proposes is inserted into notes document；

Fig. 2 is one of the method flow diagram that a kind of speech recognition text that the embodiment of the present application proposes is inserted into notes document；

Fig. 3 is the two of the method flow diagram that a kind of speech recognition text that the embodiment of the present application proposes is inserted into notes document；

Fig. 4 be the embodiment of the present application propose a kind of speech recognition text be inserted into notes document apparatus function block diagram it One；

Fig. 5 be the embodiment of the present application propose a kind of speech recognition text be inserted into notes document apparatus function block diagram it Two；

Fig. 6 is a kind of electronic equipment schematic diagram that the embodiment of the present application proposes.

Specific implementation mode

In order to make those skilled in the art better understand the technical solutions in the application, below in conjunction with the application reality The attached drawing in mode is applied, the technical solution in the application embodiment is clearly and completely described, it is clear that described Embodiment is only a part of embodiment of the application, rather than whole embodiments.Based on the embodiment party in the application Formula, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, is all answered When the range for belonging to the application protection.

As shown in Figure 1, being inserted into the system signal of notes document for a kind of speech recognition text that the embodiment of the present application proposes Figure.Including：It is inserted into notes terminal and speech recognition server.Wherein, speech recognition server obtains sound from voice collector Frequency flows, to audio stream by after noise treatment, cutting is multiple audio sub-streams.Speech recognition server is to each audio sub-stream Processing is identified, the result of identifying processing is built into textual identification, no matter whether the identification content of audio sub-stream is true Recognize, textual identification is sent to insertion notes terminal by voice server.If in the identification of the audio sub-stream currently identified Appearance is identified that then speech recognition server can be carried out the identification work of next audio sub-stream.If the audio currently identified The identification content of subflow is in non-acknowledgement state, then speech recognition server continues that work is identified to present video subflow. Only no matter the identification content of audio sub-stream is non-acknowledgement state or acknowledgement state, speech recognition server knows text Other information is back to insertion notes terminal.It is inserted into the text that notes terminal returns to speech recognition server according to text identification state Text identification content in this identification information is inserted into the corresponding position of notes document.

The technical program is applied to the application scenarios made a speech in synchronization only one role.In the technical scheme, it inserts Enter to put down document creation storage unit, will identify that the textual identification that server returns is stored into storage unit.The storage The information such as unit storaged voice content, identification state mark when receiving identification text in real time every time, are inserted into notes terminal and are passed through Identification state mark in storage unit, which calculates, obtains text insertion position, realizes that the identification content of text of single role's speech is inserted into Put down the corresponding position of document.

In the technical scheme, identify that the meaning of the non-acknowledgement state of content is：Speech recognition server is to getting Audio stream carries out the text identification content generated in the identification operating process such as slice analysis, and the text identifies that content is present video Subflow identification generates a part for final text, and deposited in text identification content individual fields need by again identify that handle into Row correction modification.Identify that the meaning of the acknowledgement state of content is：Identification server carries out slice analysis to the audio stream got Deng the text identification content for identifying life in operating process, the text identifies content by finally confirming in conjunction with context semantic analysis It is not necessary that the text of operation is identified again.

Based on foregoing description, the embodiment of the present application proposes a kind of method that speech recognition text is inserted into notes document, such as Fig. 2 It is shown.For the technical scheme, it is applied to be inserted into notes terminal, specifically, the insertion notes terminal for example can be Have that the desktop computer of data processing function, tablet computer, laptop, smart mobile phone, digital assistants, intelligence is wearable sets Standby, shopping guide's terminal, television set etc..Alternatively, the client may be the software that can be run in above-mentioned electronic equipment.Institute It states method and is applied to polygonal color while situation of making a speech, may comprise steps of：

Step 201)：Receive the current text identification information of target audio subflow；Wherein, the current text information includes Text identification content, text identification status indicator and text size.

In the technical scheme, text identification content is voice content in current goal audio sub-stream.Text identification state The voice content identified in mark current goal audio sub-stream is identified as whether without again identifying that operation.In the present embodiment In, text identification status indicator is 1, indicates the voice content identified in present video subflow by combining context semanteme point The final text confirmed it is not necessary that operation is identified again of analysis.Text identification status indicator is 0, identifies and knows in present video subflow The voice content not gone out is that the identification of present video subflow generates a part for final text, and a malapropism is deposited in text identification content Section is needed by again identifying that processing carries out correction modification.Text size is current goal audio for identifying server and identifying The length of the voice content of stream.

In the present embodiment, it is inserted into notes terminal and one storage unit is set on a processor, know dedicated for storaged voice The current text identification information that other server returns.The storage unit is divided into multiple storage regions, and different zones store text respectively Different content in this identification information, than if any special storage text identification status indicator, having in special storage text identification Hold etc..For the technical scheme, storage unit stores a upper textual identification, is inserted into notes terminal and receives voice It identifies the current text identification information that server returns, is inserted into notes terminal according to the text identification in current text identification information Text identification status indicator in status indicator and a upper textual identification knows the text in current text identification information Other content is inserted into corresponding notes document, and deletes a upper textual identification in the memory unit, and current text is identified Information storage is to storage unit.It is inserted into notes terminal and a memory is set, for storing the result letter being inserted into notes document It ceases, the content in a upper textual identification for above-described storage unit storage is for when being inserted into text identification content Accurate confirming insertion position, memory is different from the content that storage unit stores in the technical scheme.

Step 202)：Corresponding text identification content is inserted according to the text identification status indicator of current text identification information Enter to put down the corresponding position of document.

In the technical scheme, the text identification status indicator according to current text identification information is by corresponding text Identify that the step of content is inserted into the corresponding position of notes document includes：

In the technical scheme, according in a upper textual identification text size and text identification content with ought be above The text identification content of current text identification information is inserted into notes by text size and text identification content in this identification information The step of corresponding position of document includes：

In the technical scheme, the text identification content of current text identification information is inserted into the corresponding position of notes document The step of include：

Specifically, for the process for being inserted into notes document in the case of single role's speech is described in detail, notes terminal processes are inserted into Flow is：

1. role A makes a speech for the first time, audio collection device carries out audio collection to role's A speeches, obtains audio stream, identification service Device carries out cutting processing to audio stream, obtains audio sub-stream, processing is identified in identification first audio sub-stream of server pair, first Secondary returned text identification content Sa1, text identification status indicator Ta1 and text size L1, it is single to create the corresponding storages of role A Member stores text identification content, text identification status indicator and text size in textual identification.Wherein, Ta1=1, table Bright identification server is to confirm text to the text identification content Sa1 that currently returns, stores currently return in the memory unit Text identification content Sa1 in textual identification and text identification status indicator Ta1, and text identification content Sa1 is inserted into It puts down in document.Next textual identification is returned etc. server to be identified.Ta1=0 shows to identify server to currently returning Sa1 non-acknowledgement texts of text identification content, store the text in the textual identification currently returned in the memory unit It identifies content Sa1 and text identification status indicator Ta1, and text identification content Sa1 is inserted into notes document.Etc. clothes to be identified Business device returns to next textual identification.

2. identifying that server returns to next textual identification, text identification content Sa2, text identification status indicator are obtained Ta2 and text size L2.If the text identification status indicator Ta1=1 of a upper textual identification, current text identification letter Breath is that identification server identifies acquisition to present video subflow.If the text identification status indicator of a upper textual identification Ta1=0, then current text identification information is that identification server identifies acquisition to the corresponding audio sub-stream of a upper textual identification 's.

If 2.1 Ta2=0, the identification text status in textual identification is non-acknowledgement, then by text identification content S2 The L1 lengthy contents that start of initial position be compared with text identification content S1, if comparison result is identical, pass through character Truncation takes the L2-L1 contents part S21 for obtaining text identification content S2 and content S21 is inserted into notes in such a way that tail portion adds Document；If comparison result differs, without being intercepted to text identification content S2 but directly by text identification content S2 It is inserted into notes document to cover inserted mode (delete text identification content S1, be inserted into text identification content S2).Update storage list The content stored in member, by text identification content Sa1, text identification status indicator Ta1 and the text of a upper textual identification Length L1 is deleted, by text identification content Sa2, text identification status indicator Ta2 and the text size of current text identification information L2 is stored into storage unit.

If 2.2 Ta2=1, the identification text status in textual identification is to confirm (Ta2=1), and text size is L2, text identification content are S2.Then the initial position of the text identification content S2 in currently getting textual identification is opened Begin to the content of length L1 to be compared with content of text S1.If comparison result is identical, is intercepted by character string and obtain text The content is simultaneously inserted into such a way that tail portion adds in notes document by the L2-L1 partial contents of content S2；If comparison result not phase Together, then it is not necessarily to intercept content of text S2, content of text S2 (is directly deleted into S1 contents in document to cover inserted mode It is inserted into S2) it is inserted into notes document.Meanwhile by the text identification content Sa1 of a upper textual identification, text identification state mark Know Ta1 and text size L1 is deleted, by the text identification content Sa2 of current text identification information, text identification status indicator Ta2 It is stored into storage unit with text size L2.

3. identifying that server returns to third textual identification, it is inserted into notes terminal and receives the information, no matter third Text identification status indicator Ta3 in textual identification is 1 or 0, if the text identification in a upper textual identification Status indicator Ta2=0 then puts down the step 2.1 in terminal processes flow according to above-mentioned insertion and executes insertion logic.If upper one Text identification status indicator Ta2=1 in textual identification then puts down the step in terminal processes flow according to above-mentioned insertion 1 sequence restarts to execute text-processing and is inserted into logic.Finally, by the text identification content Sa2 of a upper textual identification, Text identification status indicator Ta2 and text size L2 are deleted, by text identification content Sa3, the text of current text identification information Identification state mark Ta3 and text size L3 is stored into storage unit.

While text identification content is inserted into notes document, it is inserted into the new text addition of notes document in real time to each role Poster edge, detection remove last time in document returns and in the text identification content be inserted into identification state for acknowledgement state text Poster edge, ensure that poster edge follows the identification text being currently newly inserted.Specifically, described know according to current text Corresponding text identification content is inserted into the step of corresponding position of notes document and also wrapped by the text identification status indicator of other information It includes：

Text identification content in current text identification information is inserted into after corresponding position, judges upper text identification letter The text identification status indicator of breath is removed if the text identification status indicator of a upper textual identification is to confirm to identify The poster edge of the text identification content of a upper textual identification, and be inserted into the text identification in current text identification information Hold, and poster edge is set；If the text identification status indicator of a upper textual identification identifies for non-acknowledgement, insertion is worked as Text identification content in preceding textual identification, and poster edge is set.

So, on the basis of examples detailed above, the corresponding poster edge of each text identification content adds in the case of role A speech Set logic：

1. being inserted into first textual identification of role A that notes terminal will identify that server returns, A couples of the role is created The storage unit answered stores text identification content Sa1 in current text identification information, text identification in a manner of covering and store Status indicator Ta1 and text size L1.

If 1.1 Ta1=0, the text for obtaining current text identification information is calculated by the located in connection function that WordAPI is provided This identification content insertion position, and text identification content is inserted into notes document, create the corresponding bookmarks (Bookmark) of role A B<a>Including Sa1, is to be inserted into the corresponding shading color effect of content of text addition, and be transferred to step 2 and continue to execute logic by bookmark Flow.

If 1.2 Ta1=1, the text for obtaining current text identification information is calculated by the located in connection function that WordAPI is provided This identification content insertion position, and text identification content is inserted into notes document, create the corresponding bookmarks (Bookmark) of role A B<a>Including Sa1, is to be inserted into the corresponding shading color effect of content of text addition by bookmark.And it is transferred to step 1 and continues to execute logic Flow.

2. second textual identification is back to insertion notes terminal by identification server.Being inserted into notes terminal will be current Text identification content Sa2, text identification status indicator Ta2 and text size L2 in textual identification is to cover storage Mode is stored to storage unit, deletes text identification content Sa1, text identification status indicator in the first textual identification Ta1 and text size L1.

If 2.1 text identification status indicator Ta2=0, pass through bookmark B<a>It calculates and obtains inserting for text identification content Sa2 More new bookmark B while entering position and text identification content Sa2 is inserted into notes document<a>Scope, to be updated Bookmark B<a>The corresponding poster edge of addition, and be transferred to step 3 and continue to execute logic flow.

If 2.2 text identification status indicator Ta2=1, pass through bookmark B<a>It calculates and obtains inserting for text identification content Sa2 More new bookmark B while entering position and text identification content Sa2 is inserted into notes document<a>Scope, to be updated Bookmark B<a>The corresponding poster edge of addition, and be transferred to step 3 and continue to execute logic flow.

Third textual identification is back to insertion notes terminal by 3 identification servers, and being inserted into notes terminal will be current Text identification content Sa3, text identification status indicator Ta3 and text size L3 in textual identification is to cover storage Mode is stored to storage unit, deletes text identification content Sa2, text identification status indicator in the second textual identification Ta2 and text size L2.

If 3.1 Ta2=0, Ta3=0, step 2.1 operation is executed.

If 3.2 Ta2=0, Ta3=1, step 2.2 operation is executed.

If 3.3 Ta2=1, Ta3=0 or Ta2=1, Ta3=1, restarts to execute from step 1 and is inserted into logic flow, Understand bookmark B<a>Include the poster edge of text.

The application embodiment provides the method that another speech recognition text is inserted into notes document, as shown in Figure 3.For For the technical program, it is applied to be inserted into speech recognition server, specifically, the speech recognition server can be a tool Have data operation, store function and network interaction function electronic equipment；Or run in the electronic equipment, for number The software supported is provided according to processing, storage and network interaction.Do not limit the number of the server specifically in the present embodiment Amount.The server can be a server, can also be several servers, alternatively, the server that several servers are formed Cluster.The method that the speech recognition text is inserted into notes document may comprise steps of：

Step 301：Receive audio stream.

In the present embodiment, under the real-time acquisition applications scene of voice collector user voice, and by collected voice Through noise reduction process, audio stream is obtained.

Step 302)：The audio stream is subjected to cutting, obtains audio sub-stream.

In the present embodiment, in order to improve the precision of speech recognition, the audio stream that voice collector feedback comes is passed through Cutting is handled, and a big section audio flows through cutting processing, obtains the audio stream of multiple segments.Every time identification when audio stream data not It is especially big, greatly improves accuracy of identification.

Step 303)：According to the text identification status indicator in a upper textual identification, determine what current needs identified Target audio subflow.

In the technical scheme, if identification server is to currently needing the result that the audio-frequency information of identifying processing identifies can not Confirm, recognition result is still fed back to insertion notes terminal, the content of non-acknowledgement is inserted into notes document, then identification clothes Business device continues to again identify that the audio-frequency information, no matter whether this recognition result confirms, still feed back to recognition result slotting Enter to put down terminal, this recognition result is inserted into notes document.Until the text of the audio-frequency information of identification server identifying processing This identification information is confirmed, just carries out next audio-frequency information and processing is identified.If identifying server to currently needing to know The result of the audio-frequency information identification of other places reason is acknowledgement state, recognition result is fed back to insertion notes terminal, what be will confirm that is interior Hold and be inserted into notes document, then identifies that processing is identified to next audio-frequency information in server.

For conventional techniques, identification server is to currently needing the result that the audio-frequency information identified identifies can not Confirm, then recognition result is will not to feed back to insertion notes terminal, until identification server will currently need the audio identified to believe The result of breath identification is confirmed that recognition result just feeds back to insertion notes terminal and is inserted into.Conventional techniques are being inserted The fashionable time for needing to spend is longer than the technical program, greatly reduces the Experience Degree of user.The technical program will each time Identification information improves the Experience Degree of user in spite of confirming in real-time insertion notes document.Therefore, in the technical program In, determine currently need include the step of the target audio subflow identified：

Step 304)：The target audio subflow is identified, current text identification information is obtained；Wherein, described to work as Preceding text information includes text identification content, text identification status indicator and text size；

Step 305)：The current text identification information is sent to insertion notes end, realizes current text identification information In text identification content be inserted into notes document.

The technical program will be by that will identify that server for audio stream generate in slice comparative analysis calculating process Text is inserted into document in spite of confirmation and solves the problems, such as due to identifying the slow poor user experience of text return speed in real time.

As shown in figure 4, being inserted into the work(of the device of notes document for a kind of speech recognition text that the embodiment of the present application proposes One of energy block diagram.The device is to be inserted into notes terminal in practical applications.Including:

Receiving unit 401, the current text identification information for receiving target audio subflow；Wherein, the current text Information includes text identification content, text identification status indicator and text size；

Be inserted into notes unit 402, for according to the text identification status indicator of current text identification information by corresponding text This identification content is inserted into the corresponding position of notes document.

In the present embodiment, the insertion notes unit includes：

As shown in figure 5, being inserted into the apparatus function of notes document for a kind of speech recognition text that the embodiment of the present application proposes The two of block diagram.The device is to be inserted into notes terminal in practical applications.Including:

Receiving unit 501, for receiving audio stream；

Cutting unit 502 obtains audio sub-stream for the audio stream to be carried out cutting；

Target audio subflow confirmation unit 503 is used for according to the text identification status indicator in a upper textual identification, Determine the target audio subflow for currently needing to identify；

Recognition unit 504 obtains current text identification information for the target audio subflow to be identified；Wherein, The current text information includes text identification content, text identification status indicator and text size；

Transmission unit 505 realizes that current text is known for the current text identification information to be sent to insertion notes end Text identification content in other information is inserted into notes document.

As shown in fig. 6, a kind of electronic system schematic diagram proposed for the embodiment of the present application.The electronic equipment includes：It deposits Computer program is stored in reservoir a and processor b, the memory a, when the computer program is executed by the processor b, Realize following functions：

In the present embodiment, described to be known corresponding text according to the text identification status indicator of current text identification information Other content is inserted into the corresponding position of notes document, when the computer program is executed by the processor b, realizes following functions：

Text identification status indicator in the current text identification information identifies for non-acknowledgement, and a upper text identification is believed Text identification in breath is identified as non-determined mark, then according in the text size and text identification in a upper textual identification Hold, the text size in current text identification information and text identification content are by the text identification content of current text identification information It is inserted into the corresponding position of notes document；

Text identification status indicator in the current text identification information identifies to confirm, and a upper textual identification In text identification be identified as non-acknowledgement mark, then according in the text size and text identification in a upper textual identification Hold, the text identification content of current text information is inserted by the text size in current text identification information and text identification content Put down the corresponding position of document；

In the present embodiment, according to the text size and text identification content, current text in a upper textual identification The text identification content of current text identification information is inserted into notes text by text size and text identification content in identification information Following functions when the computer program is executed by the processor b, are realized in the corresponding position of shelves：

The embodiment of the present application proposes that another electronic equipment, the electronic equipment include：Memory a and processor b, it is described Computer program is stored in memory a, when the computer program is executed by the processor b, realizes following functions：

Receive audio stream；

The audio stream is subjected to cutting, obtains audio sub-stream；

In the present embodiment, the target audio subflow for currently needing to identify is determined, the computer program is by the processing When device b is executed, following functions are realized：

In the present embodiment, the memory includes but not limited to random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), caching (Cache), hard disk (Hard Disk Drive, HDD) or storage card (Memory Card).

In the present embodiment, the processor can be implemented in any suitable manner.For example, the processor can be with Take such as microprocessor or processor and storage can by computer readable program code that (micro-) processor executes (such as Software or firmware) computer-readable medium, logic gate, switch, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), the form etc. of programmable logic controller (PLC) and embedded microcontroller.

The concrete function that the electronic equipment that this specification embodiment provides, memory and processor are realized, Ke Yiyu Aforementioned embodiments in this specification contrast explanation, and can reach the technique effect of aforementioned embodiments, here just not It repeats again.

In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example, Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit. Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device (Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " patrols Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development, And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language (Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL (Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL (Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language) etc., VHDL (Very-High-Speed are most generally used at present Integrated Circuit Hardware Description Language) and Verilog2.Those skilled in the art It will be apparent to the skilled artisan that only needing method flow slightly programming in logic and being programmed into integrated circuit with above-mentioned several hardware description languages In, so that it may to be readily available the hardware circuit for realizing the logical method flow.

It is also known in the art that in addition to realizing client, server in a manner of pure computer readable program code In addition, completely can by by method and step carry out programming in logic come so that client, server with logic gate, switch, special The form of integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. realizes identical function.Therefore this client, Server is considered a kind of hardware component, and can also be regarded to the device for realizing various functions that its inside includes For the structure in hardware component.Or even, can will be considered as either implementation method for realizing the device of various functions Software module can be structure in hardware component again.

As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It is realized by the mode of software plus required general hardware platform.Based on this understanding, the technical solution essence of the application On in other words the part that contributes to existing technology can be expressed in the form of software products, the computer software product It can be stored in a storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions are used so that a computer equipment (can be personal computer, server either network equipment etc.) executes each embodiment of the application or embodiment Method described in certain parts.

Each embodiment in this specification is described in a progressive manner, identical similar between each embodiment Just to refer each other for part, what each embodiment stressed is the difference with other embodiment.In particular, needle For the embodiment of client, the introduction control for being referred to the embodiment of preceding method is explained.

The application can describe in the general context of computer-executable instructions executed by a computer, such as program Module.Usually, program module includes routines performing specific tasks or implementing specific abstract data types, program, object, group Part, data structure etc..The application can also be put into practice in a distributed computing environment, in these distributed computing environments, by Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with In the local and remote computer storage media including storage device.

Although depicting the application by embodiment, it will be appreciated by the skilled addressee that there are many deformations by the application With variation without departing from spirit herein, it is desirable to which the attached claims include these deformations and change without departing from the application Spirit.

Claims

1. a kind of method that speech recognition text is inserted into notes document, which is characterized in that including:

Receive the current text identification information of target audio subflow；Wherein, the current text information include text identification content, Text identification status indicator and text size；

Corresponding text identification content is inserted into notes document according to the text identification status indicator of current text identification information Corresponding position.

2. the method as described in claim 1, which is characterized in that the text identification state according to current text identification information Identifying the step of corresponding text identification content is inserted into the corresponding position of notes document includes：

Text identification status indicator in the current text identification information identifies for non-acknowledgement, and in a upper textual identification Text identification be identified as non-determined mark, then according in a upper textual identification text size and text identification content with Text size and text identification content in current text identification information insert the text identification content of current text identification information Enter to put down the corresponding position of document；

Text identification status indicator in the current text identification information identifies for non-acknowledgement, and in a upper textual identification Text identification be identified as confirmation mark, then by the text identification content of current text identification information be inserted into notes document it is corresponding Position；

Text identification status indicator in the current text identification information is confirmation mark, and in a upper textual identification Text identification be identified as non-acknowledgement mark, then according in a upper textual identification text size and text identification content with work as The text identification content of current text information is inserted into notes by text size and text identification content in preceding textual identification The corresponding position of document；

Text identification status indicator in the current text identification information is confirmation mark, and in a upper textual identification Text identification is identified as confirmation mark, then the text identification content of current text identification information is inserted into the corresponding positions of notes document It sets.

3. method as claimed in claim 2, which is characterized in that according to the text size and text in a upper textual identification Identify text size in content and current text identification information and text identification content by the text of current text identification information Identify that the step of content is inserted into the corresponding position of notes document includes：

By in the text identification content of current text identification information since initial position to in a upper textual identification The content of the identical position of text size is compared with the text identification content in a upper textual identification, if comparing knot Fruit is identical, then will in the text identification content of current text identification information remove since initial position to a upper text identification The content of the identical position of text size in information is inserted into remaining content the text of a upper textual identification in notes document Behind this identification content；If comparison result differs, the text identification content of a upper textual identification is deleted, it will be current The text identification content of textual identification is inserted into the position of the text identification content of a upper textual identification for notes document It sets.

4. method as claimed in claim 3, which is characterized in that the text identification content of current text identification information is inserted into pen Record document corresponding position the step of include：

Text identification in a upper textual identification is identified as non-determined mark, in the current text identification information Text identification status indicator identifies for non-acknowledgement, then is used when being inserted by the text identification content in a upper textual identification Bookmark obtain current text identification information in text identification content insertion position, will be in the current text identification information Text identification content be inserted into corresponding position, and update the scope of the bookmark；

Text identification in a upper textual identification is identified as confirmation mark, then obtains current text by mapping function Text identification content in the current text identification information is inserted by the insertion position of the text identification content in identification information To corresponding position, it includes content of text to remove the bookmark used when the text identification content in a upper textual identification is inserted into Poster edge, and re-create corresponding bookmark, the bookmark includes the text identification content in current text identification information The band of position.

5. a kind of method that speech recognition text is inserted into notes document, which is characterized in that including:

Receive audio stream；

The audio stream is subjected to cutting, obtains audio sub-stream；

According to the text identification status indicator in a upper textual identification, the target audio subflow for currently needing to identify is determined；

The target audio subflow is identified, current text identification information is obtained；Wherein, the current text information includes Text identification content, text identification status indicator and text size；

The current text identification information is sent to insertion notes end, is realized in the text identification in current text identification information Hold and is inserted into notes document.

6. method as claimed in claim 5, which is characterized in that determination currently needs to wrap the step of the target audio subflow identified It includes：

If the text identification status indicator in a upper textual identification identifies for non-acknowledgement, the target identified is currently needed Audio sub-stream is the corresponding audio sub-stream of a upper textual identification；

If the text identification status indicator in a upper textual identification is to confirm to identify, the target sound identified is currently needed Frequency subflow is next audio sub-stream.

7. a kind of speech recognition text is inserted into the device of notes document, which is characterized in that including:

Receiving unit, the current text identification information for receiving target audio subflow；Wherein, the current text information includes Text identification content, text identification status indicator and text size；

It is inserted into notes unit, being used for will be in corresponding text identification according to the text identification status indicator of current text identification information Hold the corresponding position for being inserted into notes document.

8. device as claimed in claim 7, which is characterized in that the insertion puts down unit and includes：

First is inserted into notes module, is non-acknowledgement mark for the text identification status indicator in the current text identification information Know, and the text identification in a upper textual identification is identified as non-determined mark, then according in a upper textual identification Text size and text identification content in current text identification information text size and text identification content by current text The text identification content of identification information is inserted into the corresponding position of notes document；

Second is inserted into notes module, is non-acknowledgement mark for the text identification status indicator in the current text identification information Know, and the text identification in a upper textual identification is identified as confirmation mark, then knows the text of current text identification information Other content is inserted into the corresponding position of notes document；

Third is inserted into notes module, is to confirm to identify for the text identification status indicator in the current text identification information, And the text identification in a upper textual identification is identified as non-acknowledgement mark, then according to the text in a upper textual identification Length and text identification content in current text identification information text size and text identification content by current text information Text identification content be inserted into notes document corresponding position；

4th is inserted into notes module, is to confirm to identify for the text identification status indicator in the current text identification information, And the text identification in a upper textual identification is identified as confirmation mark, then it will be in the text identification of current text identification information Hold the corresponding position for being inserted into notes document.

9. a kind of speech recognition text is inserted into the device of notes document, which is characterized in that including:

Receiving unit, for receiving audio stream；

Target audio subflow confirmation unit, for according to the text identification status indicator in a upper textual identification, determination to be worked as The preceding target audio subflow for needing to identify；

Recognition unit obtains current text identification information for the target audio subflow to be identified；Wherein, described to work as Preceding text information includes text identification content, text identification status indicator and text size；

Transmission unit realizes current text identification information for the current text identification information to be sent to insertion notes end In text identification content be inserted into notes document.

10. device as claimed in claim 9, which is characterized in that the target audio subflow confirmation unit includes：

First confirmation module, if identified for non-acknowledgement for the text identification status indicator in a upper textual identification, It is the corresponding audio sub-stream of a upper textual identification currently to need the target audio subflow identified；

Second confirmation module, if being to confirm to identify for the text identification status indicator in a upper textual identification, when The preceding target audio subflow for needing to identify is next audio sub-stream.