CN108647190A

CN108647190A - A kind of speech recognition text is inserted into the method, apparatus and system of notes document

Info

Publication number: CN108647190A
Application number: CN201810377094.2A
Authority: CN
Inventors: 卢闪明; 张亚鹏; 李行; 单衍景
Original assignee: BEIJING HUAXIA DIANTONG TECHNOLOGY Co Ltd
Current assignee: BEIJING HUAXIA DIANTONG TECHNOLOGY Co Ltd
Priority date: 2018-04-25
Filing date: 2018-04-25
Publication date: 2018-10-12
Anticipated expiration: 2038-04-25
Also published as: CN108647190B

Abstract

The application embodiment discloses a kind of method, apparatus and system of speech recognition text insertion notes document, wherein speech recognition text is inserted into the method for putting down document and includes:Receive the current text identification information of target audio subflow；Wherein, current text information includes text identification content, text identification status indicator, role identification and text size；Corresponding text identification content is inserted into the corresponding position of notes document according to the text identification status indicator of current text identification information and role identification.The technical program is made a speech under scene simultaneously in polygonal color, speech recognition server intersects the real-time identification text for returning to different role, text identification content in the textual identification of insertion is in spite of being identified, correct, orderly, subangle color is inserted into notes document, it is not that text identification content is only just inserted into notes document under acknowledgement state, dynamic insertion effect is more obvious while improving identification text insertion document speed, greatly increases user experience.

Description

A kind of speech recognition text is inserted into the method, apparatus and system of notes document

Technical field

This application involves technical field of voice recognition, more particularly to a kind of speech recognition text is inserted into the side of notes document Method, apparatus and system.

Background technology

With the development of speech recognition technology, speech recognition technology obtains more and more extensive application in all trades and professions.Example Such as：In court hearing or conference process, if speech recognition technology can be applied in court's trial or meeting, sound is turned It is changed to word and the real-time subangle color of word is inserted into notes document simultaneously, court's trial or the work of minutes personnel will be mitigated significantly in this way Manpower is saved in the work that the problem of measuring, and avoiding the occurrence of error of omission incorrect posting even substitutes record personnel completely.

In speech recognition process, identification server obtains the audio stream of current certain role speech, by the audio Being repeated as many times for stream is sliced and the context of context, semanteme is combined to be analyzed, and gradually generates the identification for current audio stream Text.If the text identification content in textual identification cannot be identified, identify that server can be repeatedly to present video Stream carries out identifying processing, until the text identification content in the textual identification of current audio stream is identified, in text identification Appearance is just inserted into notes document.In identification process, if the word speed of spokesman is too fast and the speech dead time is shorter Can cause to identify server make pauses in reading unpunctuated ancient writings automatically calculate occur error (by two of spokesman corresponding audio streams of making a speech be considered as one into Row processing), since number increase is compared for current audio stream in identification server and then obtains final acknowledgement state It identifies that text time increases, it is poor to eventually lead to user experience.

Invention content

The purpose of the application embodiment is to provide a kind of speech recognition text and is inserted into the method, apparatus of notes document and is System solves the existing technical problem for being inserted into notes document experience sense difference.

To achieve the above object, the application embodiment provides a kind of method that speech recognition text is inserted into notes document, Including:

Receive the current text identification information of target audio subflow；Wherein, the current text information includes text identification Content, text identification status indicator, role identification and text size；

According to the text identification status indicator of current text identification information and role identification by corresponding text identification content It is inserted into the corresponding position of notes document.

Preferably, the text identification status indicator and role identification according to current text identification information is by corresponding text This identification content be inserted into notes document corresponding position the step of include：

The first textual identification for obtaining first role obtains the text in the first textual identification by mapping function Text identification content in first textual identification of the first role is inserted into corresponding by the insertion position of this identification content Position, setting first role are line feed role；

The first textual identification for obtaining second role, second is obtained on the basis of the corresponding bookmark of the role that currently enters a new line The insertion position of text identification content in the first textual identification of role, by the first text identification of the second role Text identification content in information is inserted into corresponding position, and update line feed role, is line feed role with second role；

The second textual identification for obtaining first role, if the text identification state mark in the first textual identification It is that non-acknowledgement identifies to know, the bookmark used when being inserted by the text identification content in a upper textual identification for first role On the basis of, the insertion position of the text identification content in the second textual identification of first role is obtained, by described first jiao Text identification content in second textual identification of color is inserted into corresponding position, and without updating line feed role, second role is Enter a new line role；If the text identification status indicator in the first textual identification is to confirm to identify, with the role couple that currently enters a new line The insertion position that text identification content in the second textual identification of first role is obtained on the basis of the bookmark answered, by described the Text identification content in the second textual identification of one role is inserted into corresponding position, update line feed role, with first role For the role that enters a new line；

The second textual identification of second role is obtained, if the role that currently enters a new line is first role, and second role The first textual identification in text identification status indicator be confirm identify, then with the second text identification of first role believe On the basis of the bookmark that text identification content in breath uses when being inserted into, the text in the second textual identification of second role is obtained Text identification content in second textual identification of the second role is inserted into corresponding by the insertion position of this identification content Position, update line feed role；If currently line feed role is first role, and in the first textual identification of second role Text identification status indicator identifies for non-acknowledgement, or current line feed role is second role, is known with the first text of second role On the basis of the bookmark that text identification content in other information uses when being inserted into, in the second textual identification for obtaining second role Text identification content insertion position, by the second textual identification of the second role text identification content be inserted into Corresponding position, without updating line feed role；

The first textual identification for obtaining other roles is obtained on the basis of the corresponding bookmark of the role identification that currently enters a new line The insertion position for taking the text identification content in the first textual identification of other roles, by the first text of other roles Text identification content in this identification information is inserted into corresponding position, and update line feed role, is line feed role with other roles.

Preferably, the step of text identification content in the textual identification of each role being inserted into corresponding position is wrapped It includes：

For each role, the text identification status indicator in the current text identification information identifies for non-acknowledgement, and Text identification in a upper textual identification is identified as non-determined mark, then long according to the text in a upper textual identification Degree and text identification content, the text size in current text identification information and text identification content are by current text identification information Text identification content be inserted into notes document corresponding position；

For each role, the text identification status indicator in the current text identification information identifies for non-acknowledgement, and Text identification in a upper textual identification is identified as confirmation mark, then by the text identification content of current text identification information It is inserted into the corresponding position of notes document；

For each role, the text identification status indicator in the current text identification information is confirmation mark, and on Text identification in one textual identification is identified as non-acknowledgement mark, then according to the text size in a upper textual identification With in text identification content, current text identification information text size and text identification content by the text of current text information Identify that content is inserted into the corresponding position of notes document；

For each role, the text identification status indicator in the current text identification information is confirmation mark, and on Text identification in one textual identification is identified as confirmation mark, then inserts the text identification content of current text identification information Enter to put down the corresponding position of document.

Preferably, according to the text size and text identification content, current text identification letter in a upper textual identification The text identification content of current text identification information is inserted into the phase of notes document by text size and text identification content in breath The step of answering position include：

By in the text identification content of current text identification information since initial position to a upper textual identification In the content of the identical position of text size be compared with the text identification content in a upper textual identification, if than It is identical compared with result, then will in the text identification content of current text identification information remove since initial position to a upper text Remaining content is inserted into a upper textual identification in notes document by the content of the identical position of text size in identification information Text identification content behind；If comparison result differs, the text identification content of a upper textual identification is deleted, it will The text identification content of current text identification information is inserted into the text identification content of a upper textual identification for notes document Position.

Preferably, the text identification status indicator and role identification according to current text identification information is by corresponding text This identification content be inserted into notes document corresponding position the step of further include：

Text identification content in current text identification information is inserted into after corresponding position, judges upper text identification letter The text identification status indicator of breath is removed if the text identification status indicator of a upper textual identification is to confirm to identify The poster edge of the text identification content of a upper textual identification, and be inserted into the text identification in current text identification information Hold, and poster edge is set；If the text identification status indicator of a upper textual identification identifies for non-acknowledgement, insertion is worked as Text identification content in preceding textual identification, and poster edge is set.

Receive audio stream；

The audio stream is subjected to cutting, obtains audio sub-stream；

According to the text identification status indicator in a upper textual identification, target audio for currently needing to identify is determined Stream；

The target audio subflow is identified, current text identification information is obtained；Wherein, the current text information Including text identification content, text identification status indicator, role identification and text size；

The current text identification information is sent to insertion notes end, realizes that the text in current text identification information is known Other content is inserted into notes document.

Preferably, determine currently need include the step of the target audio subflow identified：

If the text identification status indicator in a upper textual identification identifies for non-acknowledgement, what current needs identified Target audio subflow is the corresponding audio sub-stream of a upper textual identification；

If the text identification status indicator in a upper textual identification is to confirm to identify, the mesh identified is currently needed Mark audio sub-stream is next audio sub-stream.

To achieve the above object, the application embodiment provides a kind of device of speech recognition text insertion notes document, Including:

Receiving unit, the current text identification information for receiving target audio subflow；Wherein, the current text information Including text identification content, text identification status indicator, role identification and text size；

It is inserted into notes unit, being used for will be right according to the text identification status indicator and role identification of current text identification information The text identification content answered is inserted into the corresponding position of notes document.

Preferably, the insertion notes unit includes：

First textual identification of first role is inserted into module, and the first text identification for obtaining first role is believed Breath obtains the insertion position of the text identification content in the first textual identification by mapping function, by the first role The first textual identification in text identification content be inserted into corresponding position, setting first role be line feed role；

First textual identification of second role is inserted into module, and the first text identification for obtaining second role is believed Breath is obtained on the basis of the corresponding bookmark of the role that currently enters a new line in the text identification in the first textual identification of second role Text identification content in first textual identification of the second role is inserted into corresponding position, more by the insertion position of appearance New line feed role, is line feed role with second role；

Second textual identification of first role is inserted into module, and the second text identification for obtaining first role is believed Breath passes through a upper text for first role if the text identification status indicator in the first textual identification identifies for non-acknowledgement On the basis of the bookmark that text identification content in this identification information uses when being inserted into, the second text identification letter of first role is obtained The insertion position of text identification content in breath, by the text identification content in the second textual identification of the first role It is inserted into corresponding position, without updating line feed role, second role is line feed role；If the text in the first textual identification Identification state is identified as confirmation mark, and the second text that first role is obtained on the basis of the corresponding bookmark of the role that currently enters a new line is known The insertion position of text identification content in other information, will be in the text identification in the second textual identification of the first role Hold and be inserted into corresponding position, update line feed role, is line feed role with first role；

Second textual identification of second role is inserted into module, and the second text identification for obtaining second role is believed Breath, if the role that currently enters a new line is first role, and the text identification state mark in the first textual identification of second role Know to confirm mark, then the bookmark used when being inserted into the text identification content in the second textual identification of first role is Benchmark obtains the insertion position of the text identification content in the second textual identification of second role, by the second role The second textual identification in text identification content be inserted into corresponding position, update line feed role；If currently entering a new line role For first role, and the text identification status indicator in the first textual identification of second role is non-acknowledgement mark, or is worked as Preceding line feed role is second role, is used when being inserted into the text identification content in the first textual identification of second role On the basis of bookmark, the insertion position of the text identification content in the second textual identification of second role is obtained, by described Text identification content in the second textual identification of two roles is inserted into corresponding position, without updating line feed role；

The first textual identification of other roles is inserted into module, and the first text identification for obtaining other roles is believed Breath, on the basis of the corresponding bookmark of the role identification that currently enters a new line, obtains the text in the first textual identification of other roles Text identification content in the first textual identification of other roles is inserted into corresponding by the insertion position for identifying content Position, update line feed role, is line feed role with other roles.

Receiving unit, for receiving audio stream；

Cutting unit obtains audio sub-stream for the audio stream to be carried out cutting；

Target audio subflow confirmation unit is used for according to the text identification status indicator in a upper textual identification, really The target audio subflow identified is needed before settled；

Recognition unit obtains current text identification information for the target audio subflow to be identified；Wherein, institute It includes text identification content, text identification status indicator, role identification and text size to state current text information；

Transmission unit realizes current text identification for the current text identification information to be sent to insertion notes end Text identification content in information is inserted into notes document.

Therefore compared with prior art, the technical program is made a speech under scene simultaneously in polygonal color, speech recognition clothes Device of being engaged in intersects the real-time identification text for returning to different role, the text identification content in the textual identification of insertion in spite of Be identified, correctly, orderly, subangle color be inserted into notes document, be not only by text identification content under acknowledgement state It is inserted into notes document, dynamic insertion effect is more obvious while improving identification text insertion document speed, greatly increases user Experience sense.In addition, dynamic addition shading color effect problem, increases speech recognition and identifies that text insertion technology usage scenario expands in real time Big technology application range.

Description of the drawings

It, below will be to embodiment in order to illustrate more clearly of the application embodiment or technical solution in the prior art Or attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only It is some embodiments described in the application, for those of ordinary skill in the art, in not making the creative labor property Under the premise of, other drawings may also be obtained based on these drawings.

Fig. 1 is the system schematic that a kind of speech recognition text that the embodiment of the present application proposes is inserted into notes document；

Fig. 2 is one of the method flow diagram that a kind of speech recognition text that the embodiment of the present application proposes is inserted into notes document；

Fig. 3 is the two of the method flow diagram that a kind of speech recognition text that the embodiment of the present application proposes is inserted into notes document；

Fig. 4 be the embodiment of the present application propose a kind of speech recognition text be inserted into notes document apparatus function block diagram it One；

Fig. 5 be the embodiment of the present application propose a kind of speech recognition text be inserted into notes document apparatus function block diagram it Two；

Fig. 6 is a kind of electronic equipment schematic diagram that the embodiment of the present application proposes.

Specific implementation mode

In order to make those skilled in the art better understand the technical solutions in the application, below in conjunction with the application reality The attached drawing in mode is applied, the technical solution in the application embodiment is clearly and completely described, it is clear that described Embodiment is only a part of embodiment of the application, rather than whole embodiments.Based on the embodiment party in the application Formula, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, is all answered When the range for belonging to the application protection.

As shown in Figure 1, being inserted into the system signal of notes document for a kind of speech recognition text that the embodiment of the present application proposes Figure.Including：It is inserted into notes terminal and speech recognition server.Wherein, speech recognition server obtains sound from voice collector Frequency flows, to audio stream by after noise treatment, cutting is multiple audio sub-streams.Speech recognition server is to each audio sub-stream Processing is identified, the result of identifying processing is built into textual identification, no matter whether the identification content of audio sub-stream is true Recognize, textual identification is sent to insertion notes terminal by voice server.If in the identification of the audio sub-stream currently identified Appearance is identified that then speech recognition server can be carried out the identification work of next audio sub-stream.If the audio currently identified The identification content of subflow is in non-acknowledgement state, then speech recognition server continues that work is identified to present video subflow. Only no matter the identification content of audio sub-stream is non-acknowledgement state or acknowledgement state, speech recognition server knows text Other information is back to insertion notes terminal.Notes terminal is inserted into according to role identification and text identification status indicator by speech recognition Text identification content in the textual identification that server returns is inserted into the corresponding position of notes document.

In the technical scheme, each recognition result takes back the unique mark that identification server distinguishes the role simultaneously, inserts Enter to put down terminal according to each role identification dynamic creation and safeguards the voice content storage unit of each role, storage unit storage The information such as voice content, identification state mark are inserted into notes terminal and pass through the angle taken back when receiving identification text in real time every time Colour code knows each role of dynamic access and corresponds to the mark calculating acquisition text insertion position of the identification state in storage unit and distinguish angle Color realizes the insertion notes that polygonal color is made a speech simultaneously under synchronization or the identification content of text of single role's speech is correct, orderly The corresponding position of document.

In the technical scheme, identify that the meaning of the non-acknowledgement state of content is：Speech recognition server is to getting Audio stream carries out the text identification content generated in the identification operating process such as slice analysis, and the text identifies that content is present video Subflow identification generates a part for final text, and deposited in text identification content individual fields need by again identify that handle into Row correction modification.Identify that the meaning of the acknowledgement state of content is：Identification server carries out slice analysis to the audio stream got Deng the text identification content for identifying life in operating process, the text identifies content by finally confirming in conjunction with context semantic analysis It is not necessary that the text of operation is identified again.

Based on foregoing description, the embodiment of the present application proposes a kind of method that speech recognition text is inserted into notes document, such as Fig. 2 It is shown.For the technical scheme, it is applied to be inserted into notes terminal, specifically, the insertion notes terminal for example can be Have that the desktop computer of data processing function, tablet computer, laptop, smart mobile phone, digital assistants, intelligence is wearable sets Standby, shopping guide's terminal, television set etc..Alternatively, the client may be the software that can be run in above-mentioned electronic equipment.Institute It states method and is applied to polygonal color while situation of making a speech, may comprise steps of：

Step 201)：Receive the current text identification information of target audio subflow；Wherein, the current text information includes Text identification content, text identification status indicator, role identification and text size.

In the technical scheme, text identification content is voice content in current goal audio sub-stream.Text identification state The voice content identified in mark current goal audio sub-stream is identified as whether without again identifying that operation.In the present embodiment In, text identification status indicator is 1, indicates the voice content identified in present video subflow by combining context semanteme point The final text confirmed it is not necessary that operation is identified again of analysis.Text identification status indicator is 0, identifies and knows in present video subflow The voice content not gone out is that the identification of present video subflow generates a part for final text, and a malapropism is deposited in text identification content Section is needed by again identifying that processing carries out correction modification.Role identification is to identify that server is directed to the mark of different role setting Know, convenient for the voice content of different role is classified as corresponding role.Text size is the current goal for identifying server and identifying The length of the voice content of audio sub-stream.

In the present embodiment, it is inserted into notes terminal and one storage unit is set on a processor, know dedicated for storaged voice The current text identification information that other server returns.The storage unit is divided into multiple storage regions, and different zones store text respectively Different content in this identification information.For the technical scheme, storage unit stores a upper textual identification, is inserted into pen Record terminal receives the current text identification information of speech recognition server return, is inserted into notes terminal and is identified according to current text and is believed Text identification content in current text identification information is inserted into corresponding notes document by breath and a upper textual identification, And a upper textual identification is deleted in the memory unit, current text identification information is stored to storage unit.It is inserted into notes One memory is set in terminal, for storing the result information being inserted into notes document, above-described storage unit storage Content in a upper textual identification is for accurately confirming insertion position when being inserted into text identification content.

Step 202)：According to the text identification status indicator of current text identification information and role identification by corresponding text Identify that content is inserted into the corresponding position of notes document.

In the technical scheme, described to be incited somebody to action according to the text identification status indicator and role identification of current text identification information Corresponding text identification content is inserted into the step of corresponding position of notes document and includes：

Specifically, for the process for being inserted into notes document in the case of polygonal color is made a speech simultaneously is described in detail, existed with three roles For making a speech under a certain application scenarios, wherein two roles make a speech simultaneously, and third role at a time makes a speech.It is inserted into pen Recording terminal processes flow is：

1. role A, B make a speech for the first time simultaneously.

2. returned text identifies content Sa1, text identification status indicator Ta1 to role's A identifications text for the first time, it is inserted into text and knows It is line feed role (LastRole=A) that A, which is arranged, in other content Sa1 simultaneously.

3. returned text identifies content Sb1, text identification status indicator Tb1 to role's B identifications text for the first time.Currently to enter a new line On the basis of role (LastRole) A bookmarks, the insertion position (angles of the text identification content Sb1 of acquisition role B in a document are calculated The next line of color A), it is B (LastRole=B) to be inserted into role B and correspond to text while updating line feed role.

4. role A identification second of returned text of text identifies content Sa2, text identification status indicator Ta2.

4.1Ta1=0 calculates text insertion position by role's A bookmarks and completes text insertion, without updating line feed role (LastRole=B).

4.2Ta1=1 is calculated and is obtained in role's A text identifications on the basis of the role that currently enters a new line (LastRole) B bookmarks Hold the insertion positions (next line of role B) of Sa2 in a document, it is A to be inserted into role A and correspond to text while updating line feed role (LastRole=A).

5. role's B identifications text second returned text identification content Sb2, text identification status indicator Tb2.

If 5.1 LastRole==A and Tb1=1, on the basis of role's A bookmarks, calculates and obtain text identification content Sb2 Insertion position, update line feed role is role B (LastRole=B).

If 5.2 LastRole==A and Tb1=0, on the basis of role's B bookmarks, calculates text identification content Sb2 and be inserted into Position, nothing need to change line feed role.

If 5.3 LastRole==B, on the basis of role's B bookmarks, the insertion positions text identification content Sb2 is calculated, are not necessarily to Replace line feed role.

6. if new role C makes a speech for the first time, no matter currently line feed role is A or B, and bookmark is corresponded to as base using the role that currently enters a new line Standard calculates role's C returned texts insertion position, and update line feed role is C (LastRole=C).

In the present embodiment, the text identification content in the textual identification of each role is inserted into the step of corresponding position Suddenly include：

In the present embodiment, according to the text size and text identification content, current text in a upper textual identification The text identification content of current text identification information is inserted into notes text by text size and text identification content in identification information Shelves corresponding position the step of include：

Specifically, for for each role in polygonal color, the logic flow of insertion is：

1. role A makes a speech for the first time, audio collection device carries out audio collection to role's A speeches, obtains audio stream, identification service Device carries out cutting processing to audio stream, obtains audio sub-stream, processing is identified in identification first audio sub-stream of server pair, first Secondary returned text identification content Sa1, text identification status indicator Ta1 and text size L1, it is single to create the corresponding storages of role A Member stores text identification content, text identification status indicator and text size in textual identification.Wherein, Ta1=1, table Bright identification server is to confirm text to the text identification content Sa1 that currently returns, stores currently return in the memory unit Text identification content Sa1 in textual identification and text identification status indicator Ta1, and text identification content Sa1 is inserted into It puts down in document.Next textual identification is returned etc. server to be identified.Ta1=0 shows to identify server to currently returning Sa1 non-acknowledgement texts of text identification content, store the text in the textual identification currently returned in the memory unit It identifies content Sa1 and text identification status indicator Ta1, and text identification content Sa1 is inserted into notes document.Etc. clothes to be identified Business device returns to next textual identification.

2. identifying that server returns to a textual identification, text identification content Sa2, text identification status indicator are obtained Ta2 and text size L2.If the text identification status indicator Ta1=1 of a upper textual identification, current text identification letter Breath is that identification server identifies acquisition to present video subflow.If the text identification status indicator of a upper textual identification Ta1=0, then current text identification information is that identification server identifies acquisition to the corresponding audio sub-stream of a upper textual identification 's.

If 2.1 Ta2=0, the identification text status in textual identification is non-acknowledgement, then by text identification content S2 The L1 lengthy contents that start of initial position be compared with text identification content S1, if comparison result is identical, pass through character Truncation takes the L2-L1 contents part S21 for obtaining text identification content S2 and content S21 is inserted into notes in such a way that tail portion adds Document；If comparison result differs, without being intercepted to text identification content S2 but directly by text identification content S2 It is inserted into notes document to cover inserted mode (delete text identification content S1, be inserted into text identification content S2).Update storage list The content stored in member, by text identification content Sa1, text identification status indicator Ta1 and the text of a upper textual identification Length L1 is deleted, by text identification content Sa2, text identification status indicator Ta2 and the text size of current text identification information L2 is stored into storage unit.

If 2.2 Ta2=1, the identification text status in textual identification is to confirm (Ta2=1), and text size is L2, text identification content are S2.Then the initial position of the text identification content S2 in currently getting textual identification is opened Begin to the content of length L1 to be compared with content of text S1.If comparison result is identical, is intercepted by character string and obtain text The content is simultaneously inserted into such a way that tail portion adds in notes document by the L2-L1 partial contents of content S2；If comparison result not phase Together, then it is not necessarily to intercept content of text S2, content of text S2 (is directly deleted into S1 contents in document to cover inserted mode It is inserted into S2) it is inserted into notes document.Meanwhile by the text identification content Sa1 of a upper textual identification, text identification state mark Know Ta1 and text size L1 is deleted, by the text identification content Sa2 of current text identification information, text identification status indicator Ta2 It is stored into storage unit with text size L2.

3. identifying that server returns to third textual identification, it is inserted into notes terminal and receives the information, no matter third Text identification status indicator Ta3 in textual identification is 1 or 0, if the text identification in a upper textual identification Status indicator Ta2=0 then puts down the step 2.1 in terminal processes flow according to above-mentioned insertion and executes insertion logic.If upper one Text identification status indicator Ta2=1 in textual identification then puts down the step in terminal processes flow according to above-mentioned insertion 1 sequence restarts to execute text-processing and is inserted into logic.Finally, by the text identification content Sa2 of a upper textual identification, Text identification status indicator Ta2 and text size L2 are deleted, by text identification content Sa3, the text of current text identification information Identification state mark Ta3 and text size L3 is stored into storage unit.

While text identification content is inserted into notes document, it is inserted into the new text addition of notes document in real time to each role Poster edge, detection remove last time in document returns and in the text identification content be inserted into identification state for acknowledgement state text Poster edge, ensure that poster edge follows the identification text being currently newly inserted.Specifically, described know according to current text Corresponding text identification content is inserted into the corresponding position of notes document by the text identification status indicator and role identification of other information The step of further include：

In the present embodiment, identify that the logic flow that text is inserted into real time is in the case of making a speech simultaneously for polygonal color：

1. role A, B, C make a speech simultaneously.

2. returned text identifies content Sa1/Sb1/Sc1 and role identification A/ to identification text for the first time after role's A/B/C processing B/C。

2.1 text identification status indicator Ta1/Tb1/Tc1=0, according to role identification A/B/C, it is fixed to be provided by WordAPI Bit function, which calculates, obtains insertion position PA1/PB1/PC1, is inserted into text identification content and adds poster edge later, updates each role The content stored in corresponding storage unit.

2.2 text identification status indicator Ta1/Tb1/Tc1=1, execute step 2.1 operation, and follow-up corresponding role's is next Textual identification is inserted into flow and is directly executed since step 3.

3. second of returned text identification content Sa2/Sb2/Sc2 of identification text and role identification after role's A/B/C processing A/B/C。

3.1 text identification status indicator Ta2/Tb2/Tc2=0 are corresponded to according to role identification A/B/C by corresponding role Bookmark obtain insertion position PA2/PB2/PC2, be inserted into after text identification content and add poster edge, update each role and correspond to The content stored in storage unit.

3.2 text identification status indicator Ta1=1 execute step 3.1 operation, next text identification of follow-up corresponding role Information is inserted into flow and is directly executed since step 3.

4. identification text third time returned text identification content Sa3/Sb3/Sc3 and role identification after role's A/B/C processing A/B/C。

If text identification status indicator Ta3/Tb3/Tc3 is still 0, the flow of step 3 is continued to execute, until Ta3/Tb3/ Tc3 be 1 until complete polygonal color make a speech for the first time text insertion.

On this basis, the corresponding poster edge of each role's text identification content, which adds, in the case of polygonal color is made a speech simultaneously patrols Volume：

1. polygonal color A, B, C make a speech for the first time simultaneously.

2. returned text identifies that content Sa1, text identification status indicator Ta1, text are inserted into pen to role's A identifications text for the first time Document is recorded, and corresponding poster edge is set.

3. returned text identifies content Sb1, text identification status indicator Tb1 to role's B identifications text for the first time.

3.1 text identification status indicator Ta1=0 are normal to be inserted into text identification content Sb1, setting text identification content Sb1 Corresponding poster edge.

3.2 text identification status indicator Ta1=1 remove text identification content Sa1 poster edges, are inserted into text identification Hold Sb1, the corresponding poster edges of setting text identification content Sb1.

4. role's A identification second of returned text of text identifies content Sa2, text identification status indicator Ta2.

4.1 text identification status indicator Tb1=0, the normal insertion text identification content Sa2 that calculates (are appended to text identification The tail portions content Sa1, which add poster edge or replace text identification content Sa1 completely, adds poster edge).

4.2 text identification status indicator Tb1=1, remove text identification content Sb1 poster edges, and normal calculate is inserted into text This identification content Sa2 (is appended to the tail portions text identification content Sa1 addition poster edge or replaces text identification content Sa1 completely Add poster edge).

When thering is the text identification content in new textual identification to be inserted into every time, judge in a upper textual identification Text identification status indicator removes the poster edge of the text identification content in a upper textual identification if being identified as confirmation, The normal text identification content being inserted into current text identification information；It is normal to be inserted into current text identification if being identified as non-acknowledgement Text identification content in information, without removing the poster edge of the text identification content in a upper textual identification.Work as task System removes the corresponding bookmark of all roles and poster edge in text when closing.

The application embodiment provides the method that another speech recognition text is inserted into notes document, as shown in Figure 3.For For the technical program, it is applied to speech recognition server, specifically, the speech recognition server, which can be one, has number According to operation, the electronic equipment of store function and network interaction function；Or run in the electronic equipment, it is at data Reason, storage and network interaction provide the software supported.Do not limit the quantity of the server specifically in the present embodiment.Institute It can be a server to state server, can also be several servers, alternatively, the server cluster that several servers are formed. The method that the speech recognition text is inserted into notes document may comprise steps of：

Step 301)：Receive audio stream.

In the present embodiment, under the real-time acquisition applications scene of voice collector user voice, and by collected voice Through noise reduction process, audio stream is obtained.

Step 302)：The audio stream is subjected to cutting, obtains audio sub-stream.

In the present embodiment, in order to improve the precision of speech recognition, the audio stream that voice collector feedback comes is passed through Cutting is handled, and a big section audio flows through cutting processing, obtains the audio stream of multiple segments.Every time identification when audio stream data not It is especially big, greatly improves accuracy of identification.

Step 303)：According to the text identification status indicator in a upper textual identification, determine what current needs identified Target audio subflow.

In the technical scheme, if identification server is to currently needing the result that the audio-frequency information of identifying processing identifies can not Confirm, recognition result is still fed back to insertion notes terminal, the content of non-acknowledgement is inserted into notes document, then identification clothes Business device continues to again identify that the audio-frequency information, no matter whether this recognition result confirms, still feed back to recognition result slotting Enter to put down terminal, this recognition result is inserted into notes document.Until the text of the audio-frequency information of identification server identifying processing This identification information is confirmed, just carries out next audio-frequency information and processing is identified.If identifying server to currently needing to know The result of the audio-frequency information identification of other places reason is acknowledgement state, recognition result is fed back to insertion notes terminal, what be will confirm that is interior Hold and be inserted into notes document, then identifies that processing is identified to next audio-frequency information in server.

For conventional techniques, identification server is to currently needing the result that the audio-frequency information identified identifies can not Confirm, then recognition result is will not to feed back to insertion notes terminal, until identification server will currently need the audio identified to believe The result of breath identification is confirmed that recognition result just feeds back to insertion notes terminal and is inserted into.Conventional techniques are being inserted The fashionable time for needing to spend is longer than the technical program, greatly reduces the Experience Degree of user.The technical program will each time Identification information improves the Experience Degree of user in spite of confirming in real-time insertion notes document.Therefore, in the technical program In, determine currently need include the step of the target audio subflow identified：

Step 304)：The target audio subflow is identified, current text identification information is obtained；Wherein, described to work as Preceding text information includes text identification content, text identification status indicator, role identification and text size；

Step 305)：The current text identification information is sent to insertion notes end, realizes current text identification information In text identification content be inserted into notes document.

The technical program will be by that will identify that server for audio stream generate in slice comparative analysis calculating process Text is inserted into document in spite of confirmation and solves the problems, such as due to identifying the slow poor user experience of text return speed in real time.Together When, in the case that single role gradually makes a speech in speech recognition process, we can gradually ordered pair identification takes by role The identification text that returns in real time of business device carries out processing and is inserted into work, but in court's trial or conference process it is possible that polygonal color simultaneously The case where speech, identifies that server is simultaneously sliced the corresponding audio stream of each speech role with concurrent fashion in the case Processing intersects according to concurrent processing speed and returns to each role and identify text in real time, if in this case still according to single role by Secondary speech logic is inserted into, it may appear that is inserted into sequences of text and role's entanglement problem and then shading color effect dynamic is caused to be added Effect disorder problem eventually leads to generation notes/meeting document and loses meaning and cancel.Based on this, the technical program passes through knowledge Other server returns to identification text while taking back the unique mark that identification server distinguishes the role every time, according to each role identification When receiving identification text in real time every time, shape is identified by the corresponding current text of each role of role identification dynamic access taken back State mark, and calculated according to these marks and obtain text insertion positions and distinguish role, solve above-mentioned polygonal color while text of making a speech This insertion position, role are distinguished, shading color effect adds entanglement problem.

As shown in figure 4, being inserted into the work(of the device of notes document for a kind of speech recognition text that the embodiment of the present application proposes One of energy block diagram.The device is to be inserted into notes terminal in practical applications.Including：

Receiving unit 401, the current text identification information for receiving target audio subflow；Wherein, the current text Information includes text identification content, text identification status indicator, role identification and text size；

It is inserted into notes unit 402, for the text identification status indicator and role identification according to current text identification information Corresponding text identification content is inserted into the corresponding position of notes document.

In the present embodiment, the insertion notes unit includes：

As shown in figure 5, being inserted into the apparatus function of notes document for a kind of speech recognition text that the embodiment of the present application proposes The two of block diagram.The device is to be inserted into notes terminal in practical applications.Including:

Receiving unit 501, for receiving audio stream；

Cutting unit 502 obtains audio sub-stream for the audio stream to be carried out cutting；

Target audio subflow confirmation unit 503 is used for according to the text identification status indicator in a upper textual identification, Determine the target audio subflow for currently needing to identify；

Recognition unit 504 obtains current text identification information for the target audio subflow to be identified；Wherein, The current text information includes text identification content, text identification status indicator, role identification and text size；

Transmission unit 505 realizes that current text is known for the current text identification information to be sent to insertion notes end Text identification content in other information is inserted into notes document.

As shown in fig. 6, a kind of electronic system schematic diagram proposed for the embodiment of the present application.The electronic equipment includes：It deposits Computer program is stored in reservoir a and processor b, the memory a, when the computer program is executed by the processor b, Realize following functions：

In the present embodiment, described will be right according to the text identification status indicator and role identification of current text identification information The text identification content answered is inserted into the corresponding position of notes document, when the computer program is executed by the processor b, realizes Following functions：

In the present embodiment, the text identification content in the textual identification of each role is inserted into corresponding position, institute When stating computer program and being executed by the processor b, following functions are realized：

In the present embodiment, according to the text size and text identification content, current text in a upper textual identification The text identification content of current text identification information is inserted into notes text by text size and text identification content in identification information Following functions when the computer program is executed by the processor b, are realized in the corresponding position of shelves：

By in the text identification content of current text identification information since initial position to a upper textual identification In the content of the identical position of text size be compared with the text identification content in a upper textual identification, if than It is identical compared with result, then will in the text identification content of current text identification information remove since initial position to a upper text Remaining content is inserted into a upper textual identification in notes document by the content of the identical position of text size in identification information Text identification content behind；If comparison result differs, the text identification content of current text identification information is inserted into To the position of the text identification content of a upper textual identification for notes document, the text for deleting a upper textual identification is known Other content.

The embodiment of the present application proposes that another electronic equipment, the electronic equipment include：Memory a and processor b, it is described Computer program is stored in memory a, when the computer program is executed by the processor b, realizes following functions：

Receive audio stream；

The audio stream is subjected to cutting, obtains audio sub-stream；

In the present embodiment, the target audio subflow for currently needing to identify is determined, the computer program is by the processing When device b is executed, following functions are realized：

In the present embodiment, the memory includes but not limited to random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), caching (Cache), hard disk (Hard Disk Drive, HDD) or storage card (Memory Card).

In the present embodiment, the processor can be implemented in any suitable manner.For example, the processor can be with Take such as microprocessor or processor and storage can by computer readable program code that (micro-) processor executes (such as Software or firmware) computer-readable medium, logic gate, switch, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), the form etc. of programmable logic controller (PLC) and embedded microcontroller.

The concrete function that the electronic equipment that this specification embodiment provides, memory and processor are realized, Ke Yiyu Aforementioned embodiments in this specification contrast explanation, and can reach the technique effect of aforementioned embodiments, here just not It repeats again.

In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example, Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit. Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device (Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " patrols Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development, And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language (Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL (Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL (Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language) etc., VHDL (Very-High-Speed are most generally used at present Integrated Circuit Hardware Description Language) and Verilog2.Those skilled in the art It will be apparent to the skilled artisan that only needing method flow slightly programming in logic and being programmed into integrated circuit with above-mentioned several hardware description languages In, so that it may to be readily available the hardware circuit for realizing the logical method flow.

It is also known in the art that in addition to realizing client, server in a manner of pure computer readable program code In addition, completely can by by method and step carry out programming in logic come so that client, server with logic gate, switch, special The form of integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. realizes identical function.Therefore this client, Server is considered a kind of hardware component, and can also be regarded to the device for realizing various functions that its inside includes For the structure in hardware component.Or even, can will be considered as either implementation method for realizing the device of various functions Software module can be structure in hardware component again.

As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It is realized by the mode of software plus required general hardware platform.Based on this understanding, the technical solution essence of the application On in other words the part that contributes to existing technology can be expressed in the form of software products, the computer software product It can be stored in a storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions are used so that a computer equipment (can be personal computer, server either network equipment etc.) executes each embodiment of the application or embodiment Method described in certain parts.

Each embodiment in this specification is described in a progressive manner, identical similar between each embodiment Just to refer each other for part, what each embodiment stressed is the difference with other embodiment.In particular, needle For the embodiment of client, the introduction control for being referred to the embodiment of preceding method is explained.

The application can describe in the general context of computer-executable instructions executed by a computer, such as program Module.Usually, program module includes routines performing specific tasks or implementing specific abstract data types, program, object, group Part, data structure etc..The application can also be put into practice in a distributed computing environment, in these distributed computing environments, by Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with In the local and remote computer storage media including storage device.

Although depicting the application by embodiment, it will be appreciated by the skilled addressee that there are many deformations by the application With variation without departing from spirit herein, it is desirable to which the attached claims include these deformations and change without departing from the application Spirit.

Claims

1. a kind of method that speech recognition text is inserted into notes document, which is characterized in that including:

Receive the current text identification information of target audio subflow；Wherein, the current text information include text identification content, Text identification status indicator, role identification and text size；

Corresponding text identification content is inserted into according to the text identification status indicator of current text identification information and role identification Put down the corresponding position of document.

2. the method as described in claim 1, which is characterized in that the text identification state according to current text identification information Identifying the step of corresponding text identification content is inserted into the corresponding position of notes document with role identification includes：

The first textual identification for obtaining first role obtains the text in the first textual identification by mapping function and knows Text identification content in first textual identification of the first role is inserted into corresponding positions by the insertion position of other content It sets, setting first role is line feed role；

The first textual identification for obtaining second role, second role is obtained on the basis of the corresponding bookmark of the role that currently enters a new line The first textual identification in text identification content insertion position, by the first textual identification of the second role In text identification content be inserted into corresponding position, update line feed role, with second role be line feed role；

The second textual identification for obtaining first role, if the text identification status indicator in the first textual identification is Non-acknowledgement identifies, and the bookmark used when being inserted by the text identification content in a upper textual identification for first role is base Standard obtains the insertion position of the text identification content in the second textual identification of first role, by the first role Text identification content in second textual identification is inserted into corresponding position, and without updating line feed role, second role is line feed Role；If the text identification status indicator in the first textual identification is to confirm to identify, corresponding with the role that currently enters a new line The insertion position that text identification content in the second textual identification of first role is obtained on the basis of bookmark, by described first jiao Text identification content in second textual identification of color is inserted into corresponding position, and update line feed role is to change with first role Row role；

The second textual identification of second role is obtained, if currently line feed role is first role, and the of second role Text identification status indicator in one textual identification is to confirm to identify, then in the second textual identification of first role Text identification content when being inserted on the basis of the bookmark that uses, the text obtained in the second textual identification of second role is known Text identification content in second textual identification of the second role is inserted into corresponding positions by the insertion position of other content It sets, update line feed role；If the role that currently enters a new line is first role, and the text in the first textual identification of second role This identification state is identified as non-acknowledgement mark, or current line feed role is second role, with the first text identification of second role On the basis of the bookmark that text identification content in information uses when being inserted into, in the second textual identification for obtaining second role Text identification content in second textual identification of the second role is inserted into phase by the insertion position of text identification content The position answered, without updating line feed role；

The first textual identification for obtaining other roles obtains it on the basis of the corresponding bookmark of the role identification that currently enters a new line The first text of other roles is known in the insertion position of text identification content in the first textual identification of his role Text identification content in other information is inserted into corresponding position, and update line feed role, is line feed role with other roles.

3. method as claimed in claim 2, which is characterized in that will be in the text identification in the textual identification of each role Holding the step of being inserted into corresponding position includes：

For each role, the text identification status indicator in the current text identification information identifies for non-acknowledgement, and upper one Text identification in textual identification is identified as non-determined mark, then according in a upper textual identification text size and Text identification content, the text size in current text identification information and text identification content are by the text of current text identification information This identification content is inserted into the corresponding position of notes document；

For each role, the text identification status indicator in the current text identification information identifies for non-acknowledgement, and upper one Text identification in textual identification is identified as confirmation mark, then is inserted into the text identification content of current text identification information Put down the corresponding position of document；

For each role, the text identification status indicator in the current text identification information is to confirm to identify, and upper one is literary Text identification in this identification information is identified as non-acknowledgement mark, then according to the text size and text in a upper textual identification This identification content, the text size in current text identification information and text identification content are by the text identification of current text information Content is inserted into the corresponding position of notes document；

For each role, the text identification status indicator in the current text identification information is to confirm to identify, and upper one is literary Text identification in this identification information is identified as confirmation mark, then the text identification content of current text identification information is inserted into pen Record the corresponding position of document.

4. method as claimed in claim 3, which is characterized in that according to the text size and text in a upper textual identification Identify that content, the text size in current text identification information and text identification content know the text of current text identification information Other content is inserted into the step of corresponding position of notes document and includes：

By in the text identification content of current text identification information since initial position to in a upper textual identification The content of the identical position of text size is compared with the text identification content in a upper textual identification, if comparing knot Fruit is identical, then will in the text identification content of current text identification information remove since initial position to a upper text identification The content of the identical position of text size in information is inserted into remaining content the text of a upper textual identification in notes document Behind this identification content；If comparison result differs, the text identification content of a upper textual identification is deleted, it will be current The text identification content of textual identification is inserted into the position of the text identification content of a upper textual identification for notes document It sets.

5. method as claimed in claim 2, which is characterized in that the text identification state according to current text identification information Identifying the step of corresponding text identification content is inserted into the corresponding position of notes document with role identification further includes：

Text identification content in current text identification information is inserted into after corresponding position, judges a upper textual identification Text identification status indicator removes upper one if the text identification status indicator of a upper textual identification is to confirm to identify The poster edge of the text identification content of textual identification, and the text identification content being inserted into current text identification information, And poster edge is set；If the text identification status indicator of a upper textual identification identifies for non-acknowledgement, it is inserted into current Text identification content in textual identification, and poster edge is set.

6. a kind of method that speech recognition text is inserted into notes document, which is characterized in that including:

Receive audio stream；

The audio stream is subjected to cutting, obtains audio sub-stream；

According to the text identification status indicator in a upper textual identification, the target audio subflow for currently needing to identify is determined；

The target audio subflow is identified, current text identification information is obtained；Wherein, the current text information includes Text identification content, text identification status indicator, role identification and text size；

The current text identification information is sent to insertion notes end, is realized in the text identification in current text identification information Hold and is inserted into notes document.

7. method as claimed in claim 6, which is characterized in that determination currently needs to wrap the step of the target audio subflow identified It includes：

If the text identification status indicator in a upper textual identification identifies for non-acknowledgement, the target identified is currently needed Audio sub-stream is the corresponding audio sub-stream of a upper textual identification；

If the text identification status indicator in a upper textual identification is to confirm to identify, the target sound identified is currently needed Frequency subflow is next audio sub-stream.

8. a kind of speech recognition text is inserted into the device of notes document, which is characterized in that including:

Receiving unit, the current text identification information for receiving target audio subflow；Wherein, the current text information includes Text identification content, text identification status indicator, role identification and text size；

It is inserted into notes unit, being used for will be corresponding according to the text identification status indicator and role identification of current text identification information Text identification content is inserted into the corresponding position of notes document.

9. device as claimed in claim 8, which is characterized in that the insertion puts down unit and includes：

First textual identification of first role is inserted into module, and the first textual identification for obtaining first role leads to The insertion position for crossing the text identification content that mapping function obtains in the first textual identification, by the first of the first role Text identification content in textual identification is inserted into corresponding position, and setting first role is line feed role；

The first textual identification insertion module of second role, the first textual identification for obtaining second role, with The text identification content in the first textual identification of second role is obtained on the basis of the current corresponding bookmark of role that enters a new line Text identification content in first textual identification of the second role is inserted into corresponding position by insertion position, and update is changed Row role is line feed role with second role；

The second textual identification insertion module of first role, the second textual identification for obtaining first role, such as Text identification status indicator in the first textual identification of fruit identifies for non-acknowledgement, passes through a upper text identification for first role On the basis of the bookmark that text identification content in information uses when being inserted into, in the second textual identification for obtaining first role Text identification content in second textual identification of the first role is inserted into phase by the insertion position of text identification content Position is answered, without updating line feed role, second role is line feed role；If the text identification shape in the first textual identification State is identified as confirmation mark, and the second textual identification of first role is obtained on the basis of the corresponding bookmark of the role that currently enters a new line Text identification content in second textual identification of the first role is inserted by the insertion position of middle text identification content Corresponding position, update line feed role, is line feed role with first role；

The second textual identification insertion module of second role, the second textual identification for obtaining second role, such as Fruit currently enters a new line role as first role, and the text identification status indicator in the first textual identification of second role is true Recognize mark, then on the basis of the bookmark used when being inserted by the text identification content in the second textual identification of first role, The insertion position for obtaining the text identification content in the second textual identification of second role, by the second of the second role Text identification content in textual identification is inserted into corresponding position, update line feed role；If currently line feed role is first Role, and the text identification status indicator in the first textual identification of second role identifies for non-acknowledgement, or current line feed Role is second role, and the bookmark used when being inserted into the text identification content in the first textual identification of second role is Benchmark obtains the insertion position of the text identification content in the second textual identification of second role, by the second role The second textual identification in text identification content be inserted into corresponding position, without updating line feed role；

The first textual identification insertion module of other roles, the first textual identification for obtaining other roles, with On the basis of the current corresponding bookmark of role identification that enters a new line, obtain in the text identification in the first textual identification of other roles Text identification content in the first textual identification of other roles is inserted into corresponding position by the insertion position of appearance, Update line feed role, is line feed role with other roles.

10. a kind of speech recognition text is inserted into the device of notes document, which is characterized in that including:

Receiving unit, for receiving audio stream；

Target audio subflow confirmation unit, for according to the text identification status indicator in a upper textual identification, determination to be worked as The preceding target audio subflow for needing to identify；

Recognition unit obtains current text identification information for the target audio subflow to be identified；Wherein, described to work as Preceding text information includes text identification content, text identification status indicator, role identification and text size；

Transmission unit realizes current text identification information for the current text identification information to be sent to insertion notes end In text identification content be inserted into notes document.