CN108364653A

CN108364653A - Voice data processing method and processing unit

Info

Publication number: CN108364653A
Application number: CN201810145265.9A
Authority: CN
Inventors: 王磊
Original assignee: Individual
Current assignee: Individual
Priority date: 2018-02-12
Filing date: 2018-02-12
Publication date: 2018-08-03
Anticipated expiration: 2038-02-12
Also published as: CN108364653B

Abstract

The present invention provides a kind of voice data processing method and processing unit, the method includes：Obtain voice messaging；Acquired voice messaging is synchronously converted into the first text information, and by first word-information display in the first area of display device；In response to the first operation, word processing carried out to the first text information to be edited, and by treated word-information display the display device second area；In response to the second operation, first text information that do not changed is synchronized to the second area, and generate the second text information；It is operated in response to third, third text information is generated based on second text information, and by the third word-information display in the third region of the display device.Voice data processing method according to the ... of the embodiment of the present invention can by voice messaging be converted into text information precisely in real time.

Description

Voice data processing method and processing unit

Technical field

The present invention relates to audio, video data process fields, specifically, be related to a kind of procuratorial work hearing record processing method and Processing unit.

Background technology

As an important link indispensable in cracking of cases, the prior art is being done for the hearing work of procuratorate When case is inquested, personnel in charge of the case often can not entire session log in comprehensive and accurate record Interrogation Procedure, and can only handle a case It is recorded one by one in real time by typing personnel in the process, heavy workload efficiency is low and is also possible to have deviation so that the notes of case Information is sufficiently complete accurate, meanwhile, the understanding that data is integrally put down to same case is not easy to slap, and greatly reduces hearing effect Rate affects the progress of cracking of cases.

Play the role of the trend that most important record of trial work is faced with Promotion Transformation in current hearing work, with In-depth input that science and technology is examined by force and modernization science and technology are constantly progressive, convenience that procuratorial organ work to hearing record, accurately Property and advanced also have higher requirement.

Invention content

In view of this, it is an object of the invention to propose a kind of voice data processing method and processing unit, so as to reality When, the text information that is accurately converted to voice messaging.

To solve the above-mentioned problems, according to an aspect of the present invention, a kind of voice data processing method, the side are provided Method includes：Obtain voice messaging；Acquired voice messaging is synchronously converted into the first text information, and literary by described first Word presentation of information is in the first area of display device；In response to the first operation, word is carried out to the first text information to be edited Processing, and by treated word-information display the display device second area；In response to the second operation, will not repaiied First text information changed is synchronized to the second area, and generates the second text information；It operates, is based in response to third Second text information generates third text information, and by the third word-information display the display device third Region.

To solve the above-mentioned problems, according to another aspect of the present invention, a kind of voice data processing method is provided, it is described Method includes：Obtain voice messaging；Acquired voice messaging is synchronously converted into the first text information, and by described first Word-information display is in the first area of display device；In response to the 4th operation, first text information is carried out at word Reason to generate the second text information, and by second word-information display the display device second area；In response to Third operates, and generates third text information based on second text information, and by the third word-information display described The third region of display device.

To solve the above-mentioned problems, according to a further aspect of the invention, a kind of voice data processing apparatus is provided, it is described Device includes：At least one processor；And the memory being connect at least one processor communication；Wherein, described to deposit Reservoir is stored with the order that can be executed by least one processor, and the order is executed by least one processor, So that at least one processor is able to carry out acquisition voice messaging；Acquired voice messaging is synchronously converted to first Text information, and by first word-information display in the first area of display device；In response to the first operation, to be edited The first text information carry out word processing, and by treated word-information display the display device second area； In response to the second operation, first text information that do not changed is synchronized to the second area, and generate the second word Information；It is operated in response to third, third text information is generated based on second text information, and by the third text information It is shown in the third region of the display device.

Voice data processing method according to the ... of the embodiment of the present invention and processing system can be shown in same display equipment Show that voice with record, check and correction notes and final notes, can also realize accomplice retrieval in Interrogation Procedure with screen, decorrelation letter Breath, to greatly facilitate the making that scene is put down when procuratorial organs handle a case.

It should be understood that the voice data processing method and device of the present invention can not only be applied in case Interrogation Procedure, He is any required to convert voice messaging to text information in real time, and the occasion having higher requirements to the precision of text information is equal It can apply.

Description of the drawings

By referring to the drawings to the description of the embodiment of the present invention, the above and other purposes of the present invention, feature and Advantage will be apparent from, in the accompanying drawings：

Fig. 1 is the flow chart according to a kind of voice data processing method of the specific embodiment of the invention；

Fig. 2 is that procuratorate according to the ... of the embodiment of the present invention handles a case the structure diagram of system；

Fig. 3 A are the schematic diagrames of the first text information shown in the first region；

Fig. 3 B are the schematic diagrames of the text information after second area display processing；

Fig. 4 schematically shows the flow charts of the query-related information in first, second and/or third text information； And

Fig. 5 is the structure diagram of voice data processing apparatus according to the ... of the embodiment of the present invention.

Specific implementation mode

Illustrate the embodiment of the present invention with reference to the accompanying drawings.Described in the attached drawing of the present invention or a kind of embodiment Elements and features can be combined with elements and features shown in one or more other accompanying drawings or embodiment.It should note Portion unrelated to the invention, known to persons of ordinary skill in the art is omitted in attached drawing and explanation for purposes of clarity in meaning The expression and description of part and processing.

It will be understood by those skilled in the art that the terms such as " first ", " second " in the present invention are only used for distinguishing different lists Member, module or step etc., neither represent any particular technology meaning, also do not indicate that the inevitable logical order between them, also not Embody the importance degree of different units, module or step defined by it.

In embodiments, those features different from other embodiment only are described emphatically, and be omitted and other implementations The same or similar feature of example.

The inventors found that by same display device split screen display available voice with recording data, check and correction notes with And put down according to the final scene that check and correction notes generate, greatly facilitate the editing of scene notes.

Referring to Fig. 1, the flow chart of voice data processing method according to the ... of the embodiment of the present invention, this method S100 packets are shown Include following steps：Obtain voice messaging S110；Acquired voice messaging is synchronously converted into the first text information, and by institute State the first word-information display display device first area S130；In response to the first operation, to the first word to be edited Information carries out word processing, and the text information S150 after the second area display processing of the display device；In response to Two operations, are synchronized to the second area, to generate the second text information S170 by first text information that do not changed； It is operated in response to third, third text information is generated based on second text information, and by the third word-information display In the third region S190 of the display device.

Hereinafter, will be put down when being inquested with procuratorate with record is that scene specifically describes at the voice data of the embodiment of the present invention Reason method S100.It should be understood that the voice data processing method of the embodiment of the present invention is not limited to hearing scene, can be applied to The various occasions that voice input is converted to editable text information.

Fig. 2 is that procuratorate according to the ... of the embodiment of the present invention handles a case the structure diagram of system.As shown in Fig. 2, the system of handling a case can Including multiple harvesters in hearing room are arranged, to be respectively used to acquire image, the voice messaging of handle a case people and suspect. Specifically, harvester can be provided in hearing room, be located at multiple video cameras of different location.

Interrogator can be turned on or off by being used as the hearing host computer control different video camera of controller, and The voice data processing apparatus that the voice signal of acquisition is sent.

Voice data processing apparatus can convert the voice signal of acquisition, style of writing of going forward side by side word processing, finally by institute The notes information needed shows on the display device.

In alternatively possible example, special voice server can be configured, with to collected voice signal into Row processing.

With reference to figure 1, in step s 110, start to operate in response to operating personnel, voice data processing apparatus starts to connect Receive the voice data of input.In the present embodiment, voice data can be by the acquisition in such as video camera of hearing room is arranged The dialogue of device collected procurator and suspect in real time.

In a possible example, it can be operating personnel to click or touch manner is alternatively provided to start operation Beginning virtual push button on control device.Control device can be the specialized control equipment for the system specially developed, and can also lead to It crosses and the mode of software is installed on the existing equipments such as intelligent mobile terminal, tablet computer to realize.

After operating personnel click start button, setting is activated in the indoor harvester of hearing, starts to obtain hearing Then the voice messaging got is transmitted to voice data processing apparatus by indoor image information and voice messaging.

In step S120, voice data processing apparatus can utilize the various speech recognition engines being stored therein that will be obtained What the voice messaging taken synchronized is converted to text information.Hereinafter, for convenience's sake, it will be converted by speech recognition engine Text information be known as the first text information.

For example, being referred to the relevant technologies, Markov speech recognition modeling, neural network algorithm, supporting vector are used The various speech recognition engines of machine scheduling algorithm model construction.

Then, include in the specific region of display device in real time by the first text information.Hereinafter, it rises for convenience See, which is become into first area.Meanwhile special mark (ID) and memory space use can be configured in processing unit In storing the first text information.

In a possible example, a special window can be set for showing that the word of conversion is believed in real time Breath passes through the modes such as the towing position that voluntarily selection window is placed as needed by operating personnel.In another possible example In, it can be as needed, display area is divided into multiple regions, by different presentation of information in different regions.For example, Can be 3 regions, left area, intermediate region and right area, and the word that will identify that by display area fixed partition A presentation of information region wherein, in left area.

In the present embodiment, the record of trial can be shown based on the identity information for the personnel for participating in hearing.For example, when examining When sentencing personnel and speaking, speech recognition engine starts to identify the voice messaging of speaker, the voice messaging that processing unit will identify that And identity information (personnel A) display of speaker is in the first region, and when talker person becomes crime from interrogator When suspect, speech recognition engine judges the diversification in role of speaker according to the variation of sound, and at this moment, switching paragraph is shown newly The voice messaging of the other personnel (personnel B) identified.

In a possible example, the identity information of the personnel that speak identified can be set by operating personnel.For example, " personnel A " in the first text information that operating personnel can would indicate that is revised as " hearing people ", and " personnel B " is revised as " suspect ".And in another possible example, it can be according to the voice messaging acquired in different audio collecting devices To distinguish different Role Informations.

In a possible embodiment, first text information includes the time of the acquired voice messaging Stamp.Fig. 3 A are the schematic diagrames of the first text information shown in the first region.As shown in Figure 3A, voice data processing apparatus can The timestamp for getting voice messaging to be recorded, and with the identity information and voice messaging of the personnel that speak identified It shows together in the first region.

Then, word processing is carried out to the first text information in response to the first operation of operating personnel in step S150.

In a possible example, the first operation can be the double click operation chosen after the first information to be edited. As an alternative, after choosing the first information to be edited edit operation can be realized by clicking preset button.When user selects In after the first information to be edited, which can be presented on to the specific region of display device.Hereinafter, it is side Just for the sake of, which is become into second area.

As set forth above, it is possible to which a special window is arranged for showing the text information to be edited chosen in real time.Together Sample, the display mode in 3 panes described above may be used, by word-information display to be edited in display device Intermediate region.

Then, operating personnel such as can increase the text information in second area, delete at the operations, and show modification Text information afterwards.In a possible example, modification operation can call back office interface by first by processing unit The mark (ID) of text information modifies to the word being stored in processing unit.

Fig. 3 B are the schematic diagrames of the text information after second area display processing.Come below in conjunction with Fig. 3 A and Fig. 3 B specific Ground description carries out pretreated process to the hearing information recorded.

As shown in Figure 3A, the hearing pen of the interrogator and suspect of acquisition are shown in the first area of display device Record.In the example shown in Fig. 3 A and 3B, need to carry out preliminary treatment to the notes of suspect, it is unrelated with content is inquested to remove Information, i.e. " heartily " the two words.In order to modify to this text information, operating personnel are such as to touch or mouse The modes such as click, choose corresponding text information, i.e., correspondent time " 10 in figure:50:55 " corresponding text informations.Work as operation , can be by the first operation after personnel choose information to be edited, such as double-click the information so that the information, including timestamp Full content including identity information etc., is extracted in second area.Then, operating personnel can delete in the second area Except " heartily " the two words.As a result, as shown in Figure 3B, showing modified text information in the second area.

According to described above, operating personnel have carried out in pre- place text information to be edited by the first operation, and These pretreated text informations are shown in the second area, at this point, the information of not selected processing of modifying is not Display is in the second area.First text information that do not changed is synchronized in response to the second operation in step S170 The second area, and generate and completely pass through pretreated notes information.Hereinafter, in the second area complete will be presented Whole pretreated text information becomes the second text information.

Second operation for example can click preset virtual push button on the control device by operating personnel to realize.

In a possible embodiment, the second text information is editable.After generating the second text information, behaviour Editing and processing in detail can be carried out to the second text information by making personnel.

Then, it in step S190, is operated in response to the third of operating personnel, third is generated based on second text information Text information, and by the third word-information display the display device third region S190.

As described above, third operation can click preset virtual push button on the control device come real by operating personnel It is existing.

In a possible embodiment, third text information is editable.

For example, when operating personnel trigger third operation after, can in special region, i.e. third region, in create one Window is to show the final notes of generation, i.e. third text information.Similarly, as set forth above, it is possible to using described above The display mode of 3 panes exists the second word-information display by the first word-information display in the left area of display device The intermediate region of display device, by third word-information display display device right area.

It should be understood that in the present embodiment, third region is simultaneously not exclusively suitable for display third text information.For example, generating the Three text informations and after preserving, operating personnel can be by the closes for showing third text information, and by third area Domain is for showing other information.

A voice data processing method according to the ... of the embodiment of the present invention is described above in association with Fig. 1-Fig. 3 B, using this reality The method described in example is applied, voice can be shown in same display equipment with record, check and correction notes and final notes, to greatly The making for facilitating scene when procuratorial organs handle a case and putting down.It should be understood that the voice data processing method described in the present embodiment is not It can only apply in case Interrogation Procedure, other any need convert voice messaging to text information in real time, and to text The occasion that the precision of word information has higher requirements can be applied.

In another embodiment of the present invention, relevant letter can be inquired in the case of split screen display available puts down information Breath.Fig. 4 schematically shows the flow charts of the query-related information in first, second and/or third text information.In this reality It applies in example, any sentence in the first text information, the second text information or third text information can be selected, in order to To carry out the inquiry of relevant information in being put down in entire chapter.Hereinafter, by for selecting a certain sentence in the second text information, Explain the operation of the query-related information in accomplice notes in detail in conjunction with Fig. 4.

Operating personnel can be in the second text information that second area is shown, as unit of sentence, and arbitrary selection it is expected The text information of understanding.For example, in step S410, when operating personnel selected by way of click in second area it is " subsequent Zhang San has turned 1,000,000 to the account of my Bank of Communications " after this sentence, in step S430, the sentence chosen can be divided Word processing, and by default rule, the word of not practical significance is removed, has extracted heavy duty word, i.e. " Zhang San ", " traffic Bank ", " account " and " 10,000,000 ".

Then, it in step S450, is inquired in first, second and/or third text information and refers to " Zhang San ", " traffic silver The sentence of row " " account " or " 10,000,000 ", and these the associated word-information displays inquired are come out.For example, In one possible example, all associated word-information displays retrieved can be shown in specific area in third region Domain.In another possible example, it can will be inquired in first, second and/or third text information shown in same screen Relevant information be highlighted.

In the present embodiment, accomplice retrieval analysis is presented by same screen, comprehensive note to case Interrogation Procedure may be implemented Record, quick clear up a criminal case have reached best implementation result.It should be understood that the voice data processing method according to the present embodiment is answered It when used in other word processing scenes, can rapidly analyze, positioning operation personnel it is expected all information understood.

It in another embodiment of the present invention, can be by preset word processing rule, to realize to the first word The pretreatment of information, to generate the second text information.

Specifically, different words can be added in rule base by operating personnel, can when generating the second text information Directly first information word is executed and the operations such as the matched replacement of preset rules, deletion according to default rule with elder generation.

For example, in a possible example, in response to the 4th operation of user, such as grasped by clicking, touching setting Make interface execution preprocessing function virtual push button, can automatically to the first text information execute with it is each in rule file The operation of kind rule match, to directly generate the second text information.For example, by preset rules, the word in being elected to exists Such as " heartily " hip-hop word when, automatically these words can be deleted.

In another possible example, as described in foregoing embodiments, when operating personnel choose from the first text information After sentence to be edited, first directly the word chosen can be executed and the word match in rule file according to default rule Text information carries out the scheduled processing such as replacing, delete, then is confirmed whether to also need to carry out further by operating personnel Modification.After the completion of to be modified, in the operation by operating personnel, in response to the second operation, first text that will do not changed Word synchronizing information is to the second area, as described in the step S170 in foregoing embodiments.

In the present embodiment, the generation and display of voice messaging and the first text information and third text information are obtained, Similar with manner described above, details are not described herein.

Voice data processing method according to the present embodiment, can according to the first text information of default rule pair into Row pretreatment, to simplify the generation of the second text information.

The voice data processing method of the embodiment of the present invention is described above in association with Fig. 1 to Fig. 4.In fact, the present invention is also Provide a kind of voice data processing apparatus for executing above-mentioned voice data processing method.Fig. 5 is implemented according to the present invention The structure diagram of the voice data processing apparatus of example.With reference to figure 5, which includes：

Memory 53 and one or at least processor 51；

Wherein, the memory 53 is communicated to connect with one or more of processors 51, is stored in the memory 53 There are the instruction that can be executed by one or more of processors, described instruction to be executed by one or more of processors 51, with One or more of processors 51 are made to execute：Obtain voice messaging；Acquired voice messaging is synchronously converted to first Text information, and by first word-information display in the first area of display device；In response to the first operation, to be edited The first text information carry out word processing, and by treated word-information display the display device second area； In response to the second operation, first text information that do not changed is synchronized to the second area, and generate the second word Information；It is operated in response to third, third text information is generated based on second text information, and by the third text information It is shown in the third region of the display device.

Processor can execute the voice data processing method with reference to the one embodiment for combining Fig. 1 to Fig. 3 B descriptions, or Person executes the voice data processing method of another embodiment described with reference to figure 4, and details are not described herein for detail.

It will be understood by those skilled in the art that the embodiment of the present invention can be provided as method, system or computer program production Product.Therefore, the form of hardware embodiment, software implementation or embodiment combining software and hardware aspects can be used in the present invention. Moreover, the present invention can be used in one or more wherein include computer usable program code computer-usable storage medium The form for the computer program product implemented on (including but not limited to magnetic disk storage and optical memory etc.).

These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.

The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention.

Finally, it is to be noted that, the terms "include", "comprise" or its any other variant be intended to it is non-exclusive Property include so that including a series of elements process, method, article or equipment not only include those elements, but also Further include other elements that are not explicitly listed, or further include for this process, method, article or equipment it is intrinsic Element.In addition, in the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wrapping Include in the process, method, article or equipment of the element that there is also other identical elements.

Claims

1. a kind of voice data processing method, including：

Obtain voice messaging；

Acquired voice messaging is synchronously converted into the first text information, and first word-information display is being shown The first area of device；

In response to the first operation, word processing is carried out to the first text information to be edited, and text information is shown by treated Show the second area in the display device；

In response to the second operation, first text information that do not changed is synchronized to the second area, and generate second Text information；

It is operated in response to third, third text information is generated based on second text information, and by the third text information It is shown in the third region of the display device.

2. voice data processing method according to claim 1, it is characterised in that：First text information includes obtaining The timestamp of the voice messaging.

3. voice data processing method according to claim 1, it is characterised in that：

Second text information is editable.

4. voice data processing method according to claim 3, it is characterised in that：

In the case that specific character is selected in described first, second or third text information, to the selected text Word carries out word segmentation processing, and participle inquiry is carried out in described first, second and/or third text information, and shows and inquire All associated text informations.

5. a kind of voice data processing method, including：

Obtain voice messaging；

In response to the 4th operation, word processing is carried out to first text information to generate the second text information, and will be described Second area of second word-information display in the display device；

6. a kind of voice data processing apparatus, including：

At least one processor；And

The memory being connect at least one processor communication；Wherein, be stored with can be by described at least one for the memory The order that a processor executes, the order is executed by least one processor, so that at least one processor energy Enough perform claims require the method described in any one of 1-5.