CN109559744A

CN109559744A - Processing method, device and the readable storage medium storing program for executing of voice data

Info

Publication number: CN109559744A
Application number: CN201811517309.2A
Authority: CN
Inventors: 周笑涵
Original assignee: Taikang Health Industry Klc Holdings Ltd; Taikang Insurance Group Co Ltd
Current assignee: Taikang Health Industry Klc Holdings Ltd; Taikang Insurance Group Co Ltd
Priority date: 2018-12-12
Filing date: 2018-12-12
Publication date: 2019-04-02
Anticipated expiration: 2038-12-12
Also published as: CN109559744B

Abstract

The processing method of voice data provided by the invention, device and readable storage medium storing program for executing, receive voice data to be processed, speech recognition is carried out to the voice data to be processed, with the corresponding voice key word information of the determination voice data, according to the voice key word information, determine that the language identification of type corresponding with the voice data to be processed supports packet, packet is supported to convert the voice data to be processed according to the determining language identification, obtain treated voice conversion data, the voice key word information of voice data to be processed is obtained by filtering, and support packet is selected to support to wrap with the most matched language identification of voice data to be processed from different types of language identification according to voice key word information, to support packet to handle voice data to be processed using the language identification, the letter of its process It is single, it is conducive to use.

Description

Processing method, device and the readable storage medium storing program for executing of voice data

Technical field

The present invention relates to computer technology more particularly to a kind of processing methods of voice data, device and readable storage medium Matter.

Background technique

With the development of electronic technology, development trend is become using paperless office, is handled official business by way of voice It is one such mode.

It also needs to identify voice by the way of voice office, and due to the demand of different business, identify institute The language used supports the type of packet also different.In the prior art, user needs the voice phase for selecting to input with oneself After matched language supports packet, corresponding voice could be inputted, so that service platform supports packet to language according to the language of the selection Sound is handled.

The processing mode process of such voice data is cumbersome, be extremely unfavorable for using.

Summary of the invention

It is corresponding for the above-mentioned content selection for needing the voice for inputting based on oneself in the prior art referred to Language support packet after, could start voice input operation, process flow is cumbersome, be unfavorable for using the problem of, the present invention mentions Processing method, device and the readable storage medium storing program for executing of a kind of voice data are supplied.

On the one hand, the present invention provides a kind of processing methods of voice data, comprising:

Receive voice data to be processed；

Speech recognition is carried out to the voice data to be processed, with the corresponding voice keyword of the determination voice data Information；

According to the voice key word information, the language identification of type corresponding with the voice data to be processed is determined Support packet；

Packet is supported to convert the voice data to be processed according to the determining language identification, after being handled Voice conversion data.

It is described according to the voice key word information in a kind of wherein optional embodiment, it determines with described wait locate The language identification of the corresponding type of the voice data of reason supports packet, comprising:

It supports to select and believe with voice keyword in packet in all types of language identifications using preset NB Algorithm It ceases the highest language identification of matching degree and supports packet, the language identification as the voice data to be processed supports packet.

In a kind of wherein optional embodiment, it is described using preset NB Algorithm in all types of language Identification is supported in packet, is selected and is supported to wrap with the highest language identification of voice key word information matching degree, as described to be processed The language identification of voice data supports packet, comprising:

Determine that each keyword belongs to all types of language knowledges in voice key word information according to preset keyword vector table The probability of packet is not supported；

Belong to the probability that all types of language identification supports packet according to keyword each in voice key word information, determine described in Voice data to be processed belongs to all types of language identifications and supports the probability of packet, and it is described wait locate to select the highest conduct of probability The language identification of the voice data of reason supports packet.

It is described that speech recognition is carried out to the voice data to be processed in a kind of wherein optional embodiment, with Determine the corresponding voice key word information of the voice data, comprising:

The voice data is filtered according to the vocabulary packet prestored, determines the pass in the voice data to be processed Keyword；

Determine profession belonging to the quantity of the different keywords occurred in the voice data to be processed, each keyword Type, each keyword are at least one of the frequency of occurrence in the voice data, weighted value shared by each keyword；

Correspondingly, it is described according to the voice key word information, determine class corresponding with the voice data to be processed The language identification of type supports packet, comprising: according to the quantity of the different keywords occurred in the voice data to be processed, each Many types belonging to keyword, each keyword are in the frequency of occurrence in the voice data, weighted value shared by each keyword At least one of, determine that the language identification of type corresponding with the voice data to be processed supports packet.

In a kind of wherein optional embodiment, the language identification supports that the type of packet includes: Medical Language identification Support packet, customer service language identification that packet and meeting language identification is supported to support packet.

In a kind of wherein optional embodiment, it is described obtain treated voice conversion data after, the method Further include:

The voice data to be processed and corresponding voice conversion process are associated storage.

It is described to receive voice data to be processed in a kind of wherein optional embodiment, comprising:

The interface of service platform is called, the voice number to be processed for being uploaded to the service platform from each client is received According to.

On the other hand, the present invention provides a kind of processing units of voice data, comprising:

Receiving module, for receiving voice data to be processed；

Identification module, for carrying out speech recognition to the voice data to be processed, with the determination voice data pair The voice key word information answered；

Language pack conversion module, for according to the voice key word information, the determining and voice data to be processed The language identification of corresponding type supports packet；It is also used to support packet to the language to be processed according to the determining language identification Sound data are converted, and treated voice conversion data is obtained.

In another aspect, the present invention provides a kind of electronic equipment, comprising: memory, the processing being connect with the memory Device, and it is stored in the computer program that can be run on the memory and on the processor, which is characterized in that

The processor executes such as preceding described in any item methods when running the computer program.

Last aspect, the present invention provides a kind of readable storage medium storing program for executing, are stored thereon with computer program, and feature exists In the computer program realizes method of any of claims 1-7 when being executed by processor.

Processing method, device and the readable storage medium storing program for executing of voice data provided by the invention receive voice number to be processed According to, speech recognition is carried out to the voice data to be processed, with the corresponding voice key word information of the determination voice data, According to the voice key word information, determine that the language identification of type corresponding with the voice data to be processed supports packet, Packet is supported to convert the voice data to be processed according to the determining language identification, obtaining that treated, voice turns Data are changed, the voice key word information of voice data to be processed are obtained by filtering, and never according to voice key word information The language identification of same type supports packet is selected to support to wrap with the most matched language identification of voice data to be processed, thus using should Language identification supports packet to handle voice data to be processed, and process is simple, is conducive to use.

Detailed description of the invention

Through the above attached drawings, it has been shown that the specific embodiment of the disclosure will be hereinafter described in more detail.These attached drawings It is not intended to limit the scope of this disclosure concept by any means with verbal description, but is by referring to specific embodiments Those skilled in the art illustrate the concept of the disclosure.

Fig. 1 be the present invention is based on network architecture schematic diagram；

Fig. 2 is a kind of flow diagram of the processing method for voice data that the embodiment of the present invention one provides；

Fig. 3 is a kind of flow diagram of the processing method of voice data provided by Embodiment 2 of the present invention；

Fig. 4 is a kind of structural schematic diagram of the processing unit for voice data that the embodiment of the present invention three provides；

Fig. 5 is a kind of hardware schematic of the processing unit for voice data that the embodiment of the present invention four provides.

The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure Example, and together with specification for explaining the principles of this disclosure.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described.

For above-mentioned the technical issues of referring to, the present invention provides a kind of processing methods of voice data, device and readable Storage medium.It should be noted that processing method, device and readable storage medium storing program for executing that the application provides voice data can be used in It is widely used in the application scenarios for needing to carry out office or intelligent data typing and storage using voice, these application scenarios packets It includes but is not limited to: the voice record of meeting, medical case typing, voice customer service data inputting etc..

Fig. 1 be the present invention is based on network architecture schematic diagram, as shown in Figure 1, in the network architecture that is based on of the present invention Including at least the processing unit 2 and terminal 3 of service platform 1, voice data.The service platform 1 can respectively with terminal 3 and voice The processing unit 2 of data obtains through wireless communication to be connected and carries out data interaction.In addition, user can pass through terminal 3 The voice of input is uploaded to service platform 1, then is obtained and is handled from service platform by the processing unit 2 of the voice data After store into service platform 1, for its subsequent use.Certainly, the processing unit 2 of voice data provided by the present application can be handled Various types of voice data, correspondingly, all types of voice data can be uploaded in service platform 1 by user by terminal 3, So that the processing unit 2 of voice data is handled.

Fig. 2 is a kind of flow diagram of the processing method for voice data that the embodiment of the present invention one provides.

As shown in Fig. 2, the processing method of the voice data includes:

Step 101 receives voice data to be processed.

Step 102 carries out speech recognition to the voice data to be processed, with the corresponding language of the determination voice data Sound key word information.

Step 103, according to the voice key word information, determine type corresponding with the voice data to be processed Language identification supports packet.

Step 104 supports packet to convert the voice data to be processed according to the determining language identification, obtains Treated voice conversion data.

It should be noted that the executing subject of the processing method of voice data provided by the invention is concretely shown in Fig. 1 Voice data processing unit.

Specifically, the present invention provides a kind of processing method of voice data, firstly, user can will be real-time by terminal The voice data obtained is acquired, or the voice data of typing in advance is uploaded in service platform, by the processing unit of voice data The voice data to be processed is grabbed from service platform, or by receiving the voice number to be processed actively initiated by service platform According to.

Then, the processing unit of voice data will carry out speech recognition to the voice data to be processed, to determine The corresponding voice key word information of voice data is stated, and according to the voice key word information, the determining and language to be processed The language identification of the corresponding type of sound data supports packet.

Specifically, unlike the prior art, in the present embodiment, since the processing unit of voice data can be right The voice data of multiple types is handled, and the language identification for being deployed with multiple types in advance is supported packet, these language inside Identification supports packet to include at least: Medical Language identification supports packet, customer service language identification that packet and meeting language identification is supported to support packet. Wherein, which supports to can be used for carrying out text conversion for voice based on different specialized vocabularies in packet, To obtain the data after more accurately converting.And by utilizing voice key word information, it can be from all types of language identification branch It holds in packet, it is determining to support to wrap with the more matched language identification of the voice data to be processed, and identified.Voice therein Key word information specifically may include the quantity of the different keywords occurred in voice data to be processed, belonging to each keyword One or more information of the frequency of occurrence of many types, each keyword in the voice data etc..

And speech recognition is carried out to the voice data to be processed, it is crucial with the corresponding voice of the determination voice data Word information, then specifically can include: the voice data is filtered according to the vocabulary packet prestored, determines the language to be processed Keyword in sound data；Determine the quantity of the different keywords occurred in the voice data to be processed, each keyword Affiliated many types, each keyword in the frequency of occurrence in the voice data, weighted value shared by each keyword extremely It is one few；Correspondingly, it is described according to the voice key word information, determine type corresponding with the voice data to be processed Language identification support packet, comprising: according to the quantity of the different keywords occurred in the voice data to be processed, Mei Geguan Many types belonging to keyword, each keyword are in the frequency of occurrence in the voice data, weighted value shared by each keyword At least one, determine that the language identification of corresponding with the voice data to be processed type is supported to wrap.Determination therein with The language identification of the corresponding type of the voice data to be processed supports packet that the side that preset algorithm calculates comprehensive score can be used Formula is realized.

Finally, the processing unit of voice data supports packet to the voice to be processed according to the determining language identification Data are converted, and treated voice conversion data is obtained.

By taking medical scene as an example: clinician in the process of work, it is openable if desired to write documents, the users such as case history Intelligent sound software or microphone switch are spoken in the effective range of microphone, which is uploaded to service platform by terminal, So that the processing unit of voice data carries out it to identify and Medical Language identification is called to support packet automatically, voice is converted written Word is simultaneously shown in document.These texts can carry out unified storage via service platform or carry out subsequent analysis.

The processing method for the voice data that the embodiment of the present invention one provides, by receiving voice data to be processed, to institute It states voice data to be processed and carries out speech recognition, with the corresponding voice key word information of the determination voice data, according to institute Predicate sound key word information determines that the language identification of type corresponding with the voice data to be processed supports packet, according to true The fixed language identification supports packet to convert the voice data to be processed, the voice conversion that obtains that treated According to obtaining the voice key word information of voice data to be processed by filtering, and according to voice key word information from inhomogeneity The language identification of type supports packet is selected to support to wrap with the most matched language identification of voice data to be processed, to utilize the language Identification supports packet to handle voice data to be processed, and process is simple, is conducive to use.

On the basis of example 1, in order to further increase the transfer admittance for voice data, Fig. 3 is the present invention The flow diagram of the processing method for a kind of voice data that embodiment two provides, as shown in figure 3, the processing side of the voice data Method includes:

Step 201, the interface for calling service platform receive and are uploaded to the to be processed of the service platform from each client Voice data.

Step 202 carries out speech recognition to the voice data to be processed, with the corresponding language of the determination voice data Sound key word information；

Step 203 supported in packet using preset NB Algorithm in all types of language identification, is selected and voice The highest language identification of key word information matching degree supports packet, and the language identification as the voice data to be processed is supported Packet.

Step 204 supports packet to convert the voice data to be processed according to the determining language identification, obtains Treated voice conversion data.

The voice data to be processed and corresponding voice conversion process are associated storage by step 205, for clothes Business platform is managed it.

Similarly with embodiment one, the executing subject of the processing method of voice data provided by the invention is concretely schemed The processing unit of voice data shown in 1.

In the present embodiment, firstly, when user can acquire the voice data obtained or preparatory typing for real-time by terminal Voice data be uploaded to after service platform, the processing unit of voice data can call the interface of service platform, and reception comes from Each client is uploaded to the voice data to be processed of the service platform.Wherein, for the ease of the processing unit of voice data Each voice data to be processed is grabbed and received, service platform will reserve unified interface, for the place of voice data It manages device and calls voice data from the interface.

In the present embodiment, there is vocabulary packet in the processing unit of voice data, each vocabulary has profession belonging to it (belonging to medical vocabulary or customer service vocabulary or meeting vocabulary) and corresponding weighted value.By utilizing these vocabulary packets to be processed Voice data be filtered, determine the keyword occurred in the voice data to be processed.It counts and determines to be processed Many types belonging to the quantity of the different keywords occurred in voice data, each keyword, each keyword are in the voice Frequency of occurrence in data.

Then, language identification the most matched is obtained using these voice key word informations and preset algorithm to support to wrap. Specifically, also prestoring preset keyword vector table in the processing unit of voice data, which is used for table Show that each keyword belongs to the probability that all types of language identifications supports packet in voice key word information.The processing unit of voice data Determine that each voice keyword occurred in voice key word information belongs to all types of language identifications using keyword vector table Support the probability of packet.Then, the frequency of occurrence in conjunction with each voice keyword above-mentioned in entire voice data to be processed with And the quantity of the voice keyword of entire voice data to be processed, determine that probability is highest as the voice to be processed The language identification of data supports packet.I.e. the processing unit of voice data can increase it according to the word occurrence frequency of analytic language Language supports packet weight, and the processing unit of voice data, which matches, adds up frequency of use, comprehensive to add up preferably to go out language support packet.

Further, since the processing unit of voice data can be handled the voice data of multiple types, it inside will be preparatory It is deployed with the language identification support packet of multiple types, these language identifications support packet to include at least: Medical Language identification support packet, Customer service language identification supports packet and meeting language identification to support packet.

Finally, the processing unit of voice data supports packet to the voice to be processed according to the determining language identification Data are converted, after obtaining treated voice conversion data, also by the voice data to be processed and corresponding voice Conversion process is stored, so that service platform is managed it.

The processing method of voice data provided by the invention, by receiving voice data to be processed, to described to be processed Voice data carry out speech recognition, with the corresponding voice key word information of the determination voice data, closed according to the voice Keyword information determines that the language identification of type corresponding with the voice data to be processed supports packet, according to determining Language identification supports packet to convert the voice data to be processed, obtains treated voice conversion data, passed through Filter obtains the voice key word information of voice data to be processed, and is known according to voice key word information from different types of language It Zhi Chi not wrap to select and support to wrap with the most matched language identification of voice data to be processed, to support to wrap using the language identification Voice data to be processed is handled, process is simple, is conducive to use.

Fig. 4 is a kind of structural schematic diagram of the processing unit for voice data that the embodiment of the present invention three provides, such as Fig. 4 institute Show, the processing unit of the voice data includes:

Receiving module 10, for receiving voice data to be processed；

Identification module 20, for carrying out speech recognition to the voice data to be processed, with the determination voice data Corresponding voice key word information；

Language pack conversion module 30, for according to the voice key word information, the determining and voice number to be processed Packet is supported according to the language identification of corresponding type；It is also used to support packet to described to be processed according to the determining language identification Voice data is converted, and treated voice conversion data is obtained.

In a kind of wherein optional embodiment, the identification module 20 is specifically used for:

According to the probability of each voice keyword, determine that the voice data to be processed belongs to all types of language identification branch The probability of packet is held, and selects the highest language identification as the voice data to be processed of probability and supports packet.

Determine profession belonging to the quantity of the different keywords occurred in the voice data to be processed, each keyword The frequency of occurrence of type, each keyword in the voice data；

Correspondingly, it is described according to the voice key word information, determine class corresponding with the voice data to be processed The language identification of type supports packet, comprising: according to the quantity of the different keywords occurred in the voice data to be processed, each The frequency of occurrence of many types belonging to keyword, each keyword in the voice data, the determining and language to be processed The language identification of the corresponding type of sound data supports packet.

It further include storage unit in a kind of wherein optional embodiment, for obtaining treated voice described After change data, the voice data to be processed and corresponding voice conversion process are stored, for service platform It is managed.

In a kind of wherein optional embodiment, the receiving module 10 is specifically used for:

The processing unit of voice data provided by the invention, by receiving voice data to be processed, to described to be processed Voice data carry out speech recognition, with the corresponding voice key word information of the determination voice data, closed according to the voice Keyword information determines that the language identification of type corresponding with the voice data to be processed supports packet, according to determining Language identification supports packet to convert the voice data to be processed, obtains treated voice conversion data, passed through Filter obtains the voice key word information of voice data to be processed, and is known according to voice key word information from different types of language It Zhi Chi not wrap to select and support to wrap with the most matched language identification of voice data to be processed, to support to wrap using the language identification Voice data to be processed is handled, process is simple, is conducive to use.

Fig. 5 is a kind of hardware schematic of the processing unit for voice data that the embodiment of the present invention four provides.Such as Fig. 5 institute Show, the processing unit of the voice data includes: processor 42 and is stored on memory 41 and can run on processor 42 Computer program, processor 42 run the method for executing above-described embodiment when computer program.

The present invention also provides a kind of readable storage medium storing program for executing, including program, when it runs at the terminal, so that terminal executes The method of any of the above-described embodiment.

Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above-mentioned each method embodiment can lead to The relevant hardware of program instruction is crossed to complete.Program above-mentioned can be stored in a computer readable storage medium.The journey When being executed, execution includes the steps that above-mentioned each method embodiment to sequence；And storage medium above-mentioned include: ROM, RAM, magnetic disk or The various media that can store program code such as person's CD.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations；To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement；And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims

1. a kind of processing method of voice data characterized by comprising

Receive voice data to be processed；

Speech recognition is carried out to the voice data to be processed, with the corresponding voice keyword letter of the determination voice data Breath；

According to the voice key word information, determine that the language identification of type corresponding with the voice data to be processed is supported Packet；

Packet is supported to convert the voice data to be processed according to the determining language identification, the language that obtains that treated Sound change data.

2. the processing method of voice data according to claim 1, which is characterized in that described according to the voice keyword Information determines that the language identification of type corresponding with the voice data to be processed supports packet, comprising:

It supports to select and voice key word information in packet in all types of language identifications using preset NB Algorithm Packet is supported with highest language identification is spent, and the language identification as the voice data to be processed supports packet.

3. the processing method of voice data according to claim 2, which is characterized in that described to utilize preset simple pattra leaves This algorithm is supported in packet in all types of language identifications, is selected and is supported with the highest language identification of voice key word information matching degree Packet, the language identification as the voice data to be processed support packet, comprising:

Determine that each keyword belongs to all types of language identification branch in voice key word information according to preset keyword vector table Hold the probability of packet；

Belong to the probability that all types of language identifications supports packet according to keyword each in voice key word information, determines described wait locate The voice data of reason belongs to the probability that all types of language identifications supports packet, and it is highest as described to be processed to select probability The language identification of voice data supports packet.

4. the processing method of voice data according to claim 1, which is characterized in that described to the voice to be processed Data carry out speech recognition, with the corresponding voice key word information of the determination voice data, comprising:

The voice data is filtered according to the vocabulary packet prestored, determines the key in the voice data to be processed Word；

Determine professional class belonging to the quantity of the different keywords occurred in the voice data to be processed, each keyword Type, each keyword are at least one of the frequency of occurrence in the voice data, weighted value shared by each keyword；

Correspondingly, it is described according to the voice key word information, determine type corresponding with the voice data to be processed Language identification supports packet, comprising:

According to professional class belonging to the quantity of the different keywords occurred in the voice data to be processed, each keyword Type, each keyword at least one of the frequency of occurrence in the voice data, weighted value shared by each keyword, determine with The language identification of the corresponding type of the voice data to be processed supports packet.

5. the processing method of voice data according to claim 1, which is characterized in that the language identification supports the class of packet Type includes: that Medical Language identification supports packet, customer service language identification that packet and meeting language identification is supported to support packet.

6. the processing method of voice data according to claim 1-5, which is characterized in that it is described handled after Voice conversion data after, the method also includes:

7. the processing method of voice data according to claim 1-5, which is characterized in that the reception is to be processed Voice data, comprising:

The interface of service platform is called, the voice data to be processed for being uploaded to the service platform from each client is received.

8. a kind of processing unit of voice data characterized by comprising

Receiving module, for receiving voice data to be processed；

Identification module, it is corresponding with the determination voice data for carrying out speech recognition to the voice data to be processed Voice key word information；

Language pack conversion module, for according to the voice key word information, determination to be corresponding with the voice data to be processed Type language identification support packet；It is also used to support packet to the voice number to be processed according to the determining language identification According to being converted, the voice conversion data that obtains that treated.

9. a kind of electronic equipment characterized by comprising memory, the processor being connect with the memory, and it is stored in institute State the computer program that can be run on memory and on the processor, which is characterized in that

Perform claim requires the described in any item methods of 1-7 when the processor runs the computer program.

10. a kind of readable storage medium storing program for executing, is stored thereon with computer program, which is characterized in that the computer program is processed Device realizes method of any of claims 1-7 when executing.