CN110956956A - Voice recognition method and device based on policy rules - Google Patents

Voice recognition method and device based on policy rules Download PDF

Info

Publication number
CN110956956A
CN110956956A CN201911284678.6A CN201911284678A CN110956956A CN 110956956 A CN110956956 A CN 110956956A CN 201911284678 A CN201911284678 A CN 201911284678A CN 110956956 A CN110956956 A CN 110956956A
Authority
CN
China
Prior art keywords
keyword
preset
quality inspection
policy
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911284678.6A
Other languages
Chinese (zh)
Inventor
崔晶晶
左琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jeo Polymerization Beijing Artificial Intelligence Technology Co ltd
Original Assignee
Jeo Polymerization Beijing Artificial Intelligence Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jeo Polymerization Beijing Artificial Intelligence Technology Co ltd filed Critical Jeo Polymerization Beijing Artificial Intelligence Technology Co ltd
Priority to CN201911284678.6A priority Critical patent/CN110956956A/en
Publication of CN110956956A publication Critical patent/CN110956956A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/187Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

The application provides a voice recognition method and a voice recognition device based on policy rules, wherein the method comprises the following steps: receiving a target audio file and a preset strategy set; judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score; determining a quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value; the method and the device can perform voice recognition in a personalized and configurable mode, and can improve accuracy of the voice recognition.

Description

Voice recognition method and device based on policy rules
Technical Field
The application relates to the field of data processing, in particular to a voice recognition method and device based on policy rules.
Background
The voice quality inspection in the prior art mainly aims to judge whether violation behaviors exist according to the conversation contents of two parties of a telephone, different enterprises have different definitions for the violation behaviors, and the prior art is difficult to meet the requirements of individuation, configurability and high accuracy of the voice quality inspection.
Disclosure of Invention
Aiming at the problems in the prior art, the application provides a voice recognition method and device based on policy rules, which can perform voice recognition in an individualized and configurable manner and improve the accuracy of the voice recognition.
In order to solve at least one of the above problems, the present application provides the following technical solutions:
in a first aspect, the present application provides a speech recognition method based on policy rules, including:
receiving a target audio file and a preset strategy set;
judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score;
and determining a quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value.
Further, the determining whether the dialog text of the target audio file matches the keyword tag of each policy in the preset policy set includes:
and judging whether the dialog text is matched with each first keyword tag of the keyword template in the preset strategy set, wherein the first keyword tag of the keyword template is preset by a user.
Further, the determining whether the dialog text of the target audio file matches the keyword tag of each policy in the preset policy set further includes:
and judging whether the dialog text is matched with each second keyword label of the NLP template in the preset strategy set, wherein the second keyword label of the NLP template is obtained after a user carries out model training on a preset NLP model according to sample data added with labels in advance.
Further, the determining whether the dialog text of the target audio file matches the keyword tag of each policy in the preset policy set further includes:
and judging whether the dialog text is matched with each regular expression of the regular template in the preset strategy set, wherein the regular expression of the regular template is preset by a user according to condition numbers, quality inspection types, roles, range thresholds, comparison relations and regular keyword categories.
In a second aspect, the present application provides a speech recognition device based on policy rules, comprising:
the parameter input module is used for receiving a target audio file and a preset strategy set;
the strategy matching module is used for judging whether the dialog text of the target audio file matches the keyword labels of all strategies in the preset strategy set or not, and if yes, acquiring a corresponding quality inspection score;
and the result output module is used for determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value.
Further, the policy matching module includes:
and the keyword template matching unit is used for judging whether the dialog text is matched with each first keyword tag of the keyword template in the preset strategy set, wherein the first keyword tag of the keyword template is preset by a user.
Further, the policy matching module further comprises:
and the NLP template matching unit is used for judging whether the dialog text is matched with each second keyword label of the NLP template in the preset strategy set, wherein the second keyword label of the NLP template is obtained after a user performs model training on a preset NLP model according to sample data added with labels in advance.
Further, the policy matching module further comprises:
and the regular template matching unit is used for judging whether the dialog text is matched with each regular expression of the regular templates in the preset strategy set, wherein the regular expressions of the regular templates are preset by the user according to condition numbers, quality inspection types, roles, range thresholds, comparison relations and regular keyword categories.
In a third aspect, the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the policy rule based speech recognition method when executing the program.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the policy rule based speech recognition method.
According to the technical scheme, the voice recognition method and the voice recognition device based on the strategy rules are characterized in that a target audio file and a preset strategy set are received; judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score; and determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value, so that the voice recognition can be carried out in an individualized and configurable manner, and the accuracy of the voice recognition can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart illustrating a method for speech recognition based on policy rules according to an embodiment of the present application;
FIG. 2 is a block diagram of a voice recognition device based on policy rules according to an embodiment of the present application;
FIG. 3 is a second block diagram of a policy rules based speech recognition apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Considering that the main purpose of voice quality inspection in the prior art is to judge whether violation behaviors exist according to the conversation contents of two parties of a telephone, different enterprises have different violation definitions, and the prior art is difficult to meet the requirements of individuation, configurability and high accuracy of voice quality inspection, the application provides a voice recognition method and a voice recognition device based on policy rules, and the voice recognition method and the voice recognition device receive a target audio file and a preset policy set; judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score; and determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value, so that the voice recognition can be carried out in an individualized and configurable manner, and the accuracy of the voice recognition can be improved.
In order to perform speech recognition in a personalized and configurable manner and improve the accuracy of speech recognition, the present application provides an embodiment of a speech recognition method based on policy rules, and referring to fig. 1, the speech recognition method based on policy rules specifically includes the following contents:
step S101: and receiving a target audio file and a preset strategy set.
It is understood that the references in the present application may be recorded dialog texts and policy sets obtained after the target audio file is converted. The policy set can be transmitted in a data structure, wherein the policy set and the policy are in a one-to-many relationship, that is, one policy set can contain a plurality of policies; the policy and the rule are also in a one-to-many relationship, that is, one policy can contain a plurality of rules; the rule and the rule condition are also in one-to-many relationship, that is, one rule can contain a plurality of rule conditions; the rule condition and the template are in one-to-many relationship, and the rule condition can contain a plurality of templates; the templates then contain only three template types: keyword template, NLP template, regular template.
Step S102: and judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score.
It will be appreciated that the three templates described above all provide a collection of individual tags (i.e. the keyword tags). What needs to be done next is to find out whether the recorded dialog text contains the tag, if so, the recorded dialog text is called a hit, and what we need to obtain is the name of the hit tag and the number of times of hitting the tag. Labels refer to individual words such as "hello", "ask", "at".
Step S103: and determining a quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value.
Optionally, the quality inspection score is a variation value of the quality inspection score after the keyword tag is hit, if the quality inspection score is set to-5, the score is decreased by 5 every keyword tag hit, the score is not changed when the keyword tag is not hit, the default value is 0, that is, the score is increased or decreased from 0, the finally obtained score can be a negative number, and meanwhile, the user can limit the upper limit and the lower limit of the score, so that the score is limited in the interval range required by the user.
As can be seen from the above description, the voice recognition method based on policy rules provided in the embodiments of the present application can receive a target audio file and a preset policy set; judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score; and determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value, so that the voice recognition can be carried out in an individualized and configurable manner, and the accuracy of the voice recognition can be improved.
In order to accurately and flexibly perform audio quality inspection on a target audio file, in an embodiment of the speech recognition method based on the policy rules, the following may be specifically included:
and judging whether the dialog text is matched with each first keyword tag of the keyword template in the preset strategy set, wherein the first keyword tag of the keyword template is preset by a user.
Alternatively, first, keyword templates are created, and a name is set for each keyword template, the name of each keyword template cannot be repeated. The purpose of the arrangement is that a user can quickly find the corresponding keyword template according to the name of the keyword template. The setting of the keywords in the keyword template is manually added according to the service requirements of the user. And after the addition is finished, the whole keyword template is input into an engine, then, the word-by-word traversal of the recording text is started, and the matched keywords and the matching times are found out. Such as: the keyword is asking you for yes, and then traversal takes four words as a word for text traversal. If the text contains 'asking you for yes', the hit keyword is returned and the value of the counter is increased by one for hit, the loop is repeated, the counter is increased by one for hit, and after the traversal is finished, the keyword label and the hit times are returned.
In order to accurately and flexibly perform audio quality inspection on a target audio file, in an embodiment of the speech recognition method based on the policy rules, the following may be specifically included:
and judging whether the dialog text is matched with each second keyword label of the NLP template in the preset strategy set, wherein the second keyword label of the NLP template is obtained after a user carries out model training on a preset NLP model according to sample data added with labels in advance.
Optionally, the user creates an NLP template, adds a keyword tag to the NLP template according to the service requirement of the user, performs NLP model matching according to the recorded dialog text to output an NLP tag hit by the text, processes the recorded dialog text through a model to obtain predicted keyword tags, puts the predicted keyword tags into a set, and traverses the set to see whether the predicted keyword tags are matched with the keyword tags in the NLP template created by the user. If the matching is carried out, the hit label name is returned, and if the matching is not carried out, the NULL is returned.
In order to accurately and flexibly perform audio quality inspection on a target audio file, in an embodiment of the speech recognition method based on the policy rules, the following may be specifically included:
and judging whether the dialog text is matched with each regular expression of the regular template in the preset strategy set, wherein the regular expression of the regular template is preset by a user according to condition numbers, quality inspection types, roles, range thresholds, comparison relations and regular keyword categories.
Optionally, one significant feature of the canonical template that is distinguished from NLP templates and keyword templates is: it distinguishes the roles of both parties of a recorded conversation. The first two templates are used for directly traversing the whole recording dialogue text, however, the template firstly divides the dialogue between the seat side and the client side, then judges the role selected by the user and then performs corresponding traversal, and traverses the recording dialogue of the seat side if the role selected by the user is the seat; if the user-selected role is a client, the recorded conversation on the client side is traversed.
Optionally, the regular templates are introduced with regular expressions, and the regular expressions are expressions assembled after performing and-or-not operation on conditions. Since the expression is for matching a text, and each condition in the expression needs to be judged for a certain specific sentence, the sentence of each condition needs to be located, specifically, the method is realized by traversing the text of the corresponding role in a for loop, finding the position subscript of the sentence according to the user-defined condition limit and returning the subscript value, and after finding each sentence subscript, the scope of the condition in the regular expression is determined. More accurate conditional hit processing is performed around the present sentence. And determining whether the sentence is used as a scope starting subscript or an ending subscript according to a user-defined regular condition. If the user condition is set to find whether the keyword of 'hello' exists in the three sentences of dialog text after the beginning of the sentence. Then the sentence is used as the starting subscript. Otherwise, the same principle is applied. And then, traversing and circulating each scope according to the scope defined by the sentence, finding out the keyword label meeting the condition and returning.
In order to perform speech recognition in a personalized and configurable manner and improve the accuracy of speech recognition, the present application provides an embodiment of a speech recognition apparatus based on policy rules, which is used for implementing all or part of the contents of the speech recognition method based on policy rules, and referring to fig. 2, the speech recognition apparatus based on policy rules specifically includes the following contents:
and the parameter input module 10 is used for receiving the target audio file and a preset strategy set.
And the strategy matching module 20 is configured to determine whether the dialog text of the target audio file matches the keyword tag of each strategy in the preset strategy set, and if so, obtain a corresponding quality inspection score.
And the result output module 30 is used for determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value.
As can be seen from the foregoing description, the voice recognition apparatus based on policy rules provided in the embodiments of the present application can receive a target audio file and a preset policy set; judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score; and determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value, so that the voice recognition can be carried out in an individualized and configurable manner, and the accuracy of the voice recognition can be improved.
In order to accurately and flexibly perform audio quality inspection on a target audio file, in an embodiment of the speech recognition apparatus based on policy rules of the present application, referring to fig. 3, the policy matching module 20 includes:
and a keyword template matching unit 21, configured to determine whether the dialog text matches each first keyword tag of the keyword templates in the preset policy set, where the first keyword tag of the keyword template is preset by a user.
The NLP template matching unit 22 is configured to determine whether the dialog text matches each second keyword tag of the NLP template in the preset policy set, where the second keyword tag of the NLP template is obtained after a user performs model training on a preset NLP model according to sample data to which a tag is added in advance.
The regular template matching unit 23 is configured to determine whether the dialog text matches each regular expression of the regular templates in the preset policy set, where the regular expression of the regular template is preset by the user according to the condition number, the quality inspection type, the role, the range threshold, the comparison relationship, and the category of the regular keyword.
In order to perform speech recognition in a personalized and configurable manner and improve accuracy of speech recognition, an embodiment of an electronic device for implementing all or part of contents in the speech recognition method based on policy rules is provided in the present application, where the electronic device specifically includes the following contents:
a processor (processor), a memory (memory), a communication Interface (Communications Interface), and a bus; the processor, the memory and the communication interface complete mutual communication through the bus; the communication interface is used for realizing information transmission between the voice recognition device based on the strategy rules and relevant equipment such as a core service system, a user terminal, a relevant database and the like; the logic controller may be a desktop computer, a tablet computer, a mobile terminal, and the like, but the embodiment is not limited thereto. In this embodiment, the logic controller may refer to an embodiment of the policy rule-based speech recognition method and an embodiment of the policy rule-based speech recognition apparatus in the embodiments for implementation, and the contents thereof are incorporated herein, and repeated details are not repeated.
It is understood that the user terminal may include a smart phone, a tablet electronic device, a network set-top box, a portable computer, a desktop computer, a Personal Digital Assistant (PDA), an in-vehicle device, a smart wearable device, and the like. Wherein, intelligence wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..
In practical applications, part of the voice recognition method based on policy rules may be performed on the electronic device side as described above, or all operations may be performed in the client device. The selection may be specifically performed according to the processing capability of the client device, the limitation of the user usage scenario, and the like. This is not a limitation of the present application. The client device may further include a processor if all operations are performed in the client device.
The client device may have a communication module (i.e., a communication unit), and may be communicatively connected to a remote server to implement data transmission with the server. The server may include a server on the task scheduling center side, and in other implementation scenarios, the server may also include a server on an intermediate platform, for example, a server on a third-party server platform that is communicatively linked to the task scheduling center server. The server may include a single computer device, or may include a server cluster formed by a plurality of servers, or a server structure of a distributed apparatus.
Fig. 4 is a schematic block diagram of a system configuration of an electronic device 9600 according to an embodiment of the present application. As shown in fig. 4, the electronic device 9600 can include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this fig. 4 is exemplary; other types of structures may also be used in addition to or in place of the structure to implement telecommunications or other functions.
In one embodiment, the policy rule based speech recognition method functionality may be integrated into the central processor 9100. The central processor 9100 may be configured to control as follows:
step S101: and receiving a target audio file and a preset strategy set.
Step S102: and judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score.
Step S103: and determining a quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value.
As can be seen from the above description, the electronic device provided in the embodiment of the present application receives a target audio file and a preset policy set; judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score; and determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value, so that the voice recognition can be carried out in an individualized and configurable manner, and the accuracy of the voice recognition can be improved.
In another embodiment, the voice recognition device based on the policy rules may be configured separately from the central processor 9100, for example, the voice recognition device based on the policy rules may be configured as a chip connected to the central processor 9100, and the voice recognition method based on the policy rules may be controlled by the central processor to implement the function.
As shown in fig. 4, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 also does not necessarily include all of the components shown in fig. 4; further, the electronic device 9600 may further include components not shown in fig. 4, which may be referred to in the art.
As shown in fig. 4, a central processor 9100, sometimes referred to as a controller or operational control, can include a microprocessor or other processor device and/or logic device, which central processor 9100 receives input and controls the operation of the various components of the electronic device 9600.
The memory 9140 can be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 9100 can execute the program stored in the memory 9140 to realize information storage or processing, or the like.
The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. Power supply 9170 is used to provide power to electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, an LCD display, but is not limited thereto.
The memory 9140 can be a solid state memory, e.g., Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 9140 could also be some other type of device. Memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 being used for storing application programs and function programs or for executing a flow of operations of the electronic device 9600 by the central processor 9100.
The memory 9140 can also include a data store 9143, the data store 9143 being used to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers for the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, contact book applications, etc.).
The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. The communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, which may be the same as in the case of a conventional mobile communication terminal.
Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and receive audio input from the microphone 9132, thereby implementing ordinary telecommunications functions. The audio processor 9130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100, thereby enabling recording locally through the microphone 9132 and enabling locally stored sounds to be played through the speaker 9131.
An embodiment of the present application further provides a computer-readable storage medium capable of implementing all the steps in the policy rule based speech recognition method with the server or the client as an execution subject in the above embodiment, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all the steps in the policy rule based speech recognition method with the server or the client as an execution subject, for example, when the processor executes the computer program, the processor implements the following steps:
step S101: and receiving a target audio file and a preset strategy set.
Step S102: and judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score.
Step S103: and determining a quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value.
As can be seen from the foregoing description, the computer-readable storage medium provided in the embodiments of the present application receives a target audio file and a preset policy set; judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score; and determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value, so that the voice recognition can be carried out in an individualized and configurable manner, and the accuracy of the voice recognition can be improved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for speech recognition based on policy rules, the method comprising:
receiving a target audio file and a preset strategy set;
judging whether the dialog text of the target audio file is matched with the keyword labels of all the strategies in the preset strategy set, if so, acquiring a corresponding quality inspection score;
and determining a quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value.
2. The method according to claim 1, wherein the determining whether the dialog text of the target audio file matches the keyword tag of each policy in the preset policy set comprises:
and judging whether the dialog text is matched with each first keyword tag of the keyword template in the preset strategy set, wherein the first keyword tag of the keyword template is preset by a user.
3. The method according to claim 1, wherein the determining whether the dialog text of the target audio file matches the keyword tag of each policy in the preset policy set further comprises:
and judging whether the dialog text is matched with each second keyword label of the NLP template in the preset strategy set, wherein the second keyword label of the NLP template is obtained after a user carries out model training on a preset NLP model according to sample data added with labels in advance.
4. The method according to claim 1, wherein the determining whether the dialog text of the target audio file matches the keyword tag of each policy in the preset policy set further comprises:
and judging whether the dialog text is matched with each regular expression of the regular template in the preset strategy set, wherein the regular expression of the regular template is preset by a user according to condition numbers, quality inspection types, roles, range thresholds, comparison relations and regular keyword categories.
5. A policy rules-based speech recognition apparatus, comprising:
the parameter input module is used for receiving a target audio file and a preset strategy set;
the strategy matching module is used for judging whether the dialog text of the target audio file matches the keyword labels of all strategies in the preset strategy set or not, and if yes, acquiring a corresponding quality inspection score;
and the result output module is used for determining the quality inspection result of the target audio file according to the quality inspection score and a preset quality inspection threshold value.
6. The policy rules-based speech recognition device of claim 5, wherein the policy matching module comprises:
and the keyword template matching unit is used for judging whether the dialog text is matched with each first keyword tag of the keyword template in the preset strategy set, wherein the first keyword tag of the keyword template is preset by a user.
7. The policy rules-based speech recognition device of claim 5, wherein the policy matching module further comprises:
and the NLP template matching unit is used for judging whether the dialog text is matched with each second keyword label of the NLP template in the preset strategy set, wherein the second keyword label of the NLP template is obtained after a user performs model training on a preset NLP model according to sample data added with labels in advance.
8. The policy rules-based speech recognition device of claim 5, wherein the policy matching module further comprises:
and the regular template matching unit is used for judging whether the dialog text is matched with each regular expression of the regular templates in the preset strategy set, wherein the regular expressions of the regular templates are preset by the user according to condition numbers, quality inspection types, roles, range thresholds, comparison relations and regular keyword categories.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the policy rule based speech recognition method according to any one of claims 1 to 4 are implemented when the program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the policy rules based speech recognition method according to any one of claims 1 to 4.
CN201911284678.6A 2019-12-13 2019-12-13 Voice recognition method and device based on policy rules Pending CN110956956A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911284678.6A CN110956956A (en) 2019-12-13 2019-12-13 Voice recognition method and device based on policy rules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911284678.6A CN110956956A (en) 2019-12-13 2019-12-13 Voice recognition method and device based on policy rules

Publications (1)

Publication Number Publication Date
CN110956956A true CN110956956A (en) 2020-04-03

Family

ID=69981813

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911284678.6A Pending CN110956956A (en) 2019-12-13 2019-12-13 Voice recognition method and device based on policy rules

Country Status (1)

Country Link
CN (1) CN110956956A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491388A (en) * 2018-03-22 2018-09-04 平安科技(深圳)有限公司 Data set acquisition methods, sorting technique, device, equipment and storage medium
CN111627461A (en) * 2020-05-29 2020-09-04 平安医疗健康管理股份有限公司 Voice quality inspection method and device, server and storage medium
CN111753675A (en) * 2020-06-08 2020-10-09 北京天空卫士网络安全技术有限公司 Picture type junk mail identification method and device
CN112380831A (en) * 2020-11-11 2021-02-19 锐捷网络股份有限公司 Bidding method and device for configuration manual
CN112465399A (en) * 2020-12-16 2021-03-09 作业帮教育科技(北京)有限公司 Intelligent quality inspection method and device based on automatic strategy iteration and electronic equipment
CN112541774A (en) * 2020-12-08 2021-03-23 四川众信佳科技发展有限公司 AI quality inspection method, device, system, electronic device and storage medium
CN112562652A (en) * 2020-12-02 2021-03-26 湖南翰坤实业有限公司 Voice processing method and system based on Untiy engine
CN112632238A (en) * 2020-12-11 2021-04-09 浙江百应科技有限公司 Dialogue method and system for templated robot dialect
CN112836042A (en) * 2020-10-13 2021-05-25 讯飞智元信息科技有限公司 Harmful audio recognition method and device, electronic equipment and computer readable medium
CN113065328A (en) * 2021-04-06 2021-07-02 浙江百应科技有限公司 Conversation content analysis method based on regular and text truncation
CN113204630A (en) * 2021-05-31 2021-08-03 平安科技(深圳)有限公司 Text matching method and device, computer equipment and readable storage medium
CN113593533A (en) * 2021-09-10 2021-11-02 平安科技(深圳)有限公司 Flow node skipping method, device, equipment and medium based on intention recognition

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120191454A1 (en) * 2011-01-26 2012-07-26 TrackThings LLC Method and Apparatus for Obtaining Statistical Data from a Conversation
WO2017160762A1 (en) * 2016-03-15 2017-09-21 Global Tel*Link Corp. Detection and prevention of inmate to inmate message relay
CN108737667A (en) * 2018-05-03 2018-11-02 平安科技(深圳)有限公司 Voice quality detecting method, device, computer equipment and storage medium
CN108962282A (en) * 2018-06-19 2018-12-07 京北方信息技术股份有限公司 Speech detection analysis method, apparatus, computer equipment and storage medium
CN109271489A (en) * 2018-10-25 2019-01-25 第四范式(北京)技术有限公司 A kind of Method for text detection and device
CN109327632A (en) * 2018-11-23 2019-02-12 深圳前海微众银行股份有限公司 Intelligent quality inspection system, method and the computer readable storage medium of customer service recording
CN110197672A (en) * 2018-02-27 2019-09-03 招商信诺人寿保险有限公司 A kind of voice call quality detection method, server, storage medium
CN110378562A (en) * 2019-06-17 2019-10-25 中国平安人寿保险股份有限公司 Voice quality detecting method, device, computer equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120191454A1 (en) * 2011-01-26 2012-07-26 TrackThings LLC Method and Apparatus for Obtaining Statistical Data from a Conversation
WO2017160762A1 (en) * 2016-03-15 2017-09-21 Global Tel*Link Corp. Detection and prevention of inmate to inmate message relay
CN110197672A (en) * 2018-02-27 2019-09-03 招商信诺人寿保险有限公司 A kind of voice call quality detection method, server, storage medium
CN108737667A (en) * 2018-05-03 2018-11-02 平安科技(深圳)有限公司 Voice quality detecting method, device, computer equipment and storage medium
CN108962282A (en) * 2018-06-19 2018-12-07 京北方信息技术股份有限公司 Speech detection analysis method, apparatus, computer equipment and storage medium
CN109271489A (en) * 2018-10-25 2019-01-25 第四范式(北京)技术有限公司 A kind of Method for text detection and device
CN109327632A (en) * 2018-11-23 2019-02-12 深圳前海微众银行股份有限公司 Intelligent quality inspection system, method and the computer readable storage medium of customer service recording
CN110378562A (en) * 2019-06-17 2019-10-25 中国平安人寿保险股份有限公司 Voice quality detecting method, device, computer equipment and storage medium

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491388B (en) * 2018-03-22 2021-02-23 平安科技(深圳)有限公司 Data set acquisition method, classification method, device, equipment and storage medium
CN108491388A (en) * 2018-03-22 2018-09-04 平安科技(深圳)有限公司 Data set acquisition methods, sorting technique, device, equipment and storage medium
CN111627461A (en) * 2020-05-29 2020-09-04 平安医疗健康管理股份有限公司 Voice quality inspection method and device, server and storage medium
CN111753675A (en) * 2020-06-08 2020-10-09 北京天空卫士网络安全技术有限公司 Picture type junk mail identification method and device
CN111753675B (en) * 2020-06-08 2024-03-26 北京天空卫士网络安全技术有限公司 Picture type junk mail identification method and device
CN112836042A (en) * 2020-10-13 2021-05-25 讯飞智元信息科技有限公司 Harmful audio recognition method and device, electronic equipment and computer readable medium
CN112380831B (en) * 2020-11-11 2023-07-25 锐捷网络股份有限公司 Calibration method and device for configuration manual
CN112380831A (en) * 2020-11-11 2021-02-19 锐捷网络股份有限公司 Bidding method and device for configuration manual
CN112562652A (en) * 2020-12-02 2021-03-26 湖南翰坤实业有限公司 Voice processing method and system based on Untiy engine
CN112541774A (en) * 2020-12-08 2021-03-23 四川众信佳科技发展有限公司 AI quality inspection method, device, system, electronic device and storage medium
CN112632238A (en) * 2020-12-11 2021-04-09 浙江百应科技有限公司 Dialogue method and system for templated robot dialect
CN112632238B (en) * 2020-12-11 2022-05-13 浙江百应科技有限公司 Dialogue method and system for templated robot dialect
CN112465399A (en) * 2020-12-16 2021-03-09 作业帮教育科技(北京)有限公司 Intelligent quality inspection method and device based on automatic strategy iteration and electronic equipment
CN113065328A (en) * 2021-04-06 2021-07-02 浙江百应科技有限公司 Conversation content analysis method based on regular and text truncation
CN113204630A (en) * 2021-05-31 2021-08-03 平安科技(深圳)有限公司 Text matching method and device, computer equipment and readable storage medium
CN113593533A (en) * 2021-09-10 2021-11-02 平安科技(深圳)有限公司 Flow node skipping method, device, equipment and medium based on intention recognition
CN113593533B (en) * 2021-09-10 2023-05-02 平安科技(深圳)有限公司 Method, device, equipment and medium for jumping flow node based on intention recognition

Similar Documents

Publication Publication Date Title
CN110956956A (en) Voice recognition method and device based on policy rules
US10546067B2 (en) Platform for creating customizable dialog system engines
US10885529B2 (en) Automated upsells in customer conversations
CN106372059A (en) Information input method and information input device
US8856007B1 (en) Use text to speech techniques to improve understanding when announcing search results
US10929606B2 (en) Method for follow-up expression for intelligent assistance
CN112579733B (en) Rule matching method, rule matching device, storage medium and electronic equipment
CN111582360B (en) Method, apparatus, device and medium for labeling data
CN111462726B (en) Method, device, equipment and medium for answering out call
CN111048115A (en) Voice recognition method and device
CN113342948A (en) Intelligent question and answer method and device
CN111813900A (en) Multi-turn conversation processing method and device, electronic equipment and storage medium
CN110209768B (en) Question processing method and device for automatic question answering
CN111339282A (en) Intelligent online response method and intelligent customer service system
CN111625629B (en) Task type dialogue robot response method and device, robot and storage medium
US20200328990A1 (en) Intelligent Scheduler for Chatbot Sessions
CN112052316A (en) Model evaluation method, model evaluation device, storage medium and electronic equipment
CN112925895A (en) Natural language software operation and maintenance method and device
CN111666408A (en) Method and device for screening and displaying important clauses
CN115798458A (en) Classified language identification method and device
CN110931014A (en) Speech recognition method and device based on regular matching rule
CN114662452A (en) Privacy-removing text label analysis method and device
CN115495519A (en) Report data processing method and device
CN114840576A (en) Data standard matching method and device
KR102276391B1 (en) Computer program for providing a way to respond to customers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200403

RJ01 Rejection of invention patent application after publication