CN113158692A - Multi-intention processing method, system, equipment and storage medium based on semantic recognition - Google Patents

Multi-intention processing method, system, equipment and storage medium based on semantic recognition Download PDF

Info

Publication number
CN113158692A
CN113158692A CN202110435537.0A CN202110435537A CN113158692A CN 113158692 A CN113158692 A CN 113158692A CN 202110435537 A CN202110435537 A CN 202110435537A CN 113158692 A CN113158692 A CN 113158692A
Authority
CN
China
Prior art keywords
intention
type
target
intent
operation flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110435537.0A
Other languages
Chinese (zh)
Other versions
CN113158692B (en
Inventor
陈林
何浩峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202110435537.0A priority Critical patent/CN113158692B/en
Publication of CN113158692A publication Critical patent/CN113158692A/en
Application granted granted Critical
Publication of CN113158692B publication Critical patent/CN113158692B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the field of voice recognition, and provides a multi-intention processing method based on semantic recognition, which comprises the following steps: receiving a voice to be recognized, and performing transcription operation on the voice to be recognized to obtain a text to be recognized; splitting the text to be recognized into a plurality of target texts; identifying a target intention corresponding to each target text through a pre-trained semantic identification model to obtain a plurality of target intents; acquiring the intention type of each target intention, and determining a target operation strategy according to the intention type of each target intention; and determining one or more operation flows based on the target operation strategy so as to execute response operation based on the one or more operation flows. The invention improves the efficiency, accuracy and voice interaction efficiency of intention recognition, makes a voice reply system more anthropomorphic and improves user experience.

Description

Multi-intention processing method, system, equipment and storage medium based on semantic recognition
Technical Field
The embodiment of the invention relates to the field of voice recognition, in particular to a multi-purpose processing method, a system, equipment and a storage medium based on semantic recognition.
Background
At present, the intention recognition has a plurality of difficulties, firstly, the voice recognition system is difficult to accurately recognize the intention of the user; secondly, if multiple meanings exist in a user's speech, the speech recognition system cannot process the meanings very well, or the processed meanings are not the core meanings of the user; third, the problem processing is incomplete and cannot fully process the multiple meanings of the user. Based on the above points, the conventional voice recognition system often has the problems of no question alignment, less importance avoidance, and incapability of processing the core meaning of the user, and the like in voice interaction, or has the problem that when a plurality of intentions occur, only one of the intentions is replied, and the comprehensive processing cannot be performed. Therefore, how to improve the accuracy of intent recognition and thus improve the voice interaction efficiency becomes a technical problem which needs to be solved urgently at present.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, a system, a device, and a readable storage medium for processing multiple intents based on semantic recognition, so as to solve the problems of low accuracy of intent recognition and low efficiency of voice interaction.
In order to achieve the above object, an embodiment of the present invention provides a multi-intent processing method based on semantic recognition, where the method includes:
receiving a voice to be recognized, and performing transcription operation on the voice to be recognized to obtain a text to be recognized;
splitting the text to be recognized into a plurality of target texts;
identifying a target intention corresponding to each target text through a pre-trained semantic identification model to obtain a plurality of target intents;
acquiring the intention type of each target intention, and determining a target operation strategy according to the intention type of each target intention; and
determining one or more operational flows based on the target operational policy to perform responsive operations based on the one or more operational flows.
Illustratively, the intent type includes at least one of an ending intent, a backbone intent, and an objecting intent, wherein the objecting intent includes at least one of a resolvable intent, an unresolvable intent, and an unresolved intent;
the step of determining a target operation strategy according to the intention type of each target intention comprises:
determining the target operation strategy according to the intention type and the pre-configured operation priority of each target intention;
the operation priority is an execution sequence corresponding to each operation flow with the target intention, and the execution sequence of each operation flow is as follows: an operation flow of ending type intention, an operation flow of unresolved type intention, an operation flow of backbone type intention, an operation flow of resolvable type intention, and an operation flow of unresolved type intention.
Illustratively, the step of determining a target operation policy based on the intent type of each of the target intents includes:
when the ending type intention exists in the intention types of the target intentions, determining the operation flow for executing the ending type intention as the target operation strategy.
Illustratively, the step of determining a target operation policy based on the intent type of each of the target intents includes:
when the ending type intention does not exist in the intention types of the target intentions, judging whether the objection type intention exists or not;
if the objection-type intent is present, detecting whether the objection-type intent includes the resolvable intent;
if the objection type intention comprises the resolvable type intention, preferentially executing the operation flow of the resolvable type intention, and then executing the operation flow of the unsolvable type intention, the operation flow of the main type intention and the operation flow without the resolvable type intention according to the operation priority, wherein the operation flow with the highest current operation priority is determined as the target operation strategy; and
if the objection type intention does not include the resolvable type intention, determining the operation flow with the highest current operation priority in the operation flows of the unresolvable type intention, the operation flows of the main type intention and the operation flows of the unresolvable type intention as the target operation strategy.
Illustratively, the step of executing an operation flow with the highest current operation priority among the operation flows with the unsolvable intent, the operation flows with the trunk-type intent, and the operation flows without the resolvable intent according to the operation priorities includes:
when the operation flow with the highest current operation priority is the operation flow of the unsolvable type intention and the target intentions comprise a plurality of unsolvable type intentions arranged according to a preset sequence, executing the operation flow of the last unsolvable type intention in the unsolvable type intentions, wherein the preset sequence comprises the position sequence of the target text corresponding to each unsolvable type intention in the text to be recognized;
when the operation flow with the highest current operation priority is the operation flow of the stem type intentions, and the target intentions include a plurality of stem type intentions arranged according to the preset sequence, executing the operation flow of the last stem type intention in the stem type intentions, wherein the preset sequence further includes the position sequence of the target text corresponding to each stem type intention in the text to be recognized; and
and when the operation flow with the highest current operation priority is the operation flow of the unsolved intentions, and the target intentions include a plurality of unsolved intentions arranged according to the preset sequence, executing the operation flow of the last unsolved intention in the unsolved intentions, wherein the preset sequence further includes the position sequence of the target text corresponding to each unsolved intention in the text to be recognized.
Illustratively, the response operation comprises a play operation and a subsequent operation, wherein each operation flow of the intention type corresponds to one play operation, and each play operation corresponds to one subsequent operation;
the step of determining one or more operation flows based on the target operation policy to perform responsive operations based on the one or more operation flows comprises:
executing corresponding playing operation according to each operation flow; and
and determining and executing subsequent operations corresponding to the last playing operation according to the last playing operation in the playing operations.
Exemplary, also include: uploading the target operation strategy to a blockchain.
In order to achieve the above object, an embodiment of the present invention further provides a system for processing multiple intents based on semantic recognition, including:
the receiving module is used for receiving the voice to be recognized and performing transcription operation on the voice to be recognized so as to obtain a text to be recognized;
the splitting module is used for splitting the text to be recognized into a plurality of target texts;
the recognition module is used for recognizing the target intention corresponding to each target text through a pre-trained semantic recognition model so as to obtain a plurality of target intents;
the determining module is used for acquiring the intention type of each target intention and determining a target operation strategy according to the intention type of each target intention; and
an execution module to determine one or more operational flows based on the target operational policy to execute responsive operations based on the one or more operational flows.
To achieve the above object, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when executed by the processor, the computer program implements the steps of the semantic recognition-based multi-intent processing method as described above.
To achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program is executable by at least one processor to cause the at least one processor to execute the steps of the semantic recognition based multi-intent processing method as described above.
According to the semantic recognition-based multi-intention processing method, the semantic recognition-based multi-intention processing system, the computer equipment and the computer-readable storage medium, the text to be recognized is subjected to sentence breaking, and the intention recognition is performed on the multiple target short sentences after sentence breaking, so that the problem that the intention recognition is difficult due to more characters in the text to be recognized or intentions such as intersection, contradiction and inversion in the text to be recognized is solved, the accuracy rate of recognizing the intention of the text to be recognized is improved, the difficulty of recognizing the intention of the text to be recognized is reduced, and the intention recognition efficiency is improved; the target operation strategy is configured according to the intention type of each target intention, and the operation flow for executing each target intention operation is adjusted according to the target operation strategy, so that different reply operation flows are executed according to different intentions, for example, an operation flow for executing important reply according to a more important intention type, an operation flow for executing selective reply according to a secondary intention type and an operation flow for executing no reply according to an unimportant intention type can be executed, so that the replies to some voice intentions are reduced, the efficiency of voice interaction is improved, a voice reply system is more humanized, and the user experience is improved.
Drawings
FIG. 1 is a flow chart illustrating a multi-intent processing method based on semantic recognition according to an embodiment of the present invention;
FIG. 2 is a block diagram of a second embodiment of a semantic recognition based multi-intent processing system according to the present invention;
fig. 3 is a schematic diagram of a hardware structure of a third embodiment of the computer device according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Example one
Referring to fig. 1, a flowchart illustrating steps of a semantic recognition-based multi-intent processing method according to an embodiment of the present invention is shown. It is to be understood that the flow charts in the embodiments of the present method are not intended to limit the order in which the steps are performed. The multi-intent processing system based on semantic recognition in the present embodiment may be executed in the computer device 2, and the following exemplary description will be made with the computer device 2 as an execution subject. The details are as follows.
Step S100, receiving a voice to be recognized, and performing transcription operation on the voice to be recognized to obtain a text to be recognized.
The computer device 2 may receive the speech to be recognized, and perform a transcription operation on the speech to be recognized to obtain the text to be recognized. Wherein, the speech to be recognized can also be obtained by the computer device 2 in real time; as in the scenario of smart customer service, the computer device 2 can obtain the call voice of the service object (target customer) served by the smart customer service in real time. The speech to be recognized may be the nth speech of the target client, and the speech to be recognized may also be a speech segment.
For example, the computer device 2 may perform a transcription operation on the speech to be recognized to obtain a plurality of speech texts; then, each phonetic text is calculatedMatching degree between the voices to be recognized; and taking the voice text with the highest matching degree with the voice to be recognized in the plurality of voice texts as the text to be recognized. The computer device 2 calculates the matching degree between each voice text w and the voice to be recognized, and specifically, the voice text with the highest matching degree with the voice to be recognized can be used as the text to be recognized
Figure BDA0003032842860000072
Figure BDA0003032842860000071
The calculation can be carried out by converting p (w | x) into p (x | w) p (w) through a Bayesian formula
Figure BDA0003032842860000074
Wherein p (x | w) represents an acoustic model; p (w) represents a language model; p (x) represents the probability of an acoustic feature, the quantity being constant for different w, being calculated
Figure BDA0003032842860000073
And can be ignored. In the embodiment, the text to be recognized with the highest matching degree with the speech to be recognized is obtained from the plurality of speech texts, so that the transcription accuracy of the speech recognition is improved.
Step S102, the text to be recognized is split into a plurality of target texts.
After obtaining the text to be recognized, the computer device 2 may perform sentence-breaking operation on the text to be recognized to obtain the target texts.
And step S104, identifying the target intention corresponding to each target text through a pre-trained semantic identification model to obtain a plurality of target intents.
After obtaining the target texts, the computer device 2 may input the target texts into the semantic recognition model, perform semantic recognition on each target text through the semantic recognition model, and output a plurality of target intents corresponding to the target texts. The semantic recognition model may be a trained NLP (Natural Language Processing) recognition model.
Step S106, obtaining the intention type of each target intention, and determining a target operation strategy according to the intention type of each target intention.
The intent types of the plurality of target intentions include at least one of an ending intent, a backbone intent, and an objecting intent, wherein the objecting intent includes at least one of a resolvable intent, an unresolvable intent, and an unresolved intent. Wherein:
the end type intent is: the intention corresponding to the voice of the call ending language exists;
the backbone pattern intent is: the intention corresponding to the main topic corresponding to the whole text to be recognized;
the resolvable intent is: like query-like intents, which in most cases can be replied to with fixed resolution speech;
the unresolved intent is: the intention corresponding to the difficult problem, the intention which can not be solved currently;
the unresolved intent is: similar idiotypic speech corresponds to intentions that do not have much practical meaning or are causal consequence-like intentions.
Illustratively, the computer device 2 may determine the intent type of each target intent from the respective target intent carrying tag. In this embodiment, when identifying a target intention corresponding to a target text, the semantic identification model may further determine an intention type of the intention, and configure a corresponding intention tag for each target intention according to the intention type of each target intention, where one intention type corresponds to one intention tag.
In an exemplary embodiment, the step S106 may further include: determining the target operation strategy according to the intention type and the pre-configured operation priority of each target intention; the operation priority is an execution sequence corresponding to each operation flow with the target intention, and the execution sequence of each operation flow is as follows: an operation flow of ending type intention, an operation flow of unresolved type intention, an operation flow of backbone type intention, an operation flow of resolvable type intention, and an operation flow of unresolved type intention. According to the embodiment, the voice intentions are classified, and different priorities are configured for different voice types, so that the intentions with high priorities in the multiple target intentions are replied preferentially, and the intelligent replying efficiency of the voice system is improved.
In an exemplary embodiment, the step S106 may further include: when the ending type intention exists in the intention types of the target intentions, determining the operation flow for executing the ending type intention as the target operation strategy. Specifically, when one of the voice intentions in the text to be recognized is the ending type intention, for example, in a scene of intelligent customer service, the ending type intention is an intention of a target user to end a call, and when there are words such as "i hang", "i hang first", "see second", and the like in the text to be recognized, the text to be recognized has the ending type intention, that is, the target user cannot continue to keep a call at present. At this time, the computer device 2 may execute the call ending procedure; wherein, when the target user is not hung up immediately, the preset reply voice (dialect) is played.
In an exemplary embodiment, the step S106 may further include a step S200 to a step S204, where: step S200, when the ending type intention does not exist in the intention types of the target intentions, judging whether the objection type intention exists or not; if the objection-type intent is present, detecting whether the objection-type intent includes the resolvable intent; step S202, if the objection type intention comprises the resolvable type intention, the operation flow of the resolvable type intention is executed preferentially, and then the operation flow with the highest current operation priority in the operation flow of the unsolvable type intention, the operation flow of the backbone type intention and the operation flow without the resolvable type intention is executed according to the operation priority, and is determined as the target operation strategy; and step S204, if the objection type intention does not include the resolvable type intention, determining the operation flow with the highest current operation priority in the operation flows of the unsolvable type intention, the operation flows of the backbone type intention and the operation flows of the unsolvable type intention as the target operation strategy.
Illustratively, the computer device 2 may pre-configure a voice reply library, wherein the voice reply library comprises a plurality of answer-type voices and a plurality of reply-type voices. The answer-type voices are used for replying the resolvable intentions, and the reply-type voices are used for replying the voice intentions which cannot be solved by the answer-type voices.
When it is detected that the objectionability intention includes the resolvable intention, the operational flow of the resolvable intention is as follows: the computer device 2 first matches one or more answer-type voices corresponding to the solvable intention from among the plurality of answer-type voices, then matches an answer-type voice corresponding to a voice intention of the current operation priority from among the plurality of answer-type voices according to the operation priority, and preferentially plays the one or more answer-type voices, and then plays an answer-type voice corresponding to the voice intention of the current operation priority. Wherein the speech intent of the current operational priority may be one of an unresolved intent, a backbone intent, a resolvable intent, or an unresolved intent. Wherein the voice text corresponding to the solvable intent generally appears in the form of question sentences, and the question sentences are problems solved in advance. E.g. who you are (corresponding intention: ask identity), what you call for i to do (corresponding intention: ask call purpose).
When it is detected that the objectionability intent does not include the resolvable intent, then the current operational flow is: the computer device 2 matches, from the plurality of reply-type voices, a reply-type voice corresponding to the voice intention of the current operation priority according to the operation priority, and plays the reply-type voice corresponding to the voice intention of the current operation priority.
In an exemplary embodiment, the step S204 may further include a step S300 to a step S304, where: step S300, when the operation flow with the highest current operation priority is the operation flow of the unresolvable intents and the plurality of target intents include a plurality of unresolvable intents arranged according to a preset sequence, executing the operation flow of the last unresolvable intention of the plurality of unresolvable intents, where the preset sequence includes a position sequence of a target text corresponding to each unresolvable intention in the text to be recognized; step S302, when the operation flow with the highest current operation priority is the operation flow of the stem type intentions, and the target intentions include a plurality of stem type intentions arranged according to the preset sequence, executing the operation flow of the last stem type intention in the stem type intentions, where the preset sequence further includes a position sequence of a target text corresponding to each stem type intention in the text to be recognized; and step S304, when the operation flow with the highest current operation priority is the operation flow of the unresolved intents and the target intents include a plurality of unresolved intents arranged according to the preset sequence, executing the operation flow of the last unresolved intention in the unresolved intents, wherein the preset sequence further includes the position sequence of the target text corresponding to each unresolved intention in the text to be recognized. According to the embodiment, the voice reply efficiency is improved by executing the operation flow of the last intention in the plurality of intentions of the same type.
And step S108, determining one or more operation flows based on the target operation strategy, and executing response operation based on the one or more operation flows.
In an exemplary embodiment, the response operation includes a play operation and a subsequent operation, wherein each operation flow of the intention type corresponds to one play operation, and each play operation corresponds to one subsequent operation; the step S108 may further include a step S400 to a step S402, where: step S400, executing corresponding playing operation according to each operation flow; and step S402, determining and executing the subsequent operation corresponding to the last playing operation according to the last playing operation in the playing operations. According to the embodiment, the voice reply efficiency is improved by executing the operation flow of the last intention in the plurality of intentions of the same type.
In the embodiment, a long sentence (text to be recognized) is punctuated, and a plurality of target short sentences after the punctuation are subjected to intention recognition, wherein the intention recognition is performed on the short sentences after the punctuation, so that the problem that the long sentence intent library is difficult to maintain or the intention recognition is easy to be confused due to the fact that a large number of long sentence characters are provided and the intention is difficult to recognize or the long sentence has intentions of crossing, contradiction, inversion and the like is avoided; the intention recognition efficiency and accuracy are improved. The embodiment may further adjust the operation flow of each target intention by the operation priority corresponding to each target intention (for example, the solution type voice or the reply type voice is used to implement the targeted reply and/or the diversity reply to each type of intention, and the diversity control of the operation flow is implemented by the operation priority configuration), so that the voice reply system is more anthropomorphic, and the user experience of the target user is improved.
In order to make the embodiment clearer, the embodiment further provides a specific example table of the target operation policy, as shown in table 1:
Figure BDA0003032842860000121
TABLE 1
In table 1, "end" is that one or more end-type intentions are included in the plurality of target intentions corresponding to the text to be recognized; the 'main stem' is one or more main stem type intentions in a plurality of target intentions corresponding to the text to be recognized; the objection is one or more objection-type intentions in a plurality of target intentions corresponding to the text to be recognized; the 'none' is one or more unsolvable intentions in the plurality of target intentions corresponding to the text to be recognized; the text to be recognized comprises a plurality of target intentions which are corresponding to the text to be recognized and one or more resolvable intentions; the "not" includes one or more unsolved intentions in the plurality of target intentions corresponding to the text to be recognized.
In the play operation of table 1, the play one is: sequentially playing answer type voices corresponding to the solution type intentions; the second playing is as follows: playing the last reply type voice corresponding to the unsolvable type intention; the third playing is: playing the last reply type voice corresponding to the unresolved intention; the fourth playing is: broadcasting a reply type voice corresponding to the last main type intention; the fifth playing is as follows: and playing the reply type voice corresponding to the ending type intention and ending the call.
In the subsequent operations of table 1, operation one is: playing a corresponding subsequent operation; the second operation is as follows: playing the corresponding subsequent operation; the third operation is as follows: playing the three corresponding subsequent operations; the fourth operation is as follows: and playing the four corresponding subsequent operations.
For convenience of understanding, the present embodiment also provides a specific example:
the text to be recognized is: "who you are calling to find what I do now at the meeting this is inconvenient to speak".
A plurality of target texts corresponding to the text to be recognized: who/you called to find what i do/i are now in a meeting/this is inconvenient to speak (to/to make a sentence break).
Intentions corresponding to the respective target texts: who you-ask for identity/what you call for i what-ask for the purpose of the call/i now in a meeting-in a meeting/this is inconvenient to speak-no time.
The solution type intention is as follows: ask for identity/ask for purpose of incoming call.
The type intent cannot be solved: in a meeting.
No solution intent is used: there is no time.
And solving the voice: i am of company XX, incoming calls are due to XXXX things, and if you are now in a meeting, then me contacts you late.
The target operation strategy corresponding to the text to be recognized is as follows: the computer device 2 matches the two answer voices corresponding to the "inquiry identity" and the "inquiry incoming call destination" from the plurality of answer voices first, then matches the reply voice corresponding to the "in-meeting" from the plurality of reply voices, and plays the two answer voices corresponding to the "inquiry identity" and the "inquiry incoming call destination" and the reply voice corresponding to the "in-meeting".
The voice played corresponding to the playing operation is as follows: i am of company XX, incoming calls are due to XXXX things, and if you are now in a meeting, then me contacts you late.
And (3) subsequent operation: and waiting for the target user to reply again.
Illustratively, the semantic recognition-based multi-intent processing method further comprises: uploading the target operation strategy to a blockchain.
For example, uploading the target operation policy to the blockchain can ensure the security and the fair transparency. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Example two
FIG. 2 is a schematic diagram of program modules of a semantic recognition-based multi-intent processing system according to a second embodiment of the present invention. The semantic recognition based multi-intent processing system 20 may include or be divided into one or more program modules, which are stored in a storage medium and executed by one or more processors to implement the present invention and implement the above-described semantic recognition based multi-intent processing methods. The program modules referred to in the embodiments of the present invention refer to a series of computer program instruction segments capable of performing specific functions, and are more suitable than the programs themselves for describing the execution process of the multi-intent processing system 20 based on semantic recognition in the storage medium. The following description will specifically describe the functions of the program modules of the present embodiment:
the receiving module 200 receives a voice to be recognized, and performs transcription operation on the voice to be recognized to obtain a text to be recognized.
A splitting module 202, configured to split the text to be recognized into multiple target texts.
The identifying module 204 is configured to identify a target intention corresponding to each target text through a pre-trained semantic identification model, so as to obtain a plurality of target intents.
A determining module 206, configured to obtain an intention type of each of the target intentions, and determine a target operation policy according to the intention type of each of the target intentions.
Illustratively, the intent type includes at least one of an ending intent, a backbone intent, and an objecting intent, wherein the objecting intent includes at least one of a resolvable intent, an unresolvable intent, and an unresolved intent; the determining module 204 is further configured to: determining the target operation strategy according to the intention type and the pre-configured operation priority of each target intention; the operation priority is a priority execution right of an operation flow of each target intention, and the order of intention types corresponding to the operation priority is as follows: ending type intent, unresolved type intent, backbone type intent, resolvable type intent, unresolved type intent.
Illustratively, the determining module 206 is further configured to: when the ending type intention exists in the intention types of the target intentions, the target operation strategy is as follows: executing the operation flow of the ending type intention.
Illustratively, the determining module 206 is further configured to: when the ending type intention does not exist in the intention types of the target intentions, judging whether the objection type intention exists or not; if the objection-type intent is present, detecting whether the objection-type intent includes the resolvable intent; if the objectionability intent comprises the resolvable intent, the target operating policy is: preferentially executing the operation flow of the resolvable type intention, and then executing one of the operation flow of the unresolvable type intention, the operation flow of the trunk type intention and the operation flow of the unresolvable type intention according to the operation priority; and if the objectionability intent does not include the resolvable intent, the target operating policy is: executing one of the operation flow of the unsolved intent, the operation flow of the backbone-type intent, and the operation flow of the unsolved intent according to the operation priority.
Illustratively, the determining module 206 is further configured to: when the operation flow with the highest current operation priority is the operation flow of the unsolvable type intention and the target intentions comprise a plurality of unsolvable type intentions arranged according to a preset sequence, executing the operation flow of the last unsolvable type intention in the unsolvable type intentions, wherein the preset sequence comprises the position sequence of the target text corresponding to each unsolvable type intention in the text to be recognized; when the operation flow with the highest current operation priority is the operation flow of the stem type intentions, and the target intentions include a plurality of stem type intentions arranged according to the preset sequence, executing the operation flow of the last stem type intention in the stem type intentions, wherein the preset sequence further includes the position sequence of the target text corresponding to each stem type intention in the text to be recognized; and when the operation flow with the highest current operation priority is the operation flow of the unsolved intentions, and the target intentions comprise a plurality of unsolved intentions which are arranged according to the preset sequence, executing the operation flow of the last unsolved intention in the unsolved intentions, wherein the preset sequence also comprises the position sequence of the target text corresponding to each unsolved intention in the text to be recognized.
An execution module 208 configured to determine one or more operation flows based on the target operation policy, and execute a response operation based on the one or more operation flows.
Illustratively, the response operation comprises a play operation and a subsequent operation, wherein each operation flow of the intention type corresponds to one play operation, and each play operation corresponds to one subsequent operation; the executing module 208 is further configured to: executing corresponding playing operation according to each operation flow; and determining and executing subsequent operations corresponding to the last playing operation according to the last playing operation in the playing operations.
Illustratively, the semantic recognition based multi-intent processing system 20 further comprises an upload module for uploading the target operation policy into a blockchain.
EXAMPLE III
Fig. 3 is a schematic diagram of a hardware architecture of a computer device according to a third embodiment of the present invention. In the present embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a command set in advance or stored. The computer device 2 may be a rack server, a blade server, a tower server or a rack server (including an independent server or a server cluster composed of a plurality of servers), and the like. As shown, the computer device 2 includes, but is not limited to, at least a memory 21, a processor 22, a network interface 23, and a semantic recognition based multi-intent processing system 20 communicatively coupled to each other via a system bus.
In this embodiment, the memory 21 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like provided on the computer device 2. Of course, the memory 21 may also comprise both internal and external memory units of the computer device 2. In this embodiment, the memory 21 is generally used for storing an operating system and various types of application software installed on the computer device 2, such as the program codes of the semantic recognition-based multi-intent processing system 20 of the second embodiment. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to execute the program code stored in the memory 21 or process data, for example, execute the multi-intent processing system 20 based on semantic recognition, so as to implement the multi-intent processing method based on semantic recognition according to the first embodiment.
The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is typically used for establishing a communication connection between the computer device 2 and other electronic apparatuses. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication i/On (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), Wi-Fi, and the like.
It is noted that fig. 3 only shows the computer device 2 with components 20-23, but it is to be understood that not all shown components are required to be implemented, and that more or less components may be implemented instead.
In this embodiment, the semantic recognition based multi-intent processing system 20 stored in the memory 21 may also be divided into one or more program modules, which are stored in the memory 21 and executed by one or more processors (in this embodiment, the processor 22) to accomplish the present invention.
For example, fig. 2 is a schematic diagram of program modules for implementing the semantic recognition based multiple intention processing system 20 according to the second embodiment of the present invention, in which the semantic recognition based multiple intention processing system 20 can be divided into a receiving module 200, a splitting module 202, a recognition module 204, a determining module 206 and an execution to module 208. Herein, the program modules referred to herein refer to a series of computer program instruction segments capable of performing specific functions, and are more suitable than programs for describing the execution process of the semantic recognition based multi-intent processing system 20 in the computer device 2. The specific functions of the program modules 200 and 208 have been described in detail in the second embodiment, and are not described herein again.
Example four
The present embodiment also provides a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., on which a computer program is stored, which when executed by a processor implements corresponding functions. The computer-readable storage medium of the embodiment is used for the semantic recognition-based multi-intent processing system 20, and when executed by the processor, the semantic recognition-based multi-intent processing method of the first embodiment is implemented.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A multi-intent processing method based on semantic recognition, the method comprising:
receiving a voice to be recognized, and performing transcription operation on the voice to be recognized to obtain a text to be recognized;
splitting the text to be recognized into a plurality of target texts;
identifying a target intention corresponding to each target text through a pre-trained semantic identification model to obtain a plurality of target intents;
acquiring the intention type of each target intention, and determining a target operation strategy according to the intention type of each target intention; and
determining one or more operational flows based on the target operational policy to perform responsive operations based on the one or more operational flows.
2. The multi-intent processing method based on semantic recognition according to claim 1, wherein the intent types comprise at least one of an ending-type intent, a backbone-type intent, and an objecting-type intent, wherein the objecting-type intent comprises at least one of a resolvable-type intent, an unresolvable-type intent, and an unresolved-type intent;
the step of determining a target operation strategy according to the intention type of each target intention comprises:
determining the target operation strategy according to the intention type and the pre-configured operation priority of each target intention;
the operation priority is an execution sequence corresponding to each operation flow with the target intention, and the execution sequence of each operation flow is as follows: an operation flow of ending type intention, an operation flow of unresolved type intention, an operation flow of backbone type intention, an operation flow of resolvable type intention, and an operation flow of unresolved type intention.
3. The multi-intent processing method based on semantic recognition according to claim 2, wherein the step of determining a target operation strategy according to the intent type of each of the target intents comprises:
when the ending type intention exists in the intention types of the target intentions, determining the operation flow for executing the ending type intention as the target operation strategy.
4. The multi-intent processing method based on semantic recognition according to claim 2, wherein the step of determining a target operation strategy according to the intent type of each of the target intents comprises:
when the ending type intention does not exist in the intention types of the target intentions, judging whether the objection type intention exists or not;
if the objection-type intent is present, detecting whether the objection-type intent includes the resolvable intent;
if the objection type intention comprises the resolvable type intention, preferentially executing the operation flow of the resolvable type intention, and then executing the operation flow of the unsolvable type intention, the operation flow of the main type intention and the operation flow without the resolvable type intention according to the operation priority, wherein the operation flow with the highest current operation priority is determined as the target operation strategy; and
if the objection type intention does not include the resolvable type intention, determining the operation flow with the highest current operation priority in the operation flows of the unresolvable type intention, the operation flows of the main type intention and the operation flows of the unresolvable type intention as the target operation strategy.
5. The multi-intent processing method based on semantic recognition according to claim 4, wherein the step of executing the operation flow with the unsolvable intent, the operation flow with the backbone-type intent, and the operation flow without the resolvable intent with the highest priority according to the operation priority comprises:
when the operation flow with the highest current operation priority is the operation flow of the unsolvable type intention and the target intentions comprise a plurality of unsolvable type intentions arranged according to a preset sequence, executing the operation flow of the last unsolvable type intention in the unsolvable type intentions, wherein the preset sequence comprises the position sequence of the target text corresponding to each unsolvable type intention in the text to be recognized;
when the operation flow with the highest current operation priority is the operation flow of the stem type intentions, and the target intentions include a plurality of stem type intentions arranged according to the preset sequence, executing the operation flow of the last stem type intention in the stem type intentions, wherein the preset sequence further includes the position sequence of the target text corresponding to each stem type intention in the text to be recognized; and
and when the operation flow with the highest current operation priority is the operation flow of the unsolved intentions, and the target intentions include a plurality of unsolved intentions arranged according to the preset sequence, executing the operation flow of the last unsolved intention in the unsolved intentions, wherein the preset sequence further includes the position sequence of the target text corresponding to each unsolved intention in the text to be recognized.
6. The multi-intent processing method based on semantic recognition according to any one of claims 1 to 5, wherein the response operation comprises a play operation and a subsequent operation, wherein the operation flow of each intent type corresponds to one play operation, and each play operation corresponds to one subsequent operation;
the step of determining one or more operation flows based on the target operation policy to perform responsive operations based on the one or more operation flows comprises:
executing corresponding playing operation according to each operation flow; and
and determining and executing subsequent operations corresponding to the last playing operation according to the last playing operation in the playing operations.
7. The multi-intent processing method based on semantic recognition according to any one of claims 1 to 6, further comprising: uploading the target operation strategy to a blockchain.
8. A semantic recognition based multi-intent processing system, comprising:
the receiving module is used for receiving the voice to be recognized and performing transcription operation on the voice to be recognized so as to obtain the text to be recognized;
the splitting module is used for splitting the text to be recognized into a plurality of target texts;
the recognition module is used for recognizing the target intention corresponding to each target text through a pre-trained semantic recognition model so as to obtain a plurality of target intents;
the determining module is used for acquiring the intention type of each target intention and determining a target operation strategy according to the intention type of each target intention; and
an execution module to determine one or more operational flows based on the target operational policy to execute responsive operations based on the one or more operational flows.
9. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program, when executed by the processor, carries out the steps of the semantic recognition based multi-intent processing method according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which is executable by at least one processor to cause the at least one processor to perform the steps of the semantic recognition based multi-intent processing method according to any one of claims 1 to 7.
CN202110435537.0A 2021-04-22 2021-04-22 Semantic recognition-based multi-intention processing method, system, equipment and storage medium Active CN113158692B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110435537.0A CN113158692B (en) 2021-04-22 2021-04-22 Semantic recognition-based multi-intention processing method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110435537.0A CN113158692B (en) 2021-04-22 2021-04-22 Semantic recognition-based multi-intention processing method, system, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113158692A true CN113158692A (en) 2021-07-23
CN113158692B CN113158692B (en) 2023-09-12

Family

ID=76869466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110435537.0A Active CN113158692B (en) 2021-04-22 2021-04-22 Semantic recognition-based multi-intention processing method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113158692B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017218275A1 (en) * 2016-06-13 2017-12-21 Microsoft Technology Licensing, Llc Intelligent virtual keyboards
CN110309170A (en) * 2019-07-02 2019-10-08 北京大学 A kind of Task takes turns the complicated intension recognizing method in dialogue more
CN110334201A (en) * 2019-07-18 2019-10-15 中国工商银行股份有限公司 A kind of intension recognizing method, apparatus and system
CN110909137A (en) * 2019-10-12 2020-03-24 平安科技(深圳)有限公司 Information pushing method and device based on man-machine interaction and computer equipment
CN111259625A (en) * 2020-01-16 2020-06-09 平安科技(深圳)有限公司 Intention recognition method, device, equipment and computer readable storage medium
CN111368085A (en) * 2020-03-05 2020-07-03 北京明略软件系统有限公司 Recognition method and device of conversation intention, electronic equipment and storage medium
CN111400438A (en) * 2020-02-21 2020-07-10 镁佳(北京)科技有限公司 Method and device for identifying multiple intentions of user, storage medium and vehicle
CN111738016A (en) * 2020-06-28 2020-10-02 中国平安财产保险股份有限公司 Multi-intention recognition method and related equipment
CN111783471A (en) * 2020-06-29 2020-10-16 中国平安财产保险股份有限公司 Semantic recognition method, device, equipment and storage medium of natural language
CN111930950A (en) * 2020-09-18 2020-11-13 深圳追一科技有限公司 Multi-intention response method, device, computer equipment and storage medium
CN112163086A (en) * 2020-10-30 2021-01-01 海信视像科技股份有限公司 Multi-intention recognition method and display device
CN112214588A (en) * 2020-10-16 2021-01-12 平安国际智慧城市科技股份有限公司 Multi-intention recognition method and device, electronic equipment and storage medium
WO2021051521A1 (en) * 2019-09-18 2021-03-25 平安科技(深圳)有限公司 Response information obtaining method and apparatus, computer device, and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109313534A (en) * 2016-06-13 2019-02-05 微软技术许可有限责任公司 Intelligent virtual keyboard
WO2017218275A1 (en) * 2016-06-13 2017-12-21 Microsoft Technology Licensing, Llc Intelligent virtual keyboards
CN110309170A (en) * 2019-07-02 2019-10-08 北京大学 A kind of Task takes turns the complicated intension recognizing method in dialogue more
CN110334201A (en) * 2019-07-18 2019-10-15 中国工商银行股份有限公司 A kind of intension recognizing method, apparatus and system
WO2021051521A1 (en) * 2019-09-18 2021-03-25 平安科技(深圳)有限公司 Response information obtaining method and apparatus, computer device, and storage medium
CN110909137A (en) * 2019-10-12 2020-03-24 平安科技(深圳)有限公司 Information pushing method and device based on man-machine interaction and computer equipment
CN111259625A (en) * 2020-01-16 2020-06-09 平安科技(深圳)有限公司 Intention recognition method, device, equipment and computer readable storage medium
CN111400438A (en) * 2020-02-21 2020-07-10 镁佳(北京)科技有限公司 Method and device for identifying multiple intentions of user, storage medium and vehicle
CN111368085A (en) * 2020-03-05 2020-07-03 北京明略软件系统有限公司 Recognition method and device of conversation intention, electronic equipment and storage medium
CN111738016A (en) * 2020-06-28 2020-10-02 中国平安财产保险股份有限公司 Multi-intention recognition method and related equipment
CN111783471A (en) * 2020-06-29 2020-10-16 中国平安财产保险股份有限公司 Semantic recognition method, device, equipment and storage medium of natural language
CN111930950A (en) * 2020-09-18 2020-11-13 深圳追一科技有限公司 Multi-intention response method, device, computer equipment and storage medium
CN112214588A (en) * 2020-10-16 2021-01-12 平安国际智慧城市科技股份有限公司 Multi-intention recognition method and device, electronic equipment and storage medium
CN112163086A (en) * 2020-10-30 2021-01-01 海信视像科技股份有限公司 Multi-intention recognition method and display device

Also Published As

Publication number Publication date
CN113158692B (en) 2023-09-12

Similar Documents

Publication Publication Date Title
EP3451328B1 (en) Method and apparatus for verifying information
CN111105782B (en) Session interaction processing method and device, computer equipment and storage medium
EP2869298A1 (en) Information identification method and apparatus
CN108447471A (en) Audio recognition method and speech recognition equipment
CN110689881B (en) Speech recognition method, speech recognition device, computer equipment and storage medium
CN111858892B (en) Voice interaction method, device, equipment and medium based on knowledge graph
CN110266900B (en) Method and device for identifying customer intention and customer service system
CN110890088B (en) Voice information feedback method and device, computer equipment and storage medium
CN109462482B (en) Voiceprint recognition method, voiceprint recognition device, electronic equipment and computer readable storage medium
CN111696558A (en) Intelligent outbound method, device, computer equipment and storage medium
CN110110038A (en) Traffic predicting method, device, server and storage medium
CN114724561A (en) Voice interruption method and device, computer equipment and storage medium
CN105869631B (en) The method and apparatus of voice prediction
EP3843090B1 (en) Method and apparatus for outputting analysis abnormality information in spoken language understanding
CN109300474A (en) A kind of audio signal processing method and device
CN113037914A (en) Method for processing incoming call, related device and computer program product
CN112669850A (en) Voice quality detection method and device, computer equipment and storage medium
CN117112065A (en) Large model plug-in calling method, device, equipment and medium
CN115858776B (en) Variant text classification recognition method, system, storage medium and electronic equipment
CN113158692B (en) Semantic recognition-based multi-intention processing method, system, equipment and storage medium
CN116343755A (en) Domain-adaptive speech recognition method, device, computer equipment and storage medium
CN112992151B (en) Speech recognition method, system, device and readable storage medium
CN111970311B (en) Session segmentation method, electronic device and computer readable medium
CN113656566A (en) Intelligent dialogue processing method and device, computer equipment and storage medium
CN112712793A (en) ASR (error correction) method based on pre-training model under voice interaction and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant