CN115099242A - Intention recognition method, system, computer and readable storage medium - Google Patents

Intention recognition method, system, computer and readable storage medium Download PDF

Info

Publication number
CN115099242A
CN115099242A CN202211040262.1A CN202211040262A CN115099242A CN 115099242 A CN115099242 A CN 115099242A CN 202211040262 A CN202211040262 A CN 202211040262A CN 115099242 A CN115099242 A CN 115099242A
Authority
CN
China
Prior art keywords
intention
vector
actual
vectors
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211040262.1A
Other languages
Chinese (zh)
Other versions
CN115099242B (en
Inventor
罗序俊
陶俊
张琳
朱嘉欣
尧德鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Telecom Information Industry Co ltd
Original Assignee
Jiangxi Telecom Information Industry Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Telecom Information Industry Co ltd filed Critical Jiangxi Telecom Information Industry Co ltd
Priority to CN202211040262.1A priority Critical patent/CN115099242B/en
Publication of CN115099242A publication Critical patent/CN115099242A/en
Application granted granted Critical
Publication of CN115099242B publication Critical patent/CN115099242B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides an intention identification method, an intention identification system, a computer and a readable storage medium, wherein the method comprises the steps of preprocessing a voice instruction to split a plurality of characters and obtain an initial word vector; acquiring a positive sample and a negative sample; performing word cutting processing, and acquiring a corresponding target word vector; acquiring a position vector corresponding to each character, and merging and inputting a target character vector and the position vector into a first preset model to acquire a high-dimensional text semantic vector; acquiring a text intention vector according to the high-dimensional text semantic vector, and multiplying the text intention vector by the intention vector to acquire a corresponding score; and acquiring an intention distance vector based on the score, and inputting the intention distance vector into a BPR loss function for training to acquire the intention vector containing intention information. By the method, the intention appearing in the user speech can be accurately recognized, so that the machine customer service can generate a reply satisfying the user.

Description

Intention recognition method, system, computer and readable storage medium
Technical Field
The invention relates to the technical field of big data, in particular to an intention identification method, an intention identification system, a computer and a readable storage medium.
Background
With the rise of AI, an intelligent outbound system has come, and an intelligent outbound is a technology that comprehensively utilizes Automatic Speech Recognition (ASR), Text To Speech (TTS) and Natural Language Understanding (NLU), and is an intelligent customer service robot product that can be oriented To customers. The intelligent outbound robot can automatically initiate a robot telephone outbound task according to a service scene, collect service results through voice conversation interaction between people and the robot, and perform statistical processing on data.
Wherein understanding the user's true intent is the most important link of the intelligent outbound system, and only after accurate understanding of the user's intent can the intelligent outbound system generate a response that is satisfactory to the user. In the prior art, most of algorithm models are built by adopting an algorithm framework of a text feature extractor and a cross entropy classifier and then trained, however, when the utterance of a user contains no intention in the training model, namely a new intention, the algorithm model cannot recognize the new intention in the utterance of the user, so that subsequent machine customer service cannot generate satisfactory responses of the user, and the use experience of the user on the machine customer service is greatly reduced.
Disclosure of Invention
Based on this, the present invention provides an intention recognition method, system, computer and readable storage medium, so as to solve the problem that the algorithm model in the prior art cannot effectively recognize the intention appearing in the user utterance, so that the subsequent machine service cannot generate the satisfactory reply of the user.
The first aspect of the embodiments of the present invention provides an intention identification method, where the method includes:
when a voice instruction sent by a user is received, preprocessing the voice instruction to split a plurality of characters carried by the voice instruction, and acquiring initial character vectors corresponding to the characters respectively;
acquiring positive samples and negative samples corresponding to the intentions of a plurality of preset categories respectively, and initializing intention vectors;
performing word cutting processing on the positive sample and the negative sample, and indexing the initial word vector to obtain target word vectors corresponding to all characters in the positive sample and the negative sample respectively;
acquiring position vectors corresponding to the characters in the positive sample and the negative sample based on the positions of the target word vectors in the positive sample and the negative sample, and merging and inputting the target word vectors and the position vectors corresponding to the characters into a first preset model to acquire corresponding high-dimensional text semantic vectors;
acquiring a corresponding text intention vector according to the high-dimensional text semantic vector, and multiplying the text intention vector and the intention vector to acquire scores corresponding to the positive sample and the negative sample respectively;
and acquiring a corresponding intention distance vector based on the scores of the positive sample and the negative sample, and inputting the intention distance vector into a preset BPR loss function for training to acquire an intention vector containing intention information.
The invention has the beneficial effects that: the method comprises the steps of preprocessing a voice instruction sent by a user to split the voice instruction to obtain a plurality of corresponding characters, simultaneously obtaining initial word vectors corresponding to the plurality of current characters, cutting the characters of the obtained positive sample and negative sample, and simultaneously searching out the initial word vectors to obtain corresponding target word vectors. Further, a position vector corresponding to each current character is obtained, a target character vector and a position vector corresponding to each current character are merged and input into a first preset model to obtain a corresponding high-dimensional text semantic vector, a corresponding text intention vector is obtained according to the high-dimensional text semantic vector, an intention distance vector is obtained according to the high-dimensional text semantic vector and the text intention vector, the intention distance vector is input into a preset BPR loss function for training, and the intention vector containing intention information is obtained finally. Through the method, the intention appearing in the user speech can be accurately recognized, so that the machine customer service can generate a reply satisfying the user, and the use experience of the user is greatly improved.
Preferably, when a voice instruction sent by a user is received, the step of preprocessing the voice instruction to split a plurality of characters carried by the voice instruction includes:
when a voice instruction sent by the user is received, converting the voice instruction into a corresponding text through ASR, and labeling an intention label on the text;
and performing word cutting processing on the text to obtain a plurality of corresponding characters.
Preferably, the step of acquiring initial word vectors corresponding to the plurality of characters respectively includes:
one-hot coding processing is carried out on a plurality of characters one by one to obtain one-hot coding x corresponding to each character respectively k And respectively acquiring one-hot codes y corresponding to upper and lower adjacent characters of each character ij
Encoding x by the one-hot k And said one-hot code y ij All the initial Word vectors are input into a preset Word2Vector algorithm model to respectively obtain corresponding initial Word vectors.
Preferably, the step of merging and inputting the target word vector and the position vector corresponding to each word into a first preset model to obtain a corresponding high-dimensional text semantic vector includes:
adding the target word vector and the position vector corresponding to each character, and simultaneously inputting the target word vector and the position vector into a preset bert-base-Chinese model;
and performing feature extraction on the target word vector and the position vector in the bert-base-Chinese model to obtain a corresponding high-dimensional text semantic vector.
Preferably, after the step of inputting the intention distance vector into a preset BPR loss function for training to obtain the intention vector containing intention information, the method further includes:
when an actual positive sample or an actual negative sample is obtained, respectively inputting an actual target word vector and an actual position vector corresponding to each actual character in the actual positive sample or the actual negative sample, and merging the actual target word vector and the actual position vector to the bert-base-chip model to obtain a corresponding actual high-dimensional text semantic vector;
multiplying the actual high-dimensional text semantic vector by the intention vector containing intention information to obtain an actual score corresponding to the actual positive sample or the actual negative sample, and judging the actual score;
if the actual score is higher, judging that the actual positive sample or the actual negative sample does not contain a new intention;
and if the actual score is lower, determining that the actual positive sample or the actual negative sample contains a new intention.
A second aspect of an embodiment of the present invention provides an intention identifying system, where the system includes:
the splitting module is used for preprocessing a voice instruction when the voice instruction sent by a user is received so as to split a plurality of characters carried by the voice instruction and obtain initial character vectors corresponding to the characters respectively;
the first acquisition module is used for acquiring positive samples and negative samples corresponding to the intents of a plurality of preset categories respectively and initializing intention vectors;
the second obtaining module is configured to perform word cutting processing on the positive sample and the negative sample, and index the initial word vector to obtain target word vectors corresponding to characters in the positive sample and the negative sample, respectively;
the first processing module is used for acquiring position vectors corresponding to the characters in the positive sample and the negative sample based on the positions of the target word vectors in the positive sample and the negative sample, and merging and inputting the target word vectors and the position vectors corresponding to the characters into a first preset model to acquire corresponding high-dimensional text semantic vectors;
the second processing module is used for acquiring a corresponding text intention vector according to the high-dimensional text semantic vector, and multiplying the text intention vector by the intention vector to acquire scores corresponding to the positive sample and the negative sample respectively;
and the training module is used for acquiring a corresponding intention distance vector based on the scores of the positive sample and the negative sample, and inputting the intention distance vector into a preset BPR loss function for training so as to acquire an intention vector containing intention information.
In the intention identification system, the splitting module is specifically configured to:
when a voice instruction sent by the user is received, converting the voice instruction into a corresponding text through ASR, and labeling an intention label on the text;
and performing word cutting processing on the text to obtain a plurality of corresponding characters.
In the intention identifying system, the splitting module is further specifically configured to:
one-hot coding processing is carried out on a plurality of characters one by one to obtain one-hot coding x corresponding to each character respectively k And respectively acquiring one-hot codes y corresponding to upper and lower adjacent characters of each character ij
Encoding the one-hot into x k And said one-hot code y ij All the initial Word vectors are input into a preset Word2Vector algorithm model to respectively obtain corresponding initial Word vectors.
In the intention identification system, the first processing module is specifically configured to:
adding the target word vector and the position vector corresponding to each character, and simultaneously inputting the target word vector and the position vector into a preset bert-base-Chinese model;
and performing feature extraction on the target word vector and the position vector in the bert-base-Chinese model to obtain a corresponding high-dimensional text semantic vector.
In the intention recognition system, the intention recognition system further includes a scoring module, and the scoring module is specifically configured to:
when an actual positive sample or an actual negative sample is obtained, respectively inputting an actual target word vector and an actual position vector corresponding to each actual character in the actual positive sample or the actual negative sample, and merging the actual target word vector and the actual position vector to the bert-base-chip model to obtain a corresponding actual high-dimensional text semantic vector;
multiplying the actual high-dimensional text semantic vector by the intention vector containing intention information to obtain an actual score corresponding to the actual positive sample or the actual negative sample, and judging the actual score;
if the actual score is higher, determining that the actual positive sample or the actual negative sample does not contain a new intention;
and if the actual score is lower, determining that the actual positive sample or the actual negative sample contains a new intention.
A third aspect of the embodiments of the present invention provides a computer, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the intent recognition method as described above when executing the computer program.
A fourth aspect of the embodiments of the present invention proposes a readable storage medium on which a computer program is stored, which when executed by a processor, implements the intent recognition method as described above.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flow chart of an intention recognition method according to a first embodiment of the invention;
fig. 2 is a block diagram of an intention recognition system according to a second embodiment of the present invention.
The following detailed description will further illustrate the invention in conjunction with the above-described figures.
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully hereinafter with reference to the accompanying drawings. Several embodiments of the invention are presented in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like as used herein are for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Referring to fig. 1, the intention recognition method according to the first embodiment of the present invention is shown, and the intention recognition method according to the first embodiment of the present invention can accurately recognize an intention appearing in a user utterance, so that a machine customer service can generate a reply satisfying a user, thereby greatly improving the user experience.
Specifically, the intention identifying method provided in this embodiment specifically includes the following steps:
step S10, when a voice instruction sent by a user is received, preprocessing the voice instruction to split a plurality of characters carried by the voice instruction, and acquiring initial character vectors corresponding to the characters respectively;
specifically, in this embodiment, it should be noted that the intention identifying method provided in this embodiment is specifically applied to a device installed with an intelligent outbound system, for example, an intelligent device such as an intelligent sound box and a machine service. The intelligent equipment can execute corresponding actions by receiving voice instructions sent by the user or reply required contents so as to facilitate the life of the user.
Therefore, in the embodiment, in order to accurately recognize the intention in the voice command issued by the user, that is, to accurately recognize the purpose corresponding to the voice command issued by the user, the embodiment trains various intentions in advance based on the BPR loss function, and receives and executes the voice command issued by the user through the trained algorithm, so as to improve the user experience of the user.
Therefore, in this step, it should be noted that, when the intention identification method provided in this embodiment receives a voice instruction sent by a user, the step immediately preprocesses the currently received voice instruction so as to correspondingly split a plurality of characters carried by the currently received voice instruction.
Specifically, in this step, it should be noted that, when a voice instruction sent by a user is received, the step of preprocessing the voice instruction to split a plurality of characters carried by the voice instruction includes:
when a voice instruction sent by a user is received, the currently received voice instruction is converted into a corresponding text through an ASR (automatic speech recognition module), and an intention label is marked on the current text, wherein the text which is correspondingly converted from the current voice instruction comprises a plurality of characters;
therefore, in this step, the characters in the current text are further subjected to word cutting processing to obtain a plurality of corresponding characters one by one, that is, the sentence in the current text is split into individual characters.
Further, in this step, after the plurality of characters are split, initial word vectors corresponding to the plurality of current characters respectively are further obtained in this step.
Specifically, in this step, it should be noted that the step of acquiring initial word vectors corresponding to the plurality of characters respectively includes:
specifically, in this step, a plurality of currently split characters are further subjected to one-hot coding one by one to obtain a one-hot code x corresponding to each character respectively k And respectively acquiring one-hot codes y corresponding to upper and lower adjacent characters of each character ij
On the basis, the step further encodes x the one-hot code k And the above one-hot code y ij All the initial Word vectors are input into a preset Word2Vector algorithm model to respectively obtain corresponding initial Word vectors.
Step S20, acquiring positive samples and negative samples corresponding to the intentions of a plurality of preset categories respectively, and initializing intention vectors;
further, in this step, it should be noted that several types of intentions are preset in this embodiment, for example: taking the scene of the arrearage payment of the user as an example, and classifying the intentions under the current scene of the arrearage payment into four categories, specifically, the intention list is as follows:
busy Busy user;
QA _ condition _ qf arrears amount;
already has paid a fee;
the Agree agrees to pay;
correspondingly, the step also obtains the positive sample and the negative sample corresponding to each intention category, and initializes the intention vector, wherein the number of the positive samples and the number of the negative samples corresponding to each intention category should be kept the same. Specifically, for a certain class K, a positive sample indicates that the intention label L belongs to the current class K, and a negative sample indicates that the intention label L does not belong to the current class K (other classes except the class K).
Step S30, performing word cutting processing on the positive sample and the negative sample, and indexing the initial word vector to obtain target word vectors corresponding to the characters in the positive sample and the negative sample, respectively;
specifically, in this step, after the positive sample and the negative sample corresponding to each intention category are obtained, the step further performs word cutting processing on the current positive sample and negative sample, and indexes out the corresponding initial word vectors at the same time, so as to further obtain target word vectors corresponding to each character in the current positive sample and negative sample.
Step S40, acquiring position vectors corresponding to the characters in the positive sample and the negative sample based on the positions of the target word vectors in the positive sample and the negative sample, and merging and inputting the target word vectors and the position vectors corresponding to the characters into a first preset model to acquire corresponding high-dimensional text semantic vectors;
further, in this step, it should be noted that after the target word vectors corresponding to the respective characters are obtained through the above steps, the step further obtains the position vectors corresponding to the respective characters in the current positive sample and the current negative sample based on the positions of the current target word vectors in the positive sample and the negative sample, and combines and inputs the target word vectors and the position vectors corresponding to the respective characters into the first preset model to obtain the corresponding high-dimensional text semantic vectors.
Specifically, in this step, it should be noted that the step of combining and inputting the target word vector and the position vector corresponding to each word into the first preset model to obtain the corresponding high-dimensional text semantic vector includes:
the step further adds the target word vector and the position vector corresponding to each character, namely, combines the target word vector and the position vector, and simultaneously inputs the combined result into a preset bert-base-chip model, namely, inputs the combined result into the preset bert-base-chip model;
further, feature extraction is carried out on the target word vector and the position vector in the current bert-base-Chinese model to obtain a corresponding high-dimensional text semantic vector.
Step S50, acquiring a corresponding text intention vector according to the high-dimensional text semantic vector, and multiplying the text intention vector by the intention vector to acquire scores corresponding to the positive sample and the negative sample respectively;
specifically, in this step, after the high-dimensional text semantic vector corresponding to each word is obtained, the step further performs full-concatenation processing on the obtained high-dimensional text semantic vector to obtain a text intention vector correspondingly, and immediately multiplies the currently obtained text intention vector by the initialized intention vector to obtain scores corresponding to the current positive sample and the current negative sample respectively.
Step S60, obtaining a corresponding intention distance vector based on the scores of the positive sample and the negative sample, and inputting the intention distance vector into a preset BPR loss function for training to obtain an intention vector containing intention information.
Finally, in this step, after the scores corresponding to the positive sample and the negative sample in each intention category are respectively obtained through the above steps, this step further obtains a corresponding intention distance vector based on the scores corresponding to the current positive sample and the negative sample, inputs the current intention distance vector into a preset BPR loss function for training, and after a plurality of rounds of training, can finally obtain an intention vector containing intention information, thereby being able to effectively complete the training of the intention of the user.
In addition, in this embodiment, it should be further noted that, after the step of inputting the intention distance vector into a preset BPR loss function for training to obtain the intention vector containing intention information, the method further includes:
when an actual positive sample or an actual negative sample is obtained, respectively inputting an actual target word vector and an actual position vector corresponding to each actual character in the actual positive sample or the actual negative sample, and merging the actual target word vector and the actual position vector to the bert-base-chip model to obtain a corresponding actual high-dimensional text semantic vector;
multiplying the actual high-dimensional text semantic vector by the intention vector containing intention information to obtain an actual score corresponding to the actual positive sample or the actual negative sample, and judging the actual score;
if the actual score is higher, judging that the actual positive sample or the actual negative sample does not contain a new intention;
and if the actual score is lower, determining that the actual positive sample or the actual negative sample contains a new intention.
In the actual using process, it should be further described that the intention identification method provided in this embodiment is further described in combination with the arrearage payment scenario of the actual user, specifically, four categories of intentions in the arrearage payment scenario are taken as examples:
an intent list;
busy user is Busy;
QA _ condition _ qf arrears amount;
already has paid a fee;
the Agree agrees to pay;
a user text;
user 1: "please wait for a short time, your calling user is using the call hold function, please don't hang up, please wait for a short time, your calling user is using the call hold function, please don't hang up";
and (4) a user 2: "how much money I pay per month for you's questions";
and (3) a user: "did I not just have crossed 100";
and 4, the user: "good, go right away";
implementing a model training process;
step 1: the specific steps of labeling the text of the users 1-4 with intention labels are as follows: (ii) a
User 1: busy;
and (4) a user 2: QA _ condition _ qf;
user 3: alreagy;
and 4, the user: agree;
step 2: cutting characters of the text of the users 1-4, and then removing the characters to obtain a character table C = { 1: please, 2: slightly, 3: etc., 4: you, 5: dialing, 6: beat.
And step 3: the user text is sent to a Word2Vector feature extractor to obtain a Word Vector Wvxn of each Word in each Word table C, wherein the Word Vector is assumed to be 2 dimensions (actually 768 dimensions).
C_vector={1:[0.9234,0.023],2:[0.056,0.03],...,n:[0.346,0.7589]}。
And 4, step 4: positive and negative samples Busy: { positive sample: "please wait, your dialing user is using call hold function, please don't hang up, please wait, your dialing user is using call hold function, please don't hang up", negative sample: "did I not just cross 100" };
QA _ condition _ qf: { positive sample: "how much money I paid per month" when I asked, negative samples: "good, instant cross" }. (not to mention);
and 5: four intention vectors V _ label = { Vbusy, Vqa _ condition,
Valready,Vagree};
step 6: and cutting characters of positive and negative samples corresponding to Busy, and then removing C _ vector to find a word vector corresponding to each word and adding a position vector of each word in the text to obtain semantic vectors Vsem (Vp _ sem, Vn _ sem) of the positive and negative samples under the Busy label.
And 7: sending the Vsem into a pre-trained bert-base-Chinese model to obtain a high-dimensional semantic vector Vful (Vp-full, Vn-full).
And 8: obtaining ri and rj from Vvull Vgusy;
and step 9: and (ri-rj) is substituted into a preset BPR loss function for training, Vblank is gradually trained through continuous model iteration, and the training of intention vectors of other categories is similar.
The model prediction process is implemented:
step 1: and cutting characters of the text spoken by the user in real time, and then removing the C _ vector to find a word vector corresponding to each word and adding a position vector of each word in the text to obtain a text semantic vector Vsem _ predict.
Step 2: sending the Vsem _ predict into the bert-base-kernel model which is finely adjusted through the training steps to obtain a high-dimensional semantic vector Vful _ predict.
And step 3: and multiplying the Vful _ predict by the trained { Vblank, Vqa _ condition, Valready, Vagree } in sequence to obtain four scores of the spoken text of the user, { busy _ score, qa _ condition _ score, align _ score } and taking the maximum value of the scores to obtain the belonged intention of the spoken text of the user, and when the spoken text spoken by a user does not belong to any one of the four categories, the scores of the { busy _ score, qa _ condition _ score, align _ score } are close to 0, so that the speech command of the user can be considered to contain a new intention.
It should be noted that the above implementation procedures are only for illustrating the applicability of the present application, but this does not represent that the intention identification method of the present application has only the above-mentioned one implementation flow, and on the contrary, the intention identification method of the present application can be incorporated into the feasible embodiments of the present application as long as the intention identification method of the present application can be implemented.
In summary, the intention recognition method provided by the embodiments of the present invention can accurately recognize the intention appearing in the user utterance, so that the machine customer service can generate a reply satisfying the user, thereby greatly improving the user experience.
Referring to fig. 2, an intention recognition system according to a second embodiment of the present invention is shown, the system including:
the splitting module 12 is configured to, when a voice instruction sent by a user is received, pre-process the voice instruction to split a plurality of characters carried by the voice instruction, and acquire initial word vectors corresponding to the plurality of characters respectively;
a first obtaining module 22, configured to obtain positive samples and negative samples corresponding to intents of a plurality of preset categories, respectively, and initialize an intention vector;
a second obtaining module 32, configured to perform word segmentation on the positive sample and the negative sample, and index the initial word vector to obtain target word vectors corresponding to respective characters in the positive sample and the negative sample;
a first processing module 42, configured to obtain a position vector corresponding to each of the words in the positive sample and the negative sample based on the position of the target word vector in the positive sample and the negative sample, and merge and input the target word vector and the position vector corresponding to each of the words into a first preset model, so as to obtain a corresponding high-dimensional text semantic vector;
the second processing module 52 is configured to obtain a corresponding text intention vector according to the high-dimensional text semantic vector, and multiply the text intention vector and the intention vector to obtain scores corresponding to the positive sample and the negative sample respectively;
and a training module 62, configured to obtain a corresponding intention distance vector based on the scores of the positive sample and the negative sample, and input the intention distance vector into a preset BPR loss function for training, so as to obtain an intention vector containing intention information.
In the intention identifying system, the splitting module 12 is specifically configured to:
when a voice instruction sent by the user is received, converting the voice instruction into a corresponding text through ASR, and labeling an intention label on the text;
and performing word cutting processing on the text to obtain a plurality of corresponding characters.
In the intention identifying system, the splitting module 12 is further specifically configured to:
one-hot coding processing is carried out on a plurality of characters one by oneTo obtain the one-hot code x corresponding to each said character k And respectively acquiring one-hot codes y corresponding to upper and lower adjacent characters of each character ij
Encoding x by the one-hot k And said one-hot code y ij All the initial Word vectors are input into a preset Word2Vector algorithm model to respectively obtain corresponding initial Word vectors.
In the intention identifying system, the first processing module 42 is specifically configured to:
adding the target word vector and the position vector corresponding to each character, and simultaneously inputting the target word vector and the position vector into a preset bert-base-Chinese model;
and performing feature extraction on the target word vector and the position vector in the bert-base-Chinese model to obtain a corresponding high-dimensional text semantic vector.
In the intention recognition system, the intention recognition system further includes a scoring module 72, and the scoring module 72 is specifically configured to:
when an actual positive sample or an actual negative sample is obtained, respectively inputting an actual target word vector and an actual position vector corresponding to each actual character in the actual positive sample or the actual negative sample, and merging the actual target word vector and the actual position vector to the bert-base-chip model to obtain a corresponding actual high-dimensional text semantic vector;
multiplying the actual high-dimensional text semantic vector by the intention vector containing intention information to obtain an actual score corresponding to the actual positive sample or the actual negative sample, and judging the actual score;
if the actual score is higher, determining that the actual positive sample or the actual negative sample does not contain a new intention;
and if the actual score is lower, determining that the actual positive sample or the actual negative sample contains a new intention.
A third embodiment of the present invention provides a computer, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the intention identifying method provided in the first embodiment as described above when executing the computer program.
A fourth embodiment of the present invention provides a readable storage medium, on which a computer program is stored, which when executed by a processor implements the intention identifying method provided in the first embodiment described above.
In summary, the intention recognition method, system, computer and readable storage medium provided by the embodiments of the present invention can accurately recognize the intention appearing in the user utterance, so that the machine customer service can generate a reply satisfying the user, thereby greatly improving the user experience.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Further, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent should be subject to the appended claims.

Claims (10)

1. An intent recognition method, the method comprising:
when a voice instruction sent by a user is received, preprocessing the voice instruction to split a plurality of characters carried by the voice instruction, and acquiring initial character vectors corresponding to the characters respectively;
acquiring positive samples and negative samples corresponding to the intentions of a plurality of preset categories respectively, and initializing intention vectors;
performing word cutting processing on the positive sample and the negative sample, and indexing the initial word vector to obtain target word vectors corresponding to characters in the positive sample and the negative sample respectively;
acquiring position vectors corresponding to the characters in the positive sample and the negative sample based on the positions of the target word vectors in the positive sample and the negative sample, and merging and inputting the target word vectors and the position vectors corresponding to the characters into a first preset model to acquire corresponding high-dimensional text semantic vectors;
acquiring corresponding text intention vectors according to the high-dimensional text semantic vectors, and multiplying the text intention vectors by the intention vectors to acquire scores corresponding to the positive sample and the negative sample respectively;
and acquiring a corresponding intention distance vector based on the scores of the positive sample and the negative sample, and inputting the intention distance vector into a preset BPR loss function for training to acquire an intention vector containing intention information.
2. The intention recognition method according to claim 1, characterized in that: when a voice instruction sent by a user is received, the voice instruction is preprocessed to split a plurality of characters carried by the voice instruction, and the steps comprise:
when a voice instruction sent by the user is received, converting the voice instruction into a corresponding text through ASR, and labeling an intention label on the text;
and performing word cutting processing on the text to obtain a plurality of corresponding characters.
3. The intention recognition method according to claim 1, characterized in that: the step of obtaining initial word vectors corresponding to the plurality of words respectively comprises:
one-hot coding processing is carried out on a plurality of characters one by one to obtain one-hot coding x corresponding to each character respectively k And respectively acquiring one-hot codes y corresponding to upper and lower adjacent characters of each character ij
Encoding the one-hot into x k And said one-hot code y ij All the initial Word vectors are input into a preset Word2Vector algorithm model to respectively obtain corresponding initial Word vectors.
4. The intention recognition method according to claim 1, characterized in that: the step of merging and inputting the target word vector and the position vector corresponding to each word into a first preset model to obtain the corresponding high-dimensional text semantic vector comprises the following steps:
adding the target word vector and the position vector corresponding to each character, and simultaneously inputting the target word vector and the position vector into a preset bert-base-Chinese model;
and performing feature extraction on the target word vector and the position vector in the bert-base-Chinese model to obtain a corresponding high-dimensional text semantic vector.
5. The intention recognition method according to claim 4, characterized in that: after the step of inputting the intention distance vector into a preset BPR loss function for training to obtain the intention vector containing intention information, the method further includes:
when an actual positive sample or an actual negative sample is obtained, respectively inputting an actual target word vector and an actual position vector corresponding to each actual character in the actual positive sample or the actual negative sample, and merging the actual target word vector and the actual position vector to the bert-base-chip model to obtain a corresponding actual high-dimensional text semantic vector;
multiplying the actual high-dimensional text semantic vector by the intention vector containing intention information to obtain an actual score corresponding to the actual positive sample or the actual negative sample, and judging the actual score;
if the actual score is higher, judging that the actual positive sample or the actual negative sample does not contain a new intention;
and if the actual score is lower, determining that the actual positive sample or the actual negative sample contains a new intention.
6. An intent recognition system, the system comprising:
the splitting module is used for preprocessing a voice instruction when the voice instruction sent by a user is received so as to split a plurality of characters carried by the voice instruction and acquire initial character vectors corresponding to the characters respectively;
the first acquisition module is used for acquiring positive samples and negative samples corresponding to the intents of a plurality of preset categories respectively and initializing intention vectors;
the second obtaining module is configured to perform word cutting processing on the positive sample and the negative sample, and index the initial word vector to obtain target word vectors corresponding to characters in the positive sample and the negative sample, respectively;
the first processing module is used for acquiring position vectors corresponding to the characters in the positive sample and the negative sample based on the positions of the target word vectors in the positive sample and the negative sample, and merging and inputting the target word vectors and the position vectors corresponding to the characters into a first preset model to acquire corresponding high-dimensional text semantic vectors;
the second processing module is used for acquiring a corresponding text intention vector according to the high-dimensional text semantic vector, and multiplying the text intention vector by the intention vector to acquire scores corresponding to the positive sample and the negative sample respectively;
and the training module is used for acquiring a corresponding intention distance vector based on the scores of the positive sample and the negative sample, and inputting the intention distance vector into a preset BPR loss function for training so as to acquire an intention vector containing intention information.
7. The intent recognition system of claim 6, wherein: the splitting module is specifically configured to:
when a voice instruction sent by the user is received, converting the voice instruction into a corresponding text through ASR, and labeling an intention label on the text;
and performing word cutting processing on the text to obtain a plurality of corresponding characters.
8. The intent recognition system of claim 6, wherein: the splitting module is further specifically configured to:
one-hot coding processing is carried out on a plurality of characters one by one to obtain one-hot coding x corresponding to each character respectively k And respectively acquiring one-hot codes y corresponding to upper and lower adjacent characters of each character ij
Encoding x by the one-hot k And said one-hot code y ij All the initial Word vectors are input into a preset Word2Vector algorithm model to respectively obtain corresponding initial Word vectors.
9. A computer comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the intention recognition method of any one of claims 1 to 5 when executing the computer program.
10. A readable storage medium on which a computer program is stored which, when being executed by a processor, carries out the intention recognition method of any one of claims 1 to 5.
CN202211040262.1A 2022-08-29 2022-08-29 Intention recognition method, system, computer and readable storage medium Active CN115099242B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211040262.1A CN115099242B (en) 2022-08-29 2022-08-29 Intention recognition method, system, computer and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211040262.1A CN115099242B (en) 2022-08-29 2022-08-29 Intention recognition method, system, computer and readable storage medium

Publications (2)

Publication Number Publication Date
CN115099242A true CN115099242A (en) 2022-09-23
CN115099242B CN115099242B (en) 2022-11-15

Family

ID=83300349

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211040262.1A Active CN115099242B (en) 2022-08-29 2022-08-29 Intention recognition method, system, computer and readable storage medium

Country Status (1)

Country Link
CN (1) CN115099242B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107492377A (en) * 2017-08-16 2017-12-19 北京百度网讯科技有限公司 Method and apparatus for controlling self-timer aircraft
CN111310008A (en) * 2020-03-20 2020-06-19 北京三快在线科技有限公司 Search intention recognition method and device, electronic equipment and storage medium
US20200250270A1 (en) * 2019-02-01 2020-08-06 International Business Machines Corporation Weighting features for an intent classification system
CN111931513A (en) * 2020-07-08 2020-11-13 泰康保险集团股份有限公司 Text intention identification method and device
US20210012199A1 (en) * 2019-07-04 2021-01-14 Zhejiang University Address information feature extraction method based on deep neural network model
US20210081729A1 (en) * 2019-09-16 2021-03-18 Beijing Baidu Netcom Science Technology Co., Ltd. Method for image text recognition, apparatus, device and storage medium
US20210125605A1 (en) * 2019-10-29 2021-04-29 Lg Electronics Inc. Speech processing method and apparatus therefor
CN113688244A (en) * 2021-08-31 2021-11-23 中国平安人寿保险股份有限公司 Text classification method, system, device and storage medium based on neural network
CN114528844A (en) * 2022-01-14 2022-05-24 中国平安人寿保险股份有限公司 Intention recognition method and device, computer equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107492377A (en) * 2017-08-16 2017-12-19 北京百度网讯科技有限公司 Method and apparatus for controlling self-timer aircraft
US20200250270A1 (en) * 2019-02-01 2020-08-06 International Business Machines Corporation Weighting features for an intent classification system
US20210012199A1 (en) * 2019-07-04 2021-01-14 Zhejiang University Address information feature extraction method based on deep neural network model
US20210081729A1 (en) * 2019-09-16 2021-03-18 Beijing Baidu Netcom Science Technology Co., Ltd. Method for image text recognition, apparatus, device and storage medium
US20210125605A1 (en) * 2019-10-29 2021-04-29 Lg Electronics Inc. Speech processing method and apparatus therefor
CN111310008A (en) * 2020-03-20 2020-06-19 北京三快在线科技有限公司 Search intention recognition method and device, electronic equipment and storage medium
CN111931513A (en) * 2020-07-08 2020-11-13 泰康保险集团股份有限公司 Text intention identification method and device
CN113688244A (en) * 2021-08-31 2021-11-23 中国平安人寿保险股份有限公司 Text classification method, system, device and storage medium based on neural network
CN114528844A (en) * 2022-01-14 2022-05-24 中国平安人寿保险股份有限公司 Intention recognition method and device, computer equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
THAI-LE LUONG ET AL: "Intent extraction from social media texts using sequential segmentation and deep learning models", 《IEEE》 *
UDHAYAKUMAR SHANMUGAM ET AL: "Human-Computer text conversation through NLP in Tamil using Intent Recognition", 《IEEE》 *
吴纯青等: "基于语义的网络大数据组织与搜索", 《计算机学报》 *
赵小虎等: "基于多特征语义匹配的知识库问答系统", 《计算机应用》 *

Also Published As

Publication number Publication date
CN115099242B (en) 2022-11-15

Similar Documents

Publication Publication Date Title
CN107492379B (en) Voiceprint creating and registering method and device
EP0621531B1 (en) Interactive computer system recognizing spoken commands
CN110990543A (en) Intelligent conversation generation method and device, computer equipment and computer storage medium
Walker et al. Using natural language processing and discourse features to identify understanding errors in a spoken dialogue system
CN110610705B (en) Voice interaction prompter based on artificial intelligence
CN111241357A (en) Dialogue training method, device, system and storage medium
CN110704618B (en) Method and device for determining standard problem corresponding to dialogue data
CN110704590B (en) Method and apparatus for augmenting training samples
CN110890088B (en) Voice information feedback method and device, computer equipment and storage medium
CN113223560A (en) Emotion recognition method, device, equipment and storage medium
CN111429157A (en) Method, device and equipment for evaluating and processing complaint work order and storage medium
CN111159375A (en) Text processing method and device
CN115146124A (en) Question-answering system response method and device, equipment, medium and product thereof
CN113593565A (en) Intelligent home device management and control method and system
CN115099242B (en) Intention recognition method, system, computer and readable storage medium
CN110795531B (en) Intention identification method, device and storage medium
CN110047473B (en) Man-machine cooperative interaction method and system
CN115022471B (en) Intelligent robot voice interaction system and method
CN113505606B (en) Training information acquisition method and device, electronic equipment and storage medium
CN114707515A (en) Method and device for judging dialect, electronic equipment and storage medium
Lee et al. A study on natural language call routing
CN117648408B (en) Intelligent question-answering method and device based on large model, electronic equipment and storage medium
CN112714220B (en) Business processing method and device, computing equipment and computer readable storage medium
CN113889149B (en) Speech emotion recognition method and device
CN117041428A (en) Multi-language AI customer service automatic voice interaction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant