CN113886543A - Method, apparatus, medium, and program product for generating an intent recognition model - Google Patents

Method, apparatus, medium, and program product for generating an intent recognition model Download PDF

Info

Publication number
CN113886543A
CN113886543A CN202111152002.9A CN202111152002A CN113886543A CN 113886543 A CN113886543 A CN 113886543A CN 202111152002 A CN202111152002 A CN 202111152002A CN 113886543 A CN113886543 A CN 113886543A
Authority
CN
China
Prior art keywords
vector
search information
question
user type
answer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111152002.9A
Other languages
Chinese (zh)
Inventor
谭云飞
刘晓庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202111152002.9A priority Critical patent/CN113886543A/en
Publication of CN113886543A publication Critical patent/CN113886543A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a method, an apparatus, a medium, and a program product for generating an intention recognition model, which relate to the artificial intelligence fields such as natural language processing, intelligent search, deep learning, and the like. One embodiment of the method comprises: acquiring a vector of search information, a vector of a core word in the search information, and a vector of a non-core word in the search information; obtaining a user type identification characteristic vector according to the vector of the search information and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information and the vector of the non-core word; and training by utilizing the user type identification characteristic vector and the corresponding user type label as well as the question and answer demand identification characteristic vector and the corresponding question and answer demand label to obtain an intention identification model.

Description

Method, apparatus, medium, and program product for generating an intent recognition model
Technical Field
The present disclosure relates to the field of computers, and more particularly, to natural language processing, intelligent search, and deep learning, and more particularly, to a method, apparatus, medium, and program product for generating an intent recognition model.
Background
The intention recognition belongs to the technical field of natural language understanding, and means that the real intention of the user speaking expression is output through recognition and processing of the user expression. The current method of intent recognition mainly includes:
(1) a machine learning based method. The method includes the steps of segmenting search information, performing feature crossing by combining Uniform Resource Locator (URL) characteristics, category characteristics and the like of the search information, and finally performing secondary classification by machine learning algorithms such as Logistic Regression (LR)/Random Forest (RF) and the like to achieve intention identification.
(2) The method based on deep learning performs recognition based on Long Short-Term Memory (LSTM) or bert (bidirectional Encoder retrieval from transformations) for pre-training to realize intention recognition.
Disclosure of Invention
The disclosed embodiments provide a method, apparatus, medium, and program product for generating an intent recognition model.
In a first aspect, an embodiment of the present disclosure provides a method for generating an intent recognition model, including: acquiring a vector of search information, a vector of a core word in the search information, and a vector of a non-core word in the search information; obtaining a user type identification characteristic vector according to the vector of the search information and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information and the vector of the non-core word; and training by utilizing the user type identification characteristic vector and the corresponding user type label as well as the question and answer demand identification characteristic vector and the corresponding question and answer demand label to obtain an intention identification model.
In a second aspect, an embodiment of the present disclosure provides a method for generating a question and answer reply, including: acquiring search information to be predicted, and core words and non-core words in the search information to be predicted; respectively inputting search information to be predicted, core words and non-core words into a pre-trained word vector model to obtain vectors of the search information to be predicted, vectors of the core words and vectors of the non-core words; obtaining a user type identification characteristic vector according to the vector of the search information to be predicted and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information to be predicted and the vector of the non-core word; inputting the user type identification characteristic vector and the question and answer demand identification characteristic vector into a pre-trained intention identification model respectively to obtain a user type label corresponding to the user type identification characteristic vector and a question and answer demand label corresponding to the question and answer demand identification characteristic vector; and responding to the question and answer requirement label with the question and answer requirement, and determining corresponding reply information according to the user type label, the core word and the non-core word.
In a third aspect, an embodiment of the present disclosure provides an apparatus for generating an intent recognition model, including: a vector acquisition module configured to acquire a vector of the search information, a vector of core words in the search information, and a vector of non-core words in the search information; the vector obtaining module is configured to obtain a user type identification feature vector according to the vector of the search information and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information and the vector of the non-core word; and the model training module is configured to train by utilizing the user type identification characteristic vector and the corresponding user type label as well as the question and answer demand identification characteristic vector and the corresponding question and answer demand label to obtain an intention identification model.
In a fourth aspect, an embodiment of the present disclosure provides an apparatus for generating a question and answer reply, including: the information acquisition module is configured to acquire search information to be predicted and core words and non-core words in the search information to be predicted; the vector extraction module is configured to input search information to be predicted, core words and non-core words into a pre-trained word vector model respectively to obtain vectors of the search information to be predicted, the core words and the non-core words; the vector obtaining module is configured to obtain a user type identification feature vector according to the vector of the search information to be predicted and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information to be predicted and the vector of the non-core word; the label obtaining module is configured to input the user type identification characteristic vector and the question-answer demand identification characteristic vector into a pre-trained intention identification model respectively to obtain a user type label corresponding to the user type identification characteristic vector and a question-answer demand label corresponding to the question-answer demand identification characteristic vector, wherein the question-answer demand label is used for representing whether the search information to be predicted has a question-answer demand or not; and the reply determining module is configured to respond to the question-answer requirement that the question-answer requirement label corresponding to the question-answer requirement identification characteristic vector is a preset label with a question-answer requirement, and determine corresponding reply information according to the user type label, the core word and the non-core word corresponding to the user type identification characteristic vector.
In a fifth aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the first or second aspect.
In a sixth aspect, embodiments of the present disclosure propose a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method as described in the first or second aspect.
In a seventh aspect, the disclosed embodiments propose a computer program product comprising a computer program that, when executed by a processor, implements the method as described in the first or second aspect.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
Other features, objects, and advantages of the disclosure will become apparent from a reading of the following detailed description of non-limiting embodiments which proceeds with reference to the accompanying drawings. The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is an exemplary system architecture diagram in which the present disclosure may be applied;
FIG. 2 is a flow diagram of one embodiment of a method of generating an intent recognition model according to the present disclosure;
FIG. 3 is a schematic diagram of a method of generating an intent recognition model;
FIG. 4 is a flow diagram for one embodiment of a method of generating an intent recognition model according to the present disclosure;
FIG. 5 is a diagram illustrating core word extraction;
FIG. 6 is a flow diagram for one embodiment of a method of generating a question-answer reply in accordance with the present disclosure;
FIG. 7 is a schematic diagram illustrating the structure of one embodiment of an apparatus for generating an intent recognition model according to the present disclosure;
FIG. 8 is a schematic diagram illustrating an embodiment of an apparatus for generating a question-answer reply in accordance with the present disclosure;
FIG. 9 is a block diagram of an electronic device used to implement embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the method and apparatus for generating an intent recognition model or the method and apparatus for generating a question-answer reply of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to obtain a vector of search information, a vector of core words in the search information, a vector of non-core words in the search information, and so on. The terminal devices 101, 102, 103 may have various client applications, intelligent interactive applications installed thereon, such as search applications, search software, and so on.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, the terminal devices may be electronic products that perform human-Computer interaction with a user through one or more modes of a keyboard, a touch pad, a touch screen, a remote controller, voice interaction, or handwriting equipment, such as a PC (Personal Computer), a mobile phone, a smart phone, a PDA (Personal Digital Assistant), a wearable device, a PPC (Pocket PC, palmtop), a tablet Computer, a smart car machine, a smart television, a smart speaker, a tablet Computer, a laptop Computer, a desktop Computer, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the above-described electronic apparatuses. It may be implemented as multiple pieces of software or software modules, or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may provide various services. For example, server 105 may obtain a vector of search information, a vector of core words in the search information, and a vector of non-core words in the search information; obtaining a user type identification characteristic vector according to the vector of the search information and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information and the vector of the non-core word; and training by utilizing the user type identification characteristic vector and the corresponding user type label as well as the question and answer demand identification characteristic vector and the corresponding question and answer demand label to obtain an intention identification model.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for generating the intention recognition model or the method for generating the question and answer reply provided by the embodiment of the present disclosure is generally executed by the server 105, and accordingly, the device for generating the intention recognition model or the device for generating the question and answer reply is generally disposed in the server 105.
It should be understood that the number of electronic devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of electronic devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method of generating an intent recognition model in accordance with the present disclosure is shown. The method of generating an intent recognition model may include the steps of:
step 201, obtaining a vector of search information, a vector of core words in the search information, and a vector of non-core words in the search information.
In the present embodiment, an executing subject (e.g., the server 105 shown in fig. 1) of the method of generating the intention recognition model may first acquire search information from a terminal device (e.g., the terminal devices 101, 102, 103 shown in fig. 1); then, cutting words of the search information to obtain core words and non-core words in the search information, or extracting the core words and the non-core words from the search information; and then, converting the search information, the core words and the non-core words into corresponding vectors to obtain the vectors of the search information, the vectors of the core words and the vectors of the non-core words. Alternatively, the executing body (e.g., the terminal devices 101, 102, 103 shown in fig. 1) of the method of generating the intention recognition model may first acquire search information input thereon by the user from a search box or interface of the map software or navigation software on the terminal device; then, cutting words of the search information to obtain core words and non-core words in the search information, or extracting the core words and the non-core words from the search information; and then, converting the search information, the core words and the non-core words into corresponding vectors to obtain the vectors of the search information, the vectors of the core words and the vectors of the non-core words.
Here, the search information may be search information input by a user of a terminal device (e.g., terminal devices 101, 102, 103 shown in fig. 1) on a search box or interface thereof; or search information input by a user voice, wherein the search information can be used for representing the search requirement of the user; alternatively, the search information may include information composed of any characters, such as chinese characters, english characters, symbols, and the like; the vector of the search information may be obtained by vector-converting the search information. The core word can be used for representing an entity word required by a user query, the entity word generally has a specific meaning, for example, a 'ball pen', and a vector of the core word can be obtained by vector conversion of the core word; the non-core words may be used to represent words that the user has a question-answering requirement, for example, a whisper word in the search information, and the vectors of the non-core words may be obtained by vector conversion of the non-core words.
In one example, the search information, the core words, and the non-core words are converted into corresponding vectors of search information, vectors of core words, and vectors of non-core words using a pre-trained Word vector Model, such as Word2vector, or a CBOW (Continuous Bag-of-Word Model) Model, or a Skip-Gram Model, etc.
In the technical scheme of the disclosure, the related processes of collection, storage, use, processing, transmission, provision, disclosure and the like of the search information, the core words and the non-core words are all in accordance with the regulations of related laws and regulations and do not violate the good custom of the public order.
Step 202, obtaining a user type identification feature vector according to the vector of the search information and the vector of the core word; and obtaining the question-answer demand identification feature vector according to the vector of the search information and the vector of the non-core word.
In an embodiment, the execution main body may obtain a user type identification feature vector according to a vector of the search information and a vector of the core word; and obtaining a question-answer demand identification feature vector according to the vector of the search information and the vector of the non-core word.
Here, the vector of the search information and the vector of the core word are combined to obtain the user type identification feature vector. The user type identification feature vector may be used To identify a user type of a terminal device (e.g., terminal devices 101, 102, 103 shown in fig. 1), for example, To B (To Business) may refer To a Business-oriented or specific user group, for example, a Business-oriented Business. To C (To Customer) may be end-Customer oriented, e.g., To an individual consumer.
Taking a vector of search information and a vector of a core word as an example, the vector merging method may include: connecting the vector of the search information and the vector of the core word in a front-back direction; or carrying out mean processing on the vector of the search information and the vector of the core word; or randomly selecting vector information from the vector of the search information and the vector of the core word and splicing the randomly selected information.
Here, the vector of the search information and the vector of the non-core word are combined to obtain the question-answer requirement identification feature vector. The question-answer requirement identification feature vector can be used for representing whether the search information has a question-answer requirement or not.
In one example, in fig. 3, core word extraction is performed on search information based on bert (bidirectional Encoder responses from transforms) to obtain core words and non-core words; then, carrying out vector conversion on the core words, the search information and the non-core words based on bert to obtain vectors of the core words, the search information and the non-core words; then, combining the vector of the core word and the vector of the search information to obtain a user type identification characteristic vector, wherein the user type identification characteristic vector can be used for identifying the user type oriented by the search information; and combining the vector of the non-core word and the vector of the search information to obtain a question-answer requirement identification characteristic vector, wherein the question-answer requirement identification characteristic vector can be used for identifying whether the search information has a question-answer requirement.
Correspondingly, in this example, the core words and the non-core words extracted by the bert may be subjected to the bert pre-training by the bert pre-training model, and finally, the vectorization result (i.e., the vector of the search information, the vector of the core words, and the vector of the non-core words) is obtained by extraction; then, in order to identify the user type and the question and answer requirement simultaneously, the vector of the search information and the vector of the core word can be combined to obtain a user type identification characteristic vector, and the vector of the search information and the vector of the non-core word are combined to obtain a question and answer requirement identification characteristic vector, so that the characteristic vector of the search information is reserved, and meanwhile, whether the search information is in the user type industry or not is determined to a great extent by the core word in the search information; therefore, the vector of the core word is used as a constraint of the vector of the search information, so that the user type of the search information can be more accurately identified; because the question-answer requirement is composed of language words to a large extent, the words are composed of other words except core words, the words determine whether the search information is the search information related to the question-answer or not to a large extent, and the vector of the non-core words is used as a constraint of the vector of the search information, so that the user type of the search information and the fact whether the search information is the question-answer requirement or not can be accurately identified at the same time.
In the embodiment of the disclosure, feature extraction is performed on the same search information, on the basis, the extracted features are shared, on the basis of the shared features, user type identification features are respectively added to identify user types, question and answer requirement identification features are added to identify question and answer requirements, and finally the identified user types and the identified question and answer requirements are combined and trained to obtain an intention identification model capable of identifying the user types and the question and answer requirements with high accuracy. Therefore, the user type oriented to the search information can be identified, and whether the search information has a question and answer requirement can be identified.
Step 203, training by using the user type identification characteristic vector and the corresponding user type label, and the question-answer requirement identification characteristic vector and the corresponding question-answer requirement label to obtain an intention identification model.
In this embodiment, the executing entity may perform training by using the user type identification feature vector and the corresponding user type tag, and the question-answer requirement identification feature vector and the corresponding question-answer requirement tag, so as to obtain the intention identification model. During training, the executing body can respectively take the user type identification feature vector and the question-answer demand identification feature vector as the input of the intention identification model, and take the user type label corresponding to the user type identification feature vector and the question-answer demand label corresponding to the question-answer demand identification feature vector as the expected output to obtain the intention identification model. The machine learning model may be a probability model, a classification model, or other classifier in the prior art or future development technology, for example, the machine learning model may include any one of the following: decision tree model (XGBoost), logistic regression model (LR), deep neural network model (DNN).
It should be noted that, in the above intention recognition model, the trained data is shared (i.e., the same search information), and further, a user type recognition feature vector is added on the basis of shared features (i.e., the same search information) for user type recognition, and a question and answer requirement recognition feature vector is added on the basis of shared features for question and answer requirement recognition; thus, the intent recognition model is an end-to-end model. The intention recognition model may be a multi-label recognition model.
The method for generating the intention recognition model provided by the embodiment of the disclosure comprises the steps of firstly obtaining a vector of search information, a vector of a core word in the search information and a vector of a non-core word in the search information; then, according to the vector of the search information and the vector of the core word, a user type identification feature vector is obtained; obtaining a question-answer demand identification characteristic vector according to the vector of the search information and the vector of the non-core word; and finally, training by utilizing the user type identification characteristic vector and the corresponding user type label as well as the question and answer demand identification characteristic vector and the corresponding question and answer demand label to obtain an intention identification model. The extracted features can be shared by performing feature extraction on the same search information; on the basis of sharing the characteristics, user type identification characteristic vectors are respectively added to identify the user types, question and answer requirements are identified by adding question and answer requirement identification characteristic vectors, and finally the identified user types and the identified question and answer requirements are combined and trained to obtain an intention identification model for identifying the user types and the question and answer requirements with high accuracy.
With further reference to fig. 4, fig. 4 illustrates a flow 400 of another embodiment of a method of generating an intent recognition model according to the present disclosure. The method of generating an intent recognition model may include the steps of:
step 401, obtaining a vector of the search information, a vector of a core word in the search information, and a vector of a non-core word in the search information.
Step 402, obtaining a user type identification feature vector according to the vector of the search information and the vector of the core word; and obtaining the question-answer demand identification feature vector according to the vector of the search information and the vector of the non-core word.
Step 403, inputting the user type identification feature vector and the question and answer demand identification feature vector into the intention identification model respectively, and obtaining a prediction result corresponding to the user type identification feature vector and a prediction result corresponding to the question and answer demand identification feature vector.
In the present embodiment, an executing subject of the method for generating an intention recognition model (for example, the terminal devices 101, 102, 103 or the server 105 shown in fig. 1) inputs the user type recognition feature vector and the question-answer requirement recognition feature vector into the intention recognition model respectively, and obtains a prediction result corresponding to the user type recognition feature vector and a prediction result corresponding to the question-answer requirement recognition feature vector.
In one example, the respectively inputting the user type identification feature vector and the question-answer requirement identification feature vector into the intention identification model to obtain a prediction result corresponding to the user type identification feature vector and a prediction result corresponding to the question-answer requirement identification feature vector may include: inputting the user type identification characteristic vector into a first network of an intention identification model to obtain a prediction result corresponding to the user type identification characteristic vector; and inputting the question-answer requirement identification characteristic vector into a second network of the intention identification model to obtain a prediction result corresponding to the question-answer requirement identification characteristic vector. The first network may be a neural network for predicting a user type identification feature vector, for example, a user type identification network. The second network may be a neural network for predicting the question-answer requirement identification feature vector, for example, a question-answer requirement identification network.
Step 404, determining a loss function according to the prediction result and the user type label corresponding to the user type identification characteristic vector and the prediction result and the question-answer demand label corresponding to the question-answer demand identification characteristic vector.
In this embodiment, the execution subject may determine the loss function according to the prediction result and the user type tag corresponding to the user type identification feature vector, and the prediction result and the question-answer requirement tag corresponding to the question-answer requirement identification feature vector.
In one example, an initial loss function may be pre-established; and then, respectively waiting the prediction result and the user type label corresponding to the user type identification characteristic vector and the prediction result and the question-answer demand label corresponding to the question-answer demand identification characteristic vector into an initial loss function so as to adjust the parameters of the initial function to obtain a final loss function.
And 405, adjusting parameters of the intention recognition model based on the loss function until the loss function is converged to obtain the trained intention recognition model.
In this embodiment, the executing entity may adjust the parameter of the intention recognition model in response to whether the output of the loss function satisfies the preset iteration cutoff condition by calculating whether the output of the loss function satisfies the preset iteration cutoff condition, and use the model at this time as the model finally completed through training until the adjusted output of the loss function satisfies the preset iteration cutoff condition. The preset iteration cutoff condition may be set for a preset number of iterations or according to the recognition accuracy of the intention recognition model.
In the present embodiment, the specific operations of steps 401-402 have been described in detail in step 201-202 of the embodiment shown in fig. 2, and are not described herein again.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the method for generating the intention recognition model in the present embodiment highlights the step of training the intention recognition model. Therefore, in the scheme described in this embodiment, the user type identification feature vector and the question and answer demand identification feature vector are respectively input into the intention identification model, and the prediction result corresponding to the user type identification feature vector and the prediction result corresponding to the question and answer demand identification feature vector are obtained; then, determining a loss function according to a prediction result and a user type label corresponding to the user type identification characteristic vector and a prediction result and a question-answer demand label corresponding to the question-answer demand identification characteristic vector; then, adjusting parameters of the intention recognition model based on the loss function until the loss function is converged to obtain the trained intention recognition model; the user type can be identified based on the user type identification features, the question answering requirement can be identified based on the question answering requirement identification features, and finally the user type identification and the question answering requirement identification are combined and trained to obtain the intention identification model capable of identifying the user type and the question answering requirement with high accuracy. Therefore, the user type oriented to the search information can be identified, and whether the search information has a question and answer requirement can be identified.
In some optional implementation manners of this embodiment, determining the loss function according to the prediction result and the user type tag corresponding to the user type identification feature vector and the prediction result and the question-answer requirement tag corresponding to the question-answer requirement identification feature vector may include: and determining a loss function according to the prediction result corresponding to the user type identification characteristic vector, the user type label and the corresponding first weight, and the prediction result corresponding to the question-answer demand identification characteristic vector, the question-answer demand label and the corresponding second weight.
In this implementation manner, the execution subject may determine the loss function according to the prediction result corresponding to the user type identification feature vector, the user type tag and the corresponding first weight, and the prediction result corresponding to the question-answer requirement identification feature vector, the question-answer requirement tag and the corresponding second weight.
In this implementation, the execution subject may set the first weight and the second weight according to a user requirement or recognition accuracy of the intention recognition model.
In one example, the formula for the loss function is as follows:
lossall=a*loss1+b*loss2
wherein a (i.e., the first weight) and b (i.e., the second weight) are loss of user type identification, respectively1Loss with question-answer demand recognition2The weight of (c).
In this implementation manner, the execution subject may determine the loss function according to the prediction result corresponding to the user type identification feature vector, the user type tag and the corresponding first weight, and the prediction result corresponding to the question-answer requirement identification feature vector, the question-answer requirement tag and the corresponding second weight.
In some optional implementations of this embodiment, a sum of the first weight and the second weight is 1.
In the present implementation, the sum of the first weight and the second weight may be set to 1, so that the recognition accuracy of the intention recognition model may be flexibly adjusted.
In some optional implementation manners of this embodiment, the obtaining a vector of the search information, a vector of a core word in the search information, and a vector of a non-core word in the search information may include: acquiring search information; extracting core words and non-core words in the search information; and respectively inputting the search information, the core words and the non-core words into a pre-trained word vector model to obtain the vectors of the search information, the core words and the non-core words.
In this implementation manner, the execution main body may perform preprocessing on a plurality of search information, for example, filtering out abnormal symbols in the search information; or directly filtering the search information smaller than the first preset character number threshold or the search information larger than the second preset character number threshold; and then, extracting the core words of the preprocessed search information by using bert to obtain the core words, the non-core words and the preprocessed search information. The Word vector Model may be a Word2vector, or a CBOW (Continuous Bag-of-Word Model) Model, or a Skip-Gram Model.
It should be noted that the first preset character number threshold and the second preset character number threshold may be randomly set numbers or set according to the accuracy of the search information identification. The first preset character number threshold and the second preset character number threshold may be used to filter out search information whose character number included in the search information is smaller than the first preset character number threshold or larger than the second preset character number threshold.
In this implementation, the plurality of search information may be preprocessed to obtain processed search information; then, a vector of the search information, a vector of the core word in the search information, and a vector of the non-core word in the search information are determined through a pre-trained word vector model.
In some optional implementation manners of this embodiment, if the number of the search information is multiple, extracting the core words and the non-core words in the search information may include: and extracting core words and non-core words in the target search information in response to the non-Chinese ratio in the target search information being less than or equal to a second preset value, wherein the target search information is search information in which the ratio of the number of the core words extracted from the target search information to the number of the plurality of search information is greater than a first preset value.
In this implementation manner, in the process of extracting the core word, the execution main body needs to determine in advance whether a ratio of the number of the core word to the number of the plurality of search information is greater than a first preset value, and when a ratio of the number of the core word to the number of the plurality of search information in the target search is greater than the first preset value, determines whether a non-chinese ratio in the target search information is greater than a second preset value; and when the non-Chinese ratio in the target search information is larger than a second preset value, not extracting the core words of the target search information.
It should be noted that before extracting the core word from the search information, it may be determined whether a ratio of the number of the core word in the target search information to the number of the plurality of search information is greater than a first preset value; and when the ratio is determined to be larger than the first preset value, determining whether the non-Chinese ratio in the target search information is larger than a second preset value. The first preset value and the second preset value may be set by a user or set according to the recognition accuracy of the intention recognition model. Optionally, the first preset value may be 0.85, and the second preset value may be 0.85.
In one example, in fig. 5, in the process of extracting the core words from the plurality of search information, if the ratio of the number of the core words in the target search information to the number of the plurality of search information is greater than a first preset value and the non-chinese ratio in the target search information is greater than a second preset value, the core words are not extracted from the target search information; and if the non-Chinese ratio in the target search information is less than or equal to a second preset value, extracting core words from the target search information to obtain core words and non-core words.
It should be noted that the non-chinese percentage may be a proportion of all characters of the character station except the chinese character in all characters of the target search information.
In this implementation, in order to retain all the information of the search information, the search information is extracted by bert, and the extracted search information result is segmented into a core word part and a non-core word part. Compared with user-type search information, such as electronic industry and chemical industry, the search information related to the two industries has more numbers and English characters, so that the result effect of extracting core words is poor easily during extraction, for the result, a commonly adopted method is to add training samples, but the search information formed by combining Chinese, English and numbers is difficult to learn samples, and the extraction effect cannot be improved even if more samples are added; aiming at the extraction result of the search information core word, firstly, calculating the search information whether the ratio of the number of the extracted core words divided by the number of the plurality of search information is larger than a first preset value or not, aiming at the targets (namely, the ratio of the number of the core words to the number of the plurality of search information is larger than the first preset value) search information, then, calculating the non-Chinese ratio in the target search information, if the non-Chinese ratio in the target search information is larger than a second preset value, not extracting the core word aiming at the target search information, and being capable of efficiently and conveniently processing the problem of inaccurate extraction of the core word caused by more and more compositions of the search information in the industries such as electronics, chemical engineering and the like.
With further reference to fig. 6, fig. 6 illustrates a flow 600 of one embodiment of a method of generating a question-answer reply in accordance with the present disclosure. The method for generating the question-answer reply may comprise the following steps:
step 601, obtaining search information to be predicted, and core words and non-core words in the search information to be predicted.
In this implementation, an executing body (e.g., terminal devices 101, 102, 103 shown in fig. 1) of the method for generating a question-answer reply may first obtain search information to be predicted; and then, extracting core words and non-core words in the search information to be predicted. The search information to be predicted can be used for predicting the user type and the question and answer requirement by a pre-trained intention recognition model.
It should be noted that the extraction of the core words and the non-core words in the search information to be predicted may be performed by using a pre-trained bert.
In one example, extracting core words and non-core words in the search information to be predicted may include: and extracting core words and non-core words in the search information to be predicted by utilizing a pre-trained word vector model.
It should be noted that the execution subject of the method for generating the question-answer reply and the execution subject of the method for generating the intention recognition model may be the same or different; when the execution subject of the method of generating the question-answer reply is different from the execution subject of the method of generating the intention recognition model, the intention recognition model may be transmitted to the execution subject of the method of generating the question-answer reply by the execution subject of the method of generating the intention recognition model.
Step 602, inputting the search information to be predicted, the core word and the non-core word into a pre-trained word vector model respectively to obtain a vector of the search information to be predicted, a vector of the core word and a vector of the non-core word.
In this embodiment, the execution main body inputs the search information to be predicted and the core word and the non-core word in the search information to be predicted into the pre-trained word vector model respectively, so as to obtain the vector of the search information to be predicted, the vector of the core word, and the vector of the non-core word.
It should be noted that the pre-trained word vector model may refer to the description of the pre-trained word vector model, and is not described herein again.
Step 603, obtaining a user type identification feature vector according to the vector of the search information to be predicted and the vector of the core word; and obtaining a question-answer demand identification characteristic vector according to the vector of the search information to be predicted and the vector of the non-core word.
In this embodiment, the execution main body may merge a vector of search information to be predicted and a vector of a core word to obtain a user type identification feature vector; and combining the vector of the search information to be predicted with the vector of the non-core word to obtain a question-answer demand identification characteristic vector.
Step 604, inputting the user type identification feature vector and the question-answer demand identification feature vector into a pre-trained intention identification model respectively to obtain a user type label corresponding to the user type identification feature vector and a question-answer demand label corresponding to the question-answer demand identification feature vector.
In this embodiment, the executing entity may respectively input the user type identification feature vector and the question-answer required identification feature vector into a pre-trained intent identification model, so as to obtain a user type tag corresponding to the user type identification feature vector and a question-answer required tag corresponding to the question-answer required identification feature vector. The question-answer requirement label can be used for representing whether the search information to be predicted has a question-answer requirement or not. The question-answer requirement tag may be represented by 0 and 1, for example, 1 represents a question-answer requirement, and 0 represents a question-answer-free requirement.
Step 605, in response to the question-answer requirement label being a question-answer requirement, determining corresponding reply information according to the user type label, the core word and the non-core word.
In this embodiment, when it is determined that the question-answering demand label has a question-answering demand, the corresponding reply information is determined according to the user type label, the core word and the non-core word corresponding to the user type identification feature vector.
In one example, a corresponding knowledge graph can be established in advance based on the user type and the question and answer requirement; and when a user type label corresponding to the user type identification characteristic vector and a question-answer demand label corresponding to the question-answer demand characteristic vector are obtained and the question-answer demand label has a question-answer demand, acquiring corresponding reply information from the knowledge graph. The reply message can be presented in different styles according to the user type, for example, presented in a mail sending mode, or presented in a short message mode, or presented in a new interface, and presented to the user in a customized mode.
According to the method for generating the question-answer reply provided by the embodiment of the disclosure, the user type and the question-answer requirement of the search information to be predicted can be identified through the pre-trained intention identification model, and when the question-answer requirement label has the question-answer requirement, the corresponding question-answer reply can be determined according to the user type label, the core words and the non-core words.
With further reference to fig. 7, as an implementation of the methods illustrated in the above figures, the present disclosure provides an embodiment of an apparatus for generating an intent recognition model, which corresponds to the method embodiment illustrated in fig. 2, and which is particularly applicable in various electronic devices.
As shown in fig. 7, the apparatus 700 for generating an intention recognition model of the present embodiment may include: a vector obtaining module 701, a vector obtaining module 702 and a model training module 703. The vector obtaining module 701 is configured to obtain a vector of the search information, a vector of a core word in the search information, and a vector of a non-core word in the search information; a vector obtaining module 702 configured to obtain a user type identification feature vector according to the vector of the search information and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information and the vector of the non-core word; the model training module 703 is configured to train with the user type identification feature vector and the corresponding user type label, and the question-answer requirement identification feature vector and the corresponding question-answer requirement label, to obtain an intention identification model.
In the present embodiment, in the apparatus 700 for generating an intention recognition model: the specific processing of the vector obtaining module 701, the vector obtaining module 702, and the model training module 703 and the technical effects thereof can refer to the related descriptions of step 201 and step 203 in the corresponding embodiment of fig. 2, which are not repeated herein.
In some optional implementations of this embodiment, the model training module 703 includes: the result prediction unit is configured to input the user type identification feature vector and the question and answer demand identification feature vector into the intention identification model respectively to obtain a prediction result corresponding to the user type identification feature vector and a prediction result corresponding to the question and answer demand identification feature vector; a function determination unit configured to determine a loss function according to a prediction result and a user type tag corresponding to the user type identification feature vector, and a prediction result and a question-and-answer demand tag corresponding to the question-and-answer demand identification feature vector; and the model training unit is configured to adjust parameters of the intention recognition model based on the loss function until the loss function is converged to obtain the trained intention recognition model.
In some optional implementations of this embodiment, the function determination module is further configured to: and determining a loss function according to the prediction result corresponding to the user type identification characteristic vector, the user type label and the corresponding first weight, and the prediction result corresponding to the question-answer demand identification characteristic vector, the question-answer demand label and the corresponding second weight.
In some optional implementations of this embodiment, a sum of the first weight and the second weight is 1.
In some optional implementations of this embodiment, the vector extraction module 701 includes: a search information acquisition unit configured to acquire search information; a word extraction unit configured to extract a core word and a non-core word in the search information; and the vector obtaining unit is configured to input the search information, the core words and the non-core words into pre-trained word vectors respectively to obtain the vectors of the search information, the core words and the non-core words.
In some optional implementations of this embodiment, the word extraction unit is further configured to: and in response to the fact that the ratio of the number of the extracted core words to the number of the search information is larger than a first preset value and the non-Chinese ratio in the search information is larger than a second preset value, not extracting the core words and the non-core words in the search information.
With further reference to fig. 8, as an implementation of the method shown in the above-mentioned figures, the present disclosure provides an embodiment of an apparatus for generating a question-answer reply, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 6, and the apparatus may be applied to various electronic devices.
As shown in fig. 8, the apparatus 800 for generating a question-answer reply in this embodiment may include: an information obtaining module 801, a vector extracting module 802, a vector obtaining module 803, a label obtaining module 804 and a reply determining module 805. The information obtaining module 801 is configured to obtain search information to be predicted, and a core word and a non-core word in the search information to be predicted; a vector extraction module 802 configured to input search information to be predicted, core words, and non-core words into a pre-trained word vector model, respectively, to obtain vectors of the search information to be predicted, vectors of the core words, and vectors of the non-core words; a vector obtaining module 803 configured to obtain a user type identification feature vector according to a vector of search information to be predicted and a vector of a core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information to be predicted and the vector of the non-core word; a label obtaining module 804, configured to input the user type identification feature vector and the question-answer demand identification feature vector into an intention identification model generated by a pre-trained device, respectively, to obtain a user type label corresponding to the user type identification feature vector and a question-answer demand label corresponding to the question-answer demand identification feature vector, where the question-answer demand label is used to represent whether there is a question-answer demand for the search information to be predicted; the reply determining module 805 is configured to determine, in response to the question-answer requirement tag being a question-answer requirement, corresponding reply information according to the user type tag, the core word and the non-core word.
In the present embodiment, in the apparatus 800 for generating a question-answer reply: the detailed processing and the technical effects of the information obtaining module 801, the vector extracting module 802, the vector obtaining module 803, the label obtaining module 804 and the reply determining module 805 can refer to the related descriptions of step 601 and step 605 in the corresponding embodiment of fig. 6, which are not described herein again.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 9 illustrates a schematic block diagram of an example electronic device 900 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 9, the apparatus 900 includes a computing unit 901, which can perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The calculation unit 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to bus 904.
A number of components in the device 900 are connected to the I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, and the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, optical disk, or the like; and a communication unit 909 such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 901 performs the respective methods and processes described above, such as a method of generating an intention recognition model or a method of generating a question-answer reply. For example, in some embodiments, the method of generating the intent recognition model or the method of generating the question-answer reply may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 900 via ROM 902 and/or communications unit 909. When loaded into RAM 903 and executed by computing unit 901, may perform one or more steps of the above-described method of generating an intent recognition model or method of generating a question-answer reply. Alternatively, in other embodiments, the computing unit 901 may be configured by any other suitable means (e.g., by means of firmware) to perform the method of generating the intent recognition model or the method of generating the question-answer reply.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
Artificial intelligence is the subject of studying computers to simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural voice processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in this disclosure may be performed in parallel, sequentially, or in a different order, as long as the desired results of the technical solutions mentioned in this disclosure can be achieved, and are not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (17)

1. A method of generating an intent recognition model, comprising:
acquiring a vector of search information, a vector of a core word in the search information, and a vector of a non-core word in the search information;
obtaining a user type identification characteristic vector according to the vector of the search information and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information and the vector of the non-core word;
and training by utilizing the user type identification characteristic vector and the corresponding user type label as well as the question and answer demand identification characteristic vector and the corresponding question and answer demand label to obtain an intention identification model.
2. The method according to claim 1, wherein the training with the user type recognition feature vector and the corresponding user type tag and the question-answer requirement recognition feature vector and the corresponding question-answer requirement tag to obtain the intention recognition model comprises:
respectively inputting the user type identification characteristic vector and the question and answer demand identification characteristic vector into an intention identification model to obtain a prediction result corresponding to the user type identification characteristic vector and a prediction result corresponding to the question and answer demand identification characteristic vector;
determining a loss function according to a prediction result and a user type label corresponding to the user type identification characteristic vector and a prediction result and a question and answer demand label corresponding to the question and answer demand identification characteristic vector;
and adjusting parameters of the intention recognition model based on the loss function until the loss function is converged to obtain the trained intention recognition model.
3. The method according to claim 2, wherein the determining a loss function according to the prediction result and the user type tag corresponding to the user type identification feature vector and the prediction result and the question-answer requirement tag corresponding to the question-answer requirement identification feature vector comprises:
and determining a loss function according to the prediction result corresponding to the user type identification characteristic vector, the user type label and the corresponding first weight, and the prediction result corresponding to the question-answer demand identification characteristic vector, the question-answer demand label and the corresponding second weight.
4. The method of claim 3, wherein a sum of the first weight and the second weight is 1.
5. The method of any one of claims 1-4, wherein the obtaining a vector of search information, a vector of core words in search information, and a vector of non-core words in search information comprises:
acquiring search information;
extracting core words and non-core words in the search information;
and respectively inputting the search information, the core words and the non-core words into a pre-trained word vector model to obtain the vectors of the search information, the core words and the non-core words.
6. The method of claim 5, wherein if the number of the search information is multiple, the extracting core words and non-core words from the search information comprises:
and extracting core words and non-core words in the target search information in response to the non-Chinese ratio in the target search information being smaller than or equal to a second preset value, wherein the target search information is search information in which the ratio of the number of the core words extracted from the target search information to the number of the plurality of search information is larger than a first preset value.
7. A method of generating a question-answer reply, comprising:
acquiring search information to be predicted, and core words and non-core words in the search information to be predicted;
respectively inputting search information to be predicted, core words and non-core words into a pre-trained word vector model to obtain vectors of the search information to be predicted, vectors of the core words and vectors of the non-core words;
obtaining a user type identification characteristic vector according to the vector of the search information to be predicted and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information to be predicted and the vector of the non-core word;
respectively inputting the user type identification feature vector and the question-answer demand identification feature vector into the intention identification model generated by the method according to any one of claims 1 to 6, and obtaining a user type label corresponding to the user type identification feature vector and a question-answer demand label corresponding to the question-answer demand identification feature vector, wherein the question-answer demand label is used for representing whether the search information to be predicted has a question-answer demand;
and responding to the question-answering requirement label as having the question-answering requirement, and determining corresponding reply information according to the user type label, the core words and the non-core words.
8. An apparatus to generate an intent recognition model, comprising:
a vector acquisition module configured to acquire a vector of the search information, a vector of core words in the search information, and a vector of non-core words in the search information;
the vector obtaining module is configured to obtain a user type identification feature vector according to the vector of the search information and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information and the vector of the non-core word;
and the model training module is configured to train by utilizing the user type identification characteristic vector and the corresponding user type label as well as the question and answer demand identification characteristic vector and the corresponding question and answer demand label to obtain an intention identification model.
9. The apparatus of claim 8, wherein the model training module comprises:
the result prediction unit is configured to input the user type identification feature vector and the question and answer demand identification feature vector into the intention identification model respectively to obtain a prediction result corresponding to the user type identification feature vector and a prediction result corresponding to the question and answer demand identification feature vector;
a function determination unit configured to determine a loss function according to a prediction result and a user type tag corresponding to the user type identification feature vector, and a prediction result and a question-and-answer demand tag corresponding to the question-and-answer demand identification feature vector;
and the model training unit is configured to adjust parameters of the intention recognition model based on the loss function until the loss function is converged to obtain the trained intention recognition model.
10. The apparatus of claim 9, wherein the function determination module is further configured to:
and determining a loss function according to the prediction result corresponding to the user type identification characteristic vector, the user type label and the corresponding first weight, and the prediction result corresponding to the question-answer demand identification characteristic vector, the question-answer demand label and the corresponding second weight.
11. The apparatus of claim 10, wherein a sum of the first weight and the second weight is 1.
12. The apparatus of any of claims 8-11, wherein the vector extraction module comprises:
a search information acquisition unit configured to acquire search information;
a word extraction unit configured to extract a core word and a non-core word in the search information;
and the vector obtaining unit is configured to input the search information, the core words and the non-core words into a word vector model trained in advance respectively to obtain the vectors of the search information, the core words and the non-core words.
13. The apparatus of claim 12, wherein the word extraction unit is further configured to:
and extracting core words and non-core words in the target search information in response to the non-Chinese ratio in the target search information being smaller than or equal to a second preset value, wherein the target search information is search information in which the ratio of the number of the core words extracted from the target search information to the number of the plurality of search information is larger than a first preset value.
14. An apparatus to generate a question-answer reply, comprising:
the information acquisition module is configured to acquire search information to be predicted and core words and non-core words in the search information to be predicted;
the vector extraction module is configured to input search information to be predicted, core words and non-core words into a pre-trained word vector model respectively to obtain vectors of the search information to be predicted, the core words and the non-core words;
the vector obtaining module is configured to obtain a user type identification feature vector according to the vector of the search information to be predicted and the vector of the core word; obtaining a question-answer demand identification characteristic vector according to the vector of the search information to be predicted and the vector of the non-core word;
a tag obtaining module configured to input the user type identification feature vector and the question and answer demand identification feature vector into the intention identification model generated by the apparatus according to any one of claims 8 to 13, respectively, to obtain a user type tag corresponding to the user type identification feature vector and a question and answer demand tag corresponding to the question and answer demand identification feature vector, where the question and answer demand tag is used for representing whether the search information to be predicted has a question and answer demand;
and the reply determining module is configured to respond to the question-answer requirement label as having the question-answer requirement, and determine corresponding reply information according to the user type label, the core words and the non-core words.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
17. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-7.
CN202111152002.9A 2021-09-29 2021-09-29 Method, apparatus, medium, and program product for generating an intent recognition model Pending CN113886543A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111152002.9A CN113886543A (en) 2021-09-29 2021-09-29 Method, apparatus, medium, and program product for generating an intent recognition model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111152002.9A CN113886543A (en) 2021-09-29 2021-09-29 Method, apparatus, medium, and program product for generating an intent recognition model

Publications (1)

Publication Number Publication Date
CN113886543A true CN113886543A (en) 2022-01-04

Family

ID=79008090

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111152002.9A Pending CN113886543A (en) 2021-09-29 2021-09-29 Method, apparatus, medium, and program product for generating an intent recognition model

Country Status (1)

Country Link
CN (1) CN113886543A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521841A (en) * 2023-04-18 2023-08-01 百度在线网络技术(北京)有限公司 Method, device, equipment and medium for generating reply information

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521841A (en) * 2023-04-18 2023-08-01 百度在线网络技术(北京)有限公司 Method, device, equipment and medium for generating reply information
CN116521841B (en) * 2023-04-18 2024-05-14 百度在线网络技术(北京)有限公司 Method, device, equipment and medium for generating reply information

Similar Documents

Publication Publication Date Title
CN113326764A (en) Method and device for training image recognition model and image recognition
CN112487173B (en) Man-machine conversation method, device and storage medium
CN113901907A (en) Image-text matching model training method, image-text matching method and device
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN113590776A (en) Text processing method and device based on knowledge graph, electronic equipment and medium
CN114548110A (en) Semantic understanding method and device, electronic equipment and storage medium
CN114881129A (en) Model training method and device, electronic equipment and storage medium
CN117114063A (en) Method for training a generative large language model and for processing image tasks
CN113268560A (en) Method and device for text matching
CN112580732A (en) Model training method, device, equipment, storage medium and program product
CN113360711A (en) Model training and executing method, device, equipment and medium for video understanding task
CN114186681A (en) Method, apparatus and computer program product for generating model clusters
CN114037059A (en) Pre-training model, model generation method, data processing method and data processing device
CN112560461A (en) News clue generation method and device, electronic equipment and storage medium
CN116955561A (en) Question answering method, question answering device, electronic equipment and storage medium
CN115062718A (en) Language model training method and device, electronic equipment and storage medium
CN112926308A (en) Method, apparatus, device, storage medium and program product for matching text
CN113705192B (en) Text processing method, device and storage medium
CN113392920B (en) Method, apparatus, device, medium, and program product for generating cheating prediction model
CN113792876A (en) Backbone network generation method, device, equipment and storage medium
CN113157877A (en) Multi-semantic recognition method, device, equipment and medium
CN112948584A (en) Short text classification method, device, equipment and storage medium
CN117370524A (en) Training method of reply generation model, reply sentence generation method and device
CN113886543A (en) Method, apparatus, medium, and program product for generating an intent recognition model
CN116383382A (en) Sensitive information identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination